Discussion of research tools (for students and collaborators)

Volker Franz, University of Tübingen, Germany

Note: this page is somewhat outdated and needs a general over-haul and update. Nevertheless, many of the general ideas are still valid.

under-construction icon

In practical research situations, we repeatedly encounter similar problems. Here is a list of things I found useful.

Table of contents

1. Workflow for experiments and statistical analyses

Currently, a typical experiment of mine consist of the following steps:

Running the experiment

To program an experiment, I use either C++ or Matlab (depending on the setup and the requirements). I use C++ if I need OpenGL for stereo-graphics (because the PsychophysicsToolbox does not support this yet) or if there exists already a program in C/C++ which I want to modify, such that it is faster to simply use C++. If possible, I use Matlab and the PsychophysicsToolbox and my own OptotrakToolbox. Programming experiments in Matlab is much faster and, for example, allows easily to produce nice graphics online during the experiment (e.g., plotting the trajectories such that I can test whether all Optotrak-markers are visible). Also, it is easier for our students because they don't need to learn two languages (C++ and Matlab) but only one (Matlab). We also offer introductory courses for the PsychophysicsToolbox and Matlab (see the teaching section of my homepage).

All my experiments are currently run on the Windows operating system (XP and 2000). I would prefer Linux or another UNIX system (Mac OS X or SGI IRIX which I used at the MPI in Tübingen), but it was just faster to use Windows because of compatibility issues (driver, software, etc.).

For input from and output to external hardware I use the ActiveWire card. It is very cheap (~60US$/EURO) and is connected to the USB. It gives me 16 freely-programmable input/output channels. Also, there comes an ActiveWire function with the PsychophysicsToolbox. For example, we use the ActiveWire to control our liquid-crystal goggles and to measure reaction times. Recently, I performed (conservative) checks on the timing for the reaction times and found it to be very accurate (delay < 5msec, variability ~2-3msec)—as did others. Recently we bought a DataTranslation DT9812 USB data aquisition module which is slightly more versatile than the ActiveWire and also works from within Matlab.

Data transfer from experimental machine

I use unison to transfer the data from my experimental machine (Windows XP/2000) to my desktop (GNU-Linux, Debian) and my notebook (Mac OS X - Powerbook). Unison is excellent: For example, I can easily synchronize my complete home directory between notebook and desktop. On Windows, unison is somewhat slow. But there I only synchronize the current data, so this is no big issue.

Basic statistical analysis

I do all my basic statistical processing (e.g., extracting the maximum grip aperture, time-normalizing trajectories, etc.) in Matlab. Typically, this results in plots for visually inspecting the data and ASCII-files for further analyses. Sometimes the issue of filtering the data arises. An excellent book on digital signal processing that is even available for free on the web is: The Scientist and Engineer's Guide to Digital Signal Processing by Steven W. Smith.

Further statistical analyses

Unfortunately, Matlab is very bad in even the simplest statistical analyses we typically use in psychology experiments (e.g., within-subjects repeated measures ANOVA). Therefore, I use two strategies:

  • I encourage students to prepare all further analyses in Matlab and then to use SPSS (e.g., for the ANOVA). Most students learned SPSS at the University, such that this is no problem. For this, they typically have to create in Matlab an ASCII file in which each line corresponds to a subject and each column to one measurement of the dependent variable of interest.
  • I myself use a number of different programs. For this, I typically create in Matlab an ASCII file in which each line corresponds to a single trial, specifying the subject, all conditions, and all dependent variables of interest. Typically, I use:

Reading data from published graphs / figures

Replication is an integral part of our scientific work (although it is often undervalued, see for example the biting essay of Richard Feynman on: Cargo Cult Science). When we try to replicate the work of others, we want to compare our data to their data. This is complicated by the fact that quite often data are only available as graphs. Engauge is a good, free digitizing software that helps with this.

Archiving the data after publication

As mentioned in the last section, replication is an integral of our scientific work. This also includes that we retain our raw data and all statistical analyses for later use by other investigators (most journals expect us to retain our data for at least 10 years; please read this as: "for the rest of your lifetime").

Therefore, if you published an article as first author and with me as supervisor, I ask you to do the following after the manuscript was accepted for publication:

  • Create a data-DVD containing all the raw data, all analyses, and all the files related to writing the article. It should also contain all important emails related to writing the article, e.g. correspondence with the editor, reviews, etc. (at the University of Giessen email folders can easily be exported using the file-manager of the web-mail interface).
  • Add a README.txt file containing a short description of where to find the analyses, the raw data, etc. The file should be a plain text file (you can generate this in Word and save it as "text only", or in the Matlab Editor, or in XEmacs, ...). Please check that this file can be read on other machines too.
  • Burn three copies of this DVD: One for me, two for you. And please check that all three DVDs can be read on other machines too.
  • Label each DVD with authors, title, journal using a special CD/DVD marker pen (this is supposed to be better than putting labels on the DVD; but please don't use: ballpoint pen, pencil, crayon :-). If there are none, ask our secretary to order some of these markers.
  • Give one DVD to me and retain your two copies at a save place for at least 10 years! You as the first author are responsible for this storage. Giving the DVD to me is just an additional precaution...

2. Scientific writing

Writing a scientific manuscript / paper

Here is a short introduction to writing good scientific manuscripts: Gopen and Swan (1990). The Science of Scientific Writing. American Scientist (78), 550-558.

A guideline regarding the ethical aspects of writing a research article can be found at the pages of the Society for Neuroscience: Guidelines: Responsible Conduct Regarding Scientific Communication.

Writing a scientific paper with me

When you write a paper with me and you are first author (e.g., as a PhD student), I ask you to:
  • Before giving a draft to me:
    • Write an email, giving a short summary of what you did and in which state the manuscript is.
    • If it is a revision, prepare in parallel the coverletter giving a detailed point-to-point response to the reviewers' and the editor's concerns. Typically, you should send the draft of this coverletter together with the draft of the manuscript to me.
    • Allow enough time (typically at least one week) for me reading the draft and warn me a couple of days before that you are going to give me something to read soon.
    • Pay close attention to formatting, citation style, references, etc. When reading a manuscript it can be very distracting to constantly have to correct these things. Under normal circumstances you should not postpone these things to later stages of your writing. If you have good reason to do so, make explicit in the above mentioned email.
    • Run a spell checker.
  • When submitting a paper or a revision:
    • Be aware of and obey all deadlines.
    • Carefully check the journal's "guidelines for authors" and obey all requirements (formatting, etc).
    • Be sure to add an appropriate "Acknowledgments" section, mentioning all grants and funding agencies that were involved (for examples see my publications). It is very important that we mention all grants, otherwise the paper will not be counted for the grants, which essentially means that it look as if you had been lazy and we will get problems acquiring grants in future.
    • If it is a revision: Consider whether it is practical to mark everything that has changed relative to the original, submitted version (e.g., by using a yellow background). This does not always make sense but can help the editor, the reviewers, and me to focus on the changes. Some journals even require this nowadays.
    • Ask me before you submit whether it is OK to submit.
    • After submitting: Send me the final, submitted version of the manuscript. Typically you get a PDF-"proof" of your submitted version. Store this locally on your computer and send also a copy to me. If you are a member of my Hamburg-group: We currently share all submitted versions internally. For details see our WiKi at: Science>Writing research papers.
  • After your paper is accepted:
    • Be happy! Typically you have worked app. 1-3 years for this moment, sometimes considerably longer. So it is time for a little party...
    • Place an APA-style citation of your paper on your home-page, marked as "accepted for publication by JournalName"
    • Typically, journals do major formatting and some editing. The copy-editor is in charge of this and will send you (a couple of weeks after your paper was accepted) page proofs of the final to-be-published paper. The copy-editor will not warn you beforehand, but will expect you to control and correct the page proofs within 24-48 hours. Your job is to control whether the page proofs correspond to the final, submitted version. You need to do this very carefully, errors cannot be corrected later. Controlling the page proofs involves reading the page proofs from start to end, including figure legends, footnotes, references and controlling on a word-by-word level critical passages (e.g., all data in the results section. Data should, of course, not only be compared to the final, submitted draft but also to the original printouts of your data-analysis programs). Also check the Acknowledgments section again and make sure that all relevent grants and funding agencies are really mentioned. At this stage we are typically not allowed to deviate from the final, submitted version (we could in rare cases, e.g., if you found errors in the results section, but we might have to pay for doing so, because depending on the changes the formatting might have to start all over again). Also, of course, send me a copy of the page proofs and of the corrected page proofs and keep a copy on your local computer. The page proofs can already be given to colleagues who want to read the paper.
    • After the paper is (finally) published you typically get a PDF-file of the final, published version (this is different from the page proofs. E.g. now containing page-numbers). Please send a copy to me and update your home-page to show the full, official citation.
    • Store all relevant data and the manuscript for the rest of your lifetime. See above: "Archiving the data after publication"

Writing reviews

Writing reviews is an integral part of our scientific work. Here is a nice, short introduction to reviewing a paper: Brainard, D. (2000). How to write an effective manuscript review. Optics & Photonics News, 11(6), 42-43. Another good introduction is: Benos, D. J., Kirk, K. L., & Hall, J. E. (2003). How to review a paper. Advances in Physiology Education, 27 (1-4), 47-52.

A guideline regarding the ethical aspects of writing a review can be found at the pages of the Society for Neuroscience: Guidelines: Responsible Conduct Regarding Scientific Communication. There is also a comitee on publication ethics (COPE) formed by editors of peer-reviewed journals. They give detailed advice for editors and authors including case-studies. See also: Council of Science editors and World Association of Medical Editors.

3. Word processing software

I use two strategies:

  • I encourage students to write with the text-processor they know (typically MS-Word). Graphics import seems to be fairly stable if using the (compressed) tiff format.
  • I myself use LaTeX. The standard LaTeX distribution now creates by default pdf (no longer Postscript)—which I think is good. For graphics import, I create the graphics either in pdf-format or in eps-format (which I then convert to pdf using epstopdf). The latter is sometimes more stable when using the R-programming language or Matlab. (I know it shouldn't matter, but it does). epstopdf is often included in the TeX / LaTeX distribution.

4. Manipulating pdf-files

pdftk is a nice and free tool which allows you to easily manipulate pdf-files (e.g., combining different pdf-files to one single file, extracting single pages, etc.). It is command-line based and therefore also allows you to write shell-scripts for it (if you wish to do something like this :-). Of course you can also use the Adobe suite which is, however, not free.

5. Creating web-pages

Getting started

Here is a nice, short introduction to HTML, the language which is typically used for web-pages.

Checking your web-page

After creating your web-page, it might be wise to perform automatic online-checks for errors. Typically, there are three possible checks:

The online-check for errors in your HTML code can also be included as a link in your web-page, such that your page is automagically tested whenever you click on the link (see the links at the right bottom of this page). For example, (assuming you use XHTML 1.0) the source code for the relevant link is:

<p> <a href="http://validator.w3.org/check?uri=referer"> <img src="http://www.w3.org/Icons/valid-xhtml10" alt="Valid XHTML 1.0 Transitional" height="31" width="88" /></a> </p>

Just add this code at the end of your web-page. This will create the following button: Valid XHTML 1.0 Transitional If you click on it, your web-page will be tested.

More advanced stuff for web-pages...

6. Good books on software and programming...

Here is a short and quick list of books I found helpful when learning and working with the different programming languages.

C / C++

  • Deitel & Deitel. C++ How to Program. Prentice Hall. Good, comprehensive introduction and reference book for C++.
  • Kernighan & Ritchie. The C Programming Language. This is the classic book for classic C (written by the creators of C). Helps if you want to know how this language was meant to operate. Also needed if you program in C++.
  • King. C-Programming: A Modern Approach. Norton. Christoph Rasche didn't like my choice of the Kernighan & Ritchie book, he suggested this one instead...

Matlab

  • Rosenbaum. MATLAB for Behavioral Scientists. Lawrence Erlbaum. This is a short, simple introduction at the beginners level. See also: http://www.matlab-behave.com.

R / S / S-Plus

See also the excellent book collection at: http://www.r-project.org (see there under: "Manuals" and "Books"). If you want to dig deeper into the languages, it might also be useful to buy used copies of the "classics" (the "blue"/"white"/"green" books).
  • Venables and Ripley. Modern Applied Statistics with S. Springer. This is now the standard for R and S.
  • Baron and Li. Notes on R for psychology. This is a good, quick introduction for R. Freely available at: http://finzi.psych.upenn.edu

Smalltalk

Smalltalk is not very common nowadays and I don't use it in my everyday work. But it is such a nice language (fully object--oriented in a consistent way, much nicer than C++...). Therefore, I could not resist to put it here... :-)
  • Lewis. The Art and Science of Smalltalk (Hewlett-Packard Professional Books). I put this book here because it has a nice approach to programming: The idea is that programming is also an art (not only a science) and that it also has to do with communication between people (not only communication with the machine but also communication with other programmers who try to decipher our programs...).

Subversion

I use Subversion (a version control system; the successor of CVS) in my lab to manage our software. There is a good and free book on Subversion available online ( German version). See also the printed version at O'Reilly:
  • Sussman, Fitzpatrick, & Pilato. Version Control with Subversion. O'Reilly.

UNIX / GNU Linux / Mac OS X / Cygwin

All these flavors of UNIX (and emulations in the case of Cygwin) are very similar, because the core of the functionality is based on the POSIX standard. Therefore it doesn't matter too much on which version of UNIX you are currently working. Most of the functionality is standardized...
  • Robbins. Unix in a Nutshell. O'Reilly. Good, cheap, short introduction to all versions of UNIX. (Maybe there is also a Linux in a nutshell now?)
  • Abrahams and Larson. Unix for the impatient. Addison-Wesley Longman. Much better than Unix in a nutshell, but also more expensive. Nice title :-) Very comprehensive...
  • Krienke. UNIX-Shell-Programmierung. Hanser Fachbuchverlag. This is a very good and highly recommended (German) introduction to writing shell-scripts for all standard UNIX shells (sh, ksh, bash, csh, tcsh)
  • Jepson & Rothman. Mac OS X for Unix Geeks. O'Reilly. This book comes in different flavors (depending on the OS X version you want to work with). It gives a good idea of the differences between other UNIX versions and Mac OS X. Main message: Luckily the differences are small!

XEmacs / GNU Emacs / Aquamacs

You don't need a book to work with Emacs or XEmacs. You could simply use the build-in help (especially the info-system). And most people do this happily for years. However, if you want to use the full power of Emacs / XEmacs a book might be helpful:
  • Cameron, Rosenblatt & Raymond. Learning GNU Emacs. O'Reilly. This helped me a lot to understand the basic structure and ideas of Emacs and XEmacs.

7. Programming: Why Emacs is your friend...

Programming is typically the generation of text-files (nowadays unicode, in earlier times ASCII files, in any case: plain-text files). You could do this for each programming language separately by using the corresponding editor. For example, you could use the Matlab-editor to create Matlab files, the Visual-C++ editor to create C/C++ files, the R-editor to create R files, etc. However, when you get more proficient you might want to have one and the same editor for all your programming tasks. I use for this the XEmacs editor. This is a multi-purpose editor which allows you to edit files for essentially any existing programming language. I have used it for C/C++, Matlab, R, SPSS, Lisp, Shell-scripts, Pascal, Smalltalk, Perl, HTML, PHP, LaTeX/TeX, etc. I also use it as a good file-browser (instead of horrible things like the Windows-Explorer or the Finder in Mac OS X). Also, I use it as a environment to run a UNIX shell (much better than the usual terminal). I even use it to read my email (which I would not necessarily recommend), such that most of the time I work on a computer I work inside XEmacs (an exception is browsing the web). XEmacs runs on essentially every platform (e.g., Mac OS X, Linux, UNIX, Windows)—giving me the same interface for most of my tasks independent of any platform.

XEmacs takes some time to learn, but I think it is worth it (an alternative would be to use GNU Emacs which is very similar. I personally prefer XEmacs but the differences are not very big).

8. Mailing lists

Two mailing-lists are most important for our work and I recommend that you subscribe to these: Vision Science Mailinglist (visionlist) and Color and Vision Network (CVNet). If you have a special question, it is always worth having a look at the archive of visionlist mailing-list.

9. Being a graduate student

Find here a nice discussion of what it is like to be a graduate student. Note, however, that part of this discussion is only related to the American system, not the German...

If you want some more serious information related to the situation in Germany, you can have a look at: www.hochschulkarriere.de and: www.academics.de.

10. Searching for jobs after you received your PhD

A while ago, we collected a list of links for job search in academia. This was compiled for the PhD-students of an EU-funded project (PRA / Perception for Recognition and Action), but might also be of general interest.