I'm a big fan of having reproducible workflows, especially when it comes to reproducible data analysis for my research projects. A really handy tool is R Markdown, which allows me to do all my analyses in R and then format the output (including programatically-generated tables and numbers!) via Markdown into a nice HTML or PDF document.
Many journals now require uploading data and analysis code to some online repository like OSF or ICPSR. One issue is that these repositories are functionally simple: they just offer (persistent, time-stamped) storage for data and code, but do not offer any interactivity. What I prefer to do is, in addition, to use GitHub Pages to render the output HTML file that accompany my projects. This results in a visually appealing (and reproducible) presentation of the main results.
Below I list some examples:
I wrote a simple python script to automate awarding bonuses to workers on Mechanical Turk (mTurK). You can find it here. It requires Amazon Command Line Tools.
I wrote an R function to do this here. The readme on that page also has the plain cURL code and links to the Qualtrics API documentation so you can experiment on your own!
A simple Javascript implementation (using Raphael.js) of a continuous version of the Inclusion of Other in Self scale (Aron, Aron, & Smollan, 1982).
Simple one liner to replace identifiable information (e.g. mTurk workerids). (Has some cons, but it's fast.)
d0$workerid_random <- match(d0$workerid, unique(sort(d0$workerid)))
If you want to be even cooler and use a one-way hash function, here's a one-liner using the digest()
function (from the digest
package):
library(digest); d0$workerid_hash <- substr(sapply(as.character(d0$workerid), digest, algo="md5", serialize=F), 1, 6)
(the code above: (1) converts workerid into a character string, (2) uses sapply()
to apply digest()
vectorially, (3) takes the first 6 characters of the resulting string.)
During one of our lab hackathons, Justine Kao, Greg Scontras, and I coded up a little interactive web text-visualization demo: colorMeText, which basically colors input text according to ratings using some dictionary (e.g. useful for sentiment analysis, or any other dimension of interest). It's still a work in progress!
A neat and simple trick to update R and re-install all your packages here.
Sometimes you just want to concatenate all the data files in a directory into one big file. If it's in a format like .csv, and you want to skip the first header, you can use the following command in Terminal:
awk 'FNR > 1' *.csv > combined_file.csv
Although Preview on a Mac can compress (Export->Quartz Filter->Reduce File Size), the images become really low quality. The solution I used was pdfcompress.com, which provided reasonable results.
I use GIFFun, which is pretty alright if you just need the basic essentials.
There's also a neat guide here that I haven't tried, using ImageMagick on the command line.
If you crop a pdf file in Preview, it doesn't destructively crop it. The parts you cropped out are still hidden in the file (i.e. so you can undo cropping). I've found that this gives trouble with LaTeX when it doesn't recognize the bounding boxes. If you need to destructively crop the file, one way to do it is using Ghostscript. Let's say you want to crop "in.pdf" to "out.pdf" (note that you can't use the same filename, because of the way gs works), at the command line, type:
gs -sDEVICE=pdfwrite -dUseCropBox -sOutputFile=out.pdf - < in.pdf
Reference: http://electron.mit.edu/~gsteele/pdf/
[Postscript] Level 1 uses only ascii-coded RGB values, and is very wasteful, producing very large files. Level 2 includes support for JPEG encoded images, which produces much smaller files. Level 3 includes support for Zlib compression, making it well suited for making EPS files from png files.
In general, level 3 will produce the smallest files. Level 2 provides the best compatibility, and works well with jpeg images.
If you decide to use level 2 postscript, I recommend converting first to a jpg file. The "convert" program included Imagemagick uses a quality factor in "percent" that ranges from 0 to 100:
convert -quality 80 fig.png fig.jpg
I find a quality factor of 80 on high resolution images gives good compresssion without too much loss in quality. You can then to convert the image to eps using "convert" with the eps2 settings:
convert fig.jpg eps2:fig.eps
If you can use level 3 postscript, you can convert directly from png to eps:
convert fig.png eps3:fig.eps
Using level 3 postscript from a png image file for scientific figures will often produce a very small eps file. Ghostscript is compatible with these level 3 eps files, so this is often a good way to go.
Here's a simple Macro that you can use so that everytime Microsoft Word opens a new document, it does so at a specific zoom level. (Personally, I like 100% on my Retina Pro.).
1) In Word, Go to Tools->Macro->Macros.
2) On the dropbox after "Macros in", click Normal (Global Template)
3) Create a new Macro called AutoOpen [This particular name seems to be required for it to be run upon opening].
4) Paste the following macro in, where 100 is the desired zoom percentage.
Sub AutoOpen()
(refs: various places like this and this.)
ActiveWindow.ActivePane.View.Zoom.Percentage = 100
End Sub
I love having really high sensitivity on my mouse/trackpad. Unfortunately, the maximum that you can go in System Preferences isn't high enough for me. There is a way to increase this sensitivity further. In Terminal, typing:
defaults read -g com.apple.mouse.scaling
will give you the current value of your mouse scaling. You can modify it by changing read
to write
. For example, if you want to set your mouse scaling to 3.0 (the maximum in System Preferences), type:
defaults write -g com.apple.mouse.scaling 3.0
In addition, to change the trackpad, use com.apple.trackpad.scaling
. You can also use .scrolling
to change scrolling speed.