The tiniest eye

I learned a bit about ocelloids on the most recent This Week in Microbiology - they are essentially eyes, complete with lenses and retinae, yet they're part of single-celled protists. In case it's not strange enough that a single-celled organism has an eye, it looks like ocelloids may have evolved from chloroplasts.  This suggests that the most primitive* forms of light perception were developed by cyanobacteria. Thanks, bacteria.

Here's looking at you. Figure is from Hayakawa et al. (2015) PLoS ONE. The ocelloid parts are marked by H and R.  See the paper for more details.

Here's looking at you. Figure is from Hayakawa et al. (2015) PLoS ONE. The ocelloid parts are marked by H and R.  See the paper for more details.

* Calling a perfectly functional evolutionary development "primitive" may be unnecessarily pejorative here. Development of eye precursors is nothing short of amazing. 

Venn diagrams in R, or how to go around in circles

Did you know you can produce Venn diagrams in R?

Yeah, I wasn't surprised either. It's easy enough to assemble many other kinds of graphs and data visualizations in R, so why not Venn diagrams? 

I tend to feel ambivalent about Venns (see also: Euler diagrams). They have many of the same problems as pie charts: they abstract data to the point of near-meaninglessness, they're completely inappropriate for situations when some values are much smaller or larger than the rest, and they can magnify the importance of otherwise trivial details. That being said, they're still popular and can express simple conclusions easily. If all I want to say is "these groups share n components" then it's hard to do better than a Venn diagram without going into more detail.

So how can we assemble these crazy things? One option is venn() in gplots - see this vignette for some examples. It's described here.

We can use two groups:

venn(list("Set A"=1:10,"Set B"=0:5))

That's not terribly interesting, so here's four sets:

venn(list("Set A"=0:10,"Set B"=0:5,"Set C"=5:39,"Set D"=7:80))

venn() will take a data frame as long as the values are booleans, so you can turn a data frame like this

  Ab Cd Ef
1 TRUE FALSE FALSE
2 TRUE FALSE FALSE
3 TRUE TRUE FALSE
4 TRUE FALSE TRUE
5 TRUE FALSE TRUE
6 TRUE TRUE FALSE
...

Into this:

gplots provides just one of the available options. There's also the VennDiagram package and the venneuler package. The former of those packages offers extensive customization but doesn't handle the intersection counts itself. It may work well if the counts are already available. Here's an example:

venn.plot <- draw.triple.venn(
area1 = 40,
area2 = 33,
area3 = 70,
n12 = 10,
n23 = 10,
n13 = 7,
n123 = 3,
category = c("First", "Second", "Third"),
fill = c("blue", "red", "green"),
lty = "blank",
cex = 2,
cat.cex = 2,
)

The latter of those two packages, venneuler, makes the whole process so criminally easy that you could safely ignore gplots altogether. That is, unless you want the actual counts of each group plotted as above. That may be a job for post-R vector image editing but why do it by hand when you can automate it?

 

November 2015 update:

There's also the Vennerable package. It's described here and doesn't appear to be in CRAN but can be downloaded from R-Forge. Vennerable can handle all kinds of exotic n-group Venns, with a maximum n of nine.  

Vennerable depends on graph, a package available through Bioconductor.

I haven't used it yet but will likely do so soon. Example output will show up here (further update: Vennerable also needs RBGL from Bioconductor. I ran into some unresolvable dependency issues while testing Vennerable so it will have to wait until I really need a nine group diagram, but that may indicate larger problems.)

February 2016 update:

Vennerable is now on Github. You'll need the devtools package to install it that way. 

Vennerable has some neat features but lacks useful documentation. Here's a quick example.

Using the example data provided with the package, and cutting it down to just three groups:

data(StemCell)
VennRaw <- Venn(StemCell)
Venn3 <- VennRaw[, c("OCT4", "SOX2", "NANOG")]
plot(Venn3, doWeights = FALSE)

and that gives you the basic diagram.

If you use the full set of four groups, Vennerable defaults to overlapping rectangles, like this.

The type argument can be set to "circles", "squares", "triangles", or "AWFE" (that is, the kind of plot favored by British statistician AWF Edwards).

plot(VennRaw,type="AWFE")

Give Vennerable a try if you don't mind figuring out all the other options on your own or waiting for me to post about it again.

 

Two updates

Here's some material related to my complaints about email yesterday: the most recent Reply All podcast. It's partially about giving up on email. It seems that people may more readily abandon email if they're comfortable using some alternatives and have enough social leverage to enforce the switch. I don't think I could get away with it.

In the context of the "zero rating" plan hubbub I posted a couple weeks ago, Internet.org now has some app-centered competition. The Jana plan and mCent remind me of the old internet browser toolbars claiming users could make money by surfing. Those were strange times. Does anyone still use "surf" in an internet context?

Investments, email, and monkey tails

I'm a graduate student. It's hard to admit it publicly. There's always the desire to prove to others that I'm doing useful research for some reason other than ongoing career training. For non-students (or, at least, people who don't consider themselves students right now), grad school can sound like a poor investment, like buying a beach house. Sure, it's a way to take a brief vacation from the world, but will you ever see any return on that investment after the next hurricane season?

That's not really my opinion about graduate studies and I'm not sure what the metaphorical equivalent of a hurricane is in that context. I'm glad to have the opportunity. Last week, I took my exams, and once all the dust had settled and I finally had a solid research project to move ahead with, I had some time to get introspective again.

What hasn't been going well? 

E-mail has been a source of more stress than it should be. It was my primary communication method for a while, and for good reason: a well-organized inbox is a collection of tasks and their context and the people involved in the tasks. E-mail keeps everything in one place and ready for when it's needed.

Karlsruhe, Germany was apparently the site of the first email from Germany to the US. &nbsp;The @ symbol may be called an Affenschwanz, or monkey's tail, in German. One day, maps will refer to this place as Kaffenschwanzruhe.&nbsp;This image is from…

Karlsruhe, Germany was apparently the site of the first email from Germany to the US.  The @ symbol may be called an Affenschwanz, or monkey's tail, in German. One day, maps will refer to this place as Kaffenschwanzruhe. This image is from StromBer on Wikimedia Commons.

The stress I felt about email had little to do with the content of any particular message, though. It was disrupting my attention.  A system intended to hold material until I was ready for it was constantly leaking bits of information in the form of alerts. That's a problem with an easy solution so I just turned them off. Even then, I'd worry about what I was missing. It wasn't just FOMO. I felt like I had an obligation to carefully address every message because the wrong response (and is there ever really a perfect response, other than no response at all?*) could have repercussions in my career. That's especially true for a lowly grad student, right?

I'm beginning to realize that most people don't care. That's a great thing! They'll forgive email typos. They'll forget mistakes. They probably won't notice that I took three hours to get back to them instead of twenty minutes. When I reply to email messages, I'll just focus on the messages rather than trying to fit email in around everything else.  

I'm also going to refrain from apologizing about dull blog posts like this one. Instead, I'm going to link to a clip from an in-development game about procedural architecture and noise pollution and neon cats and mind-reading.  It's by a fellow(?) named Strangethink and the corresponding Twitter feed is an ongoing collection of beautiful oddities.

 

*It's a perfectly bad response. In this flawed system of logic, failing to acknowledge a message is the highest offense.  

Placebo research and alternative reviews

Open access publishing is great. Anything reducing the friction inherent to the spread of scientific results provides a benefit to society. 

That being said, reading the list of PLOS ONE papers on Complementary and Alternative Medicine is disappointing. I like PLOS journals and I know how PLOS ONE works: papers have to be technically sound but any judgement beyond that is limited. The burden of proof does not lie with the journal. The problem is with studies clearly based on a shaky foundation. There's this recent work with using homeopathy to treat depression, for example. 

A selection of 19th century&nbsp;homeopathic remedies. From Wikimedia Commons.

A selection of 19th century homeopathic remedies. From Wikimedia Commons.

There's an Ars Technica piece about that paper, though there appears to be some debate as to whether all of its author's claims are justified. The comments on the PLOS ONE page appear to have been deleted.* Even so, extraordinary claims (i.e., that homeopathic remedies have any therapeutic effect beyond placebo) require extraordinary evidence. If we are to assume that any study along these lines has been designed and performed properly, a positive result is essentially disproving everything we think we know about science. Researchers shouldn't handle homeopathy studies without that philosophy.

I'm genuinely worried about how much of the research into alternative treatments may, in fact, provide solid results and legitimately novel approaches. It's awfully difficult to find such results when they're diluted in a sea of studies with questionable premises.** Hopefully, public commentary will provide a helpful filter.

*The corresponding 404 page is cute and panic-inducing. Where did that sample go? How long has it been missing?

**That's a half-finished homeopathy joke, there.