Saturday, January 5, 2013

Playing around in the Tree of Life


The Tree of Life web project (TOLweb) aims to consolidate phylogenetic information from multitudes of studies in an effort to build a robust tree of all species. It's a great resource, even now as it becomes increasingly clear that no single phylogeny accurately captures species relationships (due to gene flow and horizontal gene transfer).

A user navigates the tree via a web interface with a hierarchical structure, showing the topology of one clade at a time, along with some pretty pictures. However, this is not entirely satisfying, since only a subset of the data can be viewed at a time. For example, viewing the "Eutheria" (the placental mammals), we can see the various families, but we cannot see the genera or species within each one. This format is OK for exploring the tree, but useless to the biologist who wants to "play" with the tree.

The "Eutheria" page on TOLweb. Clicking on any of the family names would take you to the equivalent page for each family, while clicking on the root would take you up one level (Mammalia).
Once you've found the clade you're interested in, you'd ideally like to have the tree in a more useful format, such as newick or nexus, which allow it to be viewed and manipulated by various programs. Fortunately, although not widely publicised, this is possible.

The first step is to find the unique TOLweb ID of the clade. Each clade in the tree has a name and a number (NODE ID), which is assigned to the node (branching point) at the root of that clade. For example, node 1 is the root of the tree "Life on Earth", while the Eutheria correspond to node 15997. To find the number corresponding to a clade, simply add this line to the address bar of your web browser:

http://tolweb.org/onlinecontributors/app?service=external&page=xml/GroupSearchService&group=xxx

Where xxx is the name of the clade on tolweb (scientific and common names are usually accepted). This will produce a short XML output with information about this node. Here is the output that was be produced when I specified the clade "primates":


<?xml version="1.0" standalone="yes"?>

<NODES COUNT="1">
<NODE ID="15963">
<NAME><![CDATA[Primates]]></NAME>
</NODE>
</NODES>


The important bit is the second line, which indicates that this is node 15963. The complete tree corresponding to this clade can then be obtained in one of two ways. TOLweb itself uses an XML format which can be obtained directly by pasting this line into the address bar:

http://tolweb.org/onlinecontributors/app?service=external&page=xml/TreeStructureService&node_id=yyy

Where yyy corresponds to the unique node ID described above. Unfortunately, in my experience, not many programs can read this XML format. But I have found one program that can, and it actually allows you to bypass the above step entirely: Archaeopteryx

Archaeopteryx is a feature-rich java program designed for viewing and manipulating large trees. It's has built-in functionality to retrieve tolweb trees given the node ID.


The primate tree in Archaeoptryx

Its an ideal program for this kind of thing because it has a great "Dyna Hide" function, whereby it only shows the number of taxon names that can fit on the tree at a given zoom level. Zooming in on a sub-clade then reveals additional names.

Zooming in on the top corner reveals more taxon names

The tree can then be exported in a number of formats, including newick. I like to use FigTree, another Java program, for further tree manipulation.

My own processed and summarized version of the primate tree

Tuesday, August 21, 2012

Gene expression analysis reveals a single origin for Nymphalid butterfly eyespots

This paper from Antonia Monteiro lab came up in my Reader feed the other day, and I was so impressed I thought it deserved a blog post. Here goes:

The paper applies two approaches to answering the question of whether eyespots have a single or multiple origin in Nymphalid butterflies, when this trait evolved, and to shed some light on the gene networks involved.

One approach was to plot the presence/absence of eyespots in 399 Nymphalid species (and 21 outgroup species ) onto a previous phylogeny. The distribution of the trait over the tree was used to fit alternative models of single, double or triple origins of the trait, using a Bayesian approach. The authors found that the most consistent model, given the data and the tree, was a single origin of eyespots at the base of the Nympahlidae.

In parallel, the expression data of 5 core developmental genes previously implicated in wing patterning - expression ofthe genes Antennapedia, spalt, engrailed, Distal-less and Notch in eyespot foci were analysed using antibody stains for23 Nymphalid species. The presence or absence of gene expression of each in presumptive eyspot foci was scored and the distribution of trait values across the phylogeny was analysed in a similar way to the presence/absence of eyespots. Their Figure 1 shows a single origin for co-expression of Notch, spalt and Distal-less at the base of the Nymphalidae, but the results for engrailed and Antp were ambigous (the association of these genes with eyespots might have evolved more than once).

The fact that 3 genes together are associated with eyespots at the base of the Nymphalidae is strongly indicative that the network for eyespot pattern was co-opted from a pre-existing role elsewhere in development, a fact re-inforced by the observation that the genes are expressed in a characteristic order (Figure 2):


All in all, this is a really interesting paper, and a brilliant demonstration that butterflies are a wonderful model system for examining the evolutionary and developmental biology of many complex (and visually striking) traits.



Wednesday, July 4, 2012

A desktop wiki instead of a linear logbook - my personal journey


Anyone who works with command line programs enough eventually realises the need to keep a log of their activities. It's not fun, but its much less painful having to go back to the manual every time you want to run something. Also, a log helps to keep track of the options and settings that were used.

Up until recently I did this linearly. I had a text document in which I would add a small heading and then the commands I used to do what I was doing. If I wanted to repeat a command used previously, but with slight modification, I would have to search the entire file for that command. This was OK at first, but eventually I had multiple versions of certain commands, each for a slightly different purpose, scattered throughout this file. I wanted to be able to keep commands for the same program together. But I also wanted to keep all the commands for a specific task together, even if that task involved several programs. This didn't seem feasible in the linear text file format, so I started to search for alternatives.

In general I decided that keeping a separate file or page for each task or program would be best. This could be done manually, using multiple text files, but this seemed cumbersome. I discovered that there were several free programs that offered this functionality. Most of these are designed for keeping track of ones thoughts, ideas, todo lists etc. Popular examples include Tomboy (simple and easy) and BasKet Note Pads (Complex and versatile). An additional function that these both have is the ability to create links. You can make links to other flies such as papers and manuals, links to websites and best of all, links to other pages. Rather than a single very long file, I could keep a network of inter-connected, smaller, pages.

Tomboy allows easy linking between notes
Main window
BasKet Note Pads allows you to make complex and beautiful pages

After extensive reading I decided to go with a less well-known "desktop wiki" program called Zim. A wiki is simply a website that allows users to edit pages - the best known example being wikipedia. Zim works like a website in that pages can be linked and organised in a heirarchical structure, but it isn't online, it's all saved locally. Creating links, formatting text and embedding images is really quick and easy. I have completly migrated to Zim, and so far it meets all my needs.

Zim's interface is simple yet powerful
At first the idea of a network of linked notes sounds more complicated than a single log file, but this change has definitley simplified my life, and I recommend it. Zim currently has linux and Unix versions, but I'm sure there are other programs out there with similar functions.

Friday, April 27, 2012

Fieldwork in South Africa - "coffee is coffee, tea is tea"




This is a quick post, on the urging of Simon, about a recent collecting trip to Limpopo province in South Africa. My co-conspirators for this trip were from the Brakefield lab in Cambridge: Erik van Bergen:

Erik with his first-ever wild-caught butterfly, a Papilio demodocus

Oskar in typical pose and attire
The aim of our trip was to capture live females of Papilio dardanus (the  most interesting butterfly in the world), along with various Bicyclus species. The trip came out of a workshop held in Ghana on Afrotropical Lepidoptera research. There, Oskar and I met several extraordinarily interested members of the Lepidopterists' Society of Africa (seriously, go there and join), among them Bennie and Andre Coetzer who, after hearing about the butterfly projects in Cambridge (and over many beers) insisted that we visit them in South Africa and they would show us a site "where you could pick the butterflies off the trees with your fingers."
|Andre (far left) and Bennie (next to Andre) Coetzer in Ghana
After our return to Cambridge, it didn't take long to commence planning on the South Africa trip.

Upon arriving in Johannesburg, and picking up our loyal Josephine from Avis, we were treated to some incredibly warm hospitality from the Coetzer's, before striking out for Limpopo. After a braai (and a little too much fine Windhoek draft) at Nwanedi resort, it was time to get into the field! Our hosts had talked up Mphaphuli (our field site) quite a lot, but we were happy to see it didn't disappoint!


Aside from an absence of Bicyclus ena, all the other species we were hoping for were present in abundance, although, given the goals of Erik and I of establishing cultures of butterflies in the UK, there was a slight lack of females (in butterflies, it is usual to observe an excess of males in the field, given the shy and cautious behaviour of female butterflies). Oskar, seeking only pheromone samples, completed his fieldwork rapidly. 

Keeping the butterflies alive in our accommodation proved to be a little trickier, however. I settled on a regimen of twice-a-day feeding of all captive swallowtails, and leaving them in cages at the field site wherever possible. Despite the high mortality of captured swallowtails (not that surprising given their short life expectancy as adults), I was able to get many many eggs by putting cages with mated females onto host plants.
Quenching the thirst of butterflies and scientists alike

Hand-pairing of Papilio dardanus
Mphaphuli proved so productive that we were able to give ourselves a day to see some other zoology in the Kruger National Park:
Oskar's camera is way better than mine
Too too funny to not include.
After packing up and heading home, it did indeed prove possible to get some livestock established in the UK. I'll try and persuade Erik to write something about the Bicyclus, but from my point of view, although only a minority of the eggs I had hatched, enough larvae have survived to do some crosses and I have eggs again (one generation out from the wild)!


The only minor downside from the trip was that one day after returning to the UK, I started to develop symptoms of African tick-bite fever

In conclusion, the trip was a roaring success and I owe a great deal of thanks to Oskar, Erik, LepSoc Africa and the Coetzer family.
"I hate my job"


Thursday, November 24, 2011

How to make a (real) butterfly out of paper


The instructions for making an origami butterfly could be sketched on the back of the page it was folded out of, but how long are the instructions for making a real butterfly?

The life recipe of a butterfly, the DNA sequence of its genome, is about 280 million "base pairs" long. The art of origami might offer a means to put this number into perspective. Cambridge scientist Alex Bateman has developed an origami model of the DNA "double-helix". So now, rather than making a paper butterfly, we can try to figure out how much paper we'd need to create the complete DNA sequence of a real one.



In a recent record-breaking feat, Dr Bateman lead the construction of a 247 metre long paper DNA strand. This model represented about 10,000 base pairs, only a tiny fraction of a genome. A complete butterfly genome made in this way would require 25 million sheets of paper and would easily stretch across the Atlantic. The human genome, which is more than ten times the size, would wrap around the world three times!

Of course this is a huge-scale model, and in reality the human genome, stretched out in one long line, would be a few metres in length. Also, the genome is not a single DNA strand - it is divided up into 23 chromosomes (21 in a butterfly). Still, it is remarkable that all of this DNA (in fact two copies of each chromosome - one from each of our parents) is crammed into every single living cell in our bodies.

Tuesday, November 15, 2011

Mendeley

To organise my collection of papers and references I use the reference manager Mendeley.


Mendeley is a desktop application (very similar to Mekentosj Papers in this respect), a cloud-based community for backup, storage and sharing of papers and a scientific social network (a bit facebook, a bit LinkedIn).

The desktop application curates your collection of pdfs:


The panel on the top left contains your list of folders and public groups (see below for more on these), the bottom left panel enables you to search by author, author's keywords, journal or personal tags in your tag-cloud. The middle panel displays the documents in whatever folde is open and the right panel shows the metadata for the selected paper (title, author, tags, abstract etc.). Clicking on the pdf symbol next to a reference opens the reading screen:


Here you have the option to read fullscreen, as well as edit metadata on the right as well as highlight and annotate the paper itself (very handy!). Mendeley tabs all the references you may have open at any one time, making it simple to flip between them.

Importing papers is simple. You can either drag a pdf into your libary or select the 'Add Documents' button in the library screen. Mendeley then adds the pdf to your library, checks for metadata and automatically fills out Title, author, publication etc. This isn't perfect but it nearly always works and is improving - it's simple enough to edit incorrect entries: Mendeley lets you search Google Scholar with the title to fill out these fields if it can't find the reference in its catalogue. Best of all, Mendeley copies the pdf into a folder in its area of your hardrive and renames it in a format of your choosing: 'j.1365-294X.2011.05127.pdf' now becomes 'Legrand et al 2011 Molecular Ecology.pdf' for instance. These are stored in nested folders with a structure of your choosing.

This is all fantastic, and the cross-platform nature (there are Windows, OS X and Linux versions of Mendeley Desktop as well as iPhone and iPad apps to read your papers on the move) beats all competition hands-down, but Mendeley really comes into its own with its use of the cloud. Firstly, Mendeley allows you to sync your files between multiple computers - just click 'Sync' in the desktop app and all your references are downloaded. Amazing - no more carting around an external hard drive full of pdfs so you can work at home.

Mendeley is also great for collaboration: you can search for other users by email address and add your colleagues. You can then make groups or shared folders of references (complete with a facebook-like wall for discussion) with your collaborators. I use this feature to help run our journal club - it works brilliantly!



Finally, you can make a public profile to go with your Mendeley web presence:


Neat. This will make it easier for my colleagues to find me and share references.

Mendeley also does citation management. You can either export folders as Bibtex lists, copy and paste individual citations from the software itself, or install the plugin for Microsoft Word or Open Office. This lasst option is so easy to use its unreal. In Word:

Alt+M brings up a dialogue box where you can search your library

Select the appropriate reference and it is added. When you've finished the document, use the insert bibliography option in Word:
Very cool. The reference style is customizable too. 

Thursday, November 3, 2011

Thinking about Acraea again today

Female Acraea jodutta
Keep thinking about Acraea species. They're Mullerian mimics from Africa, and often the models for Papilio dardanus. Some are sympatrically polymorphic, some sexually dimorphic and some are infected by Wolbachia male-killers.