Wednesday, June 28, 2006

TreeBASE rocks


I gave a talk today ("Dude, where's my tree?") at the Evolution 2006 meeting at Stony Brook. It was intended as a somewhat tongue-in-check overview of some issues concerning TreeBASE, and broader areas of biodiversity informatics, making use of ants as an example (see my SemAnt project).
Michael Donoghue took me aside after the talk and made some interesting points. He was a little tired — understandably — of hearing that "TreeBASE sucks" (e.g., my CIPRES talk), and felt that my constantly saying this was counter productive. It could also lead to people not putting their data in TreeBASE because they'd heard that it "sucks".
There is an element of social responsibility here, I guess. I resolutely avoid politics. I don't mean this in a pejorative sense, it's just that I don't have the temperament or skill for it, unlike Michael himself (Lee Belbin is another person in this area who strikes me as a very skilled manager).
Now, my talk was intended to be fun, and I was taking the piss out of myself as much as anything. I also think the things we criticise are the things we value the most. But that said, let be make it clear that TreeBASE is very important. As editor of Systematic Biology I've made authors submit data to it. I have a lot of respect for the work Michael, Bill Piel, and Mike Sanderson put into TreeBASE. If you have phylogenetic data — submit it to TreeBASE. It's the best we have. It's just that, well, as a community, we could do better.

Taxonomic names, metadata, and the Semantic Web

My paper "Taxonomic names, metadata, and the Semantic Web" has appeared in Biodiversity Informatics.


Life Science Identifiers (LSIDs) offer an attractive solution to the problem of globally unique identifiers for digital objects in biology. However, I suggest that in the context of taxonomic names, the most compelling benefit of adopting these identifiers comes from the metadata associated with each LSID. By using existing vocabularies wherever possible, and using a simple vocabulary for taxonomy-specific concepts we can quickly capture the essential information about a taxonomic name in the Resource Description Framework (RDF) format. This opens up the prospect of using technologies developed for the Semantic Web to add ``taxonomic intelligence" to biodiversity databases. This essay explores some of these ideas in the context of providing a taxonomic framework for the phylogenetic database TreeBASE.