Notes

nonparametric bayes

Reading

Arnqvist, G. 2013 Editorial rejects? Novelty, schnovelty! Trends in ecology & evolution 28, 448-9. (doi:10.1016/j.tree.2013.05.007)
Accounting for Complementarity to Maximize Monitoring Power for Species Management. 27, 988-999. (doi:10.1111/cobi.12092)
Nes, E. H. Van, Hirota, M., Holmgren, M. & Scheffer, M. 2013 Tipping points in tropical tree cover: linking theory to data. (doi:10.1111/gcb.12398)
Millar, R. B. 2002 Reference priors for Bayesian fisheries models. , 1492-1502. (doi:10.1139/F02-108)
Mcallister, M. K. & Kirkwood, G. P. 1998 Bayesian stock assessment : a review and example application using the logistic model. , 1031-1060.

writing

reorganizing, re-writing in intro.
technical issues:
pandoc 1.12 citation formating issues? How to use bibtex now?
Makefile add dependency flags. Yay, brilliant.

Informatics / ropensci

Interesting discussion with Dave Bridges over semantic metadata in regards to his data management plan
Karthik asks for thoughts on sustainability in advance of our meeting this fall.
Scott’s taxize_ paper is out (Chamberlain, S. a. & Szöcs, E. 2013 taxize: taxonomic search and retrieval in R. F1000Research 191, 1-20. doi:10.12688/f1000research.2-191.v1) Great, but also appears F1000 (and everyone else, including both my software papers) does a terrible job with code display in papers. Ought to have a more general open letter to publishers on this (Scott writing blog post on issue – yay). Sent this to F1000Research:

Congrats on publishing https://f1000research.com/articles/2-191/v1 This looks like a promising article. Your support of research containing software, with support for features such as continual updating as the software improves, is an invaluable contribution to an crucial and frequently neglected element of academic research, and your innovative publishing workflow with visible and signed reviewer comments a welcome addition to the academic publishing landscape.

Overall the presentation of your HTML and pdf manuscripts looks quite strong. (I particularly appreciate the linked pdf and the availability of XML version).

It is most unfortunate that code blocks appear as gifs rather than as text (in PDF, HTML, and XML versions). Being able to copy-paste code chunks is a valuable element to ensure reproducible output of the author’s provided code, and this is severely hampered by the use of the GIFs. Though I can copy the text for the most part from the GIF, certain characters are not ASCII encoded, so that commas, for instance, appear not as plain-text ASCII but as a UTF-8 comma symbol I don’t recognize, and which will not work in code.

As you strive to bring publication into the 21st century, please consider addressing this important issue. All lines of code should be machine-readable in the XML, not images, and should use only the ASCII characters they actually involve.

evolution thinking

After another email on the topic, am assembling thoughts on comparative methods promises and pitfalls. Might be worth flushing out into proper post / article at some point. A broader version of the more narrower opinion I’ve been toying with writing for some time (beyond phylogenetic signal; or phylogenetic signal is not biologically meaningful). Similar statements should be made of rates.

Yes, so it’s a good question of just how meaningful that would be. I believe that assigning some Markovian process to some property associated with a species (e.g. Markov chain to categories, BM/OU to some continuous trait) in general is exciting but potentially dicey even in the context that we already apply the phylogenetic comparative method. Um, summarizing roughly, I think i have 3 concerns. (Take with a grain of salt, I’m probably on the far end of skepticism with regards to this stuff, and not completely up to date these days either).

Applicability. It should be reasonable to think of the trait in question as evolving under the hypothesized model (at least approximately). For instance, Brownian Motion may not seem like a very biological model except for purely neutral processes, but I find it 100% reasonable as an approximation to lots of processes. Other models/statistics in this area I do not – such as Pagel’s lambda, because it treats tips differently from the rest of the tree and obviously evolution has no idea if it is on a tip or not. Without a clear picture of the quantities you have in mind it would be difficult for me to be sure how comfortable I am with it.

Power. As the models get richer, clearly our ability to meaningfully distinguish between them goes down. This isn’t much of a problem in Bayesian framework, since we just get posteriors that look like our priors and we realize we haven’t learned anything, but it can be more misleading in a frequentist framework (though the method I take in that Evolution paper, based on frequentist methods from Huelsenbeck and others in the 1990s (and really Cox in the 1960s) tries to address that in a fequentist way instead).

Identifiability. Other difficulties are more troublesome. A central tenant of the approach is to assume evolution on independent branches is independent – clearly a bit of a strange assumption when the process is supposed to capture competition between species that might lead to different niches, etc. However, relaxing this assumption (a growing interest recently) is yet more troublesome, since the tree no longer tells us much if traits can be correlated for reasons other than shared evolutionary history. This is just one of many ways we get a stronger problem than power – we have non-identifiable elements in our model, where we can explain the data equally effectively with different representations. (This arises in simpler models too of course).