Notes

  • writing manuscript
  • slides
  • data request to Glaser.

gpdd

  • Global Popluation Dynamics Database initiated package rgpdd
  • data extraction – No API, but does provide direct dump in the form of a Microsoft Access mdb database. For the moment I’m using mdbtools: mdb-tables and mdb-extract to convert to CSV and then reading those into R. RODBC package installs but cannot read the database directly for me. Since data will fit into working memory no real reason to support database calls from R? (avoids extra dependencies). data.table should speed internal computations.
  • data compression. Looks like raw data exceeds CRAN’s 5MB limit, but zx compressed binaries it will be less than \(<0.4\) MB. Compressing those from within R: save(table_name, file="filename.rda", compress="xz"). Looks like database is essentially static with updates every few years, see Updates page. Together this makes a good case for distributing data directly in the package.

On distributing the data, note:

“Not all of the data in GPDD is in the public domain; some are provided under licence from the BTO. We require permission before we can distribute some of the data sets. The data in these datasets have been removed from the DATA table in this online version of GPDD, but the metadata in the other tables has been left in place.” https://www3.imperial.ac.uk/cpb/databases/gpdd/structure

Emailed database providers to confirm.

  1. Post, E., Brodie, J., Hebblewhite, M., Anders, A. D., Maier, J. a. K. & Wilmers, C. C. 2009 Global Population Dynamics and Hot Spots of Response to Climate Change. BioScience 59, 489-497. (doi:10.1525/bio.2009.59.6.7)
  2. Knape, J. & de Valpine, P. 2012 Are patterns of density dependence in the Global Population Dynamics Database driven by uncertainty about population abundance? Ecology Letters 15, 17-23. (doi:10.1111/j.1461-0248.2011.01702.x)
  3. Sibly, R. M., Barker, D., Denham, M. C., Hone, J. & Pagel, M. 2005 On the regulation of populations of mammals, birds, fish, and insects. Science (New York, N.Y.) 309, 607-10. (doi:10.1126/science.1110760)

Notebook infrastructure

  • ugh should implement script to transform DOIs to links+metadata. probably knitcitations based, but without needing the verbose r citep(" notation. pity markdown doesn’t support rel attribute on links. Maybe possible already? https://drupal.org/node/421832 No luck there, though pandoc can avoid choking on this style of link attribute: https://github.com/jgm/pandoc/wiki/Pandoc-vs-Multimarkdown#image-and-link-attributes when compiling with -f markdown+link_attributes. Unfortunately this doesn’t actually render the attributes but just silently drops them.

  • Adjust link to provide turtle instead of RDF/XML (avoid the download prompt)?

  • I mentioned my involvement started with this post; the (somewhat long) comments are useful.

  • I should have mentioned that recently attended a workshop where environmental scientists were thinking about how to approach these issues related to software and code through the creation of a Software Sustainability Center. (Of which code review could be one possible service it might provide).

  • I might have mentioned that a few organizations are doing code review. Notably, the journal of open research, published by the new UK Software Sustainability Institute. They are one of the few organziations that have code review guidelines

  • Code review was recently proposed and disputed in a Policy Forum piece in Science Magazine discussing these issues.

  • This Times Higher Ed article recently highlighted the issues with Climate code: https://www.timeshighereducation.co.uk/news/save-your-work-give-software-engineers-a-career-track/2006431.article The Climate Code Foundation https://climatecode.org/ is doing excellent work to start addressing these issues.

  • Of course several earlier articles in Nature have highlighted these issues, including the important tension between encouraging scientists to share code even if it isn’t “pretty”, vs improving the quality / readability of code.