Scratch Notes While Outlining Talk

Scratch notes for my seminar

Part I: Tipping Points: Early Warning signals and decision theory


Global change threats. So what do we do?

This is a pendulum. easy to predict.
Not a pendulum. Oscillations not easy to predict. Crashes

Tipping points



No equation for $f_i(s,h)$

Everything matters.

Data is good, getting better.

Crushing our data under parameters.  Stop throwing out the data.


Early Warning signals


Optimal Control problem -- Model the descision process! Action space!

Model choice sucks.

Nonparametrics work.


Why an informatics approach?

What is Ecoinformatics?


 Theory      |     Data
-------------|-------------
 Algorithms  |  Informatics




50 years ago Ecology Golden (Folk) Age / Peter, Paul & Mary: theory had little need of good algorithms
Today the models, data, and statistics are more numerous and more complex. An algorithmic and computational component is now essential

Today's data needs similar infrastructure

Bioinformatics, ATGC, vertically integrated.

No one need be told we are awash in data today. Yet it is not the data we have but the questions we face which demand this approach.

Let us recall how utterly desperate the warning signal idea is: an extrapolation, not based on averages but on the noise itself, and under the very assumption that everything is about to change suddenly and extrapolation should be impossible.

That we seek to manage a fishery not knowing the dynamics not only for one species, but for the whole ecosystem that feeds and eats it, a variable and changing climate, and socio-economic winds every bit as daunting. Across scales of space, time and ecology.


(sequentially add...)
Fishbase.  Treebase.  GBIF

rOpenSci Story

- Shared challenge ~> collaboration
- collaboration ~> network, developer collective rOpenSci
- Building bridges, teaching tools. Rapid growth. Sloan


- next challenges: from vertically integrated to metadata-driven. Better leverage your own data.
(out of scope? or some data management needed?  differentiate from future work proposal)


(I think we can safely define vertically integrated and metadata driven database without "giving away" all future research proposals...)

Risk in doing this: Prosecutor's fallacy.

A mandate to try.  Just as policy must be made on the best available science, science must be made on the best available data.





Please don't show this, don't think this:

   Math     |     Ecology
------------|------------------
 Statistics | Computer Science


What does it do?

Access / Discovery. (Data Mining?)
Manipulation
Management

Modular: Reusable, Interoperable components

archiving. metadata.


-----------------------------------------------------------------------

Who has big data already? Who needs you?
The pitch to those not using external data
The pitch to experimental scientist

---------------------------------------------


- Fix figures.  Myers -> Maynard-Smith


------------------------------------------------------------------

What is Ecoinformatics?
=======================


An analogy to computation
-------------------------

Ecoinformatics:data :: computation : models & theory.

We don't need a computer to do ecology.  But it sure helps.

Statistics.  Visualization.


On the transition to ecoinformatics: another analogy
----------------------------------------------------

Having a digital representation of our data, rather than a handwritten
one in a paper notebook, has forever changed how we do science. That is
the promise ecoinformatics.

Digital data could be backed up, shared, and manipulated with ease.
Most importantly, it provided a relatively seemless integration into
all kinds of tools, while minimizing the potential to introduce human
error. Despite the improvement, archiving, sharing, and manipulating
digital data isn't always effortless.  Adapting, formatting and
manipulating data for the analyses is anything but easy or foolproof.
Ecoinformatics seeks to address each of these challenges: archiving,
sharing, manipulation, inter-operability, automation, and the reduction
of human error.

How do we do that? What does it look like?


Platform.

We're familiar with platform for statistics. We're familiar with platform for data in other contexts: Google Maps. Search. Airline flights.

Yeah, but what does it look like for ecology?

- Vertically integrated databases.  (literally - same column)
- Metadata-driven databases.