ISEES software lifecycle and components workshop, Day 1

Range of large project to the long tail.

  • statistical analyses
  • One-off models
  • community model
  • Services (Blast)

Science Challenges

Outcomes of previous community workshops

18 Grand Challenge areas.

Highlights: Supermodels. human/biophysical systems shape and be shaped by water availability. (A vision similar to Microsoft’s “Model all Earth” Purves et al)

Functions and services areas:

(priority ordering)

  • Computational training (early career - all career)
  • Assimilation and QA/QC tools
  • Collaborative environment
  • decisions and workflows
  • software discovery
  • consultants / collaborator
  • community hub to converge on standards
  • merging disparate tools
  • user-friendly interfaces
  • multiscale coupled modeling framework
  • Software vetting (how about paper vetting?)

Software Sysems and Lifecycle

Innovation -> incubation/integration -> prduct development/hardening -> maintenance and support


  • What is and is not out of scope?
  • Models of success (ESIP grassroots conception of problems). NCEAS working groups. NeSCENT Hackathons.
  • Models of failure

NSF has funded 16 Visions. Many are computer science efforts (compilers, language integration etc). 3 proposals are domain focused with close overlap: ISEES, Long Tail, and Water Science. These need to be aligned for us to succeed.

How about existing stuff:

  • ESIP? Lots of overlap.
  • EarthCube Large top-down effort. Geo directorate, not Cyberinfrastructure (OCI) anyway, so overlap matters less. However, better funding chances when draws across directorates. (tap BIO).

Those are integration in general, not software sustainability issues in science.


Software huge range! Goal is different for different products! Incremental improvement from paper-only model.

Sustainability and Adoption

Town Hall ESIP (Environmental Science Information Planning)

NCEAS for software. Instead of showing up with spreadsheets and trying to combine them, we show up with software. We add nubs.

  • Science as Lego. Papers: blocks. without the nubs. Easy to build tall towers. Fragile.
  • Data as Lego. Metadata as nubs.
  • Software as Lego. What are the nubs?

small. unit tests. versioned. dependencies. accessible language and style. Licenses.

Software Lifecyle Session

Comparison: Commercial, Open Source, Academic Lab.

Note this isn’t software development lifecycle, which would occur within any of these blocks.

Goals:

  • Issues that occur (strengths, weaknesses) in the lifecycle

Services, Activities and Functions

  • Software quality assessment
  • Training
  • Hackathons
  • Archiving

Add nubs. Good development makes for faster development.

User/developer access

Documentation. Versioning. Dependencies


Afternoon breakout

Brainstorm Services, Activities, Functions.

Assist groups in disseminating and hardening software.

Successsful academic software is frequently a victim of it’s own success. They cannot keep up with support, bug requests.

  • Strategies for dissemination and maintenance.
  • Commercialization
    • Connections
    • test data and use cases. (Because open source and commercial often don’t meet academic goals because they don’t have access to the data or questions!)
  • Open Source Release
    • Strategy template: Host repository, contribution guidelines
    • Mentorship for managing development
    • Language and style accessible to community
    • Hackathon for engaging community for future development
  • Hardening of software
  • Written product: Objectives checklist: unit tests, RESTful APIs
  • Meeting space: hackathon for hardening
  • Funding: e.g. for a student Summer of Code style for hardening

Software review and certification

  • tied to software lifecycle
  • Tiered system. e.g. level 3 is external hands-on testing by experts.
  • Couple to publication of software papers or separate model?
  • Scoring system analgous to repository scoring.

Computational career path. // (me): Doesn’t have to be - phylogenetics research driven by computational side.

Leveraging social media, hashtags, stackoverflow for community

Synthesizing Functions

  1. Assist Groups in Maintenance (CS)
  2. Assist Groups in Hardening (CS)
  3. Software Review and Certification Program (OS) (ASSESSMENT)
  4. Mechanisms for sharing data / use cases (OS)
  5. Operation community based support services (CB) // no, these discussions happen in native environments.
  6. Provide sofware use and quality metrics (OS) (ASSESSMENT)
  7. Provide best practices boot camp (T) (BEST PRACTICES)
  8. On site and remote course training (T) (BEST PRACTICES)
  9. The Dating Game - match researchers with engineers (CB)
  10. Support collaborative groups focused on software projects (CB)
  11. Advocate for open source approach (A)
  12. Develop policies to incentivise sharing and licenses (A)
  13. Promote software management plans to NSF etc (A)
  14. Consulation and mentoring services (CS)
  15. Directory for discovery (citation, evaluation) (OS)
  16. Kickstarter funding for software (OS)
  17. Connect industry to academia (CB)
  18. Certify papers as reproducible (RunMyCode service) (OS)
  19. Assist groups in making extensible (CS)
  20. ISEES swag (CB)
  21. Actively survey the community about practices. (OS)
  22. Assist groups in archiving and decomissioning. (CS)