Code Monkey home page Code Monkey logo

bssw.io's Introduction


What is Better Scientific Software?

Better Scientific Software is an organization dedicated to improving developer productivity and improving software sustainability for computational science and engineering (CSE).

This repository provides source material for the Better Scientific Software BSSw.io web portal. Better Scientific Software (BSSw) community members can contribute content using standard GitHub tools and processes. Contributions can be made via:

  • Web browser editing: For many people (even BSSw project members), this is probably the preferred way. GitHub provides a nice web editor for Markdown.
  • Cloning: If you have push access, you can clone and commit to this repository. This approach could be best for remote editing and activities that span across multiple source files.
  • Forking: This option is like cloning, but works for anyone. You can make edits to your own forked copy of the repo, either in a browser or from a local repository. Contributions are submitted to BSSw by using a pull request.

For details see our What To Contribute and How To Contribute pages.

Please note that BSSw.io has a Code of Conduct. By participating in the BSSw.io community, you agree to abide by its guidelines.


What is the BSSw.io Editorial Space ?

The BSSw.io Editorial Space website hosts documentation related to BSSw.io content authoring as well as editorial review processes.


bssw.io's People

Contributors

adubey64 avatar bartlettroscoe avatar bernhold avatar brnorris03 avatar clararaubertas avatar curfman avatar danielskatz avatar davidbernholdt avatar elaineraybourn avatar fnrizzi avatar gonsie avatar gpieper avatar haikudeb avatar hartwiganzt avatar hnamlanl avatar jarrah42 avatar jmgate avatar karbarz avatar madhusb avatar maherou avatar markcmiller86 avatar oamarques avatar pagrubel avatar prwolfe avatar rinkug avatar ritua2 avatar sbxchicago avatar shuds13 avatar tscheibe avatar vahi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bssw.io's Issues

Add curated content on how to grow software projects in DOE/National Labs

DOE doesn't have sustainability initiatives focused on long-term software products (that I know of). AFAIK larger projects are mostly programmatic and depend mainly on sometimes short-term research funding.

Make a how-to on growing software projects within the DOE. Ideas:

  • discretionary funding sources at the labs
    • often this is reserved for maintenance/sustaining existing capabilities, NOT research
  • LDRD, ASCR, other traditionally research-focused funding sources
  • SBIR
  • subcontracts, and companies that can help (Kitware, Krell, others)
    • can be cheaper than hiring within the labs
  • communication channels for advertising software within DOE
    • who to talk to, where to present so that people find out about software
    • potential focus area for facility liaisons -- spreading the word.
  • software release and licensing at the labs
    • who to talk to, what to be aware of when releasing software at the labs
    • potential link for how to choose a license.
  • How to lower barriers for external DOE contributors.

I'd like to work with others on this as I don't have the internal perspective on labs other than LLNL.

Categories: Collaboration

Need 1- and 2-line descriptions for the topics within the Collaboration category.

Mini-WhatIs will also be required, but are not covered in this issue.

Note that #51 covers the licensing topic specifically as a pilot.

Categories: Planning

Need 1- and 2-line descriptions of the topics in the Planning category.

Mini-WhatIs are also needed for all topics, but are not covered in this issue.

Categories: Cross-Cutting

Write 1- and 2-line descriptions for all topics in the Cross-Cutting category.

Mini-WhatIs are also needed, but are not covered by this issue.

How to estimate operational intensity

The operational intensity introduced in the Roofline model -- operations per byte of DRAM traffic -- is a simple model that can be used to determine what architectures are the best match for a given computational kernel, or conversely, in what ways to optimize a kernel so it performs better on a given architecture. Operational intensity is not typically provided directly by performance tools but can be estimated from other readily available measurements.

Add new curated links file: Ways to publish your software

This page, or aggregate page, would contain curated links to journals and similar mechanisms for making your software citable as academic literature and getting citation-based credit for scientific software development. A preliminary list of options follows, whose descriptions may need improvement by people who are more familiar with each journal:

  • TOMS (ACM Transactions on Mathematical Software): This is a well-established journal whose articles often describe novel algorithms and their implementation as mature, usable software products. It has also pioneered policies to improve the reproducibility of published research.
  • TOMACS (ACM Transactions on Modeling and Computer Simulation): Another well-established journal, which deals more with applications, their impact and results, as well as their methodology (e.g. Verification & Validation).
  • JSS (Journal of Statistical Software): Like TOMS, but with a focus on software which implements statistical methods rather than other mathematical modeling topics.
  • SoftwareX: An Elsevier journal which aims to ensure software is cited and gets credit in the literature. This journal accepts submissions regarding software that is used in any of a wide range of disciplines, from mathematics to the sciences and humanities.
  • JOSS (The Journal of Open Source Software): This journal provides authors with a DOI for their software package without requiring a full-length manuscript. Instead, authors must demonstrate (via a form of peer review) that their package follows certain best practices of open-source software, including proper licensing and documentation, and helps meet scientific research challenges.
  • Zenodo: Like JOSS, Zenodo can provide a DOI for your software. Unlike JOSS, it does not require a review of the software, and can generate a DOI for each release of your package via GitHub integration. Zenodo also allows users to upload data, and obtain a DOI for their data, while also acting as a hosting/distribution platform for others to access that data.

Create sample blog post

Create a file in the format of a blog post, containing all metadata expected to be used for a blog post

Guide to improving reproducibility in scientific software

Author : @oamarques
EB member: Rinku
There a quite a few groups working on reproducibility in science, with a focus on scientific software. However, there doesn't seem to be much coordination between them, nor any obvious place to go for a guide on how to get started or what are best practices. This page could provide a starting point for scientists/developers interested in trying to improve this aspect of their work. It could provide links to the broader community, as well as a survey of the current best practices and links to get people started.

Write brief article "Better Testing: Start Today"

Better Testing: Start Today

Concerned about testing, but so hard to cover existing code, not enough resources.
Instead, resolve to cover new functionality with tests.
From now on:

  • No source contributions without tests.
  • A source checkin without tests coverage is a fault. Can and should be reported by anyone.

Add curated content on sustaining open source projects

There are a whole lot of other organizations already working on ways to sustain open source projects, scientific and otherwise. BSS should link and leverage these efforts.

This would add a page to the site pointing to a number of recent key work on software sustainability, including:

  • GitHub's Open Source Guides: opensource.guide

  • reports, e.g. Nadia Eghbal's Roads and Bridges: The Unseen Labor Behind Our Digital Infrastructure

  • foundations/non-profits devoted to OSS, e.g.:

    • NumFOCUS: sustaining open source projects for data science
    • Linux Foundation
    • Ford Foundation
    • Sloan Foundation
  • recent NSF software sustainability efforts

  • Information on similar efforts in DOE (This site? Others if they exist?)

    • this could grow into a separate howto on how to grow OSS projects at national labs. (funding sources, who to talk to, etc.)
  • others?

Categories: Individual Productivity

Need 1- and 2-line descriptions for the topics within the Individual Productivity category.

Mini-WhatIs as also needed, but are not covered in this issue.

Create sample announcement

Create a file in the format of a sample announcement, containing the metadata an announcement should have.

Add curated: TDD survey paper

curated pointer to a journal article: Aziz Nanthaamornphong, Jeffrey C. Carver (2015), Software Quality Journal, p. 1-30, Springer US, url, doi:10.1007/s11219-015-9292-4

Using GitHub Projects featue for Kanban for BSS?

This is the first project where I have really used the GitHub Projects feature for implementing a Kanban process (but I did play with it a little on other project and was not impressed). While I understand the benefits of using a native GitHub tool for implementing Kanban, I have to say it leaves a lot to be desired. I will not compare GitHub Projects to JIRA (because there is no comparison, JIRA blows GitHub Issues and GitHub Projects out of the water in every way for project planning and issue tracking but JIRA is a commercial tool). Instead, I will compare this to waffle.io which is another free tool that implements Kanban using GitHub issues and is used for Trilinos and TriBITS which I have used a lot.

First the advantages of using the GitHub Projects feature over waffle.io:

  • With GitHub Projects, you can associate a single GitHub Issue or PR with more than one Kanban board and it can be in different stages in those board. (But this may not actually be an advantage because I don't see any utility for this when compared to what you can do with filters on labels with waffle.io.)

Second, the really irritating aspects of GitHub Projects:

  • You have to switch to the Project page and then manually add a new "Card" (which is an Issue or PR, of course) to add an Issue or PR one of the Project stages. This is a slow workflow.
  • The drop-down to add a new card does not let you just put in the Issue or PR number (i.e. #1234). Instead, it makes you search based on a name and drag the cards over. This is a slow workflow.

Now for the main advantages of waffle.io over GitHub Projects that I am noticing the most:

  • Using waffle.io one can assign and change the Kanban stage right in the GitHub issue using a label. One only needs to view the Kanban board when one wants to with waffle.io
  • One can define various filters for a waffle.io Kanban board based on labels, assignee, milestones, issues and/or PRs. With GitHub Projects, you can't filter on anything. (The ability for create filters makes it unnecessary to support multiple Kanban boards like GitHub Projects supports.)
  • Because labels are used to represent Kanban stages with waffle.io, you can create quick GitHub queries for "ready" or "in progress" issues, for example, or any other type of query supported by GitHub for Issues. (Ironically, you can't search for GitHub issues by their GitHub Project or the stage in that GitHub Project.)

So while the GitHub Projects feature for implementing a Kanban process and Kanban board may get better over time, currently is is pretty bad and I don't see why anyone would use it while waffle.io is available.

So why is BSS using GitHub project instead of waffle.io?

Anyone want to debate this?

CC: @maherou, @curfman, @jwillenbring

Improve customer confidence in your updates

Improve customer confidence in your updates

When a customer updates to a new version of your software, changes are not just about new features, but often (perhaps mostly) include improvements to existing capabilities.

When a customer is integrating your latest version, they are looking for changes in behavior. Changes include timing differences and changes in input requirements and output data. In HPC software, changes in output can be common, especially with floating point computations, where difference in order of operations can produce correct but different results.

In these situations, customers don’t necessarily mind that results have changed, but they want to know that the change is expected, not the result of a regression.

Improve customer confidence in your update by considering the following:

  • Create an issue in your database (e.g., a GitHub or JIRA issue) for the feature and give it a label indicating that the feature may change software behavior from the user’s perspective.
  • Notify known users of the change prior to release.
  • Document any changes that result in different behavior from your software.
  • Describe in release notes what kind of behavior change can be expected.
  • Provide users with an option to restore previous behavior (e.g., via a runtime or compile time parameter).
  • Include performance differences, even if the changes are improvements.

Some sources for behavior change:

  • Performance optimizations for vectorization: Vectorization represents one of the current commodity performance improvement curves. The number of simultaneous operations a process can perform (as either SIMD or SIMT), we continue to increase as a resource for concurrency. Introducing vector operations into your code, directly or through compiler transformations, will result in floating point results differences, including differences from one architecture to the next.
  • Reordering of irregular (gather/scatter) computations for better performance: Changes in the order of irregular computations can improve cache utilization and reduce memory bandwidth requirements, leading to better performance. These changes also lead to floating point result differences.
  • Changes in heuristics for automatic parameter settings: Many algorithms are tunable, able to exploit problem details to improve robustness, reliability or performance. Automatic parameter setting can improve software usability by reducing how many details the user needs to explicitly manage. Improved heuristics, often derived from customer use, can lead to changes in behavior, even though the change is an improvement.

Categories: Reliability

Need 1- and 2-line descriptions of the topics in the Reliability category.

Mini-WhatIs are also needed for all topics, but are not included in this issue.

Using "docker" or other container technology for research software

We've been using docker/containers for quite a while now, and it is very good at encapsulating complex packages with their dependencies, for use on any Linux/x86 system. Also can be used for continuous integration/automated testing, and will be important for HPC too.

I could write an article, if it would be useful.

Categories: Performance

Need 1- and 2-line descriptions of the topics in the Performance category.

Mini-WhatIs are also needed for all topics, but are not covered in this issue.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.