Code Monkey home page Code Monkey logo

keemei's Introduction

Keemei

Validate tabular bioinformatics file formats in Google Sheets

Keemei (canonically pronounced key may) is an open source Google Sheets add-on for validating tabular bioinformatics file formats, including QIIME 2 metadata files.

To get started using Keemei, visit keemei.qiime2.org.

Citation

If you use Keemei for any published research, please include the following citation:

Keemei: cloud-based validation of tabular bioinformatics file formats in Google Sheets. Rideout JR, Chase JH, Bolyen E, Ackermann G, González A, Knight R, Caporaso JG. GigaScience. 2016;5:27. http://dx.doi.org/10.1186/s13742-016-0133-6

Find the Keemei paper here.

Licensing

Keemei is available under the new BSD license. See LICENSE for Keemei's license.

Keemei uses and distributes Moment.js, available under the MIT license. See licenses/Moment.js.txt.

Credits

Keemei is a QIIME 2 project developed by the Caporaso Lab. See the full list of Keemei's contributors on GitHub. Keemei was originally developed by Jai Ram Rideout (@jairideout) in the Caporaso Lab. Keemei's logo was created by John Chase (@johnchase).

keemei's People

Contributors

colinbrislawn avatar ebolyen avatar gitter-badger avatar jairideout avatar keegan-evans avatar lizgehret avatar q2d2 avatar thermokarst avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

keemei's Issues

Prevent sorting of only *some* columns?

Not sure what the Google Sheets add-ons API permits, but @rob-knight pointed out that a frequent problem is that people sometimes sort the rows of their mapping files but accidentally do not select all the columns (resulting in "scrambled" metadata files)

add About dialog

Add an About dialog that contains info about the add-on's functionality and the file formats it validates. This feature was requested in the review performed by Google.

validate empty cells correctly

Empty cells are currently marked as invalid for the wrong reasons. Specific checks should be put in place for empty cells.

add sidebar for listing errors and warnings

Adding a sidebar that lists the various errors and warnings would be really helpful for larger spreadsheets where it's hard to find all problems simply by looking at the cells. See the sidebar documentation. Would be awesome if clicking an error/warning focused on the offending cell in the spreadsheet.

clean up Header.gs

Header.gs is pretty messy, and a lot of the code could be refactored to use pieces in Base.gs and Column.gs.

support numeric data

Numeric cells currently throw an error during validation. They need to be cast to strings first.

add website

Current docs are on the repo's wiki. It'd be better to create a dedicated website as the add-on's Help menu will direct users to this site.

Required if add-on is accepted for publication in the web store.

abstract out the rules for different validators?

Improvement Description
Would it be possible to abstract out the rules, so that if (for example) we wanted to build a QIIME2-based (i.e., python) validator that could work without google access (e.g., while working on a plane, or somewhere where access to google is blocked such as parts of China or DoD/USDA facilities) we don't have to define these rules twice? Ideally, this could be used for auto-generating documentation as well.

improve performance

Validation is pretty slow for larger sheets. Look into cutting down the number of Google API calls -- these are what's slowing it down. Profile via View -> Execution transcript.

add author info to About dialog

@gregcaporaso suggested adding a link to the lab website in the About dialog, along with info about who developed the tool. This info should also be added somewhere on the "website" (currently GitHub wiki) and readme.

add install instructions

Add instructions that show how to install the plugin. Need to cut a release first and make it an real Google Sheets "add-on".

support validation of currency-formatted numbers

A negative number formatted as a currency is marked as an empty cell. Example: (42.45)

Since we're using a third-party library (SheetConverter) for these conversions, this issue may need to be raised on their issue tracker.

note location of duplicate cells

It'd be helpful to list the location of duplicates cells in each error/warning message. Right now it just states that the cell isn't unique and the user has to search for the duplicates.

Search for latitude and longitude

Improvement Description
A user may have a column with locations of the form 1846 19th street, 9500 Gilman Drive, etc and may want to create latitude and longitude columns from this information.

barcode length checking

Ensure barcodes are all the same length unless the user has specified variable-length barcodes.

support multiple validation functions

It will be important for the set of rules that are applied to be customizable. For example, Qiita has different requirements for its sample template and prep template (qiita-spots/qiita#933) than QIIME 1.x has for its mapping files.

What probably makes the most sense for this would be if there were multiple validate functions, which were tool/version-specific (e.g., QIIME 1.9.x, Qiita 0.1.x, ...), and if the user could choose to validate with one or more of these from within the Google Spreadsheets interface.

indicate error vs. warning in notes

Keemei doesn't state whether a message (displayed in a note) is an error or a warning. Instead it relies on the color of the cell to indicate this state. The notes should also include this info to improve accessibility.

validators take state as input

This should be able to be re-factored to just return the state that represents that column/row and even per cell. This can then be batched as appropriate.

indicate success

Initially this will be done by either coloring the column header cell green, or perhaps every valid cell.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.