Code Monkey home page Code Monkey logo

Comments (11)

chacalle avatar chacalle commented on May 29, 2024

I think there is a bug with updating the timestamp field because it should just be the last time the document was updated or inserted. I'll open up a separate issue for that.

from fauna.

chacalle avatar chacalle commented on May 29, 2024

So the inclusion_date is just the very first time the document is added to the database. This would be formatted like collection_date as YYYY-MM-DD.

from fauna.

trvrb avatar trvrb commented on May 29, 2024

Yes. Exactly. inclusion_date is first time a document appears and is formatted just like collection_date.

from fauna.

chacalle avatar chacalle commented on May 29, 2024

Another thing to consider, each virus document and sequence document will have an inclusion_date. When downloading we merge together the virus document and sequence document with this command.

command = r.table(sequence_table).merge(lambda sequence: r.table(virus_table).get(sequence[index]))

If the virus and sequence document have different inclusion_date value then rethinkdbs merge command defaults to the rightmost document in the merge command which would keep the virus inclusion_date. I think it makes more sense to keep the sequence inclusion_date since there might be multiple sequences per virus but this would require some work on the download side to adjust the merge command.

from fauna.

trvrb avatar trvrb commented on May 29, 2024

Hmm.... I think I'm okay with attaching inclusion_date to virus when downloading FASTAs. As a concrete example, we'll often want to know something like when did A/HongKong/4801/2014 first appear in the database. More sequences can appear later, but that's not the main interest.

I do like having an inclusion_date for each sequence and an inclusion_date for each virus in the table. This just becomes a question of how to the merge when downloading.

from fauna.

chacalle avatar chacalle commented on May 29, 2024

Okay so say:
A/HongKong/4801/2014 has an inclusion_date 2014-04-01
The first sequence uploaded with it, EPI1 also has an inclusion_date 2014-04-01
A second sequence uploaded, EPI2 has a later inclusion_date 2014-08-31

Right now the command above would download EPI1 and EPI2 and they would both keep A/HongKong/4801/2014's inclusion_date 2014-04-01.

But this seems to be okay because we care more about when the virus is first uploaded. Both the sequence and virus will have inclusion_date field though.

from fauna.

trvrb avatar trvrb commented on May 29, 2024

Hmm.... I see. Thinking more, what if we had virus_inclusion_date and sequence_inclusion_date fields. The merge could include one or both in the resulting FASTA. Seems a bit cleaner perhaps. What do you think?

from fauna.

chacalle avatar chacalle commented on May 29, 2024

I like that! They'd both be left after the merge and can be downloaded to the resulting fasta if needed.

from fauna.

trvrb avatar trvrb commented on May 29, 2024

Exactly. I like it.

from fauna.

chacalle avatar chacalle commented on May 29, 2024

I believe this works now. I also added the fields to current documents in vdb and tdb defaulted to 2016-09-03. Also reminder that the inclusion_date and timestamp fields are based off utc time.

from fauna.

trvrb avatar trvrb commented on May 29, 2024

Fantastic! Thanks so much for making this happen @chacalle.

from fauna.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.