Code Monkey home page Code Monkey logo

Comments (8)

R-Wright-1 avatar R-Wright-1 commented on August 14, 2024

Hi there,

The library should be compatible, but I think you're missing an intermediate step. When you download using the kraken commands, it automatically runs the kraken2-build --add-to-library command for each fasta file. You'll see that in the library/added folder of your database it'll have .fna and .fna.masked files for each of the files that have been added to the library (they also have just a random name). So what you should do is move all of the files that you copied to the library to a different folder and run something like this: for i in fasta_files/* ; do kraken2-build --add-to-library $i --db kraken2_database ; done (note also that the files need to be unzipped for this to work).
After you've run the add to library command then you can build the database that will then include everything that you want.

Thanks,
Robyn

from peptides.

termithorbor avatar termithorbor commented on August 14, 2024

But the command only downloads GCF_000001405.39_GRCh38.p13_genomic.fna.gz no matter if I use -human TRUE or FALSE. So what I want to do is to use a library with mammals in Kraken2 to identify animal species in my wgs sample.

from peptides.

R-Wright-1 avatar R-Wright-1 commented on August 14, 2024

OK I think I should have been more specific here - I think it probably isn't working because python is expecting either True or False (with only the first letter capitalized). I think either removing this human flag entirely (default is False, which you want) or saying --human False or if you modify the script by changing line 42 (which is currently human = args.human) to instead be human = False then any of these should fix the problem.

Thanks,
Robyn

from peptides.

termithorbor avatar termithorbor commented on August 14, 2024

I used the command with only the frist letter capitalized but I still only get the GCF_000001405.39_GRCh38.p13_genomic.fna.gz
file and no other fasta or fna files.

from peptides.

R-Wright-1 avatar R-Wright-1 commented on August 14, 2024

Oh that is strange. Did you try the other options, of either not including the flag at all or setting it within the code file?

from peptides.

R-Wright-1 avatar R-Wright-1 commented on August 14, 2024

OK I think I have figured it out now - were you using the --complete True option? If so, then the problem was that the script wasn't accepting 'chromosome' as being complete also (I think maybe this is something that is different with the latest NCBI update). If you download the version of download_domain.py that I have just added then this should now be fixed. Let me know if you still have issues!

from peptides.

termithorbor avatar termithorbor commented on August 14, 2024

Yes that was the problem - Thank you very much :)

from peptides.

R-Wright-1 avatar R-Wright-1 commented on August 14, 2024

No problem! Just a note that with the vertebrate_mammalian you may want to also include the incomplete/scaffold genomes as only about half of them are complete (you can see in the assembly summary assembly_level column how many there are and if they correspond to species of interest for you or not).

from peptides.

Related Issues (1)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.