Code Monkey home page Code Monkey logo

lemur's People

Contributors

bkille avatar mgnute avatar nsapoval avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

lemur's Issues

Output not generated due to Python Error

Hello,
Thank you for developing this tool. It looks interesting. I ran this tool with some ONT reads from an untargeted sequencing run. I used the following code to run the analysis:
lemur -i Data_cutadapt.fastq -o lem_test -d rv221bacarc-rv222fungi --tax-path rv221bacarc-rv222fungi/taxonomy.tsv --mm2-type map-ont --verbose -e LOG_FILE -r species

But it threw me the following error:

Traceback (most recent call last):
  File "/home/rd/miniconda3/bin/lemur", line 901, in <module>
    main()
  File "/home/rd/miniconda3/bin/lemur", line 887, in main
    run.EM_complete()
  File "/home/rd/miniconda3/bin/lemur", line 672, in EM_complete
    self.low_abundance_threshold = 1. / n_reads
ZeroDivisionError: float division by zero

The fastq file worked well and provided desired results when I used that for taxonomic classification using Kraken2.

Then, I check the log file, which contains the following contents:

2024-06-25 11:15:09 AM INFO:	Starting run of minimap2 at 2024-06-25 11:15:09.730779
2024-06-25 11:16:51 AM DEBUG:	
2024-06-25 11:16:51 AM DEBUG:	[M::mm_idx_gen::40.391*1.47] collected minimizers
[M::mm_idx_gen::47.459*2.30] sorted minimizers
[M::main::47.691*2.30] loaded/built the index for 3335785 target sequence(s)
[M::mm_mapopt_update::47.691*2.30] mid_occ = 1000000
[M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 3335785
[M::mm_idx_stat::48.247*2.28] distinct minimizers: 51995586 (41.46% are singletons); average occurrences: 11.640; average spacing: 5.451; total length: 3299326635
[M::worker_pipeline::101.127*7.31] mapped 64828 sequences
[M::main] Version: 2.24-r1122
[M::main] CMD: minimap2 -ax map-ont -N 50 -K 500000000 -p .9 -f 0 --sam-hit-only --eqx -t 20 -o lem_test/reads.sam rv221bacarc-rv222fungi/species_taxid.fasta Chry_1_cutadapt.fastq
[M::main] Real time: 101.611 sec; CPU: 739.436 sec; Peak RSS: 20.974 GB

2024-06-25 11:16:51 AM INFO:	Finished running minimap2 in 101787.308 ms
2024-06-25 11:16:51 AM DEBUG:	                                    species                   genus              family  ... subspecies species subgroup species group
tax_id                                                                                   ...                                          
5467               Colletotrichum truncatum          Colletotrichum      Glomerellaceae  ...        NaN              NaN           NaN
5499                           Fulvia fulva                  Fulvia  Mycosphaerellaceae  ...        NaN              NaN           NaN
245174                Wallemia ichthyophaga                Wallemia        Wallemiaceae  ...        NaN              NaN           NaN
1903569           Penicillium riverlandense             Penicillium      Aspergillaceae  ...        NaN              NaN           NaN
36630                  Aspergillus fischeri             Aspergillus      Aspergillaceae  ...        NaN              NaN           NaN
...                                     ...                     ...                 ...  ...        ...              ...           ...
2977707        Synechococcus sp. AH-551-E02           Synechococcus    Synechococcaceae  ...        NaN              NaN           NaN
467085   Candidatus Symbiothrix dinenymphae  Candidatus Symbiothrix                 NaN  ...        NaN              NaN           NaN
1873883        Rhodoplanes sp. JGI PP 4-B12             Rhodoplanes   Hyphomicrobiaceae  ...        NaN              NaN           NaN
2977708        Synechococcus sp. AH-551-E05           Synechococcus    Synechococcaceae  ...        NaN              NaN           NaN
1492901            Bizionia psychrotolerans                Bizionia   Flavobacteriaceae  ...        NaN              NaN           NaN

[47390 rows x 11 columns]
2024-06-25 11:16:51 AM DEBUG:	Loaded taxonomy file rv221bacarc-rv222fungi/taxonomy.tsv
2024-06-25 11:16:51 AM INFO:	Finished loading taxonomy in 88.576 ms
2024-06-25 11:16:51 AM DEBUG:	Initialized abundance vector
2024-06-25 11:16:51 AM INFO:	Finished initializing F in 3.161 ms
2024-06-25 11:16:51 AM INFO:	Finished building alignment model in 0.027 ms
2024-06-25 11:16:56 AM DEBUG:	build_P_rgs_df extracted 100000 reads
2024-06-25 11:16:56 AM DEBUG:	build_P_rgs_df extracted 200000 reads
2024-06-25 11:16:57 AM DEBUG:	build_P_rgs_df extracted 300000 reads
2024-06-25 11:16:58 AM DEBUG:	build_P_rgs_df extracted 400000 reads
2024-06-25 11:16:58 AM DEBUG:	build_P_rgs_df extracted 500000 reads
2024-06-25 11:16:59 AM DEBUG:	build_P_rgs_df extracted 600000 reads
2024-06-25 11:17:00 AM DEBUG:	build_P_rgs_df extracted 700000 reads
2024-06-25 11:17:00 AM DEBUG:	build_P_rgs_df extracted 800000 reads
2024-06-25 11:17:34 AM DEBUG:	Empty DataFrame
Columns: [log_P, Gene, Reference, aln_len, max_aln_len, max_log_P, Name, Genome]
Index: []
2024-06-25 11:17:34 AM INFO:	Finished constructing P(r|s) in 43166.161 ms

But I do not get any output file with relative abundance of the microorganisms. In the output folder, I have only the P_rgs* files.

Could you please help me find a solution?
Thank you.

Automated tests fail

After the latest push the automated tests fail. No major changes were made to the code (minor error handling added). It appears that the problem has to do with how GitHub actions handle conda environments now. I'd appreciate if we can look into this and fix it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.