Hi all, I followed the workflow of metaGEM (except using metaspades and das_tool i

See also <a class="issue-link js-issue-link" data-error-text="Failed to load title" da

low number of species after binning for metaGEM model reconstruction about metagem HOT 2 CLOSED

Zhaoju-Deng commented on September 4, 2024

low number of species after binning for metaGEM model reconstruction

from metagem.

Comments (2)

franciscozorrilla commented on September 4, 2024

Hey Zhaoju,

The numbers you describe sound pretty normal to me: I would expect that an assembly-free approach used by short-read profilers like kraken/metaphlan/mOTUs will have many more "hits" for genomes compared to the assembly-based approach used by metaGEM and other similar workflows.

However, I would warn that many of the low relative abundance hits from short read profilers may be false positives from closely related species. I would expect that if you use a relative abunance cutoff to filter our low abundance species from the short-read profiler output, then the number of species will start to approach those obtained from the assembly-based approach. This reflects the fact that assembly-based approaches work great for high coverage or high abundance genomes, but not so well for low abundance/coverage genomes.

Also, 8-12 genomes is on the lower side, did you use coverage across multiple samples for binning? This extra mapping information should allow you to get more out of your samples. As an example, consider this publication (https://doi.org/10.1016/j.cell.2019.01.001) where they reconstruct 154,723 genomes from 9,428 human gut metagenomes = ~ 16 genomes/sample, and note that they did not use coverage across multiple samples for binning. In the metaGEM paper (https://doi.org/10.1093/nar/gkab815), we reconstructed 4,133 genomes from 137 human gut metagenomes = ~ 30 genomes/sample, note that this was using coverage across samples.

Note also that sequencing depth and complexity of your samples will play a big role in the number of genomes reconstructed, if your samples are very shallow and they are complex then you will recover a low number of genomes. If possible, try increasing sequencing depth in your next experiment, or search for a dataset with higher sequencing depth.

I think that the approach you mention regarding the usage of short-read profilers to select AGORA models for simulation is understandable, but not very elegant. The whole point of metaGEM is enable direct reconstuction of metabolic models from metagenomes in order to capture context-and-strain-specific information available in your sequencing samples, which is missing from reference genomes and reference-genome-based-metabolic models (e.g. AGORA). Consider the following text from the metaGEM paper:

Pangenome analysis of the human gut microbiome demonstrated that the functional repertoire of gut species differ significantly, with a median core genome proportion of only 66% [14], revealing differences in metabolic potentials of individual microbiomes.

There is significant variation in the functional repertoire of the same species across humans, and I would expect the differences in metabolism of the same microbial species across human and cow to be even greater.

Hope this helps, let me know if you have further questions!
Best,
Francisco

from metagem.

franciscozorrilla commented on September 4, 2024

low number of species after binning for metaGEM model reconstruction about metagem HOT 2 CLOSED

Comments (2)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent