Code Monkey home page Code Monkey logo

Comments (7)

torognes avatar torognes commented on August 19, 2024 1

I am not sure, but I think UPARSE is quite strict and eliminates clusters that have a low abundance or low quality sequence. It also removes chimeras as far as I know. That may be a reason why it ends up with fewer clusters.

from swarm.

torognes avatar torognes commented on August 19, 2024

If I am not mistaken, the ITS sequences of fungi (if that's what you are studying) often have highly variable length gaps when aligned. That may cause Swarm to split groups more than other algorithms.

from swarm.

frederic-mahe avatar frederic-mahe commented on August 19, 2024

@torognes is right. Swarm does only one thing: it makes clusters of sequences; whereas UPARSE also applies aggressive filters to remove rare sequences, low quality sequences, and chimeras.

In my own analyses I use swarm --difference 1 --fastidious, and I apply all these somewhat arbitrary filters after clustering, rather than before. The idea is that applying filters is less harmful once the clusters are defined.

Other filters that can efficiently reduce the number of clusters:

  • eliminating clusters present in only one technical replicate,
  • eliminating low-abundant clusters present in only one biological replicate,
  • eliminating clusters dissimilar to any known reference sequence

from swarm.

Gian77 avatar Gian77 commented on August 19, 2024

Thank you @torognes and @frederic-mahe

I think it is possible, I do eliminate singletons in UPARSE and it is true that removes chimeras automatically.

I can try to remove chimeras before using SWARM, remove singletons after generating the clusters.
I am not sure about the reference based approach but it may be another more conservative option.

What if I increase --difference to e.g. 3 or more?

Gian

from swarm.

frederic-mahe avatar frederic-mahe commented on August 19, 2024

I suggest to eliminate chimeras after clustering and after removing singletons.

Increasing the --difference value will indeed reduce the number of clusters, but it will also reduce the resolution (you will not be able to distinguish taxa with only a few differences in your molecular marker).

In my own projects, using the high resolution --difference 1 has always been a greater advantage. The final number of clusters is an issue only if you are trying to get absolute alpha diversity values (and in that case having a lot of replicates is the prefered solution). For normal comparative diversity studies, the total number of clusters doesn't really matter.

from swarm.

Gian77 avatar Gian77 commented on August 19, 2024

@frederic-mahe,

thanks a lot for the explanation. I will follow your advice and let you know what I get.

Gian

from swarm.

frederic-mahe avatar frederic-mahe commented on August 19, 2024

I am going to close that issue. Please feel free to re-open if need be.

from swarm.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.