Code Monkey home page Code Monkey logo

Comments (10)

trishorts avatar trishorts commented on August 14, 2024

Just want to double check that there is no sequence redundancy in your list of PSMs. Multiple PSMs from a single peak will only yield a single value.

The other option is that the current version might skip some unreadable psms b/c of sequence readability issues. I will have to check on that. If you make the mzml and the fasta available, i will search and quantify on our end to see what is going on.

from flashlfq.

wsnoble avatar wsnoble commented on August 14, 2024

There is no sequence redundancy. I thought at first that your guess about sequence readability might indeed be the issue, since my input file is a converted Percolator file. But I double checked, and the input file contains no modifications.

Here are the files: https://drive.google.com/drive/folders/1FF_vrsVwScEb5UJnBsEYHXSBoSPos31D?usp=sharing

from flashlfq.

trishorts avatar trishorts commented on August 14, 2024

thanks. i'll download and look at them shortly.

from flashlfq.

trishorts avatar trishorts commented on August 14, 2024

Bill,

I applied our whole workflow (calibration, ptm discovery and search with quant). Our calibration recommended search tolerances of 6ppm parent and 11ppm daughter. With those settings (and ptm discovery) we observed the following results:
All target PSMS within 1% FDR: 18583
All target peptides within 1% FDR: 12810
All target protein groups within 1% FDR: 2492
W.R.T. quantification, there were 128810 unique peptides. FlashLFQ within MetaMorpheus provided intensities for 12091 with the remaining 782 being 0 intensity. That is a ~94% yield.

I will need to run your results separately through FlashLFQ (without doing our search) to see why you got the results that you got. There is possibly a filtering w.r.t. q-value that we can resolve easily.

If you like, I'd be happy to provide you with all my search results. Maybe they are useful to you in some way. Stay tuned for my investigation of your data.

from flashlfq.

trishorts avatar trishorts commented on August 14, 2024

To start, I see 9796 PSMs with percolator q-value below 0.01. That agrees with your 9800 number. However, there are many duplicates in column K (sequence). In fact, I count only 7543 unique sequences. Please see the attached file.
percolatorSequenceWithCounts.txt

from flashlfq.

wsnoble avatar wsnoble commented on August 14, 2024

Thanks! Yes, there are duplicates in column K, but I think they should each be associated with distinct scan numbers.

One obvious thing to check is whether the RTs are sensible. I wrote pyteomics code to try to extract those from the mzML and stick them into the FlashLFQ input file, but I certainly could have messed that step up somehow.

In case it's not obvious, flash.txt is the file I provided as input to FlashLFQ.

from flashlfq.

trishorts avatar trishorts commented on August 14, 2024

that was not clear but that does help. And, i'm gonna try to see if my code to pull retention times is working. stay tuned

from flashlfq.

trishorts avatar trishorts commented on August 14, 2024

flash.txt only has 6026 values. is that correct?
image

from flashlfq.

wsnoble avatar wsnoble commented on August 14, 2024

OMG, that's embarrassing! That means the problem is entirely on my end. I will figure out what went wrong in the conversion. Sorry!

from flashlfq.

trishorts avatar trishorts commented on August 14, 2024

LOL. That is the story of my whole life...

from flashlfq.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.