Comments (10)
Just want to double check that there is no sequence redundancy in your list of PSMs. Multiple PSMs from a single peak will only yield a single value.
The other option is that the current version might skip some unreadable psms b/c of sequence readability issues. I will have to check on that. If you make the mzml and the fasta available, i will search and quantify on our end to see what is going on.
from flashlfq.
There is no sequence redundancy. I thought at first that your guess about sequence readability might indeed be the issue, since my input file is a converted Percolator file. But I double checked, and the input file contains no modifications.
Here are the files: https://drive.google.com/drive/folders/1FF_vrsVwScEb5UJnBsEYHXSBoSPos31D?usp=sharing
from flashlfq.
thanks. i'll download and look at them shortly.
from flashlfq.
Bill,
I applied our whole workflow (calibration, ptm discovery and search with quant). Our calibration recommended search tolerances of 6ppm parent and 11ppm daughter. With those settings (and ptm discovery) we observed the following results:
All target PSMS within 1% FDR: 18583
All target peptides within 1% FDR: 12810
All target protein groups within 1% FDR: 2492
W.R.T. quantification, there were 128810 unique peptides. FlashLFQ within MetaMorpheus provided intensities for 12091 with the remaining 782 being 0 intensity. That is a ~94% yield.
I will need to run your results separately through FlashLFQ (without doing our search) to see why you got the results that you got. There is possibly a filtering w.r.t. q-value that we can resolve easily.
If you like, I'd be happy to provide you with all my search results. Maybe they are useful to you in some way. Stay tuned for my investigation of your data.
from flashlfq.
To start, I see 9796 PSMs with percolator q-value below 0.01. That agrees with your 9800 number. However, there are many duplicates in column K (sequence). In fact, I count only 7543 unique sequences. Please see the attached file.
percolatorSequenceWithCounts.txt
from flashlfq.
Thanks! Yes, there are duplicates in column K, but I think they should each be associated with distinct scan numbers.
One obvious thing to check is whether the RTs are sensible. I wrote pyteomics code to try to extract those from the mzML and stick them into the FlashLFQ input file, but I certainly could have messed that step up somehow.
In case it's not obvious, flash.txt
is the file I provided as input to FlashLFQ.
from flashlfq.
that was not clear but that does help. And, i'm gonna try to see if my code to pull retention times is working. stay tuned
from flashlfq.
flash.txt only has 6026 values. is that correct?
from flashlfq.
OMG, that's embarrassing! That means the problem is entirely on my end. I will figure out what went wrong in the conversion. Sorry!
from flashlfq.
LOL. That is the story of my whole life...
from flashlfq.
Related Issues (20)
- Duplicated column names in BayesianFoldChangeAnalysis.tsv
- Support for Percolator output files HOT 77
- RT alignment? HOT 2
- Support for more conditions
- FlashLFQ normalization issue HOT 8
- setting up license on linux HOT 2
- FlashLFQ crashed HOT 1
- extra trailing tab in output HOT 1
- will/does it support timsTOF data? HOT 1
- Question about quantification
- FlashLFQ: psm missing information?
- mass calibrated mzML or Thermo raw files
- SILAC quantification task crashing HOT 7
- settings, toml, prose HOT 2
- add reset button
- Add Version information to "About" tab
- Fail in the command-line mode HOT 2
- Questions about the generic input format HOT 3
- Commandline version dependent on Microsoft.WindowsDesktop.App Framework HOT 13
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from flashlfq.