Comments (5)
A few things:
- I would do
-m 100G
for the memory, this is how the memory string is read - This looks like inDropsv3 data format, is it? If so, the newest release of kallisto supports this, and we will be releasing
kb-python
soon that will support this structure.
For clarity this is the structure I am referring to:
R1: Biological Read
R2: Cell BC1 (1-8bp)
R3: Index (1-8bp) UMI (9-14bp)
R4: Cell BC2 (1-8bp)
from kb_python.
Thank you for your quick response! I will try to rerun with -m 100G
; I will let you know whether that solves the issue. Does the newest version of kallisto have a inDropsv3 structure that also accepts a R3 library index file? If so, I will try to run kallisto. But it should be possible right to specify a new technology string with the library index of inDrops in R3 as a BC because in essence a library index is just a BC (that together with BC1 and BC2 specifies unique cells)?
In addition, should the UMI in your inDrops structure not be in R4? I use the following inDrops version 3 structure, which is the same as used in: https://github.com/indrops/indrops (I checked the reads manually):
R1: Biological read
R2: Cell BC1 (1-8 bp)
R3: Index (1-8 bp)
R4: Cell BC2 (1-8 bp) and UMI (9-14 bp).
Again, thank you so much for creating kallisto bus, and kb-python.
from kb_python.
The index file is only necessary if you wish to demultiplex samples that were pooled on the same lane, using the samplesheet.csv file that you create (if you used Illumina short read sequencing), see Illumina documentation. kallisto does not use the sample index.
Small side note, I made an error in the comment above, the UMI is in R4 and the structure is:
R1: Biological read
R2: Cell BC1 (1-8 bp)
R3: Index (1-8 bp)
R4: Cell BC2 (1-8 bp) and UMI (9-14 bp).
To process your reads lets look at main.cpp
in the kallisto repo. We see the following lines:
} else if (opt.technology == "INDROPSV3") {
busopt.nfiles = 3;
busopt.seq.push_back(BUSOptionSubstr(2,0,0));
busopt.umi = BUSOptionSubstr(1,8,14);
busopt.bc.push_back(BUSOptionSubstr(0,0,8));
busopt.bc.push_back(BUSOptionSubstr(1,0,8));
In plain english: kallisto expects 3
files. Given how you have defined what R1,R2,R3,R4 mean, we note the that first half of the cell barcode comes from R2, the second half of the cell barcode comes from R4, the UMI comes from R4 and the biological read is in R1. So the command would be:
kallisto bus -i index.idx -o ./output -x inDropsv3 R2.fastq.gz R4.fastq.gz R1.fastq.gz
Where R2 is the 0th
file, R4 is the 1st
file and R1 is the 2nd
file (0-indexed).
This works with the current release of kallisto and will be added to kb-python
soon.
from kb_python.
Thank you for your reaction. Yes, I understood that I could not use technology = "INDROPSV3" because this kb technology specification expects 3 files that together contain BC1, BC2, UMI, and Biological read. Therefore, I used the opportunity of kb to specifiy a new technology myself that has 3 BC of which one BC is the library index because in essence a library index is just a BC that together with BC1 and BC2 specifies an unique cell. The issue that I raised here is that the specification of a new technology with 3 BCs, a UMI, and a Biological read does not work, while it should be possible to do this according to the kallisto bus documentation? I also tried to run with -m 100G
and I get a different error <Signals.SIGKILL: 9>
instead of <Signals.SIGSEGV: 11>
but still 0 reads pseudo aligned and an empty bus file (which I think causes the error in the bustools sort command).
from kb_python.
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days
from kb_python.
Related Issues (20)
- SmartSeq3 Demultiplexed Fastq files HOT 4
- kb-python version 0.28.2 giving problems on m1 mac HOT 6
- Is kb-python capable of processing many samples in one command? HOT 2
- --umi-gene no longer supported in kb count? HOT 2
- Containerized kb can choose the wrong binary HOT 1
- How to run RNA velocity analysis (La Manno) on BAM files HOT 4
- kb-python processing significantly less reads HOT 6
- Crosspost from kallistobustools: kb count error with CSP reads HOT 21
- Running kb count with a single FASTQ file HOT 4
- kb count - joint vs individual sample processing for RNA velocity analysis HOT 2
- kb ref can't handle blanks in path arguments HOT 2
- Naive collapsing of UMIs in SS3xpress data? HOT 6
- No reads pseudoaligned in 10XV1 chemistry HOT 6
- Usage of f1 and f2 parameters in kb ref and count HOT 2
- Changed barcodes from the kb-python count HOT 2
- Issue obtaining unaligned reads HOT 3
- Kb count command stuck on bustools count step HOT 5
- kb-python for MARS-seq HOT 1
- Syntax for 3-reads, paired end technology? HOT 3
- Whitelist for SMARTSEQ2 technology HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kb_python.