yfukasawa / longqc Goto Github PK
View Code? Open in Web Editor NEWLongQC is a tool for the data quality control of the PacBio and ONT long reads.
License: MIT License
LongQC is a tool for the data quality control of the PacBio and ONT long reads.
License: MIT License
Hello. I just tried your program and after increasing number for (-p) was able to get further but now have new error. My input file is demultiplexed with qcat, concatenated into one fastq. Any ideas? Looking forward to using this software!
python ~/programs/LongQC/longQC.py sampleqc -x ont-ligation -o ontqc -p 4 Cf_KPC_all.fastq
longQC:2020-06-04 13:38:43,599:166:INFO:Cmd: /home/tarah/programs/LongQC/longQC.py sampleqc -x ont-ligation -o ontqc -p 4 Cf_KPC_all.fastq
longQC:2020-06-04 13:38:43,599:216:INFO:Preset "ont-ligation" was applied. Options --pb(--ont) is overwritten.
longQC:2020-06-04 13:38:43,733:288:INFO:Computation of the low complexity region started for a chunk 0
lq_mask:2020-06-04 13:38:43,784:111:INFO:New job was submitted: in->ontqc/analysis/tmp_0.fastq, out->ontqc/analysis/tmp_0.txt
longQC:2020-06-04 13:38:43,785:293:INFO:Adapter search is starting for a chunk 0.
longQC:2020-06-04 13:38:43,785:309:INFO:Computation of the GC fraction started for a chunk 0
lq_utils:2020-06-04 13:38:44,020:358:INFO:list for subsample is not initialized. Initializing now.
lq_adapt:2020-06-04 13:38:44,086:76:INFO:694 reads were skipped due to their short lengths.
lq_adapt:2020-06-04 13:38:44,086:89:INFO:Adapter Sequence: AATGTACTTCGTTCAGTTACGTATTGCT, max identity:0.758621 and the number of trimmed reads: 1
lq_adapt:2020-06-04 13:38:44,218:41:INFO:694 reads were skipped due to their short lengths.
lq_adapt:2020-06-04 13:38:44,218:91:INFO:Adapter Sequence: GCAATACGTAACTGAACG, max identity:0.842105 and the number of trimmed reads: 20
longQC:2020-06-04 13:38:44,563:314:INFO:Adapter search has done for a chunk 0.
longQC:2020-06-04 13:38:44,563:324:INFO:subsample finished for chunk 0.
longQC:2020-06-04 13:38:44,563:344:INFO:Input file parsing was finished. #seqs:3591, #bases: 11155437
lq_mask:2020-06-04 13:38:44,563:114:INFO:Waiting completion of all of jobs...
lq_mask:2020-06-04 13:38:44,658:117:INFO:sdust jobs finished.
lq_mask:2020-06-04 13:38:44,661:87:INFO:sdust output file ontqc/longqc_sdust.txt was made.
lq_mask:2020-06-04 13:38:44,669:93:INFO:tmp file ontqc/analysis/tmp_0.fastq was removed.
lq_mask:2020-06-04 13:38:44,669:93:INFO:tmp file ontqc/analysis/tmp_0.txt was removed.
longQC:2020-06-04 13:38:44,669:348:INFO:Summary table ontqc/longqc_sdust.txt was made.
longQC:2020-06-04 13:38:44,695:354:DEBUG:Highly masked seq list:
longQC:2020-06-04 13:38:44,739:393:INFO:Subsampled seqs were written to a file. #seqs:3591
lq_exec:2020-06-04 13:38:44,749:26:INFO:below command is executed: -Y -l 0 -q 160 -k 15 -w 5 -I 4G -p 160 -t 4 Cf_KPC_all.fastq ontqc/analysis/subsample.fastq
lq_exec:2020-06-04 13:38:44,749:27:INFO:/home/tarah/programs/LongQC/minimap2_mod/minimap2-coverage is started.
longQC:2020-06-04 13:38:44,750:421:INFO:Overlap computation started. Process is 14834
lq_gcfrac:2020-06-04 13:38:44,750:52:INFO:Mean GC composition: 0.510
Traceback (most recent call last):
File "/home/tarah/programs/LongQC/longQC.py", line 917, in
main(args)
File "/home/tarah/programs/LongQC/longQC.py", line 63, in main
args.handler(args)
File "/home/tarah/programs/LongQC/longQC.py", line 424, in command_sample
gc_read_mean, gc_read_sd = lg.plot_unmasked_gc_frac(fp=fig_path_gc)
File "/home/tarah/programs/LongQC/lq_gcfrac.py", line 54, in plot_unmasked_gc_frac
plt.hist(self.r_frac, alpha=0.3, bins=np.arange(min(self.r_frac), max(self.r_frac) + b_width, b_width), color='blue', normed=True)
File "/home/tarah/miniconda3/envs/py36/lib/python3.6/site-packages/matplotlib/pyplot.py", line 2610, in hist
if data is not None else {}), **kwargs)
File "/home/tarah/miniconda3/envs/py36/lib/python3.6/site-packages/matplotlib/init.py", line 1565, in inner
return func(ax, *map(sanitize_sequence, args), **kwargs)
File "/home/tarah/miniconda3/envs/py36/lib/python3.6/site-packages/matplotlib/axes/_axes.py", line 6808, in hist
p.update(kwargs)
File "/home/tarah/miniconda3/envs/py36/lib/python3.6/site-packages/matplotlib/artist.py", line 1006, in update
ret = [_update_property(self, k, v) for k, v in props.items()]
File "/home/tarah/miniconda3/envs/py36/lib/python3.6/site-packages/matplotlib/artist.py", line 1006, in
ret = [_update_property(self, k, v) for k, v in props.items()]
File "/home/tarah/miniconda3/envs/py36/lib/python3.6/site-packages/matplotlib/artist.py", line 1002, in _update_property
.format(type(self).name, k))
AttributeError: 'Rectangle' object has no property 'normed'
Hello....
When I run the below code,
python longQC.py sampleqc -x pb-rs2 -o prueba14 /home/jforero/mis_datos/tutoriales/anaconda3/prueba10.fasta -p 72 -d --fast -m 2 -i 10 --trim_output trimmedsequences
It gives me an error saying that there is attribute error: 'float' object has no attribute 'split' .
I would like to know why this error comes about.
lq_coverage:2020-08-03 10:38:59,423:122:INFO:Estimation of coverage distribution finished.
Traceback (most recent call last):
File "longQC.py", line 932, in <module>
main(args)
File "longQC.py", line 62, in main
args.handler(args)
File "longQC.py", line 592, in command_sample
adp3_pos=np.mean(adp_pos3) if args.adp3 and adp_pos3 and np.mean(adp_pos3) > 0 else None)
File "/datos/datosjforero/tutoriales/anaconda3/LongQC/lq_coverage.py", line 373, in plot_unmapped_frac_terminal
t5l, t3l, il = self.__region_analysis(3, 1)
File "/datos/datosjforero/tutoriales/anaconda3/LongQC/lq_coverage.py", line 596, in __region_analysis
regs = [(int(reg.split('-')[0]), int(reg.split('-')[1])) for reg in str.split(',')]
AttributeError: 'float' object has no attribute 'split'
Hello
I hope all is well.
Sorry for the bother but I have an issue running longQC and was hoping for some help as I haven't been able to figure it out.
I believe I have everything installed correctly as per the instructions however when I run the below command to get sampleqc on a pacbio seuel bam file called 37.bam:
python /pub01/mgemmell/programs_chos_7/longqc/LongQC/longQC.py sampleqc -x pb-sequel -o 37_longqc_results 37.bam
I get the below:
longQC:2021-01-19 13:43:57,198:169:INFO:Cmd: /pub01/mgemmell/programs_chos_7/longqc/LongQC/longQC.py sampleqc -x pb-sequel -o 37_longqc_results 37.bam
longQC:2021-01-19 13:43:57,198:233:INFO:Preset "pb-sequel" was applied. Options --pb(--ont) is overwritten.
Traceback (most recent call last):
File "/pub01/mgemmell/programs_chos_7/longqc/LongQC/longQC.py", line 956, in
main(args)
File "/pub01/mgemmell/programs_chos_7/longqc/LongQC/longQC.py", line 62, in main
args.handler(args)
File "/pub01/mgemmell/programs_chos_7/longqc/LongQC/longQC.py", line 235, in command_sample
file_format_code = guess_format(args.input)
File "/pub01/mgemmell/programs_chos_7/longqc/LongQC/lq_utils.py", line 125, in guess_format
l = f.readline()
File "/pub01/mgemmellprograms_chos_8/anaconda3/envs/longqc/lib/python3.8/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x99 in position 4: invalid start byte
Any help would be appreciated and feel free to ask me any questions if it helps.
Hi, I tried to compile minimap2-coverage within the cloned LongQC repo and seems to be missing a file
cd LongQC/minimap2-coverage && make cc -c -g -O2 -Wall -Wc++-compat -DHAVE_KALLOC minimap2-coverage.c -o minimap2-coverage.o minimap2-coverage.c:4:10: fatal error: zlib.h: No such file or directory 4 | #include <zlib.h> | ^~~~~~~~ compilation terminated. make: *** [Makefile:29: minimap2-coverage.o] Error 1
Any ideas?
Hi, I am trying to install pysam and edlib. It shows following error. Please suggest.
--
b) conda install -c bioconda pysam
c) conda install -c bioconda edlib
error-----------
Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
ResolvePackageNotFound:
Hi @yfukasawa ,
Again, thanks heaps for this wonderful tool.
Could you kindly explain a bit of what happens in/logic behind (or point me to a location where I can find a thorough explanation) the sampleqc
step? It seems to be generating multiple fastq files at the moment (I started processing only ~1 hour ago). I executed the following script:
python longQC.py sampleqc -x pb-sequel -p 8 -o ${OUT_DIR}/${subdir} ${BAM}
Cheers,
Shani.
Hello,
I've just installed the software. When running:
conda activate LongQC_env
export PATH=/home/m.sevi/software/LongQC/minimap2_mod/:$PATH
python /home/m.sevi/software/LongQC/longQC.py sampleqc -p 4 -x ont-rapid -o fne_qc_out_dir /scratch/m.sevi/processing/WW_HRSD/data/long/basecalled_fq/fne/fne_fastq/fne.fastq
The run fails with standard output:
longQC:2020-06-18 15:11:19,615:166:INFO:Cmd: /home/m.sevi/software/LongQC/longQC.py sampleqc -p 4 -x ont-rapid -o fne_qc_out_dir /scratch/m.sevi/processing/WW_HRSD/data/long/basecalled_fq/fne/fne_fastq/fne.fastq
longQC:2020-06-18 15:11:19,615:216:INFO:Preset "ont-rapid" was applied. Options --pb(--ont) is overwritten.
longQC:2020-06-18 15:11:21,621:288:INFO:Computation of the low complexity region started for a chunk 0
lq_mask:2020-06-18 15:11:22,791:111:INFO:New job was submitted: in->fne_qc_out_dir/analysis/tmp_0.fastq, out->fne_qc_out_dir/analysis/tmp_0.txt
longQC:2020-06-18 15:11:22,792:293:INFO:Adapter search is starting for a chunk 0.
longQC:2020-06-18 15:11:22,792:309:INFO:Computation of the GC fraction started for a chunk 0
lq_utils:2020-06-18 15:11:25,979:380:INFO:list for subsample is not initialized. Initializing now.
lq_adapt:2020-06-18 15:11:30,810:76:INFO:7038 reads were skipped due to their short lengths.
lq_adapt:2020-06-18 15:11:30,811:96:INFO:Adapter Sequence: GTTTTCGCATTTATCGTGAAACGCTTTCGCGTTTTTCGTGCGCCGCTTCA, max identity:-1.000000 and the number of trimmed reads: 0
longQC:2020-06-18 15:11:45,124:320:INFO:Adapter search has done for a chunk 0.
longQC:2020-06-18 15:11:45,125:324:INFO:subsample finished for chunk 0.
Traceback (most recent call last):
File "/home/m.sevi/software/LongQC/longQC.py", line 920, in
main(args)
File "/home/m.sevi/software/LongQC/longQC.py", line 63, in main
args.handler(args)
File "/home/m.sevi/software/LongQC/longQC.py", line 335, in command_sample
if tuple_3:
UnboundLocalError: local variable 'tuple_3' referenced before assignment
Below find information about my environment:
_libgcc_mutex 0.1 main
blas 1.0 mkl
bzip2 1.0.8 h7b6447c_0
ca-certificates 2020.1.1 0 anaconda
certifi 2020.4.5.2 py37_0 anaconda
cycler 0.10.0 py_2 conda-forge
dbus 1.13.6 he372182_0 conda-forge
edlib 1.2.3 h2d50403_1 bioconda
expat 2.2.9 he1b5a44_2 conda-forge
fontconfig 2.13.1 he4413a7_1000 conda-forge
freetype 2.10.2 he06d7ca_0 conda-forge
glib 2.63.1 h3eb4bd4_1
gst-plugins-base 1.14.0 hbbd80ab_1
gstreamer 1.14.0 hb31296c_0
h5py 2.10.0 py37h7918eee_0
hdf5 1.10.4 hb1b8bf9_0
icu 58.2 hf484d3e_1000 conda-forge
intel-openmp 2019.4 243
jinja2 2.11.2 py_0 anaconda
joblib 0.15.1 py_0 anaconda
jpeg 9d h516909a_0 conda-forge
kiwisolver 1.2.0 py37h99015e2_0 conda-forge
krb5 1.17.1 h173b8e3_0
ld_impl_linux-64 2.33.1 h53a641e_7
libcurl 7.69.1 h20c2e04_0
libdeflate 1.6 h516909a_0 conda-forge
libedit 3.1.20181209 hc058e9b_0
libffi 3.3 he6710b0_1
libgcc-ng 9.1.0 hdf63c60_0
libgfortran-ng 7.3.0 hdf63c60_0
libpng 1.6.37 hed695b0_1 conda-forge
libssh2 1.9.0 h1ba5d50_1
libstdcxx-ng 9.1.0 hdf63c60_0
libuuid 2.32.1 h14c3975_1000 conda-forge
libxcb 1.13 h14c3975_1002 conda-forge
libxml2 2.9.10 he19cac6_1
markupsafe 1.1.1 py37h7b6447c_0 anaconda
matplotlib 3.2.1 0 conda-forge
matplotlib-base 3.2.1 py37hef1b27d_0
mkl 2019.4 243
mkl-service 2.3.0 py37he904b0f_0
mkl_fft 1.0.14 py37hd81dba3_0 r
mkl_random 1.0.4 py37hd81dba3_0 r
ncurses 6.2 he6710b0_1
numpy 1.17.0 py37h7e9f1db_0 r
numpy-base 1.17.0 py37hde5b4d6_0 r
openssl 1.1.1g h7b6447c_0 anaconda
pandas 1.0.4 py37h0573a6f_0 anaconda
pcre 8.44 he1b5a44_0 conda-forge
pip 20.1.1 py37_1
pthread-stubs 0.4 h14c3975_1001 conda-forge
pyparsing 2.4.7 pyh9f0ad1d_0 conda-forge
pyqt 5.9.2 py37hcca6a23_4 conda-forge
pysam 0.16.0.1 py37hc501bad_0 bioconda
python 3.7.7 hcff3b4d_5
python-dateutil 2.8.1 py_0 anaconda
python-edlib 1.3.8.post1 py37h99015e2_1 bioconda
python_abi 3.7 1_cp37m conda-forge
pytz 2020.1 py_0 anaconda
qt 5.9.7 h5867ecd_1
readline 8.0 h7b6447c_0
scikit-learn 0.22.1 py37hd81dba3_0 anaconda
scipy 1.4.1 py37h0b6359f_0 anaconda
setuptools 47.3.0 py37_0
sip 4.19.8 py37hf484d3e_0
six 1.15.0 py_0
sqlite 3.31.1 h62c20be_1
tk 8.6.8 hbc83047_0
tornado 6.0.4 py37h8f50634_1 conda-forge
wheel 0.34.2 py37_0
xorg-libxau 1.0.9 h14c3975_0 conda-forge
xorg-libxdmcp 1.1.3 h516909a_0 conda-forge
xz 5.2.5 h7b6447c_0
zlib 1.2.11 h7b6447c_3
I'd appreciate any feedback.
Thank you,
Maria
Hi @yfukasawa ,
Thanks for this tool kit. It is definitely very flexible and important. I am wondering whether you have fully implemented the runqc
module within LongQC, or is it still ongoing? I can't seem to find a proper tutorial/readme on this process as well as am facing some issues when running it in python 3.7 (accessing the scripts manually through a cloned repo). If the implementation of runqc
is complete, I am happy to show the errors so, hopefully, they can be troubleshot.
Thanks heaps,
Shani.
Hi @yfukasawa
I am having trouble installing the docker image of LongQC, it is showing some error related to glibc version.
error:
Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... failed with initial frozen solve. Retrying with flexible solve.
Solving environment: ...working... failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): ...working... done
Solving environment: ...working... failed with initial frozen solve. Retrying with flexible solve.
Found conflicts! Looking for incompatible packages.
This can take several minutes. Press CTRL-C to abort.
failed
UnsatisfiableError: The following specifications were found to be incompatible with each other:
Output in format: Requested package -> Available versionsThe following specifications were found to be incompatible with your system:
Your installed version is: 2.28
Is there any other other option than docker file to install LongQC? or is there a way I can fix this bug?
Thanks in Advance
Saraswati Awasthi
I received processed nanopore data and wanted to see the overall quality of the final dataset.
However I have the feeling that the adapters are already removed from the dataset.
Is there a way to run LongQC without adapter information?
I read the longQC.py script, there are some adapter sequences. For example,
if args.preset:
p = args.preset
if p == 'pb-rs2':
args.pb = True
args.adp5 = "ATCTCTCTCTTTTCCTCCTCCTCCGTTGTTGTTGTTGAGAGAGAT" if not args.adp5 else args.adp5
args.adp3 = "ATCTCTCTCTTTTCCTCCTCCTCCGTTGTTGTTGTTGAGAGAGAT" if not args.adp3 else args.adp3
minimap2_params = "-Y -l 0 -q 160"
minimap2_med_score_threshold = 80
if args.short:
minimap2_med_score_threshold_short = 60
elif p == 'pb-sequel':
args.pb = True
args.sequel = True
args.adp5 = "ATCTCTCTCAACAACAACAACGGAGGAGGAGGAAAAGAGAGAGAT" if not args.adp5 else args.adp5
args.adp3 = "ATCTCTCTCAACAACAACAACGGAGGAGGAGGAAAAGAGAGAGAT" if not args.adp3 else args.adp3
minimap2_params = "-Y -l 0 -q 160"
minimap2_med_score_threshold = 80
if args.short:
minimap2_med_score_threshold_short = 60
For example, I have done a 'pb-sequel' sequencing, and I don't know what the adapter sequence is. Are the adapter sequences in the longQC.py script correct for my case? How can I judge it?
How did the author find the adapter sequences for Pacbio sequel? Could you show us the reference website?
Goodmorning,
I run longqc with ONT data with the following command :
srun longQC.py sampleqc -x ont-rapid -s ${sample} -p 30 -o ${folder_sampleqc}/${sample}_500X_rapid ${long_read}
and some of my samples succeeded but some other seemed to crash and I don't understand why (the issue is uncomprehensible for me, I'm biologist...-> see below for the issue)
I've questioned the cluster manager @lecorguille and he thinks it's rather a tool dependant issue than an installation one.
Could you help us to resolve it ?
thans a lot
regards
Chloé
lq_coverage:2021-03-16 18:57:46,714:374:INFO:Coordinates of coverage analysis were parsed.
Traceback (most recent call last):
File "/opt/LongQC/longQC.py", line 933, in
main(args)
File "/opt/LongQC/longQC.py", line 63, in main
args.handler(args)
File "/opt/LongQC/longQC.py", line 598, in command_sample
lc.plot_length_vs_coverage(fig_path_cl)
File "/opt/LongQC/lq_coverage.py", line 461, in plot_length_vs_coverage
self.__check_outlier_coverage(interval)
File "/opt/LongQC/lq_coverage.py", line 482, in __check_outlier_coverage
meds = stats['median'][np.where(stats['size']>=LqCoverage.LENGTH_BIN_THRESHOLD)[0]]
File "/opt/conda/lib/python3.8/site-packages/pandas/core/series.py", line 908, in getitem
return self._get_with(key)
File "/opt/conda/lib/python3.8/site-packages/pandas/core/series.py", line 943, in _get_with
return self.loc[key]
File "/opt/conda/lib/python3.8/site-packages/pandas/core/indexing.py", line 879, in getitem
return self._getitem_axis(maybe_callable, axis=axis)
File "/opt/conda/lib/python3.8/site-packages/pandas/core/indexing.py", line 1099, in _getitem_axis
return self._getitem_iterable(key, axis=axis)
File "/opt/conda/lib/python3.8/site-packages/pandas/core/indexing.py", line 1037, in _getitem_iterable
keyarr, indexer = self._get_listlike_indexer(key, axis, raise_missing=False)
File "/opt/conda/lib/python3.8/site-packages/pandas/core/indexing.py", line 1254, in _get_listlike_indexer
self._validate_read_indexer(keyarr, indexer, axis, raise_missing=raise_missing)
File "/opt/conda/lib/python3.8/site-packages/pandas/core/indexing.py", line 1315, in _validate_read_indexer
raise KeyError(
KeyError: "Passing list-likes to .loc or [] with any missing labels is no longer supported. The following labels were missing: Int64Index([2, 3], dtype='int64', name='Binned read length'). See https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#deprecate-loc-reindex-listlike"
srun: error: cpu-node-18: task 0: Exited with exit code 1
Good afternoon,
I'm trying to run LongQC on a pacbio sequel library in fastq format.
The software starts just fine, but then crashes with the following error:
longQC:2020-06-08 15:02:23,100:573:INFO:Generating coverage related plots...
lq_coverage:2020-06-08 15:02:23,199:120:INFO:Estimating coverage distribution..
Traceback (most recent call last):
File "/PATH/TO/Andrea/LongQC/longQC.py", line 920, in <module>
main(args)
File "/PATH/TO/Andrea/LongQC/longQC.py", line 63, in main
args.handler(args)
File "/PATH/TO/Andrea/LongQC/longQC.py", line 577, in command_sample
lc = LqCoverage(cov_path, isTranscript=args.transcript, control_filtering=pb_control)
File "/PATH/TO/Andrea/LongQC/lq_coverage.py", line 121, in __init__
self.__est_coverage()
File "/PATH/TO/Andrea/LongQC/lq_coverage.py", line 220, in __est_coverage
model_main_comp = self.__est_coverage_dist_gmm(k_i=2)
File "/PATH/TO/Andrea/LongQC/lq_coverage.py", line 545, in __est_coverage_dist_gmm
nonzeros = self.df[LqCoverage.COVERAGE_COLUMN].values[np.nonzero(self.df[LqCoverage.COVERAGE_COLUMN])]
File "<__array_function__ internals>", line 6, in nonzero
File "/PATH/TO/Andrea/myanaconda/longqc/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 1896, in nonzero
return _wrapfunc(a, 'nonzero')
File "/PATH/TO/Andrea/myanaconda/longqc/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 58, in _wrapfunc
return _wrapit(obj, method, *args, **kwds)
File "/PATH/TO/Andrea/myanaconda/longqc/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 51, in _wrapit
result = wrap(result)
File "/PATH/TO/Andrea/myanaconda/longqc/lib/python3.7/site-packages/pandas/core/generic.py", line 1918, in __array_wrap__
return self._constructor(result, **d).__finalize__(self)
File "/PATH/TO/Andrea/myanaconda/longqc/lib/python3.7/site-packages/pandas/core/series.py", line 292, in __init__
f"Length of passed values is {len(data)}, "
ValueError: Length of passed values is 1, index implies 5000.
Not sure if the problem is with the data or with the dependencies.
Thanks for the help
Andrea
Hi,
Thanks for the great work.
I experience a similar issue as described here #28 and here #34.
longQC:2021-10-27 08:06:14,443:598:INFO:Generating coverage related plots...
Traceback (most recent call last):
File "/storage/home/hcoda1/3/apfennig3/LongQC/longQC.py", line 956, in <module>
main(args)
File "/storage/home/hcoda1/3/apfennig3/LongQC/longQC.py", line 62, in main
args.handler(args)
File "/storage/home/hcoda1/3/apfennig3/LongQC/longQC.py", line 602, in command_sample
lc = LqCoverage(cov_path, isTranscript=args.transcript, control_filtering=pb_control)
File "/storage/home/hcoda1/3/apfennig3/LongQC/lq_coverage.py", line 88, in __init__
self.df = pd.read_table(table_path, sep='\t', header=None, dtype={3: str, 4: str})
File "/storage/home/hcoda1/3/apfennig3/.conda/envs/GBL/lib/python3.9/site-packages/pandas/util/_decorators.py", line 311, in wrapper
return func(*args, **kwargs)
File "/storage/home/hcoda1/3/apfennig3/.conda/envs/GBL/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 683, in read_table
return _read(filepath_or_buffer, kwds)
File "/storage/home/hcoda1/3/apfennig3/.conda/envs/GBL/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 482, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "/storage/home/hcoda1/3/apfennig3/.conda/envs/GBL/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 811, in __init__
self._engine = self._make_engine(self.engine)
File "/storage/home/hcoda1/3/apfennig3/.conda/envs/GBL/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 1040, in _make_engine
return mapping[engine](self.f, **self.options) # type: ignore[call-arg]
File "/storage/home/hcoda1/3/apfennig3/.conda/envs/GBL/lib/python3.9/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 69, in __init__
self._reader = parsers.TextReader(self.handles.handle, **kwds)
File "pandas/_libs/parsers.pyx", line 549, in pandas._libs.parsers.TextReader.__cinit__
pandas.errors.EmptyDataError: No columns to parse from file
However, I don't think it's a memory issue. I already reduced the index size to 100M. The peak RSS is 6.7G and 22.7G during the spiked-in control, which seems to run through normal. I requested 64G of Ram, which is why I don't think memory is the issue here. This is the command I used to execute the pipeline:
python ${home_dir}LongQC/longQC.py sampleqc -o ${home_dir}scratch/QC/ -i 100M -x pb-sequel --sample_name gbl -m 1 -p 64 ${home_dir}scratch/gbl.subreads.bam
The coverage_out.txt file is empty, causing the error. I attached the coverage_err.txt file, the log file, and the files corresponding to the spiked-in control:
coverage_err_gbl.txt
qc.log
spiked_in_control_gbl.txt
spiked_in_control_gbl_stderr.txt
Any thoughts on this?
Thanks,
Aaron
Hi, as stated on the GitHub page, for ONT data, there are two optional kits to choose from: 1D ligation and rapid sequencing kit. But as I understand from the ONT document here: https://nanoporetech.com/sites/default/files/s3/Product_brochure_Final_July_2018.pdf (pages 4 to 7 about available kits for library preparation), both are kits for DNA library preparation. But how about RNA? We actually used cDNA-PCR Sequencing Kit in our project, which is a type of RNA library.
hi, @yfukasawa
When I tried to run longQC, it gave we following error:
python ~/LongQC/longQC.py -h
File "~/longQC.py", line 255
le.exec(*le_args, out=cov_path, err=cov_path_e)
^
SyntaxError: invalid syntax
It seems thant "exec" is a key word in python, how to avoid this?
Thank you,
Xiucz.
Hi, Yoshinori. First thanks for developing this nice tool.
You explained that the second column of 'longqc_sdust.txt' table is 'the number of bases masked (MDUST)'. I am wondering how you define a masked base. Since for my read, I didn't find a base pair in lower case.
Hi @yfukasawa ,
Thanks for this useful tool kit. I was running LongQC on my ONT direct cDNA sequencing data using the following script.
python longQC.py sampleqc -x ont-ligation -p 4 -o $out/barcode01 $input/barcode01.fq.gz
The analysis/subsample.fastq
file were successfully generated together with minimap coverage error and out txt file. However, the figs
folder is empty, and I got the following error:
lq_gcfrac:2020-07-20 15:02:39,582:58:INFO:Kernel density estimation done for read GC composition
Traceback (most recent call last):
File "longQC.py", line 932, in <module>
main(args)
File "longQC.py", line 62, in main
args.handler(args)
File "longQC.py", line 435, in command_sample
gc_read_mean, gc_read_sd = lg.plot_unmasked_gc_frac(fp=fig_path_gc)
File "/stornext/Home/data/allstaff/d/dong.x/Programs/LongQC/lq_gcfrac.py", line 60, in plot_unmasked_gc_frac
plt.hist(self.c_frac, alpha=0.3, bins=np.arange(min(self.c_frac), max(self.c_frac) + b_width, b_width), color='red', density=True)
ValueError: min() arg is an empty sequence
Could you please tell me how to fix the problem?
Thanks,
Xueyi
Hi, I am trying to run longQC.py with the following command. However, it is showing following error. Please suggest.
python longQC.py sampleqc -x pb-rs2 -o /longqc/ merge.fastq.gz
error:
Traceback (most recent call last):
File "/media/bmaurice/Data2/Hybrid_assembly_virus/DMV10_IBV/longqc/LongQC/longQC.py", line 20, in
import pandas as pd
ModuleNotFoundError: No module named 'pandas'
Hi, I'm new to this, but im currently working with fast5 files generated by minion
trying to install longQC in my Mac, I get run ´LongQC/minimap2-coverage && make´ I get the following error
´minimap2-coverage.c:565:9: error: implicit declaration of function 'compute_reliable_region' is invalid in C99
[-Werror,-Wimplicit-function-declaration]
compute_reliable_region(v, fopt.min_coverage, ®s, &mregs);
^
1 error generated.
make: *** [minimap2-coverage.o] Error 1´
can anyone help me on this?
Thank you
Hi, I have no problem running LongQC with Docker with the command in the Doc:
docker run -it \
-v YOUR_INPUT_DIR:/input \
-v YOUR_OUTPUT_DIR:/output \
longqc sampleqc \
-x pb-sequel \ **specify a preset and change accordingly.**
-p $(nproc) \ **number of process/cores, this uses all of your cores. change accordingly.**
-o /output/YOUR_SAMPLE_NAME \ **keep /output as this is binded.**
/input/YOUR_INPUT_READ_FILE **keep /input as this is binded.**
But because of Docker's high privilege requirement, it is not allowed on our server. But Singularity is supported on the server. So I converted LongQC's Docker image into a Singularity file longqc.sif
, but I'm having problems getting it run with singularity. Do you have any suggestions on how to run this longqc.sif
with Singularity? Thank you in advance!
Im Running LongQC in my snakemake Nanopore genome pipe as on a cluster with no admin rights.
rule LongQC:
input:
'filtered_reads/{sample}.fastq.gz',
output:
'QC_results/{sample}',
threads: t
shell:
'''
python ~/SCRATCH_NOBAK/workplace_5z/LongQC/longQC.py sampleqc -x ont-ligation --ncpu {threads} -o {output} {input}
'''
It runs fine with no error from minimap until the end.
I attached the output (without subsample.fastq[to big]) i think only the .html report is missing
Strange is it worked twice when testing after make:
python ~/SCRATCH_NOBAK/workplace_5z/LongQC/longQC.py sampleqc -x ont-ligation --ncpu 100 -o ~/SCRATCH_NOBAK/workplace_5z/LongQC/hybrid/A22 ~/SCRATCH_NOBAK/workplace_5z/hybrid/reads/long/A22.fastq.gz
Error
longQC:2021-04-07 12:55:48,036:489:INFO:Genarated the sample read length plot.
longQC:2021-04-07 12:55:48,037:491:INFO:Throughput: 500003168
longQC:2021-04-07 12:55:48,037:492:INFO:Length of longest read: 69248
longQC:2021-04-07 12:55:48,037:493:INFO:The number of reads: 57175
longQC:2021-04-07 12:55:48,038:524:INFO:Calculating overlaps of sampled reads...
longQC:2021-04-07 12:55:58,047:524:INFO:Calculating overlaps of sampled reads...
longQC:2021-04-07 12:56:08,055:524:INFO:Calculating overlaps of sampled reads...
longQC:2021-04-07 12:56:18,063:524:INFO:Calculating overlaps of sampled reads...
longQC:2021-04-07 12:56:28,071:524:INFO:Calculating overlaps of sampled reads...
longQC:2021-04-07 12:56:38,079:524:INFO:Calculating overlaps of sampled reads...
longQC:2021-04-07 12:56:48,087:524:INFO:Calculating overlaps of sampled reads...
longQC:2021-04-07 12:56:58,095:522:INFO:Process 66315 for ~/SCRATCH_NOBAK/workplace_5z/LongQC/minimap2-coverage/minimap2-coverage terminated.
longQC:2021-04-07 12:56:58,096:526:INFO:Overlap computation finished.
longQC:2021-04-07 12:57:08,103:598:INFO:Generating coverage related plots...
lq_coverage:2021-04-07 12:57:08,118:121:INFO:Estimating coverage distribution..
lq_coverage:2021-04-07 12:57:08,186:571:DEBUG:GaussianMixture(n_components=2)
lq_coverage:2021-04-07 12:57:08,192:576:INFO:The order of componens 0.009008786180148729 0.001502131247968907
lq_coverage:2021-04-07 12:57:08,192:577:INFO:Means of components: 77.92692972768067 52.171159446345385 k=2
lq_coverage:2021-04-07 12:57:08,192:578:INFO:Covariances of components: 76.03986535434747 209.68414862634 k=2
lq_coverage:2021-04-07 12:57:08,229:123:INFO:Estimation of coverage distribution finished.
lq_coverage:2021-04-07 12:57:08,795:392:INFO:Coordinates of coverage analysis were parsed.
Traceback (most recent call last):
File "~/SCRATCH_NOBAK/workplace_5z/LongQC/longQC.py", line 956, in <module>
main(args)
File "~/SCRATCH_NOBAK/workplace_5z/LongQC/longQC.py", line 62, in main
args.handler(args)
File "~/SCRATCH_NOBAK/workplace_5z/LongQC/longQC.py", line 611, in command_sample
lc.plot_length_vs_coverage(fig_path_cl)
File "/scratch/blumenscheitc/workplace_5z/LongQC/lq_coverage.py", line 479, in plot_length_vs_coverage
self.__check_outlier_coverage(interval)
File "/scratch/blumenscheitc/workplace_5z/LongQC/lq_coverage.py", line 500, in __check_outlier_coverage
meds = stats['median'][np.where(stats['size']>=LqCoverage.LENGTH_BIN_THRESHOLD)[0]]
File "~/.conda/envs/bio39/lib/python3.8/site-packages/pandas/core/series.py", line 877, in __getitem__
return self._get_with(key)
File "~/.conda/envs/bio39/lib/python3.8/site-packages/pandas/core/series.py", line 912, in _get_with
return self.loc[key]
File "~/.conda/envs/bio39/lib/python3.8/site-packages/pandas/core/indexing.py", line 895, in __getitem__
return self._getitem_axis(maybe_callable, axis=axis)
File "~/.conda/envs/bio39/lib/python3.8/site-packages/pandas/core/indexing.py", line 1113, in _getitem_axis
return self._getitem_iterable(key, axis=axis)
File "~/.conda/envs/bio39/lib/python3.8/site-packages/pandas/core/indexing.py", line 1053, in _getitem_iterable
keyarr, indexer = self._get_listlike_indexer(key, axis, raise_missing=False)
File "~/.conda/envs/bio39/lib/python3.8/site-packages/pandas/core/indexing.py", line 1266, in _get_listlike_indexer
self._validate_read_indexer(keyarr, indexer, axis, raise_missing=raise_missing)
File "~/.conda/envs/bio39/lib/python3.8/site-packages/pandas/core/indexing.py", line 1321, in _validate_read_indexer
raise KeyError(
KeyError: "Passing list-likes to .loc or [] with any missing labels is no longer supported. The following labels were missing: Int64Index([0], dtype='int64', name='Binned read length'). See https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#deprecate-loc-reindex-listlike"
Error in job LongQC while creating output file QC_results/A22.
RuleException:
CalledProcessError in line 33 of /scratch/blumenscheitc/workplace_5z/long_reads_preprocess/Snakefile:
Command '
python ~/SCRATCH_NOBAK/workplace_5z/LongQC/longQC.py sampleqc -x ont-ligation --ncpu 100 -o QC_results/A22 filtered_reads/A22.fastq.gz
' returned non-zero exit status 1
File "/scratch/blumenscheitc/workplace_5z/long_reads_preprocess/Snakefile", line 33, in __rule_LongQC
File "/usr/lib/python3.5/concurrent/futures/thread.py", line 55, in run
Removing output files of failed job LongQC since they might be corrupted:
QC_results/A22
Skipped removing non-empty directory QC_results/A22
Will exit after finishing currently running jobs.
Exiting because a job execution failed. Look above for error message
Hello,
Thanks for making this awesome tool! I'm recently installed it and am trying to use the sampleqc
tool. I am running with the below call:
longQC.py sampleqc -p 16 -m 2 -d -x pb-hifi -s <sample_name> -o <out_dir> <sample_name.bam>
The tools appears to have run without any errors, but some of the output is not as I expected?
The fig_longQC_sampleqc_average_qv_<sample_name>.png
and fig_longQC_sampleqc_olp_qv_<sample_name>.png
files appear as below:
Is this expected for a given error, or do I need to review my installation? No error messages were created.
Many thanks in advance!
Hi Yoshinori,
I am trying to trim adaptors from ONT reads:
Do we just use the flags --adapter_5 ADP5 --adapter_3 ADP3?
I have used this:
for s in $(cat samples_ont.txt);do
longQC.py sampleqc -x ont-rapid -o ${today}OntQC_results/${s} LRfastqs/${s}.fastq.gz --adapter_5 ADP5 --adapter_3 ADP3 --trim_output ${today}OntQC_results --ncpu ${SLURM_CPUS_PER_TASK}
done
cd LongQC-1.2.0c/minimap2-coverage && make
cc -c -g -O2 -Wall -Wc++-compat -DHAVE_KALLOC minimap2-coverage.c -o minimap2-coverage.o
minimap2-coverage.c:1:10: fatal error: stdlib.h: No such file or directory
1 | #include <stdlib.h>
| ^~~~~~~~~~
compilation terminated.
make: *** [Makefile:29: minimap2-coverage.o] Error 1
im facing this issue. kindly help me to sort out this. Thanks
Hello @yfukasawa,
I started with the same error as @PerisD. After adding -p 4 it actually ran (got some stats) and got an error that looks like this:
longQC:2020-07-07 08:11:12,923:475:INFO:Genarated the sample read length plot.
longQC:2020-07-07 08:11:12,923:477:INFO:Throughput: 737228
longQC:2020-07-07 08:11:12,923:478:INFO:Length of longest read: 25168
longQC:2020-07-07 08:11:12,923:479:INFO:The number of reads: 121
longQC:2020-07-07 08:11:12,924:508:INFO:Process 450 for /home/user/LongQC/minimap2_mod/minimap2-coverage terminated.
longQC:2020-07-07 08:11:12,924:512:INFO:Overlap computation finished.
longQC:2020-07-07 08:11:22,933:584:INFO:Generating coverage related plots...
lq_coverage:2020-07-07 08:11:22,938:120:INFO:Estimating coverage distribution..
/opt/conda/lib/python3.7/site-packages/pandas/core/ops/array_ops.py:253: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
res_values = method(rvalues)
Traceback (most recent call last):
File "longQC.py", line 932, in
main(args)
File "longQC.py", line 62, in main
args.handler(args)
File "longQC.py", line 588, in command_sample
lc = LqCoverage(cov_path, isTranscript=args.transcript, control_filtering=pb_control)
File "/home/user/LongQC/lq_coverage.py", line 121, in init
self.__est_coverage()
File "/home/user/LongQC/lq_coverage.py", line 220, in __est_coverage
model_main_comp = self.__est_coverage_dist_gmm(k_i=2)
File "/home/user/LongQC/lq_coverage.py", line 546, in __est_coverage_dist_gmm
m_f = mixture.GaussianMixture(n_components=k).fit(nonzeros[nonzeros < th_per].reshape(-1,1),1)
File "/opt/conda/lib/python3.7/site-packages/sklearn/mixture/_base.py", line 193, in fit
self.fit_predict(X, y)
File "/opt/conda/lib/python3.7/site-packages/sklearn/mixture/_base.py", line 220, in fit_predict
X = _check_X(X, self.n_components, ensure_min_samples=2)
File "/opt/conda/lib/python3.7/site-packages/sklearn/mixture/_base.py", line 53, in _check_X
ensure_min_samples=ensure_min_samples)
File "/opt/conda/lib/python3.7/site-packages/sklearn/utils/validation.py", line 73, in inner_f
return f(**kwargs)
File "/opt/conda/lib/python3.7/site-packages/sklearn/utils/validation.py", line 654, in check_array
context))
ValueError: Found array with 0 sample(s) (shape=(0, 1)) while a minimum of 2 is required.
What went wrong and how can I fix it?
Thanks
Asta
Originally posted by @astulaaa in #3 (comment)
I am using 1.1.1 version from docker. I have used following command
python longQC.py sampleqc -x ont-ligation -p 6 -d -o /data/out_dir221 /data/Arun1.CCV.fastq
I have aborted the process because it did not finish even after 15 minutes.
My computer has 64 GB RAM and 6 CPU cores
Can you help me to optimize the performance?
while running longqc I had this error can someone tell me what the problem is
ValueError: truncated quality string in [my path to the fastq file]
Hi,
I am having the following issue for few samples, two other samples were completed successfully and generated the Html file
Could you please take a look?
Thank you in advance
longQC:2021-09-10 13:56:57,131:598:INFO:Generating coverage related plots...
Traceback (most recent call last):
File "/home/kgagalova/src/LongQC/longQC.py", line 956, in <module>
main(args)
File "/home/kgagalova/src/LongQC/longQC.py", line 62, in main
args.handler(args)
File "/home/kgagalova/src/LongQC/longQC.py", line 602, in command_sample
lc = LqCoverage(cov_path, isTranscript=args.transcript, control_filtering=pb_control)
File "/home/kgagalova/src/LongQC/lq_coverage.py", line 88, in __init__
self.df = pd.read_table(table_path, sep='\t', header=None, dtype={3: str, 4: str})
File "/home/kgagalova/miniconda3/envs/py3.6bis/lib/python3.6/site-packages/pandas/io/parsers.py", line 767, in read_table
return read_csv(**locals())
File "/home/kgagalova/miniconda3/envs/py3.6bis/lib/python3.6/site-packages/pandas/io/parsers.py", line 688, in read_csv
return _read(filepath_or_buffer, kwds)
File "/home/kgagalova/miniconda3/envs/py3.6bis/lib/python3.6/site-packages/pandas/io/parsers.py", line 454, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File "/home/kgagalova/miniconda3/envs/py3.6bis/lib/python3.6/site-packages/pandas/io/parsers.py", line 948, in __init__
self._make_engine(self.engine)
File "/home/kgagalova/miniconda3/envs/py3.6bis/lib/python3.6/site-packages/pandas/io/parsers.py", line 1180, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/home/kgagalova/miniconda3/envs/py3.6bis/lib/python3.6/site-packages/pandas/io/parsers.py", line 2010, in __init__
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 540, in pandas._libs.parsers.TextReader.__cinit__
pandas.errors.EmptyDataError: No columns to parse from file
I am getting signature errors when trying to build the docker image. I'm trying on macOS 11.5.2, Docker Desktop 3.6.0.
% docker build -t longqc .
[+] Building 2.9s (6/11)
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 1.44kB 0.0s
=> [internal] load .dockerignore 0.1s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/continuumio/miniconda3:latest 1.8s
=> CACHED [1/7] FROM docker.io/continuumio/miniconda3@sha256:592a60b95b547f31c11dc6593832e962952e3178f1fa11db37f43a2afe8df8d7 0.0s
=> CACHED https://api.github.com/repos/yfukasawa/longqc/git/refs/heads/minimap2_update 0.0s
=> ERROR [2/7] RUN apt-get clean all && apt-get update && apt-get upgrade -y && apt-get install -y git build-essential libc6-dev zlib1g-dev && apt-get clean && apt-get purge 0.9s
------
> [2/7] RUN apt-get clean all && apt-get update && apt-get upgrade -y && apt-get install -y git build-essential libc6-dev zlib1g-dev && apt-get clean && apt-get purge:
#4 0.566 Get:1 http://security.debian.org/debian-security buster/updates InRelease [65.4 kB]
#4 0.570 Get:2 http://deb.debian.org/debian buster InRelease [122 kB]
#4 0.603 Get:3 http://deb.debian.org/debian buster-updates InRelease [51.9 kB]
#4 0.712 Err:1 http://security.debian.org/debian-security buster/updates InRelease
#4 0.712 At least one invalid signature was encountered.
#4 0.777 Err:2 http://deb.debian.org/debian buster InRelease
#4 0.777 At least one invalid signature was encountered.
#4 0.849 Err:3 http://deb.debian.org/debian buster-updates InRelease
#4 0.849 At least one invalid signature was encountered.
#4 0.858 Reading package lists...
#4 0.871 W: GPG error: http://security.debian.org/debian-security buster/updates InRelease: At least one invalid signature was encountered.
#4 0.871 E: The repository 'http://security.debian.org/debian-security buster/updates InRelease' is not signed.
#4 0.871 W: GPG error: http://deb.debian.org/debian buster InRelease: At least one invalid signature was encountered.
#4 0.871 E: The repository 'http://deb.debian.org/debian buster InRelease' is not signed.
#4 0.871 W: GPG error: http://deb.debian.org/debian buster-updates InRelease: At least one invalid signature was encountered.
#4 0.871 E: The repository 'http://deb.debian.org/debian buster-updates InRelease' is not signed.
------
executor failed running [/bin/sh -c apt-get clean all && apt-get update && apt-get upgrade -y && apt-get install -y git build-essential libc6-dev zlib1g-dev && apt-get clean && apt-get purge]: exit code: 100
Hi, yfukasawa,
I am running LongQC for a batch of my Pacbio unaligned BAMs. For the small samples, it run smoothly. But for one of a little bit large BAM file, it take a very long time and finally report OSE error. My command is
longqc sampleqc -x pb-sequel -p 8 -o HG002_90pM_read_LongQC ./m64304e_211014_201856.reads.bam
I cut the error related message as the following. Could you help me figure out the reason? Additionally, it take close two weeks to run this sample and get such an OSError, whether I can specific more nodes (such as -p 32) to linearly speed up? Thank you in advance.
Wenchao
File "/opt/LongQC/longQC.py", line 63, in main
args.handler(args)
File "/opt/LongQC/longQC.py", line 829, in command_sample
tpl = env.get_template('web_summary.tpl.html')
File "/opt/conda/lib/python3.9/site-packages/jinja2/environment.py", line 997, in get_template
File "/opt/conda/lib/python3.9/site-packages/jinja2/environment.py", line 958, in _load_template
File "/opt/conda/lib/python3.9/site-packages/jinja2/loaders.py", line 125, in load
File "/opt/conda/lib/python3.9/site-packages/jinja2/loaders.py", line 201, in get_source
OSError: [Errno 5] Input/output error
When i run:
python longQC.py sampleqc -x pb-hifi -o longqc ccs.bam
I get a message saying too high non-sense read fraction
Non-sense fraction is 0.647
but if I use:
python longQC.py sampleqc -x pb-sequel -o longqc ccs.bam
Non-sense fraction goes down to 0.282
I a trying to analyse a Sequel II Hifi run, which result would be the correct one?
It looks like the only thing that changes for minimap2-coverage are the database kmer size parameter?
HiFi : -k 15
Sequel : -k 12
Hi,
I just built the docker image (version 1.2) , but when I try to run it I get this error:
Traceback (most recent call last):
File "/root/LongQC/longQC.py", line 29, in <module>
import lq_nanopore
File "/root/LongQC/lq_nanopore.py", line 1, in <module>
import os, sys, time, h5py, json
File "/opt/conda/lib/python3.9/site-packages/h5py/__init__.py", line 33, in <module>
from . import version
File "/opt/conda/lib/python3.9/site-packages/h5py/version.py", line 15, in <module>
from . import h5 as _h5
File "h5py/h5.pyx", line 1, in init h5py.h5
ImportError: /opt/conda/lib/python3.9/site-packages/h5py/defs.cpython-39-x86_64-linux-gnu.so: undefined symbol: H5Pset_fapl_ros3
Any idea what the problem might be?
Cheers
column2 | column3 | column4 | column5 | column6 | |
---|---|---|---|---|---|
m64062_47008_47682 | 197 | 674 | 0.292 | -0.000 | 0 |
m64062_0_35905 | 976 | 35905 | 0.027 | -0.000 | 0 |
m64062_0_157 | 7 | 157 | 0.045 | -0.000 | 0 |
Hello @yfukasawa ,
I am running LongQC
on my PacBio data with the following script.
python longQC.py sampleqc -x pb-sequel -d -p 8 -m 2 -o ${OUT_DIR}/RAW/${subdir} ${BAM}
It seems like something is happening (it generates a fastq file in the analysis directory), but in the end, I get the following error:
longQC:2020-06-19 01:31:56,627:348:INFO:Summary table /stornext/General/data/
/long_read_benchmark/LongQC_output/RAW/S1R/longqc_sdust.txt was made.
Traceback (most recent call last):
File "/stornext/General/data/long_read_benchmark/LongQC/longQC.py", line 920, in <module>
main(args)
File "/stornext/General/data/long_read_benchmark/LongQC/longQC.py", line 63, in main
args.handler(args)
File "/stornext/General/data/long_read_benchmark/LongQC/longQC.py", line 351, in command_sample
df_mask = pd.read_table(lm.get_outfile_path(), sep='\t', header=None)
File "/home/.local/lib/python3.7/site-packages/pandas/io/parsers.py", line 676, in parser_f
return _read(filepath_or_buffer, kwds)
File "/home/.local/lib/python3.7/site-packages/pandas/io/parsers.py", line 448, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File "/home/.local/lib/python3.7/site-packages/pandas/io/parsers.py", line 880, in __init__
self._make_engine(self.engine)
File "/home/.local/lib/python3.7/site-packages/pandas/io/parsers.py", line 1114, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/home/.local/lib/python3.7/site-packages/pandas/io/parsers.py", line 1891, in __init__
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 532, in pandas._libs.parsers.TextReader.__cinit__
pandas.errors.EmptyDataError: No columns to parse from file
Also, the figs
and the analysis/minimap2
folders are empty.
I have loaded minimap2 into my PATH:
echo $PATH: /stornext/System/data/apps/minimap2/minimap2-2.17/bin:/stornext/System/data/apps/python/python-3.7.0/bin:/stornext/System/data/apps/anaconda2/anaconda2-2019.10/condabin:/stornext/System/data/apps/R/R-3.6.1/lib64/R/bin:/stornext/System/data/apps/hdf5/hdf5-1.8.20/bin:/stornext/System/data/apps/java/java-1.8.0_131/bin:/stornext/System/data/apps/scala/scala-2.12.2/bin:/usr/local/bioinf/bin:/stornext/System/data/apps/fastqc/fastqc-0.11.8/bin:/stornext/System/data/apps/samtools/samtools-1.7/bin:/usr/local/bioinf/bin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/users/allstaff/bin/Linux:/home/users/all_users/bin:/home/users/user/.local/bin
What is happening? Can you please explain whether I'm doing something wrong here. It would be great to use your tool for our analysis!
Thanks and I hope to hear from you soon!
Shani.
Hello! I encountered several errors while installing. Please, help to solve it.
Steps to reproduce:
# install anaconda prerequisites
sudo apt install libgl1-mesa-glx libegl1-mesa libxrandr2 libxrandr2 libxss1 libxcursor1 libxcomposite1 libasound2 libxi6 libxtst6
# install anaconsa itself
cd ./Downloads
wget https://repo.anaconda.com/archive/Anaconda3-2020.11-Linux-x86_64.sh
bash ./Anaconda3-2020.11-Linux-x86_64.sh
# allow to install to /home/username/anaconda3
# reject running conda init at start-up of system
# add conda to system $PATH varialbe
export PATH="/home/username/anaconda3/bin:$PATH"
This part works fine.
2. Install prerequisites, as described in README.md file:
# the following line works without problem:
conda install h5py
# the next line returns the 'UnsatisfiableError'
conda install -c bioconda pysam
# the following line works without problem:
conda install -c bioconda edlib
# the next line returns the 'UnsatisfiableError'
conda install -c bioconda python-edlib
The error message for pysam
is:
~$ conda install -c bioconda pysam
Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: /
Found conflicts! Looking for incompatible packages.
This can take several minutes. Press CTRL-C to abort.
failed
UnsatisfiableError: The following specifications were found to be incompatible with each other:
Output in format: Requested package -> Available versions
The error message for python-edlib
is:
~$ conda install -c bioconda python-edlib
Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: /
Found conflicts! Looking for incompatible packages.
This can take several minutes. Press CTRL-C to abort.
failed
UnsatisfiableError: The following specifications were found to be incompatible with each other:
Output in format: Requested package -> Available versions
sudo apt update && sudo apt upgrade -y
sudo apt install -y apt-transport-https ca-certificates curl gnupg
sudo apt install -y docker-ce docker-ce-cli containerd.io
It works. No error here.
2. following instructions from README.md file:
cd /mnt/d/biotools
wget https://raw.githubusercontent.com/yfukasawa/LongQC/master/Dockerfile
sudo docker build -t longqc .
And the last line raises an error with the following log:
$ sudo docker build -t longqc .
Sending build context to Docker daemon 278.5MB
Step 1/26 : FROM continuumio/miniconda3
---> 52daacd3dd5d
Step 2/26 : MAINTAINER Yoshinori Fukasawa <[email protected]>
---> Using cache
---> 9cfac16065e5
Step 3/26 : RUN apt-get clean all && apt-get update && apt-get upgrade -y && apt-get install -y git build-essential libc6-dev zlib1g-dev && apt-get clean && apt-get purge
---> Using cache
---> 793730a77067
Step 4/26 : ENV USER user
---> Using cache
---> 80979e2b1106
Step 5/26 : ENV HOME /home/${USER}
---> Using cache
---> 0155bb0b6c1a
Step 6/26 : LABEL base_image="miniconda3"
---> Using cache
---> 0a7264391517
Step 7/26 : LABEL software="LongQC docker"
---> Using cache
---> 4b21c78cfca5
Step 8/26 : LABEL software.version="1.2"
---> Using cache
---> 90faa00bc325
Step 9/26 : RUN useradd -m ${USER}
---> Using cache
---> 243296d01b69
Step 10/26 : RUN echo "${USER}:test_pass" | chpasswd
---> Using cache
---> 9c07be4b060a
Step 11/26 : ADD https://api.github.com/repos/yfukasawa/longqc/git/refs/heads/minimap2_update version.json
Downloading 373B
---> Using cache
---> e41881c5db42
Step 12/26 : RUN git clone https://github.com/yfukasawa/LongQC.git $HOME/LongQC
---> Using cache
---> e0e39f42f46a
Step 13/26 : RUN cd $HOME/LongQC/minimap2-coverage && make
---> Using cache
---> 99e02c3d2a24
Step 14/26 : RUN conda update -y conda
---> Using cache
---> 2e13d687d385
Step 15/26 : RUN conda install -y numpy
---> Using cache
---> 0f257a341327
Step 16/26 : RUN conda install -y pandas'>=0.24.0'
---> Using cache
---> 6d627a9d63ec
Step 17/26 : RUN conda install -y scipy
---> Using cache
---> 822e6bb47c1f
Step 18/26 : RUN conda install -y jinja2
---> Using cache
---> fc01b5c48375
Step 19/26 : RUN conda install -y h5py
---> Using cache
---> 8652a02b3a1e
Step 20/26 : RUN conda install -y matplotlib'>=2.1.2'
---> Using cache
---> 6dc66c09087f
Step 21/26 : RUN conda install -y scikit-learn
---> Using cache
---> 987f6afa4db1
Step 22/26 : RUN conda install -y -c bioconda pysam
---> Running in b4ee5f7ad191
Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... failed with initial frozen solve. Retrying with flexible solve.
Solving environment: ...working... failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): ...working... done
Solving environment: ...working... failed with initial frozen solve. Retrying with flexible solve.
Found conflicts! Looking for incompatible packages.
This can take several minutes. Press CTRL-C to abort.
failed
UnsatisfiableError: The following specifications were found to be incompatible with each other:
Output in format: Requested package -> Available versions
The command '/bin/sh -c conda install -y -c bioconda pysam' returned a non-zero code: 1
Please, advise the fix.
System info:
Hello,
I am using docker build to install longqc, but it occupyied more than 1T tmp space (as shown below). Is it right? This is my first time to use docker.
$docker build -t longqc .
$Sending build context to Docker daemon 745G
Thank you
Tian
Hi,
I am using longQC for my sequel II data and it ran fine in the past but currently, it is giving me errors for some samples and other samples are running fine.
The surprising thing is that all the figures are generated but it fails to give the following error:
longQC::INFO:Generated coverage related plots.
lq_coverage : WARNING:Mode of lognormal has no value. Do estimation first.
Traceback (most recent call last):
File "longQC.py", line 918, in
main(args)
File "/longQC.py", line 63, in main
args.handler(args)
File "longQC.py", line 591, in command_sample
or (lc.is_low_coverage() and float(lc.get_logn_mode()) < very_low_coverage_threshold)
TypeError: float() argument must be a string or a number, not 'NoneType'
I have tried different very_low_coverage_threshold, went as low as 1.
I am using the following version:
Project Name: longQC.py
Start Date: 2017-10-10
Version: 0.1
This is my command line
python longQC.py sampleqc -o Sample/longQC -x pb-sequel -s Sample Sample.ccs.fastq
Please let me know where I am going wrong.
Thanks for your help!
Self posting.
Issue happened in qscore and length_vs_coverage plotting while analyzing datasets having very small variance in read length.
The cause of the error was that the number of bins for read length can be very small in such a case (< 5).
Checking the number of bins before plotting must be done.
Y.
Hi, I have doubts about some options under the -x parameter.
What is the difference between "ont-ligation", "ont-rapid", "ont-1dsq" options for ONT sequencing and which option should I choose for Nanopore PromethION Ultra-long sequencing?
Thank you!
Dear All,
I got an error that I can't find a solution. I paste here the output after the process breaks just at the beginning.
Traceback (most recent call last):
File "/usr/lib/python3.6/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/usr/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "/disk2/Users/FLIR/Software/LongQC/lq_nanopore.py", line 155, in wrapper
c_id = get_channel_id(fast5) -1
File "/disk2/Users/FLIR/Software/LongQC/lq_nanopore.py", line 120, in get_channel_id
return int(f['/UniqueGlobalKey']['channel_id'].attrs['channel_number'])
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "/disk2/Software/Venv/venv36/lib/python3.6/site-packages/h5py/_hl/group.py", line 288, in getitem
oid = h5o.open(self.id, self._e(name), lapl=self._lapl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5o.pyx", line 190, in h5py.h5o.open
KeyError: "Unable to open object (object 'UniqueGlobalKey' doesn't exist)"
Do you have any clue about how to fix this and make the script runs normally?
Thank you in advance.
In the Flanking region analysis plot, what does the y-axis represent and how is it calculated?
info:
SRR10520596.fastq is ~3 GB in size
command:
python longQC.py sampleqc -x pb-rs2 -o /data/out_dir /data/SRR10520596.fastq
logs:
longQC:2020-12-30 16:26:53,142:169:INFO:Cmd: longQC.py sampleqc -x pb-rs2 -o /data/out_dir /data/SRR10520596.fastq
longQC:2020-12-30 16:26:53,143:233:INFO:Preset "pb-rs2" was applied. Options --pb(--ont) is overwritten.
longQC:2020-12-30 16:26:58,169:306:INFO:Computation of the low complexity region started for a chunk 0
lq_mask:2020-12-30 16:27:13,623:111:INFO:New job was submitted: in->/data/out_dir/analysis/tmp_0.fastq, out->/data/out_dir/analysis/tmp_0.txt
longQC:2020-12-30 16:27:13,630:311:INFO:Adapter search is starting for a chunk 0.
longQC:2020-12-30 16:27:13,635:327:INFO:Computation of the GC fraction started for a chunk 0
lq_utils:2020-12-30 16:27:21,706:380:INFO:list for subsample is not initialized. Initializing now.
It freezed here for a day that I had to force quit.
Hi Yoshinori,
Finally, I tried to use LongQC with a real data, but I got this error:
The command line
MYPATH/conda_env/circlator/bin/python MYPATH/software/programs/LongQC/longQC.py sampleqc -x pb-rs2 -o longqc/ MYPATH/CompGenomics_Species/raw_reads/PacBio/TF100210M3/TF100210M3_1.fq -p 1
The error output:
Traceback (most recent call last): File "MYPATH/software/programs/LongQC/longQC.py", line 919, in <module> main(args)
File "MYPATH/software/programs/LongQC/longQC.py", line 65, in main args.handler(args)
File "MYPATH/software/programs/LongQC/longQC.py", line 268, in command_sample lm = LqMask(os.path.join(path_minimap2, "sdust"), args.out, suffix=suffix, max_n_proc=10 if ncpu > 10 else ncpu)
File "MYPATH/software/programs/LongQC/lq_mask.py", line 41, in __init__ self.pool = mp.Pool(self.n_proc)
File "MYPATH/conda_env/circlator/lib/python3.6/multiprocessing/context.py", line 119, in Pool context=self.get_context())
File "MYPATH/conda_env/circlator/lib/python3.6/multiprocessing/pool.py", line 168, in __init__ raise ValueError("Number of processes must be at least 1")
ValueError: Number of processes must be at least 1
Do you know how could I fix it?
Thanks,
Peris
Hi @yfukasawa,
In the first place, thank you for developing LongQC.
I am currently testing the tool to understand all the parameters better and choose their optimal configuration. However, I have several questions about the Index Size and the Short Mode since my test results seem unclear.
I have used two public datasets for my tests: flnc.bam (PacBio, Transcriptomic, ~4 Gb) and pb.bam (Pacbio, Genomic, ~12 Gb).
These are the results of my tests:
longQC.py sampleqc -o /tmp/results -x pb-hifi -n 10000 -p 8 -m 2 -i 1G -t /data/input/flnc.bam
longQC.py sampleqc -o /tmp/results -x pb-sequel -n 10000 -p 8 -m 2 -i 1G -b /data/input/pb.bam
According to the results, my questions are the next:
Thanks!,
Adolfo
Hi I am running longQC with docker
sudo docker run -it longqc sampleqc -x pb-sequel -p $(nproc) -o /home/haley/Desktop/LongQC/testPlato/H10_60411L2/barcode05_60411L2/longqctest_fastq1 /home/haley/Desktop/LongQC/testPlato/H10_60411L2/barcode05_60411L2/EC-D-6041-1L2-D-CuP-CeP_LR_1.fastq
I tried with one fastq file but plan to loop the command over many.
I got this error: Error: input file /home/haley/Desktop/LongQC/testPlato/H10_60411L2/barcode05_60411L2/EC-D-6041-1L2-D-CuP-CeP_LR_1.fastq does not exist.
Does anyone know why longqc will not recognize the files?
Thanks!
Hi Yoshinori
I am getting the following error when running:
python longQC.py sampleqc -x pb-hifi -o longqc ccs.bam
I made sure that ccs.bam is a 16GB file
Thank you for the help
longQC:2021-05-07 09:18:02,824:592:INFO:Filteration finished.
longQC:2021-05-07 09:18:12,834:598:INFO:Generating coverage related plots...
Traceback (most recent call last):
File "/LongQC-1.2.0b/longQC.py", line 956, in
main(args)
File "/LongQC-1.2.0b/longQC.py", line 62, in main
args.handler(args)
File "/LongQC-1.2.0b/longQC.py", line 602, in command_sample
lc = LqCoverage(cov_path, isTranscript=args.transcript, control_filtering=pb_control)
File "/LongQC-1.2.0b/lq_coverage.py", line 88, in init
self.df = pd.read_table(table_path, sep='\t', header=None)
File "/opt/conda/lib/python3.9/site-packages/pandas/io/parsers.py", line 689, in read_table
return _read(filepath_or_buffer, kwds)
File "/opt/conda/lib/python3.9/site-packages/pandas/io/parsers.py", line 462, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "/opt/conda/lib/python3.9/site-packages/pandas/io/parsers.py", line 819, in init
self._engine = self._make_engine(self.engine)
File "/opt/conda/lib/python3.9/site-packages/pandas/io/parsers.py", line 1050, in _make_engine
return mapping[engine](self.f, **self.options) # type: ignore[call-arg]
File "/opt/conda/lib/python3.9/site-packages/pandas/io/parsers.py", line 1898, in init
self._reader = parsers.TextReader(self.handles.handle, **kwds)
File "pandas/_libs/parsers.pyx", line 521, in pandas._libs.parsers.TextReader.cinit
pandas.errors.EmptyDataError: No columns to parse from file
Hi Yoshinori,
Thank you for this tool.
After processing some reads I get an error of format. Previously you solved a problem related to UTF-8 format. You guessed it was related to the new PacBio sequel codification format. Finally, I could start to run the algorithm of longQC.py "sampleqc" and get the following error, I do not if it is related to the same problem. Any help will be appreciated. Thank you,
python /opt/LongQC/longQC.py sampleqc -x pb-sequel -o ./PacBio_QC/ chk_PacBio_/m54336U_201216_191137.subreads.bam -p 15 -m 2 -i 128 longQC:2021-01-25 16:29:47,375:169:INFO:Cmd: /opt/LongQC/longQC.py sampleqc -x pb-sequel -o ./PacBio_QC/ chk_PacBio_/m54336U_201216_191137.subreads.bam -p 15 -m 2 -i 128 longQC:2021-01-25 16:29:47,376:233:INFO:Preset "pb-sequel" was applied. Options --pb(--ont) is overwritten. lq_utils:2021-01-25 16:29:47,383:127:DEBUG:chk_PacBio_/m54336U_201216_191137.subreads.bam is a compressed BAM. longQC:2021-01-25 16:29:47,383:238:INFO:Temporary work file was made at ./PacBio_QC/analysis/pbbam_converted_seq_file.fastq longQC:2021-01-25 16:31:03,240:306:INFO:Computation of the low complexity region started for a chunk 0 lq_mask:2021-01-25 16:31:17,932:111:INFO:New job was submitted: in->./PacBio_QC/analysis/tmp_0.fastq, out->./PacBio_QC/analysis/tmp_0.txt longQC:2021-01-25 16:31:17,933:311:INFO:Adapter search is starting for a chunk 0. longQC:2021-01-25 16:31:17,933:327:INFO:Computation of the GC fraction started for a chunk 0 lq_adapt:2021-01-25 16:32:09,463:77:INFO:2330 reads were skipped due to their short lengths. lq_adapt:2021-01-25 16:32:09,519:90:INFO:Adapter Sequence: ATCTCTCTCAACAACAACAACGGAGGAGGAGGAAAAGAGAGAGAT, max identity:0.977778 and the number of trimmed reads: 67 lq_utils:2021-01-25 16:32:17,424:380:INFO:list for subsample is not initialized. Initializing now. lq_adapt:2021-01-25 16:32:19,634:42:INFO:2333 reads were skipped due to their short lengths. lq_adapt:2021-01-25 16:32:19,635:92:INFO:Adapter Sequence: ATCTCTCTCAACAACAACAACGGAGGAGGAGGAAAAGAGAGAGAT, max identity:0.956522 and the number of trimmed reads: 64 longQC:2021-01-25 16:33:17,551:332:INFO:Adapter search has done for a chunk 0. longQC:2021-01-25 16:33:17,551:342:INFO:subsample finished for chunk 0. longQC:2021-01-25 16:34:31,081:306:INFO:Computation of the low complexity region started for a chunk 1 lq_mask:2021-01-25 16:34:46,149:111:INFO:New job was submitted: in->./PacBio_QC/analysis/tmp_1.fastq, out->./PacBio_QC/analysis/tmp_1.txt longQC:2021-01-25 16:34:46,149:311:INFO:Adapter search is starting for a chunk 1. longQC:2021-01-25 16:34:46,150:327:INFO:Computation of the GC fraction started for a chunk 1 lq_adapt:2021-01-25 16:35:46,389:77:INFO:1358 reads were skipped due to their short lengths. lq_adapt:2021-01-25 16:35:46,389:90:INFO:Adapter Sequence: ATCTCTCTCAACAACAACAACGGAGGAGGAGGAAAAGAGAGAGAT, max identity:0.957447 and the number of trimmed reads: 63 lq_adapt:2021-01-25 16:35:55,091:42:INFO:1363 reads were skipped due to their short lengths. lq_adapt:2021-01-25 16:35:55,091:92:INFO:Adapter Sequence: ATCTCTCTCAACAACAACAACGGAGGAGGAGGAAAAGAGAGAGAT, max identity:0.957447 and the number of trimmed reads: 62 longQC:2021-01-25 16:36:40,406:332:INFO:Adapter search has done for a chunk 1. Traceback (most recent call last): File "/opt/LongQC/longQC.py", line 956, in <module> main(args) File "/opt/LongQC/longQC.py", line 62, in main args.handler(args) File "/opt/LongQC/longQC.py", line 341, in command_sample s_reads = pool_res['subsample'].get() File "/home/mldbotero/anaconda3/lib/python3.7/multiprocessing/pool.py", line 657, in get raise self._value File "/home/mldbotero/anaconda3/lib/python3.7/multiprocessing/pool.py", line 431, in _handle_tasks put(task) File "/home/mldbotero/anaconda3/lib/python3.7/multiprocessing/connection.py", line 206, in send self._send_bytes(_ForkingPickler.dumps(obj)) File "/home/mldbotero/anaconda3/lib/python3.7/multiprocessing/connection.py", line 393, in _send_bytes header = struct.pack("!i", n)
Hello,
I tried to execute LongQC with the following command:
python LongQC/longQC.py sampleqc -o longqc -x ont-rapid -p 4 test/test.fasta
However, I got the following error:
longQC:2020-08-04 11:17:56,650:165:INFO:Cmd: LongQC/longQC.py sampleqc -o longqc -x ont-rapid -p 4 test/test.fasta
longQC:2020-08-04 11:17:56,650:215:INFO:Preset "ont-rapid" was applied. Options --pb(--ont) is overwritten.
Traceback (most recent call last):
File "LongQC/longQC.py", line 932, in <module>
main(args)
File "LongQC/longQC.py", line 62, in main
args.handler(args)
File "LongQC/longQC.py", line 281, in command_sample
for (reads, n_seqs, n_bases) in open_seq_chunk(args.input, file_format_code, chunk_size=args.mem*1024**3, is_upper=True):
File "/home/jovyan/LongQC/lq_utils.py", line 65, in open_seq_chunk
yield from parse_fastx_chunk(fn, chunk_size, is_upper=is_upper)
File "/home/jovyan/LongQC/lq_utils.py", line 268, in parse_fastx_chunk
with pysam.FastxFile(fn) as f:
AttributeError: module 'pysam' has no attribute 'FastxFile'
Can you please help with that?
Thank you in advance!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.