mortazavilab / lapa Goto Github PK
View Code? Open in Web Editor NEWAlternative polyadenylation detection from diverse data sources such as 3'-seq, long-read and short-reads.
Home Page: https://www.biorxiv.org/content/10.1101/2022.11.08.515683v1
Alternative polyadenylation detection from diverse data sources such as 3'-seq, long-read and short-reads.
Home Page: https://www.biorxiv.org/content/10.1101/2022.11.08.515683v1
Hello,
I am encountering problems installing lapa. Here is the error I am getting
#0 1.082 Collecting lapa
#0 1.115 Downloading lapa-0.0.5-py3-none-any.whl (36 kB)
#0 1.132 Requirement already satisfied: setuptools in /opt/conda/lib/python3.10/site-packages (from lapa) (68.0.0)
#0 1.132 Requirement already satisfied: tqdm in /opt/conda/lib/python3.10/site-packages (from lapa) (4.65.0)
#0 1.360 Collecting numpy<=1.23 (from lapa)
#0 1.372 Downloading numpy-1.23.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.0 MB)
#0 1.565 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 17.0/17.0 MB 70.9 MB/s eta 0:00:00
#0 1.615 Collecting click (from lapa)
#0 1.623 Downloading click-8.1.6-py3-none-any.whl (97 kB)
#0 1.630 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 97.9/97.9 kB 17.3 MB/s eta 0:00:00
#0 1.781 Collecting pandas (from lapa)
#0 1.790 Downloading pandas-2.0.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.3 MB)
#0 1.889 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12.3/12.3 MB 113.1 MB/s eta 0:00:00
#0 1.932 Collecting pybigwig (from lapa)
#0 1.945 Downloading pyBigWig-0.3.22-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (209 kB)
#0 1.954 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 209.7/209.7 kB 31.3 MB/s eta 0:00:00
#0 2.080 Collecting scipy (from lapa)
#0 2.092 Downloading scipy-1.11.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (36.3 MB)
#0 2.376 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 36.3/36.3 MB 63.5 MB/s eta 0:00:00
#0 2.444 Collecting bamread>=0.0.10 (from lapa)
#0 2.458 Downloading bamread-0.0.16-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (650 kB)
#0 2.470 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 650.3/650.3 kB 60.0 MB/s eta 0:00:00
#0 2.497 Collecting pyranges>=0.0.71 (from lapa)
#0 2.508 Downloading pyranges-0.0.129-py3-none-any.whl (1.5 MB)
#0 2.527 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.5/1.5 MB 87.0 MB/s eta 0:00:00
#0 2.552 Collecting sorted-nearest==0.0.33 (from lapa)
#0 2.565 Downloading sorted_nearest-0.0.33.tar.gz (1.2 MB)
#0 2.580 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 98.2 MB/s eta 0:00:00
#0 2.643 Installing build dependencies: started
#0 8.952 Installing build dependencies: finished with status 'done'
#0 8.954 Getting requirements to build wheel: started
#0 9.657 Getting requirements to build wheel: finished with status 'error'
#0 9.666 error: subprocess-exited-with-error
#0 9.666
#0 9.666 × Getting requirements to build wheel did not run successfully.
#0 9.666 │ exit code: 1
#0 9.666 ╰─> [173 lines of output]
#0 9.666
#0 9.666 Error compiling Cython file:
#0 9.666 ------------------------------------------------------------
#0 9.666 ...
#0 9.666 @cython.boundscheck(False)
#0 9.666 @cython.wraparound(False)
#0 9.666 @cython.initializedcheck(False)
#0 9.666 cpdef annotate_clusters64(const long [::1] starts, const long [::1] ends, int slack):
#0 9.666
#0 9.666 cpdef int min_start = starts[0]
#0 9.666 ^
#0 9.666 ------------------------------------------------------------
#0 9.666
#0 9.666 sorted_nearest/src/annotate_clusters.pyx:23:10: Variables cannot be declared with 'cpdef'. Use 'cdef' instead.
#0 9.666
#0 9.666 Error compiling Cython file:
#0 9.666 ------------------------------------------------------------
#0 9.666 ...
#0 9.666 @cython.wraparound(False)
#0 9.666 @cython.initializedcheck(False)
#0 9.666 cpdef annotate_clusters64(const long [::1] starts, const long [::1] ends, int slack):
#0 9.666
#0 9.666 cpdef int min_start = starts[0]
#0 9.666 cpdef int max_end = ends[0]
#0 9.666 ^
#0 9.666 ------------------------------------------------------------
#0 9.666
#0 9.666 sorted_nearest/src/annotate_clusters.pyx:24:10: Variables cannot be declared with 'cpdef'. Use 'cdef' instead.
#0 9.666
#0 9.666 Error compiling Cython file:
#0 9.666 ------------------------------------------------------------
#0 9.666 ...
#0 9.666 @cython.initializedcheck(False)
#0 9.666 cpdef annotate_clusters64(const long [::1] starts, const long [::1] ends, int slack):
#0 9.666
#0 9.666 cpdef int min_start = starts[0]
#0 9.666 cpdef int max_end = ends[0]
#0 9.666 cpdef int i = 0
#0 9.666 ^
#0 9.666 ------------------------------------------------------------
#0 9.666
#0 9.666 sorted_nearest/src/annotate_clusters.pyx:25:10: Variables cannot be declared with 'cpdef'. Use 'cdef' instead.
#0 9.666
#0 9.666 Error compiling Cython file:
#0 9.666 ------------------------------------------------------------
#0 9.666 ...
#0 9.666 cpdef annotate_clusters64(const long [::1] starts, const long [::1] ends, int slack):
#0 9.666
#0 9.666 cpdef int min_start = starts[0]
#0 9.666 cpdef int max_end = ends[0]
#0 9.666 cpdef int i = 0
#0 9.666 cpdef int n_clusters = 1
#0 9.666 ^
#0 9.666 ------------------------------------------------------------
#0 9.666
#0 9.666 sorted_nearest/src/annotate_clusters.pyx:26:10: Variables cannot be declared with 'cpdef'. Use 'cdef' instead.
#0 9.666
#0 9.666 Error compiling Cython file:
#0 9.666 ------------------------------------------------------------
#0 9.666 ...
#0 9.666
#0 9.666 cpdef int min_start = starts[0]
#0 9.666 cpdef int max_end = ends[0]
#0 9.666 cpdef int i = 0
#0 9.666 cpdef int n_clusters = 1
#0 9.666 cpdef int length = len(starts)
#0 9.666 ^
#0 9.666 ------------------------------------------------------------
#0 9.666
#0 9.666 sorted_nearest/src/annotate_clusters.pyx:27:10: Variables cannot be declared with 'cpdef'. Use 'cdef' instead.
#0 9.666
#0 9.666 Error compiling Cython file:
#0 9.666 ------------------------------------------------------------
#0 9.666 ...
#0 9.666 @cython.boundscheck(False)
#0 9.666 @cython.wraparound(False)
#0 9.666 @cython.initializedcheck(False)
#0 9.666 cpdef annotate_clusters32(const int32_t [::1] starts, const int32_t [::1] ends, int slack):
#0 9.666
#0 9.666 cpdef int min_start = starts[0]
#0 9.666 ^
#0 9.666 ------------------------------------------------------------
#0 9.666
#0 9.666 sorted_nearest/src/annotate_clusters.pyx:55:10: Variables cannot be declared with 'cpdef'. Use 'cdef' instead.
#0 9.666
#0 9.666 Error compiling Cython file:
#0 9.666 ------------------------------------------------------------
#0 9.666 ...
#0 9.666 @cython.wraparound(False)
#0 9.666 @cython.initializedcheck(False)
#0 9.666 cpdef annotate_clusters32(const int32_t [::1] starts, const int32_t [::1] ends, int slack):
#0 9.666
#0 9.666 cpdef int min_start = starts[0]
#0 9.666 cpdef int max_end = ends[0]
#0 9.666 ^
#0 9.666 ------------------------------------------------------------
#0 9.666
#0 9.666 sorted_nearest/src/annotate_clusters.pyx:56:10: Variables cannot be declared with 'cpdef'. Use 'cdef' instead.
#0 9.666
#0 9.666 Error compiling Cython file:
#0 9.666 ------------------------------------------------------------
#0 9.666 ...
#0 9.666 @cython.initializedcheck(False)
#0 9.666 cpdef annotate_clusters32(const int32_t [::1] starts, const int32_t [::1] ends, int slack):
#0 9.666
#0 9.666 cpdef int min_start = starts[0]
#0 9.666 cpdef int max_end = ends[0]
#0 9.666 cpdef int i = 0
#0 9.666 ^
#0 9.666 ------------------------------------------------------------
#0 9.666
#0 9.666 sorted_nearest/src/annotate_clusters.pyx:57:10: Variables cannot be declared with 'cpdef'. Use 'cdef' instead.
#0 9.666
#0 9.666 Error compiling Cython file:
#0 9.666 ------------------------------------------------------------
#0 9.666 ...
#0 9.666 cpdef annotate_clusters32(const int32_t [::1] starts, const int32_t [::1] ends, int slack):
#0 9.666
#0 9.666 cpdef int min_start = starts[0]
#0 9.666 cpdef int max_end = ends[0]
#0 9.666 cpdef int i = 0
#0 9.666 cpdef int n_clusters = 1
#0 9.666 ^
#0 9.666 ------------------------------------------------------------
#0 9.666
#0 9.666 sorted_nearest/src/annotate_clusters.pyx:58:10: Variables cannot be declared with 'cpdef'. Use 'cdef' instead.
#0 9.666
#0 9.666 Error compiling Cython file:
#0 9.666 ------------------------------------------------------------
#0 9.666 ...
#0 9.666
#0 9.666 cpdef int min_start = starts[0]
#0 9.666 cpdef int max_end = ends[0]
#0 9.666 cpdef int i = 0
#0 9.666 cpdef int n_clusters = 1
#0 9.666 cpdef int length = len(starts)
#0 9.666 ^
#0 9.666 ------------------------------------------------------------
#0 9.666
#0 9.666 sorted_nearest/src/annotate_clusters.pyx:59:10: Variables cannot be declared with 'cpdef'. Use 'cdef' instead.
#0 9.666 Compiling sorted_nearest/src/sorted_nearest.pyx because it changed.
#0 9.666 Compiling sorted_nearest/src/max_disjoint_intervals.pyx because it changed.
#0 9.666 Compiling sorted_nearest/src/k_nearest.pyx because it changed.
#0 9.666 Compiling sorted_nearest/src/k_nearest_ties.pyx because it changed.
#0 9.666 Compiling sorted_nearest/src/clusters.pyx because it changed.
#0 9.666 Compiling sorted_nearest/src/annotate_clusters.pyx because it changed.
#0 9.666 Compiling sorted_nearest/src/cluster_by.pyx because it changed.
#0 9.666 Compiling sorted_nearest/src/merge_by.pyx because it changed.
#0 9.666 Compiling sorted_nearest/src/introns.pyx because it changed.
#0 9.666 Compiling sorted_nearest/src/windows.pyx because it changed.
#0 9.666 Compiling sorted_nearest/src/tiles.pyx because it changed.
#0 9.666 [ 1/11] Cythonizing sorted_nearest/src/annotate_clusters.pyx
#0 9.666 Traceback (most recent call last):
#0 9.666 File "/opt/conda/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
#0 9.666 main()
#0 9.666 File "/opt/conda/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
#0 9.666 json_out['return_val'] = hook(**hook_input['kwargs'])
#0 9.666 File "/opt/conda/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel
#0 9.666 return hook(config_settings)
#0 9.666 File "/tmp/pip-build-env-zg_oj5sl/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 341, in get_requires_for_build_wheel
#0 9.666 return self._get_build_requires(config_settings, requirements=['wheel'])
#0 9.666 File "/tmp/pip-build-env-zg_oj5sl/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 323, in _get_build_requires
#0 9.666 self.run_setup()
#0 9.666 File "/tmp/pip-build-env-zg_oj5sl/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 487, in run_setup
#0 9.666 super(_BuildMetaLegacyBackend,
#0 9.666 File "/tmp/pip-build-env-zg_oj5sl/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 338, in run_setup
#0 9.666 exec(code, locals())
#0 9.666 File "<string>", line 79, in <module>
#0 9.666 File "/tmp/pip-build-env-zg_oj5sl/overlay/lib/python3.10/site-packages/Cython/Build/Dependencies.py", line 1134, in cythonize
#0 9.666 cythonize_one(*args)
#0 9.666 File "/tmp/pip-build-env-zg_oj5sl/overlay/lib/python3.10/site-packages/Cython/Build/Dependencies.py", line 1301, in cythonize_one
#0 9.666 raise CompileError(None, pyx_file)
#0 9.666 Cython.Compiler.Errors.CompileError: sorted_nearest/src/annotate_clusters.pyx
#0 9.666 [end of output]
#0 9.666
#0 9.666 note: This error originates from a subprocess, and is likely not a problem with pip.
#0 9.667 error: subprocess-exited-with-error
#0 9.667
#0 9.667 × Getting requirements to build wheel did not run successfully.
#0 9.667 │ exit code: 1
#0 9.667 ╰─> See above for output.
#0 9.667
#0 9.667 note: This error originates from a subprocess, and is likely not a problem with pip.
------
Dockerfile:11
--------------------
9 | RUN apt-get clean all
10 |
11 | >>> RUN pip3 install lapa
--------------------
ERROR: failed to solve: process "/bin/sh -c pip3 install lapa" did not complete successfully: exit code: 1
Thank you.
Would you recommend using pychopper upstream for ONT reads. Just as an additional QC step or do you think it wouldn't make a difference because of the way LAPA looks for PolyA signal?
Mustafa
It seems that lapa_correct_talon
is looking for a column named samples
in the TALON generates read_annot
file. This column is not available, however sample names are in dataset
column
Following the google colab jupyter notebook
, I ran all the code prior to prepare config and gtf
and fa
successfully however when I ran the following
! lapa --alignment sample_config.csv \
--fasta /home/mustafa/projects/ReferenceGenomes/gencode/v41/GRCh38.primary_assembly.genome.fa \
--annotation gencode.v41.primary_assembly.annotation.utr_fixed.gtf \
--chrom_sizes gencode.v41.chrom_sizes \
--output_dir LAPA_PolyAClusterCalling
After a while i get the error below
/root/miniconda3/envs/LAPA/lib/python3.8/site-packages/lapa/count.py:594: FutureWarning: The default value of numeric_only in DataFrameGroupBy.sum is deprecated. In a future version, numeric_only will default to False. Either specify numeric_only or select only columns which should be valid for the function.
df_all = df.groupby(cols).agg('sum').reset_index()
/root/miniconda3/envs/LAPA/lib/python3.8/site-packages/lapa/count.py:598: FutureWarning: The default value of numeric_only in DataFrameGroupBy.sum is deprecated. In a future version, numeric_only will default to False. Either specify numeric_only or select only columns which should be valid for the function.
_df = _df.groupby(cols).agg('sum').reset_index()
/root/miniconda3/envs/LAPA/lib/python3.8/site-packages/lapa/count.py:598: FutureWarning: The default value of numeric_only in DataFrameGroupBy.sum is deprecated. In a future version, numeric_only will default to False. Either specify numeric_only or select only columns which should be valid for the function.
_df = _df.groupby(cols).agg('sum').reset_index()
/root/miniconda3/envs/LAPA/lib/python3.8/site-packages/lapa/count.py:598: FutureWarning: The default value of numeric_only in DataFrameGroupBy.sum is deprecated. In a future version, numeric_only will default to False. Either specify numeric_only or select only columns which should be valid for the function.
_df = _df.groupby(cols).agg('sum').reset_index()
/root/miniconda3/envs/LAPA/lib/python3.8/site-packages/lapa/count.py:598: FutureWarning: The default value of numeric_only in DataFrameGroupBy.sum is deprecated. In a future version, numeric_only will default to False. Either specify numeric_only or select only columns which should be valid for the function.
_df = _df.groupby(cols).agg('sum').reset_index()
/root/miniconda3/envs/LAPA/lib/python3.8/site-packages/lapa/count.py:598: FutureWarning: The default value of numeric_only in DataFrameGroupBy.sum is deprecated. In a future version, numeric_only will default to False. Either specify numeric_only or select only columns which should be valid for the function.
_df = _df.groupby(cols).agg('sum').reset_index()
/root/miniconda3/envs/LAPA/lib/python3.8/site-packages/lapa/count.py:598: FutureWarning: The default value of numeric_only in DataFrameGroupBy.sum is deprecated. In a future version, numeric_only will default to False. Either specify numeric_only or select only columns which should be valid for the function.
_df = _df.groupby(cols).agg('sum').reset_index()
Traceback (most recent call last):
File "/root/miniconda3/envs/LAPA/bin/lapa", line 8, in <module>
sys.exit(cli_lapa())
File "/root/miniconda3/envs/LAPA/lib/python3.8/site-packages/click/core.py", line 1130, in __call__
return self.main(*args, **kwargs)
File "/root/miniconda3/envs/LAPA/lib/python3.8/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/root/miniconda3/envs/LAPA/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/root/miniconda3/envs/LAPA/lib/python3.8/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/root/miniconda3/envs/LAPA/lib/python3.8/site-packages/lapa/main.py", line 112, in cli_lapa
lapa(alignment, fasta, annotation, chrom_sizes, output_dir,
File "/root/miniconda3/envs/LAPA/lib/python3.8/site-packages/lapa/lapa.py", line 497, in lapa
_lapa(alignment)
File "/root/miniconda3/envs/LAPA/lib/python3.8/site-packages/lapa/lapa.py", line 288, in __call__
df_all_count, sample_counts = self.counting(alignment)
File "/root/miniconda3/envs/LAPA/lib/python3.8/site-packages/lapa/lapa.py", line 143, in counting
counter._to_bigwig(df_all_count, sample_counts, self.chrom_sizes,
File "/root/miniconda3/envs/LAPA/lib/python3.8/site-packages/lapa/count.py", line 561, in _to_bigwig
save_count_bw(df_all, output_dir, chrom_sizes, f'all_{prefix}')
File "/root/miniconda3/envs/LAPA/lib/python3.8/site-packages/lapa/count.py", line 197, in save_count_bw
BaseCounter._to_bigwig(df, chrom_sizes, output_dir, prefix)
File "/root/miniconda3/envs/LAPA/lib/python3.8/site-packages/lapa/count.py", line 153, in _to_bigwig
bw_from_pyranges(
File "/root/miniconda3/envs/LAPA/lib/python3.8/site-packages/lapa/utils/io.py", line 153, in bw_from_pyranges
gr['+'].to_bigwig(bw_pos_file, chromosome_sizes=chrom_sizes,
File "/root/miniconda3/envs/LAPA/lib/python3.8/site-packages/pyranges/pyranges.py", line 5381, in to_bigwig
result = _to_bigwig(self, path, chromosome_sizes, rpm, divide, value_col, dryrun)
File "/root/miniconda3/envs/LAPA/lib/python3.8/site-packages/pyranges/out.py", line 189, in _to_bigwig
gr = self.to_rle(rpm=rpm, strand=False, value_col=value_col).to_ranges()
File "/root/miniconda3/envs/LAPA/lib/python3.8/site-packages/pyranges/pyranges.py", line 5745, in to_rle
return _to_rle(self, value_col, strand=strand, rpm=rpm, nb_cpu=nb_cpu)
File "/root/miniconda3/envs/LAPA/lib/python3.8/site-packages/pyranges/methods/to_rle.py", line 22, in _to_rle
result = pyrange_apply_single(coverage, ranges, **kwargs)
File "/root/miniconda3/envs/LAPA/lib/python3.8/site-packages/pyranges/multithreaded.py", line 382, in pyrange_apply_single
result = call_f_single(function, nparams, df, **kwargs)
File "/root/miniconda3/envs/LAPA/lib/python3.8/site-packages/pyranges/multithreaded.py", line 31, in call_f_single
return f.remote(df, **kwargs)
File "/root/miniconda3/envs/LAPA/lib/python3.8/site-packages/pyrle/methods.py", line 167, in coverage
runs, values = _coverage(_df.Position.values, _df.Value.values)
File "pyrle/src/coverage.pyx", line 67, in pyrle.src.coverage._coverage
File "/root/miniconda3/envs/LAPA/lib/python3.8/site-packages/numpy/__init__.py", line 284, in __getattr__
raise AttributeError("module {!r} has no attribute "
AttributeError: module 'numpy' has no attribute 'int'
lapa was installed with pip
in a new conda environment python=3.8
.
Any help would very much be welcomed
Hi,
I am running LAPA with the command -
lapa --alignment /data/salomonis-archive/FASTQs/Grimes/RNA/scRNASeq/10X-Genomics/LGCHMC53-17GEX/PacbioPBMC/PacbioPBMC/outs/possorted_genome_bam.bam --fasta hg38.fa --annotation hg38.gtf --chrom_sizes hg38.chrom_sizes --output_dir pbmc_pacbio_1
I am getting this error -
a3-2020/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/anaconda3-2020/lib/python3.8/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/users/raw6jg/.local/lib/python3.8/site-packages/lapa/main.py", line 112, in cli_lapa
lapa(alignment, fasta, annotation, chrom_sizes, output_dir,
File "/users/raw6jg/.local/lib/python3.8/site-packages/lapa/lapa.py", line 497, in lapa
_lapa(alignment)
File "/users/raw6jg/.local/lib/python3.8/site-packages/lapa/lapa.py", line 288, in call
df_all_count, sample_counts = self.counting(alignment)
File "/users/raw6jg/.local/lib/python3.8/site-packages/lapa/lapa.py", line 142, in counting
df_all_count, sample_counts = counter.to_df()
File "/users/raw6jg/.local/lib/python3.8/site-packages/lapa/count.py", line 583, in to_df
df = pd.concat([
File "/users/raw6jg/.local/lib/python3.8/site-packages/lapa/count.py", line 584, in
self.build_counter(row['path'])
File "/users/raw6jg/.local/lib/python3.8/site-packages/lapa/count.py", line 142, in to_df
return self.to_gr().df.astype({'Chromosome': 'str', 'Strand': 'str'})
File "/users/raw6jg/.local/lib/python3.8/site-packages/lapa/count.py", line 136, in to_gr
return pr.PyRanges(df).count_overlaps(
File "/usr/local/anaconda3-2020/lib/python3.8/site-packages/pyranges/pyranges.py", line 1322, in count_overlaps
counts = pyrange_apply(_number_overlapping, self, other, **kwargs)
File "/usr/local/anaconda3-2020/lib/python3.8/site-packages/pyranges/multithreaded.py", line 236, in pyrange_apply
result = call_f(function, nparams, df, odf, kwargs)
File "/usr/local/anaconda3-2020/lib/python3.8/site-packages/pyranges/multithreaded.py", line 23, in call_f
return f.remote(df, odf, **kwargs)
File "/usr/local/anaconda3-2020/lib/python3.8/site-packages/pyranges/methods/coverage.py", line 27, in _number_overlapping
_self_indexes, _other_indexes = oncls.all_overlaps_both(
File "ncls/src/ncls32.pyx", line 76, in ncls.src.ncls32.NCLS32.all_overlaps_both
File "ncls/src/ncls32.pyx", line 122, in ncls.src.ncls32.NCLS32.all_overlaps_both
File "<array_function internals>", line 5, in resize
File "/usr/local/anaconda3-2020/lib/python3.8/site-packages/numpy/core/fromnumeric.py", line 1417, in resize
a = concatenate((a,) * n_copies)
File "<array_function internals>", line 5, in concatenate
ValueError: need at least one array to concatenate
Please guide what could be the error due to ?
Thanks
Hi Muhammed,
Well done with the documentation updates! This is great. I have upgraded to the latest, as suggested. However, I have come across an issue: (full error at the bottom)
lapa command: lapa --alignment samples.csv --fasta GRCh38.primary_assembly.genome.fa --annotation hg39.utr_fixed.gtf --chrom_sizes chrom_sizes --output_dir lapa_c_vs_t
(these are the same input files which worked with the previous version, except I fixed the UTR, which was in the docs:
#gencode_utr_fix --input_gtf mm10.gtf --output_gtf mm10.utr_fixed.gtf
wget -O - https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_40/gencode.v40.annotation.gtf.gz | gunzip -c > hg38.gtf
gencode_utr_fix --input_gtf hg38.gtf --output_gtf hg39.utr_fixed.gtf
gencode_utr_fix --input_gtf gencode.v39.primary_assembly.annotation.gtf --output_gtf hg39.utr_fixed.gtf
Both of these fail in with the main lapa command
.....
.....
[E::idx_find_and_load] Could not retrieve index file for '/home/pthorpe/scratch/mustafa/lapa/reads_bams/R6_Trt_LONG.fastq.gz.temp.mapped.bam'
Traceback (most recent call last):
File "/mnt/shared/scratch/pthorpe/apps/conda/envs/python38/bin/lapa", line 8, in
sys.exit(cli_lapa())
File "/mnt/shared/scratch/pthorpe/apps/conda/envs/python38/lib/python3.8/site-packages/click/core.py", line 1130, in call
return self.main(*args, **kwargs)
File "/mnt/shared/scratch/pthorpe/apps/conda/envs/python38/lib/python3.8/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/mnt/shared/scratch/pthorpe/apps/conda/envs/python38/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/mnt/shared/scratch/pthorpe/apps/conda/envs/python38/lib/python3.8/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/mnt/shared/scratch/pthorpe/apps/conda/envs/python38/lib/python3.8/site-packages/lapa/main.py", line 112, in cli_lapa
lapa(alignment, fasta, annotation, chrom_sizes, output_dir,
File "/mnt/shared/scratch/pthorpe/apps/conda/envs/python38/lib/python3.8/site-packages/lapa/lapa.py", line 497, in lapa
_lapa(alignment)
File "/mnt/shared/scratch/pthorpe/apps/conda/envs/python38/lib/python3.8/site-packages/lapa/lapa.py", line 288, in call
df_all_count, sample_counts = self.counting(alignment)
File "/mnt/shared/scratch/pthorpe/apps/conda/envs/python38/lib/python3.8/site-packages/lapa/lapa.py", line 143, in counting
counter._to_bigwig(df_all_count, sample_counts, self.chrom_sizes,
File "/mnt/shared/scratch/pthorpe/apps/conda/envs/python38/lib/python3.8/site-packages/lapa/count.py", line 561, in to_bigwig
save_count_bw(df_all, output_dir, chrom_sizes, f'all{prefix}')
File "/mnt/shared/scratch/pthorpe/apps/conda/envs/python38/lib/python3.8/site-packages/lapa/count.py", line 197, in save_count_bw
BaseCounter._to_bigwig(df, chrom_sizes, output_dir, prefix)
File "/mnt/shared/scratch/pthorpe/apps/conda/envs/python38/lib/python3.8/site-packages/lapa/count.py", line 153, in _to_bigwig
bw_from_pyranges(
File "/mnt/shared/scratch/pthorpe/apps/conda/envs/python38/lib/python3.8/site-packages/lapa/utils/io.py", line 153, in bw_from_pyranges
gr['-'].to_bigwig(bw_neg_file, chromosome_sizes=chrom_sizes,
File "/mnt/shared/scratch/pthorpe/apps/conda/envs/python38/lib/python3.8/site-packages/pyranges/pyranges.py", line 5339, in to_bigwig
result = _to_bigwig(self, path, chromosome_sizes, rpm, divide, value_col, dryrun)
File "/mnt/shared/scratch/pthorpe/apps/conda/envs/python38/lib/python3.8/site-packages/pyranges/out.py", line 203, in _to_bigwig
bw.addEntries(chromosomes, starts, ends=ends, values=values)
RuntimeError: The entries you tried to add are out of order, precede already added entries, or otherwise use illegal values.
Please correct this and try again.
Would you be able to help?
regards,
Pete
Dear Lapa,
FIX: this is a fix for all the blah below:
mamba install -c bioconda pyranges (it seems the pip version of pyranges does not have all that is required).
I have installed lap via pip install lapa (but I have to also do pip install cython). I have this in a conda environment running python 3.8.
If I run:
$ lapa
Traceback (most recent call last):
File "/mnt/shared/scratch/pthorpe/apps/conda/envs/python38/bin/lapa", line 5, in
from lapa.main import cli
File "/mnt/shared/scratch/pthorpe/apps/conda/envs/python38/lib/python3.8/site-packages/lapa/init.py", line 1, in
from lapa.main import lapa
File "/mnt/shared/scratch/pthorpe/apps/conda/envs/python38/lib/python3.8/site-packages/lapa/main.py", line 2, in
from lapa.lapa import lapa
File "/mnt/shared/scratch/pthorpe/apps/conda/envs/python38/lib/python3.8/site-packages/lapa/lapa.py", line 3, in
import pyranges as pr
File "/mnt/shared/scratch/pthorpe/apps/conda/envs/python38/lib/python3.8/site-packages/pyranges/init.py", line 137, in
import pyranges.genomicfeatures as gf
File "/mnt/shared/scratch/pthorpe/apps/conda/envs/python38/lib/python3.8/site-packages/pyranges/genomicfeatures.py", line 7, in
from sorted_nearest.src.introns import find_introns
File "/mnt/shared/scratch/pthorpe/apps/conda/envs/python38/lib/python3.8/site-packages/sorted_nearest/init.py", line 7, in
from sorted_nearest.src.k_nearest_ties import get_all_ties, get_different_ties
ImportError: cannot import name 'get_all_ties' from 'sorted_nearest.src.k_nearest_ties' (/mnt/shared/scratch/pthorpe/apps/conda/envs/python38/lib/python3.8/site-packages/sorted_nearest/src/k_nearest_ties.cpython-38-x86_64-linux-gnu.so)
Then if I force it to use python3.8 with a lapa.py version from github: I get the same error.
$ python3.8 "/mnt/shared/scratch/pthorpe/private/mustafa/lapa/lapa/lapa/lapa.py"
Traceback (most recent call last):
File "/mnt/shared/scratch/pthorpe/private/mustafa/lapa/lapa/lapa/lapa.py", line 3, in
import pyranges as pr
File "/mnt/shared/scratch/pthorpe/apps/conda/envs/python38/lib/python3.8/site-packages/pyranges/init.py", line 137, in
import pyranges.genomicfeatures as gf
File "/mnt/shared/scratch/pthorpe/apps/conda/envs/python38/lib/python3.8/site-packages/pyranges/genomicfeatures.py", line 7, in
from sorted_nearest.src.introns import find_introns
File "/mnt/shared/scratch/pthorpe/apps/conda/envs/python38/lib/python3.8/site-packages/sorted_nearest/init.py", line 7, in
from sorted_nearest.src.k_nearest_ties import get_all_ties, get_different_ties
ImportError: cannot import name 'get_all_ties' from 'sorted_nearest.src.k_nearest_ties' (/mnt/shared/scratch/pthorpe/apps/conda/envs/python38/lib/python3.8/site-packages/sorted_nearest/src/k_nearest_ties.cpython-38-x86_64-linux-gnu.so)
If I then run pip install pyranges. It says it is already satisfied ..
/home/sutai/.local/lib/python3.10/site-packages/lapa-0.0.5-py3.10.egg/lapa/count.py:594: FutureWarning: The default value of numeric_only in DataFrameGroupBy.sum is deprecated. In a future version, numeric_only will default to False. Either specify numeric_only or select only columns which should be valid for the function. Traceback (most recent call last): File "/home/sutai/biosoft/localcolabfold/localcolabfold/colabfold-conda/bin/lapa", line 33, in <module> sys.exit(load_entry_point('lapa==0.0.5', 'console_scripts', 'lapa')()) File "/home/sutai/biosoft/localcolabfold/localcolabfold/colabfold-conda/lib/python3.10/site-packages/click/core.py", line 1157, in __call__ return self.main(*args, **kwargs) File "/home/sutai/biosoft/localcolabfold/localcolabfold/colabfold-conda/lib/python3.10/site-packages/click/core.py", line 1078, in main rv = self.invoke(ctx) File "/home/sutai/biosoft/localcolabfold/localcolabfold/colabfold-conda/lib/python3.10/site-packages/click/core.py", line 1434, in invoke return ctx.invoke(self.callback, **ctx.params) File "/home/sutai/biosoft/localcolabfold/localcolabfold/colabfold-conda/lib/python3.10/site-packages/click/core.py", line 783, in invoke return __callback(*args, **kwargs) File "/home/sutai/.local/lib/python3.10/site-packages/lapa-0.0.5-py3.10.egg/lapa/main.py", line 112, in cli_lapa File "/home/sutai/.local/lib/python3.10/site-packages/lapa-0.0.5-py3.10.egg/lapa/lapa.py", line 497, in lapa File "/home/sutai/.local/lib/python3.10/site-packages/lapa-0.0.5-py3.10.egg/lapa/lapa.py", line 293, in __call__ File "/home/sutai/.local/lib/python3.10/site-packages/lapa-0.0.5-py3.10.egg/lapa/lapa.py", line 149, in clustering File "/home/sutai/.local/lib/python3.10/site-packages/lapa-0.0.5-py3.10.egg/lapa/cluster.py", line 377, in to_df File "/home/sutai/.local/lib/python3.10/site-packages/lapa-0.0.5-py3.10.egg/lapa/cluster.py", line 378, in <listcomp> File "/home/sutai/.local/lib/python3.10/site-packages/lapa-0.0.5-py3.10.egg/lapa/cluster.py", line 255, in to_dict File "/home/sutai/.local/lib/python3.10/site-packages/lapa-0.0.5-py3.10.egg/lapa/cluster.py", line 135, in to_dict File "/home/sutai/.local/lib/python3.10/site-packages/lapa-0.0.5-py3.10.egg/lapa/cluster.py", line 118, in peak File "/home/sutai/biosoft/localcolabfold/localcolabfold/colabfold-conda/lib/python3.10/site-packages/pandas/core/generic.py", line 11986, in rolling return Window( File "/home/sutai/biosoft/localcolabfold/localcolabfold/colabfold-conda/lib/python3.10/site-packages/pandas/core/window/rolling.py", line 165, in __init__ self._validate() File "/home/sutai/biosoft/localcolabfold/localcolabfold/colabfold-conda/lib/python3.10/site-packages/pandas/core/window/rolling.py", line 1168, in _validate raise ValueError(f"Invalid win_type {self.win_type}") ValueError: Invalid win_type gaussian
I see that you have 'lapa.correction.Transcript' how would I be able to use it.
To my understanding default LAPA deals with 'gene_id' how would I change the analysis so that it looks at the 'transcript_id' instead?
As always thank you
Mustafa
Hi there, I encountered this error when running LAPA on single short read BAM file. What do you advise to solve this? Thanks!
lapa --alignment ${illumina_bam_dir}/${bamfile} --fasta ${reference_genome_fa} --annotation ${reference_gtf} --chrom_sizes ${chrom_sizes} --output_dir ${outdir}/vb_annot/${samplename}_illumina
Error:
Traceback (most recent call last): File "/hpcdata/bcbb/homc/conda_envs/envs/lapa_env/bin/lapa", line 8, in sys.exit(cli_lapa()) File "/hpcdata/bcbb/homc/conda_envs/envs/lapa_env/lib/python3.9/site-packages/click/core.py", line 1130, in call return self.main(*args, **kwargs) File "/hpcdata/bcbb/homc/conda_envs/envs/lapa_env/lib/python3.9/site-packages/click/core.py", line 1055, in main rv = self.invoke(ctx) File "/hpcdata/bcbb/homc/conda_envs/envs/lapa_env/lib/python3.9/site-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, **ctx.params) File "/hpcdata/bcbb/homc/conda_envs/envs/lapa_env/lib/python3.9/site-packages/click/core.py", line 760, in invoke return __callback(*args, **kwargs) File "/hpcdata/bcbb/homc/conda_envs/envs/lapa_env/lib/python3.9/site-packages/lapa/main.py", line 112, in cli_lapa lapa(alignment, fasta, annotation, chrom_sizes, output_dir, File "/hpcdata/bcbb/homc/conda_envs/envs/lapa_env/lib/python3.9/site-packages/lapa/lapa.py", line 497, in lapa _lapa(alignment) File "/hpcdata/bcbb/homc/conda_envs/envs/lapa_env/lib/python3.9/site-packages/lapa/lapa.py", line 288, in call df_all_count, sample_counts = self.counting(alignment) File "/hpcdata/bcbb/homc/conda_envs/envs/lapa_env/lib/python3.9/site-packages/lapa/lapa.py", line 142, in counting df_all_count, sample_counts = counter.to_df() File "/hpcdata/bcbb/homc/conda_envs/envs/lapa_env/lib/python3.9/site-packages/lapa/count.py", line 583, in to_df df = pd.concat([ File "/hpcdata/bcbb/homc/conda_envs/envs/lapa_env/lib/python3.9/site-packages/lapa/count.py", line 584, in self.build_counter(row['path']) File "/hpcdata/bcbb/homc/conda_envs/envs/lapa_env/lib/python3.9/site-packages/lapa/count.py", line 142, in to_df return self.to_gr().df.astype({'Chromosome': 'str', 'Strand': 'str'}) File "/hpcdata/bcbb/homc/conda_envs/envs/lapa_env/lib/python3.9/site-packages/pandas/core/generic.py", line 6212, in astype raise KeyError( KeyError: "Only a column name can be used for the key in a dtype mappings argument. 'Chromosome' not found in columns."
My chrom.sizes file for Anopheles gambiae looks like this, in case that helps (it was generated using samtools faidx as instructed)
AgamP4_2L 49364325
AgamP4_2R 61545105
AgamP4_3L 41963435
AgamP4_3R 53200684
AgamP4_UNKN 42389979
AgamP4_X 24393108
AgamP4_Y_unplaced 237045
AAAB01000047 21505
AAAB01000163 28420
AAAB01000448 22809
AAAB01000791 62303
(..more contigs..)
AgamP4_Mt 15363
Hi Muhammed,
I'm trying to test lapa with RNAseq short reads. I'm using hisat2 for the mapping ( I built the hg38 with transcript index using the files suggested in the lapa tutorial). And my python version is 3.9
After fixing the gtf file and gave it the right format to all the inputs. Lapa failed after trying to process the bam for the first sample with the following error:
$ lapa --alignment samples.csv --fasta genome.fa --annotation genome_utr.gtf --chrom_sizes chrom_sizes --output_dir lapa_test
Traceback (most recent call last):
File "/home/eortiz/.local/bin/lapa", line 8, in
sys.exit(cli_lapa())
File "/zfs/gcl/software/gbf/anaconda3/2021.11/lib/python3.9/site-packages/click/core.py", line 1128, in call
return self.main(*args, **kwargs)
File "/zfs/gcl/software/gbf/anaconda3/2021.11/lib/python3.9/site-packages/click/core.py", line 1053, in main
rv = self.invoke(ctx)
File "/zfs/gcl/software/gbf/anaconda3/2021.11/lib/python3.9/site-packages/click/core.py", line 1395, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/zfs/gcl/software/gbf/anaconda3/2021.11/lib/python3.9/site-packages/click/core.py", line 754, in invoke
return __callback(*args, **kwargs)
File "/home/eortiz/.local/lib/python3.9/site-packages/lapa/main.py", line 112, in cli_lapa
lapa(alignment, fasta, annotation, chrom_sizes, output_dir,
File "/home/eortiz/.local/lib/python3.9/site-packages/lapa/lapa.py", line 497, in lapa
_lapa(alignment)
File "/home/eortiz/.local/lib/python3.9/site-packages/lapa/lapa.py", line 288, in call
df_all_count, sample_counts = self.counting(alignment)
File "/home/eortiz/.local/lib/python3.9/site-packages/lapa/lapa.py", line 142, in counting
df_all_count, sample_counts = counter.to_df()
File "/home/eortiz/.local/lib/python3.9/site-packages/lapa/count.py", line 583, in to_df
df = pd.concat([
File "/home/eortiz/.local/lib/python3.9/site-packages/lapa/count.py", line 584, in
self.build_counter(row['path'])
File "/home/eortiz/.local/lib/python3.9/site-packages/lapa/count.py", line 142, in to_df
return self.to_gr().df.astype({'Chromosome': 'str', 'Strand': 'str'})
File "/zfs/gcl/software/gbf/anaconda3/2021.11/lib/python3.9/site-packages/pandas/core/generic.py", line 5791, in astype
raise KeyError(
KeyError: 'Only a column name can be used for the key in a dtype mappings argument.'
I know this error is generated when the names in the columns don't match exactly, but I'm not so sure how to fix it.
Any suggestion is welcome.
Thanks.
Dear @MuhammedHasan
Great work on LAPA, planning to start using it very soon. I have a question regarding the poly A motifs, as per preprint you have mentioned that LAPA looks for the canonical AATAAA, is this the only motif it searches for before determining a polyA site usage? Have you considered adding other motifs such as
aataaa
attaaa
agtaaa
tataaa
cataaa
gataaa
aatata
aataca
aataga
aaaaag
actaaa
aagaaa
aatgaa
tttaaa
aaaaca
ggggct
I have done some preliminary analysis with SQANTI3 and the distribution change between cells type from AATAAA being used 70% in one to 46% in another. The second, most used is ATTAAA. The others such as AAAAAG and AGTAAA change the most as percentages between cell types
Kind Regards
Mustafa
Dear Lapa,
I have run your tool, currently as a test. I was wondering if you could add some documentation/ pass on some wisdom on what output files to expect?
I have bw and bed files for all my conditions. I have polyA_cluster.bed (what do the columns stand for? What does "None@None" in the last columns mean? .
I have no other files ... ... I was wondering if the program terminated early, however, there is no error or warning in any out files via slurm .. (RAM limit was not exceeded)
can you please advise?
Kind regards,
Pete
I use LAPA on aligned pacbio data from minimap2
I got the following error:
File "/usr/nzx-cluster/apps/lapa/python3.8.11/bin/lapa", line 8, in
sys.exit(cli_lapa())
File "/usr/nzx-cluster/apps/lapa/python3.8.11/lib/python3.8/site-packages/click/core.py", line 1157, in call
return self.main(*args, **kwargs)
File "/usr/nzx-cluster/apps/lapa/python3.8.11/lib/python3.8/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/usr/nzx-cluster/apps/lapa/python3.8.11/lib/python3.8/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/nzx-cluster/apps/lapa/python3.8.11/lib/python3.8/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/usr/nzx-cluster/apps/lapa/python3.8.11/lib/python3.8/site-packages/lapa/main.py", line 112, in cli_lapa
lapa(alignment, fasta, annotation, chrom_sizes, output_dir,
File "/usr/nzx-cluster/apps/lapa/python3.8.11/lib/python3.8/site-packages/lapa/lapa.py", line 497, in lapa
_lapa(alignment)
File "/usr/nzx-cluster/apps/lapa/python3.8.11/lib/python3.8/site-packages/lapa/lapa.py", line 297, in call
df_cluster = self.annotate_cluster(df_cluster)
File "/usr/nzx-cluster/apps/lapa/python3.8.11/lib/python3.8/site-packages/lapa/lapa.py", line 155, in annotate_cluster
df = self.create_genomic_regions().annotate(gr)
File "/usr/nzx-cluster/apps/lapa/python3.8.11/lib/python3.8/site-packages/lapa/genomic_regions.py", line 66, in annotate
gr_ann = pr.PyRanges(gr.df, int64=True).join(
File "/usr/nzx-cluster/apps/lapa/python3.8.11/lib/python3.8/site-packages/pyranges/pyranges_main.py", line 2433, in join
dfs = pyrange_apply(_write_both, self, other, **kwargs)
File "/usr/nzx-cluster/apps/lapa/python3.8.11/lib/python3.8/site-packages/pyranges/multithreaded.py", line 207, in pyrange_apply
assert (
AssertionError: Can only do stranded operations when both PyRanges contain strand info
Hi, I am using lapa for the DRS and cDNA ONT data. While it runs smoothly in DRS, in case of the cDNA reads, it throws an error at the clustering stage.
I used the following command:
lapa --alignment alignment.csv --fasta /references/reference/ucsc/rn7.fa --annotation /references/reference/ucsc/lapa_utrs_ncbiRefSeq.gtf --chrom_sizes /references/reference/ucsc/chrom_sizes.txt --output_dir /ANALYSES/rat/cDNA/LAPA
And here is the traceback:
Traceback (most recent call last):
File "/usr/local/software/lapa/eb16fee/bin/lapa", line 11, in
load_entry_point('lapa==0.0.5', 'console_scripts', 'lapa')()
File "/usr/local/software/lapa/eb16fee/lib/python3.6/site-packages/click/core.py", line 1128, in call
return self.main(*args, **kwargs)
File "/usr/local/software/lapa/eb16fee/lib/python3.6/site-packages/click/core.py", line 1053, in main
rv = self.invoke(ctx)
File "/usr/local/software/lapa/eb16fee/lib/python3.6/site-packages/click/core.py", line 1395, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/software/lapa/eb16fee/lib/python3.6/site-packages/click/core.py", line 754, in invoke
return __callback(*args, **kwargs)
File "/usr/local/software/lapa/eb16fee/lib/python3.6/site-packages/lapa/main.py", line 122, in cli_lapa
non_replicates_read_threhold=non_replicates_read_threhold)
File "/usr/local/software/lapa/eb16fee/lib/python3.6/site-packages/lapa/lapa.py", line 497, in lapa
_lapa(alignment)
File "/usr/local/software/lapa/eb16fee/lib/python3.6/site-packages/lapa/lapa.py", line 297, in call
df_cluster = self.annotate_cluster(df_cluster)
File "/usr/local/software/lapa/eb16fee/lib/python3.6/site-packages/lapa/lapa.py", line 155, in annotate_cluster
df = self.create_genomic_regions().annotate(gr)
File "/usr/local/software/lapa/eb16fee/lib/python3.6/site-packages/lapa/genomic_regions.py", line 67, in annotate
gr_gtf, strandedness='same', how='left')
File "/usr/local/software/lapa/eb16fee/lib/python3.6/site-packages/pyranges/pyranges.py", line 2257, in join
dfs = pyrange_apply(_write_both, self, other, **kwargs)
File "/usr/local/software/lapa/eb16fee/lib/python3.6/site-packages/pyranges/multithreaded.py", line 236, in pyrange_apply
result = call_f(function, nparams, df, odf, kwargs)
File "/usr/local/software/lapa/eb16fee/lib/python3.6/site-packages/pyranges/multithreaded.py", line 23, in call_f
return f.remote(df, odf, **kwargs)
File "/usr/local/software/lapa/eb16fee/lib/python3.6/site-packages/pyranges/methods/join.py", line 129, in _write_both
scdf, ocdf = _both_dfs(scdf, ocdf, how=how)
File "/usr/local/software/lapa/eb16fee/lib/python3.6/site-packages/pyranges/methods/join.py", line 83, in _both_dfs
oh = null_types(ocdf.head(1))
File "/usr/local/software/lapa/eb16fee/lib/python3.6/site-packages/pyranges/methods/join.py", line 67, in null_types
tmp_cat = tmp_cat.cat.add_categories("-1")
File "/usr/local/software/python/3.6.11/lib/python3.6/site-packages/pandas/core/accessor.py", line 89, in f
return self._delegate_method(name, *args, **kwargs)
File "/usr/local/software/python/3.6.11/lib/python3.6/site-packages/pandas/core/arrays/categorical.py", line 2403, in _delegate_method
res = method(*args, **kwargs)
File "/usr/local/software/python/3.6.11/lib/python3.6/site-packages/pandas/core/arrays/categorical.py", line 1023, in add_categories
raise ValueError(msg.format(already_included=already_included))
ValueError: new categories must not include old categories: {'-1'}
I would be grateful for solving the issue.
Hi Mihammed,
I got the following pyrange error:
Traceback (most recent call last):
File "/cluster/work/bewi/members/dondia/Anaconda3/envs/snakemake/bin/lapa", line 8, in
sys.exit(cli_lapa())
File "/cluster/work/bewi/members/dondia/Anaconda3/envs/snakemake/lib/python3.9/site-packages/click/core.py", line 1130, in call
return self.main(*args, **kwargs)
File "/cluster/work/bewi/members/dondia/Anaconda3/envs/snakemake/lib/python3.9/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/cluster/work/bewi/members/dondia/Anaconda3/envs/snakemake/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/cluster/work/bewi/members/dondia/Anaconda3/envs/snakemake/lib/python3.9/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/cluster/work/bewi/members/dondia/Anaconda3/envs/snakemake/lib/python3.9/site-packages/lapa/main.py", line 112, in cli_lapa
lapa(alignment, fasta, annotation, chrom_sizes, output_dir,
File "/cluster/work/bewi/members/dondia/Anaconda3/envs/snakemake/lib/python3.9/site-packages/lapa/lapa.py", line 497, in lapa
_lapa(alignment)
File "/cluster/work/bewi/members/dondia/Anaconda3/envs/snakemake/lib/python3.9/site-packages/lapa/lapa.py", line 288, in call
df_all_count, sample_counts = self.counting(alignment)
File "/cluster/work/bewi/members/dondia/Anaconda3/envs/snakemake/lib/python3.9/site-packages/lapa/lapa.py", line 142, in counting
df_all_count, sample_counts = counter.to_df()
File "/cluster/work/bewi/members/dondia/Anaconda3/envs/snakemake/lib/python3.9/site-packages/lapa/count.py", line 583, in to_df
df = pd.concat([
File "/cluster/work/bewi/members/dondia/Anaconda3/envs/snakemake/lib/python3.9/site-packages/lapa/count.py", line 584, in
self.build_counter(row['path'])
File "/cluster/work/bewi/members/dondia/Anaconda3/envs/snakemake/lib/python3.9/site-packages/lapa/count.py", line 142, in to_df
return self.to_gr().df.astype({'Chromosome': 'str', 'Strand': 'str'})
File "/cluster/work/bewi/members/dondia/Anaconda3/envs/snakemake/lib/python3.9/site-packages/lapa/count.py", line 136, in to_gr
return pr.PyRanges(df).count_overlaps(
File "/cluster/work/bewi/members/dondia/Anaconda3/envs/snakemake/lib/python3.9/site-packages/pyranges/pyranges_main.py", line 1385, in count_overlaps
counts = pyrange_apply(_number_overlapping, self, other, **kwargs)
File "/cluster/work/bewi/members/dondia/Anaconda3/envs/snakemake/lib/python3.9/site-packages/pyranges/multithreaded.py", line 231, in pyrange_apply
result = call_f(function, nparams, df, odf, kwargs)
File "/cluster/work/bewi/members/dondia/Anaconda3/envs/snakemake/lib/python3.9/site-packages/pyranges/multithreaded.py", line 21, in call_f
return f.remote(df, odf, **kwargs)
File "/cluster/work/bewi/members/dondia/Anaconda3/envs/snakemake/lib/python3.9/site-packages/pyranges/methods/coverage.py", line 26, in _number_overlapping
_self_indexes, _other_indexes = oncls.all_overlaps_both(starts, ends, indexes)
File "ncls/src/ncls.pyx", line 74, in ncls.src.ncls.NCLS64.all_overlaps_both
File "ncls/src/ncls.pyx", line 115, in ncls.src.ncls.NCLS64.all_overlaps_both
File "<array_function internals>", line 5, in resize
File "/cluster/work/bewi/members/dondia/Anaconda3/envs/snakemake/lib/python3.9/site-packages/numpy/core/fromnumeric.py", line 1423, in resize
raise ValueError('all elements of new_shape
must be non-negative')
ValueError: all elements of new_shape
must be non-negative
If it can help you, here is the format of one bam read :
molecule/4051_GGCAATACTCGTGACC_B900_Tum_B900_Tum 16 chr1 14424 12 406M140N69M757N108M1I44M659N159M92N198M177N56M GATTGGTGTGCCGTTTTCTCTGGAAGCCTCTTAAGAACACTGTGGCGCAGGCTGGGTGGAGCCGTCCCCCCATGGAGCACAGGCAGACAGAAGTCCCCGCCCCAGCTGTGTGGCCTCAAGCCAGCCTTCCGCTCCTTGAAGCTGGTCTCCACACAGTGCTGGTTCCGTCACCCCCTCCCAAGGAAGTAGGTCTGAGCAGCTTGTCCTGGCTGTGTCCATGTCAGAGCAACGGCCCAAGTCTGGGTCTGGGGGGGAAGGTGTCATGGAGCCCCCTACGATTCCCAGTCGTCCTCGTCCTCCTCTGCCTGTGGCTGCTGCGGTGGCGGCAGAGGAGGGATGGAGTCTGACACGCGGGCAAAGGCTCCTCCGGGCCCCTCACCAGCCCCAGGTCCTTTCCCAGAGATGCCCTTGCGCCTCATGACCAGCTTGTTGAAGAGATCCGACATCAAGTGCCCACCTTGGCTCGTGGCTCTCACTTGCTCCTGCTCCTTCTGCTGCTTCTTCTCCAGCTTTCGCTCCTTCATGCTGCGCAGCTTGGCCTTGCCGATGCCCCCAGCTTGGCGGATGGACTCTAGCAGAGTGGCCCAGCCACCGGAGGGGTCAACCACTTCCCTGGGAGCTCCCTGGACTGAAGGAGACGCGCTGCTGCTGCTGTCGTCCTGCCTGGCGCCTTGGCCTACAGGGGCCGCGGTTGAGGGTGGGAGTGGGGGTGCACTGGCCAGCACCTCAGGAGCTGGGGGTGGTGGTGGGGGCGGTGGGGGTGGTGTTAGTACCCCATCTTGTAGGTCTTGAGAGGCTCGGCTACCTCAGTGTGGAAGGTGGGCAGTTCTGGAATGGTGCCAGGGGCAGAGGGGGCAATGCCGGGGCCCAGGTCGGCAATGTACATGAGGTCGTTGGCAATGCCGGGCAGGTCAGGCAGGTAGGATGGAACATCAATCTCAGGCACCTGGCCCAGGTCTGGCACATAGAAGTAGTTCTCTGGGACCTGCTGTTCCAGCTGCTCTCTCTTGCTGATGGACAAGGGGGCATCAAACAGCTTCT * NM:i:3 ms:i:1031 AS:i:87nn:i:0 ts:A:+ tp:A:P cm:i:307 s1:i:987 s2:i:975 de:f:0.0029 rl:i:0
Let me know if you need any further detail.
Thanks for the help
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.