algolab / galig Goto Github PK
View Code? Open in Web Editor NEWA graph aligner
License: GNU General Public License v3.0
A graph aligner
License: GNU General Public License v3.0
Hi,
I will really appreciate it if you could help me with this. I keep getting this error saying reference genome not found.
Here is the command I used:
dixi06@nia-login05:/scratch/a/amaclea3/dixi06/JeanData/Jdata$ singularity exec ./asgal_v1.1.6.sif /galig/asgal -g Medicago_truncatula.MedtrA17_4.0.dna.chromosome.1.fa -a annotation.gtf -s NRNSA_S103_R1_001.fastq.gz -o "LATD Medtr1g009200"
Thanks.
-Dixi
I found that ASGAL generated only the number of supporting reads for each of the splicing events.
Could you please let me know (with code if possible) how to compute PSI values of each of these events?
Thank you,
Hi, I'm attempting to run ASGAL via the Docker image and encounter an issue in the salmon quant
step.
The error is
Traceback (most recent call last):
File "/galig/asgal", line 585, in <module>
main()
File "/galig/asgal", line 576, in main
runSalmon(args)
File "/galig/asgal", line 210, in runSalmon
command_check_return(salmon_quant_cmd, salmonBam, salmonQuantLog, shell=True, verbose=args.verbose)
File "/galig/asgal", line 62, in command_check_return
completed_process.check_returncode()
File "/usr/lib/python3.8/subprocess.py", line 444, in check_returncode
raise CalledProcessError(self.returncode, self.args, self.stdout,
subprocess.CalledProcessError: Command '/galig/salmon/bin/salmon quant -p 2 -i /data/output/salmon/salmon_index -l A -1 /data/sample_1.fq -2 /data/sample_2.fq -o /data/output/salmon/salmon_out --no-version-check --validateMappings --writeMappings --writeUnmappedNames | samtools view -Sb - | samtools sort -' returned non-zero exit status 1.
And when I check the log/salmon_quant.log
file it appears to be because of a permission denied error.
[E::hts_open_format] Failed to open file "./samtools.71.441.tmp.0000.bam" : Permission denied
samtools sort: failed to create temporary file "./samtools.71.441.tmp.0000.bam": Permission denied
Do you have any idea how to fix this?
Thanks!
Rachel
Hello everyone,
I have been using ASGAL for some time now and I'm very content with the obtained results, congrats on the implementation.
Lately I have been working with samples that present the ALK ATI isoform. ASGAL hasn't been successful at calling this event. After running the program with a few samples (all having this alteration) I have started to think that ASGAL may not be designed to identify this type of event (based on my interpretation of the documentation), but before I jump into that conclusion I would like to know your thought on this. This image provides a nice description of the ALK ATI event.
Let me know if I should provide additional information.
Thanks in advance!
Hi Luca,
It's me again, I ran ASGAL on the docker you gave me and I seem to be coming up with an issue that took me a while to fix, because of a misguiding error message:
Starting ASGAL run for /MOUNT/input/fastq/subsample_15_100K_1.fastq and /MOUNT/input/fastq/subsample_15_100K_2.fastq ...
[ Oct 28, 2020 - 2:28:46PM ] args Namespace(allevents=False, annoPath='/MOUNT/input/splicing_variants.gtf', debug=False, e='3', l='15', multiMode=True, outputPath='/MOUNT/output/asgal-output/subsample_15_100K_1-output', refPath='/MOUNT/input/Homo_sapiens.GRCh38.dna.primary_assembly.fa', sample1Path='/MOUNT/input/fastq/subsample_15_100K_1.fastq', sample2Path='/MOUNT/input/fastq/subsample_15_100K_2.fastq', split_only=False, threads='6', transPath='/MOUNT/input/custom_transcripts.fasta', verbose=False, w='3')
Transcripts file /MOUNT/input/splicing_variants.gtf not found. Halting...
I have no name!@061a56a9b51a:/$ ls /MOUNT/input/splicing_variants.gtf
/MOUNT/input/splicing_variants.gtf
I had the transcript file missing... but it told me that the gtf was missing.
I just solved this after I posted the issue here.. :P
I'll close this issue.. but please note the misguiding error message.
I installed python3
biopython
pysam
gffutils
pandas
cmake
samtools
zlib in a newly created conda environment.
conda create -n asgal -y
conda activate asgal
conda install python=3.6 -y
conda install biopython
pip install pysam
pip install pandas
conda install samtools -y
conda install cmake -y
conda install gffutils -y
conda install zlib
And then execute the following commands:
git clone --recursive https://github.com/AlgoLab/galig.git
cd galig
make prerequisites
make
However when I try to run asgal (./asgal -h), there is an error:
Traceback (most recent call last):
File "./asgal", line 12, in <module>
from Bio import SeqIO
ModuleNotFoundError: No module named 'Bio'
How can I fix this problem?
Hi Luca,
We wrote earlier,
It seems that you don't have biopython installed. But from your first message, it seems that you installed it... Can you import the
Bio
module from the python3 shell?
I apologize for the delay, however, I was a little too quick to write the dockerfile, I wrote for you. It still doesn't work, unfortunately.
I've been trying to use my docker with BioPython Installed. However, even when the python shell can import pandas, or Bio, asgal can't seem to:
I have no name!@2cfb5973d9f4:/myvol1$ asgal --multi -g Homo_sapiens.GRCh38.dna.primary_assembly.fa -a splicing_variants.gtf -s one_test_1.fastq -s2 one_test_2.fastq -t splicing_variants_transcripts.fa -o asgal_results
Traceback (most recent call last):
File "/opt/galig/asgal", line 8, in <module>
import pandas as pd
ModuleNotFoundError: No module named 'pandas'
I have no name!@2cfb5973d9f4:/myvol1$ python
Python 3.7.4 (default, Aug 13 2019, 20:35:49)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas
>>>
>>> import Bio
>>>
Furthermore, If I may bring your attention to the Dockerfile you have as well. It had the following error, when building it.
...
In file included from /galig/sdsl-lite/compiled/include/sdsl/rrr_vector.hpp:27:0,
from /galig/sdsl-lite/compiled/include/sdsl/bit_vectors.hpp:10,
from /galig/src/SplicingGraph.hpp:12,
from /galig/src/SplicingGraph.cpp:1:
/galig/sdsl-lite/compiled/include/sdsl/rrr_helper.hpp: In constructor 'sdsl::binomial_coefficients<n>::impl::impl() [with short unsigned int n = 63]':
/galig/sdsl-lite/compiled/include/sdsl/rrr_helper.hpp:251:9: internal compiler error: Segmentation fault
impl() {
^~~~
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-7/README.Bugs> for instructions.
/galig/Makefile:150: recipe for target 'SplicingGraph.o' failed
make[1]: *** [SplicingGraph.o] Error 1
target.mk:16: recipe for target '/galig/obj' failed
make: *** [/galig/obj] Error 2
The command '/bin/sh -c git clone --recursive https://github.com/AlgoLab/galig.git ; cd galig ; make prerequisites ; make' returned a non-zero code: 2
I use the patch you gave me in my Dockerfile, in the RUN command, before I make salmon and asgal:
wget https://github.com/AlgoLab/galig/files/4437983/CMakeLists.txt.patch.txt ;\
git apply CMakeLists.txt.patch.txt ;\
Of course, you are welcome to a complete Dockerfile as soon as we can compose one together, if you can help.
I am running into issues with group permission with docker.
docker run -v "$PWD"/input:/data algolab/asgal:v1.1.1
Starting with UID:GID 0:0
groupadd: GID '0' already exists
useradd: group 'group' does not exist
error: failed switching to "user:group": unable to find user user: no matching entries in passwd file
Hi,
I am currently working on RNA seq data (Paired end), of human sample, I need to check the alternative splicing event of a specific gene "NEK1" in the sample data, and may be after that for all the genes, my sample has no replicate.I was trying to use ASGAL tool, but while I am giving the input for SpliceAwareAligner tools for generatinig sample.mem file form fastq files nothing is coming up as output. I am using the annotation gtf for that gene only extracted from hg19 human genome gtf.
Here is the command I am usingbin/SpliceAwareAligner -g hg19.fa -a annotation_NEK1.gtf -s NEK1_NM1.fastq -o asgal/NEK1_NM1_output.memPlease help me to run the tool I need the data urgently, can you please help with this?
Greetings, maintainers,
I need your tool on a docker container. However, I haven't been able to install the tool. Could you help me out?
$ docker run -it --rm ubuntu:latest
root@<container-id>:/docker_main# cat /etc/issue
Ubuntu 18.04.4 LTS \n \l
root@<container-id>/docker_main# apt-get update && apt-get install build-essential git python3 python3-pip python3-setuptools python3-biopython python3-biopython-sql python3-pysam cmake libboost1.65-all-dev samtools unzip wget curl zlib1g-dev liblzma-dev libjemalloc-dev libjemalloc1 libghc-bzlib-dev libgff-dev libtbb-dev
root@<container-id>/docker_main# pip3 install gffutils; git clone --recursive https://github.com/AlgoLab/galig.git ; cd galig; make prerequisites
.
..
...
[ 23%] Completed 'libstadenio'
[ 23%] Built target libstadenio
Scanning dependencies of target libtbb
[ 24%] Creating directories for 'libtbb'
[ 25%] Performing download step for 'libtbb'
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 125 100 125 0 0 679 0 --:--:-- --:--:-- --:--:-- 679
100 126 100 126 0 0 345 0 --:--:-- --:--:-- --:--:-- 345
100 2843k 0 2843k 0 0 2008k 0 --:--:-- 0:00:01 --:--:-- 5407k
tbb-2018_U3.tgz: FAILED
sha256sum: WARNING: 1 computed checksum did NOT match
tbb-2018_U3.tgz did not match expected SHA256! Exiting.
CMakeFiles/libtbb.dir/build.make:89: recipe for target 'libtbb-prefix/src/libtbb-stamp/libtbb-download' failed
make[4]: *** [libtbb-prefix/src/libtbb-stamp/libtbb-download] Error 1
CMakeFiles/Makefile2:178: recipe for target 'CMakeFiles/libtbb.dir/all' failed
make[3]: *** [CMakeFiles/libtbb.dir/all] Error 2
Makefile:162: recipe for target 'all' failed
make[2]: *** [all] Error 2
[ 8%] Built target libcereal
[ 15%] Built target libdivsufsort
[ 23%] Built target libstadenio
[ 24%] Performing download step for 'libtbb'
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 125 100 125 0 0 1893 0 --:--:-- --:--:-- --:--:-- 1893
100 126 100 126 0 0 1482 0 --:--:-- --:--:-- --:--:-- 1482
100 2843k 0 2843k 0 0 2660k 0 --:--:-- 0:00:01 --:--:-- 3650k
tbb-2018_U3.tgz: FAILED
sha256sum: WARNING: 1 computed checksum did NOT match
tbb-2018_U3.tgz did not match expected SHA256! Exiting.
CMakeFiles/libtbb.dir/build.make:89: recipe for target 'libtbb-prefix/src/libtbb-stamp/libtbb-download' failed
make[4]: *** [libtbb-prefix/src/libtbb-stamp/libtbb-download] Error 1
CMakeFiles/Makefile2:178: recipe for target 'CMakeFiles/libtbb.dir/all' failed
make[3]: *** [CMakeFiles/libtbb.dir/all] Error 2
Makefile:162: recipe for target 'all' failed
make[2]: *** [all] Error 2
/galig/Makefile:126: recipe for target '/galig/salmon/bin/salmon' failed
make[1]: *** [/galig/salmon/bin/salmon] Error 2
target.mk:16: recipe for target '/galig/obj' failed
make: *** [/galig/obj] Error 2
Any help would be much appreciated,
Thanking you,
Amit
It would a nice feature to allow user to define a temporary directory and use it when running samtools and salmon
See #15
Hello, I had a more general question about running ASGAL in genome-wide mode.
I have a dataset where I know there are 3 novel retained intron events. I was wondering if, after the pre-filtering step performed with by quasi-mapping with Salmon, the reads mapping to these novel retained introns would still be included in down-stream alternative splicing analysis.
I'm asking because I have run ASGAL in genome-wide mode, but fail to detect any events. When I look at the output SAM file for a gene that should have a retained intron, there is no coverage (whereas when I look with a more typical spliced alignment to the reference genome tool, e.g. STAR, there is coverage). I have attached a screenshot of what I mean.
Could you let me know if I am mis-understanding something, or should be running the tool differently?
Thanks!
Rachel
HI, I encountered the following error when testing with your example data after installation. Do you know what caused the issue and is it caused by inproper installation?
Thanks!
File "/opt/galig/asgal", line 54
eprint(f"command: '{' '.join(command)}'")
^
SyntaxError: invalid syntax
Hi,
I am trying to run ASGAL and I am interested: how the files in the logs/ASGAL folder are produced and what do they mean? In the genomewide mode are they produced for each gene in the genome annotation or only for some of them? Which ones?
Thank you!
Hi,
I am using ASGAL tool to find MET14 deletion and EGFR variation events in the samples. I am running genome wide analysis for these two genes. ASGAL run is successful but I am getting .mem and sam file as empty and hence no events reported in the final files.
command I used is as below:
./asgal --multi -g genome.fa -a annotation2.gtf -s sample1.fastq.gz -s2 sample2.fastq.gz -t transcript.fa --allevents -o output
could you please help me in this case as soon as possible?
I installed Asgal in the virtual environment on shared resources HPC using python 3.8 and installed all the packages required using pip install. On Asgal gives error of not finding Salmon
[ Mar 08, 2021 - 10:44:56AM ] args Namespace(allevents=False, annoPath='/GENOMEFILES/ensemble_genomefasta/Homo_sapiens.GRCh38.100.gtf', debug=False, e='3', l='15', multiMode=True, outputPath='/SOFTWARES/asgalvm/output/R01', refPath='/GENOMEFILES/ensemble_genomefasta/Homo_sapiens.GRCh38.dna.primary_assembly.fa', sample1Path='/U2OS/u2os_rawdata/63-Z01-F001/raw_data/R01/R01_1_val_1.fq.gz', sample2Path='/U2OS/u2os_rawdata/63-Z01-F001/raw_data/R01/R01_2_val_2.fq.gz', split_only=False, threads='2', transPath='/GENOMEFILES/ensemble_genomefasta/Homo_sapiens.GRCh38.cds.all.fa.gz', verbose=False, w='3')
[ Mar 08, 2021 - 10:44:56AM ] Opening input annotation...
[ Mar 08, 2021 - 10:44:56AM ] Splitting input annotation...
[ Mar 08, 2021 - 10:45:05AM ] number of genes 60683
[##################################################] 60683/60683
[ Mar 08, 2021 - 10:50:03AM ] Done.
[ Mar 08, 2021 - 10:50:03AM ] Splitting input reference...
[ Mar 08, 2021 - 10:50:54AM ] Done.
[ Mar 08, 2021 - 10:50:54AM ] Running Salmon indexing...
Traceback (most recent call last):
File "/SOFTWARES/asgalvm/galig/asgal", line 585, in <module>
main()
File "/SOFTWARES/asgalvm/galig/asgal", line 576, in main
runSalmon(args)
File "SOFTWARES/asgalvm/galig/asgal", line 183, in runSalmon
command_check_return(salmon_index_cmd, salmonIndexLog, salmonIndexLog, verbose=args.verbose)
File "/SOFTWARES/asgalvm/galig/asgal", line 57, in command_check_return
completed_process = subprocess.run(command,
File "/cluster/software/Python/3.8.6-GCCcore-10.2.0/lib/python3.8/subprocess.py", line 489, in run
with Popen(*popenargs, **kwargs) as process:
File "/cluster/software/Python/3.8.6-GCCcore-10.2.0/lib/python3.8/subprocess.py", line 854, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "/cluster/software/Python/3.8.6-GCCcore-10.2.0/lib/python3.8/subprocess.py", line 1702, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: /SOFTWARES/asgalvm/galig/salmon/bin/salmon
I will highly appreciate any help to rectify this error.
Thanks
best
Sa
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.