Code Monkey home page Code Monkey logo

Comments (15)

ldenti avatar ldenti commented on July 29, 2024

Hi,
the error occurs while compiling tbb (a dependency of salmon): it seems that the developers of tbb changed something in their release.

I found a possible workaround (but not a fix) to the problem.

Open the CMakeLists.txt file from the cloned salmon repository and edit the following lines:

  • line 552, from
    set(TBB_SOURCE_DIR ${CMAKE_CURRENT_SOURCE_DIR}/external/tbb-2018_U3)
    to
    set(TBB_SOURCE_DIR ${CMAKE_CURRENT_SOURCE_DIR}/external/oneTBB-2018_U3)
  • line 566, from
    ${SHASUM} 23793c8645480148e9559df96b386b780f92194c80120acce79fcdaae0d81f45 tbb-2018_U3.tgz &&
    to
    ${SHASUM} e5f19d747f6adabfc7daf2cc0a1ddcfab0f26bc083d70ea0a63def4a9f3919c5 tbb-2018_U3.tgz &&
  • line 574, from
    SOURCE_DIR ${CMAKE_CURRENT_SOURCE_DIR}/external/tbb-2018_U3
    to
    SOURCE_DIR ${CMAKE_CURRENT_SOURCE_DIR}/external/oneTBB-2018_U3

This is a patch you can apply: CMakeLists.txt.patch.txt.

Let me know if it worked.

Best,
Luca

from galig.

amitfenn avatar amitfenn commented on July 29, 2024

Okay, So I got the salmon patch in, but is there a way to get asgal to stop downloading and installing salmon again? Because I get the same error message.

from galig.

ldenti avatar ldenti commented on July 29, 2024

So you applied the patch, ran make prerequisites again and it failed with the same message? It's quite strange, I tried now and it worked

This is what I did (inside the docker container):

# install dependencies with apt and pip3
git clone --recursive https://github.com/AlgoLab/galig.git
cd galig/salmon
wget https://github.com/AlgoLab/galig/files/4437983/CMakeLists.txt.patch.txt
git apply CMakeLists.txt.patch.txt
cd ..
make salmon # you can compile just salmon running this
# Then you have to compile lemon and sdsl
# make lemon sdsl

It shouldn't be necessary but, if you already cloned the repository on your container and tried to compile salmon, you can try to remove the folders salmon/external and salmon/build.

from galig.

ldenti avatar ldenti commented on July 29, 2024

Anyway, since the problem you are encountering affects the compilation of asgal on any system (not only in a docker container), I removed salmon as submodule and I modified the Makefile to download directly salmon pre-compiled binary.

You can find the changes in the binsalmon branch. Can you try it out and let me know if it works?

After you installed all the dependencies, you just have to:

git clone --recursive https://github.com/AlgoLab/galig.git
git checkout binsalmon
make prerequisites
make

from galig.

amitfenn avatar amitfenn commented on July 29, 2024

Thank you Luca,

I had earlier downloaded the entire salmon release v0.12.0, rather than just using the patch you provided: https://github.com/AlgoLab/galig/files/4437983/CMakeLists.txt.patch.txt. Your commands were handy too.

I had traced the CMakeLists.txt that you shared with me and I thought I should have gotten that version of Salmon.

Either way.. I also am super grateful for the way you fix errors.. The patch, the bin and modifying your repo. I thank you for your quick response, for fixing this bug for others as well and for your patience.

from galig.

amitfenn avatar amitfenn commented on July 29, 2024

Dear Luca,

I thought this was a closed issue, but apparently not. I was trying ASGAL out with the --multi function and I think the error still has to do with ASGAL's interface with Salmon. Perhaps you'd be a better judge of what's exactly going on here.

(base) root@c08985285cee:/docker_main# asgal --multi -g ./Homo_sapiens.GRCh38.dna.primary_assembly.fa -a ./splicing_variants.gtf -s ./test_1.fastq -s2 ./test_2.fastq -t ./splicing_variants_transcripts.fa -o ./asgalresults
[ Apr 30, 2020 - 9:16:15PM ] Opening input annotation...
[ Apr 30, 2020 - 9:16:15PM ] Indexing...
[ Apr 30, 2020 - 9:18:01PM ] Splitting input annotation...
[##################################################] 37648/37648
[ Apr 30, 2020 - 9:19:08PM ] Done.
[ Apr 30, 2020 - 9:19:08PM ] Splitting input reference...
[ Apr 30, 2020 - 9:20:12PM ] Done.
[ Apr 30, 2020 - 9:20:12PM ] Running Salmon indexing...
Traceback (most recent call last):
File "/opt/galig/asgal", line 536, in
main()
File "/opt/galig/asgal", line 530, in main
runSalmon(args)
File "/opt/galig/asgal", line 169, in runSalmon
stdout=open(salmonIndexLog, 'w'), stderr=open(salmonIndexLog, 'w'))
File "/usr/lib/python3.6/subprocess.py", line 423, in run
with Popen(*popenargs, **kwargs) as process:
File "/usr/lib/python3.6/subprocess.py", line 729, in init
restore_signals, start_new_session)
File "/usr/lib/python3.6/subprocess.py", line 1364, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
PermissionError: [Errno 13] Permission denied: 'salmon'
(base) root@c08985285cee:/docker_main#

Note on docker: Dockers usually have permission issues, so these are now copied files inside the docker container and the container is running on root. So I don't expect the permission issues to be really coming from docker.

Any help would be much appreciated,
Thanks,
Amit

from galig.

ldenti avatar ldenti commented on July 29, 2024

Hi Amit,
did asgal work on the example data?

Can you please send me the salmon log you find in the logs subfolder (inside the output folder)?

Luca

from galig.

amitfenn avatar amitfenn commented on July 29, 2024

❯ l
total 5,5G
drwxr-sr-x 6 afenn asdockers 4,0K Mai 11 09:05 .
drwxrws--- 14 tim asdockers 4,0K Mai 11 08:47 ..
drwxr-sr-x 2 afenn asdockers 1,5M Mai 11 09:02 annos
-rw-rw-r-- 1 afenn asdockers 1,2G Apr 30 22:50 Homo_sapiens.GRCh38.98.chr.gtf
-rw-rw-r-- 1 afenn asdockers 3,0G Apr 30 22:51 Homo_sapiens.GRCh38.dna.primary_assembly.fa
-rw-rw-r-- 1 afenn asdockers 354M Apr 30 22:51 Homo_sapiens.GRCh38.transcriptome.fa
drwxr-sr-x 2 afenn asdockers 4,0K Mai 11 09:04 logs
drwxr-sr-x 2 afenn asdockers 12K Mai 11 09:04 refs
drwxr-sr-x 4 afenn asdockers 4,0K Mai 11 09:04 salmon
-rw-r--r-- 1 afenn asdockers 146M Apr 30 22:51 splicing_variants.gtf
-rw-r--r-- 1 afenn asdockers 512M Apr 30 22:57 splicing_variants.gtf.db
-rw-rw-r-- 1 afenn asdockers 340M Apr 30 22:51 splicing_variants_transcripts.fa

❯ ls logs salmon -R
logs:
salmon_index.log

salmon:
salmon_index salmon_out

salmon/salmon_index:
❯ cat logs/salmon_index.log

salmon/salmon_out:

Sorry Luca, I double checked this. The logs appear to be empty. Is there a verbose or a debug mode I could try for perhaps more information?

I'd say asgal does not work for my example dataset. It does work without the " --multi " function.

from galig.

ldenti avatar ldenti commented on July 29, 2024

Can you please run salmon from terminal and let me know what is its output? This is the command run by the asgal script:

{galig_repo}/salmon/bin/salmon index -p 2 -t [splicing_variants_transcripts.fa] -i [salmon_index]

from galig.

amitfenn avatar amitfenn commented on July 29, 2024

Hi Luca,

I'm not sure what's going wrong anymore..
I think the first error has disappeared since I updated the PATH variable to include {galig_repo}/salmon/bin/

here's the output of salmon since.

salmon index -p 2 -t ./splicing_variants_transcripts.fa -i /myvol1/asgal_results/salmon/salmon_index/
Version Info: Could not resolve upgrade information in the alotted time.
Check for upgrades manually at https://combine-lab.github.io/salmon
[2020-05-12 15:22:26.894] [jLog] [info] building index
[2020-05-12 15:22:26.943] [jointLog] [info] [Step 1 of 4] : counting k-mers
[2020-05-12 15:22:28.521] [jointLog] [warning] Entry with header [ENSG00000281344_template] was longer than 200000 nucleotides.  Are you certain that we are indexing a transcriptome and not a genome?
[2020-05-12 15:22:29.909] [jointLog] [warning] Entry with header [ENSG00000249815_ir] was longer than 200000 nucleotides.  Are you certain that we are indexing a transcriptome and not a genome?
[2020-05-12 15:22:31.721] [jointLog] [warning] Entry with header [ENSG00000282431_template], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2020-05-12 15:22:35.241] [jointLog] [warning] Entry with header [ENSG00000196376_ir] was longer than 200000 nucleotides.  Are you certain that we are indexing a transcriptome and not a genome?
[2020-05-12 15:22:35.257] [jointLog] [warning] Entry with header [ENSG00000286540_ir] was longer than 200000 nucleotides.  Are you certain that we are indexing a transcriptome and not a genome?
[2020-05-12 15:22:35.631] [jointLog] [warning] Entry with header [ENSG00000237838_ir] was longer than 200000 nucleotides.  Are you certain that we are indexing a transcriptome and not a genome?
[2020-05-12 15:22:41.118] [jointLog] [warning] Entry with header [ENSG00000270961_template], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2020-05-12 15:22:41.124] [jointLog] [warning] Entry with header [ENSG00000270451_template], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2020-05-12 15:22:42.275] [jointLog] [warning] Entry with header [ENSG00000184226_ir] was longer than 200000 nucleotides.  Are you certain that we are indexing a transcriptome and not a genome?
[2020-05-12 15:22:42.566] [jointLog] [warning] Entry with header [ENSG00000258394_ir] was longer than 200000 nucleotides.  Are you certain that we are indexing a transcriptome and not a genome?
[2020-05-12 15:22:42.972] [jointLog] [warning] Entry with header [ENSG00000211909_template], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2020-05-12 15:22:42.972] [jointLog] [warning] Entry with header [ENSG00000227196_template], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2020-05-12 15:22:42.972] [jointLog] [warning] Entry with header [ENSG00000211915_template], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2020-05-12 15:22:42.972] [jointLog] [warning] Entry with header [ENSG00000227800_template], had length less than the k-mer length of 31 (perhaps after poly-A clipping)                                                                                                            
[2020-05-12 15:22:42.972] [jointLog] [warning] Entry with header [ENSG00000211920_template], had l                                                                                ength less than the k-mer length of 31 (perhaps after poly-A clipping)                                                                                                            
[2020-05-12 15:22:42.972] [jointLog] [warning] Entry with header [ENSG00000211921_template], had l                                                                                ength less than the k-mer length of 31 (perhaps after poly-A clipping)                                                                                                            
[2020-05-12 15:22:42.972] [jointLog] [warning] Entry with header [ENSG00000232543_template], had l                                                                                ength less than the k-mer length of 31 (perhaps after poly-A clipping)                                                                                                            
[2020-05-12 15:22:42.972] [jointLog] [warning] Entry with header [ENSG00000237197_template], had l                                                                                ength less than the k-mer length of 31 (perhaps after poly-A clipping)                                                                                                            
[2020-05-12 15:22:42.972] [jointLog] [warning] Entry with header [ENSG00000233655_template], had l                                                                                ength less than the k-mer length of 31 (perhaps after poly-A clipping)                                                                                                            
[2020-05-12 15:22:42.975] [jointLog] [warning] Entry with header [ENSG00000254045_template], had l                                                                                ength less than the k-mer length of 31 (perhaps after poly-A clipping)                                                                                                            
Elapsed time: 16.632s                                                                                                                                                             

[2020-05-12 15:22:43.575] [jointLog] [warning] Removed 2120 transcripts that were sequence duplicates of indexed transcripts.                                                                                                                                                       
[2020-05-12 15:22:43.575] [jointLog] [warning] If you wish to retain duplicate transcripts, please                                                                                 use the `--keepDuplicates` flag                                                                                                                                                  
[2020-05-12 15:22:43.632] [jointLog] [info] Replaced 3 non-ATCG nucleotides                                                                                                       
[2020-05-12 15:22:43.632] [jointLog] [info] Clipped poly-A tails from 619 transcripts
[2020-05-12 15:22:43.703] [jointLog] [info] Building rank-select dictionary and saving to disk
[2020-05-12 15:22:43.727] [jointLog] [info] done
Elapsed time: 0.023286s
[2020-05-12 15:22:44.074] [jointLog] [info] Writing sequence data to file . . . 
[2020-05-12 15:22:44.251] [jointLog] [info] done
Elapsed time: 0.177109s
[2020-05-12 15:22:46.453] [jointLog] [info] Building 32-bit suffix array (length of generalized text is 344391509)
[2020-05-12 15:22:47.500] [jointLog] [info] Building suffix array . . . 
success
saving to disk . . . done
Elapsed time: 0.740055s
done
Elapsed time: 42.1645s
processed 344000000 positions[2020-05-12 15:26:08.570] [jointLog] [info] khash had 142280847 keys
[2020-05-12 15:26:09.233] [jointLog] [info] saving hash to disk . . .                                                                                                             
[2020-05-12 15:26:19.516] [jointLog] [info] done                                                                                                                                  
Elapsed time: 10.2826s                                                                                                                                                            
[2020-05-12 15:26:33.946] [jLog] [info] done building index 

.
.
.
.
and the output files from running asgal, which works okayish, i think:

I have no name!@1a2dc4d0e695:/myvol1$ ls -lhtr ./asgal_results/
total - -G (edited to remove older files)
drwxr-sr-x 4 1491850500 1491900551 4.0K May 12 15:12 salmon
drwxr-sr-x 2 1491850500 1491900551 4.0K May 12 15:12 samples
drwxr-sr-x 3 1491850500 1491900551 4.0K May 12 15:12 logs
drwxr-sr-x 2 1491850500 1491900551 1.5M May 12 15:12 annos
drwxr-sr-x 2 1491850500 1491900551 4.0K May 12 15:12 ASGAL
I have no name!@1a2dc4d0e695:/myvol1$ ls -lhtr ./asgal_results/ASGAL/
total 708K
-rw-r--r-- 1 1491850500 1491900551 280K May 12 15:12 ENSG00000223972.mem
-rw-r--r-- 1 1491850500 1491900551 259K May 12 15:12 ENSG00000237491.mem
-rw-r--r-- 1 1491850500 1491900551 134K May 12 15:12 ENSG00000279928.mem
-rw-r--r-- 1 1491850500 1491900551  468 May 12 15:12 ENSG00000236397.mem
-rw-r--r-- 1 1491850500 1491900551  25K May 12 15:12 ENSG00000233614.mem
I have no name!@1a2dc4d0e695:/myvol1$ ls -lhtr ./asgal_results/logs/
total 24K
-rw-r--r-- 1 1491850500 1491900551  16K May 12 15:11 salmon_index.log
-rw-r--r-- 1 1491850500 1491900551 3.2K May 12 15:12 salmon_quant.log
-rw-r--r-- 1 1491850500 1491900551    0 May 12 15:12 samtools.log
drwxr-sr-x 2 1491850500 1491900551 4.0K May 12 15:12 ASGAL
I have no name!@1a2dc4d0e695:/myvol1$ 

However, I don't see any CSV files


Furthermore.. For a single run as well, we don't get any CSV files.

asgal --multi -g Homo_sapiens.GRCh38.dna.primary_assembly.fa -a
splicing_variants.gtf -s test_1.fastq -s2 test_2.fastq -t
splicing_variants_transcripts.fa -o asgal_results

In the asgal_results/logs/ASGAL - there was one log file named after a
gene. Inside it:

Traceback (most recent call last):
File "/opt/galig/scripts/detectEvents.py", line 6, in
from Bio import SeqIO
ModuleNotFoundError: No module named 'Bio'

In the asgal_results/ASGAL folder there was only mem file, but no
events.csv.


Do you think we're using ASGAL wrong?

from galig.

ldenti avatar ldenti commented on July 29, 2024

It seems that you don't have biopython installed. But from your first message, it seems that you installed it... Can you import the Bio module from the python3 shell?

from galig.

amitfenn avatar amitfenn commented on July 29, 2024

I guess I should have taken a closer look at that error message, Sorry to have bothered you again with this Luca, And thanks for double checking on BioPython. I thought i had it, but I was mistaken.

I've updated my Dockerfile and it seems to work for me now. I've also attached a dockerfile that pulls directly from dockerhub and it should work just fine. Feel free to share it with those who might need this tool in a docker.

Dockerfile-asgal.txt

Thank you for all your support, Luca

from galig.

gdv avatar gdv commented on July 29, 2024

Dear Amit,
thanks for the Dockerfile you provided.
I would like to know if you have tried our Dockerfile or if you have written it from scratch.

from galig.

amitfenn avatar amitfenn commented on July 29, 2024

#FACEPALM.... I wrote it from scratch, when I didn't notice anything on the README. I should have paid more attention to your repo. I think your Dockerfile is more elegantly made. It would have saved me a lot of time.

Thanks for pointing it out.

from galig.

gdv avatar gdv commented on July 29, 2024

This means that we have to put some notice in the README.

Have a nice day

from galig.

Related Issues (16)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.