Code Monkey home page Code Monkey logo

Comments (2)

AstrobioMike avatar AstrobioMike commented on July 30, 2024

Heya, @Emmashipman,

Just came across this while looking at some other issues here, are you sure the slurm compute node is connected to the location you are trying to access?

You can check that for sure by just making a new slurm script where the call is just checking for the directory you are trying to use. So like just making the call line something like this:

call="ls /home/user/RNASEQDATADUMP/HTS_outputs/"

Running that slurm script and seeing if it prints the contents of that location or if it gives you an error will at least let you know if slurm itself can see the location. (you could also probably just run srun ls /home/user/RNASEQDATADUMP/HTS_outputs/ to check, but might as well do everything the same in case the script is making a difference)

good luck!

from htstream.

Emmashipman avatar Emmashipman commented on July 30, 2024

Hi, thanks for the reply! I fixed that, it was a small error in my shell script, but I have found a more serious bug.... when I run a set of the HTS programs on a pair of PE RNASeq data, I get a very strange output and error message.

The output does not produce any trimmed fastq.gz files, and the JSON file with the record of the programs run only includes a few of the programs (e.g., NTrimmer, QWindowtrim, and the final run of hts_Stats) and show hardly any input or output reads (about 2400, instead of several million in the file). It is like this:

"Program_details": {
"program": "hts_NTrimmer",
"version": "v1.3.3",
"options": {
"append-stats-file": "/home/enshipma/RNASEQDATADUMP/HTS_outputs/N_h>
"exclude": false,
"force": false,
"notes": "remove N chars",
"uncompressed": false
}
},
"Fragment": {
"in": 2319,
"out": 2319,
"basepairs_in": 694532,
"basepairs_out": 689580
},
"Single_end": {
"in": 18,
"out": 18,
"basepairs_in": 4263,
"basepairs_out": 4234,

As you can see, the number of reads is really low, and the log has missed my entire first five programs, starting with the NTrimmer.

I also get this error message on some of my runs.

terminate called after throwing an instance of 'std::out_of_range'
what():  basic_string::substr: __pos (which is 10) > this->size() (which is 9)
ERROR: There are not either 3 (SE, itab3), 4 (SE, itab with tags) 5 (PE, itab5), or 6 (PE, itab6), or 8 (PE, itab6 with tags) elements within a tab delimited file line

I found the error message in the github page, saying it had to do with the i/o handler. I assume it's a problem with the tab delimited files that get streamed from program to program.

I just ran a test today trying to only use a few of the hts programs (the SeqScreener), to see if I could isolate the point at which the files failed. I got a json file that looked as expected and no fastq output files.

Here is my code:
hts_Stats -L /home/enshipma/RNASEQDATADUMP/HTS_outputs/B_htsStats.json -N 'initi
al stats'
-1 /home/enshipma/RNASEQDATADUMP/BWTsalt3/BWTsalt3_CKDL220018431-1A_HJ
2W5BBXX_L6_1.fq.gz
-2 /home/enshipma/RNASEQDATADUMP/BWTsalt3/BWTsalt3_CKDL220018
431-1A_HJ2W5BBXX_L6_1.fq.gz |
hts_SeqScreener -A /home/enshipma/RNASEQDATADUMP/H
TS_outputs/B_htsStats.json -N 'screen phix' |
hts_SeqScreener -A /home/enshipma/
RNASEQDATADUMP/HTS_outputs/B_htsStats.json -N 'count rRNAs' --record --seq /home/enshipma/RNASEQDATADUMP/wheat_rrna.fasta |
hts_Stats -A /home/enshipma/RNASEQDATADUMP/HTS_outputs/B_htsStats.json -N 'final stats' \ -f /home/enshipma/RNASEQDATADUMP/HTS_outputs/B_hts_test1/

The slurmout file seemed to have read out all 55 million lines to the stdout, and the error file is blank. I have no output files from this in fastq format. My next step will be to include the next program following Seqscreener, and see if the stream fails there...

I am just wondering if you have encountered this issue before and if you know what step can end up being overzealous in trimming?

from htstream.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.