Comments (17)
Thanks for the report @dmalzl, will add a fix immediately. I will just remove the flag since most of the time nobody will follow the log output anyway.
from fetchngs.
On the same issue, there seems to be a problem with configuration of the sra-toolkit as I get this message after resolving the command line parameter issue
This sra toolkit installation has not been configured.
Before continuing, please run: vdb-config --interactive
For more information, see https://www.ncbi.nlm.nih.gov/sra/docs/sra-cloud/
after which it retries and fails
from fetchngs.
I had a look and I don't see the same thing when running prefetch from the container. What version of sra-tools are you running?
Output that I see:
prefetch --help
-p|--progress Show progress
"prefetch" version 2.11.0
I'm happy to remove the hardcoded option, though, since it doesn't really serve a purpose.
On the configuration: There is code in the module that generates a configuration for you. Why is that not working for you? Please open a separate issue on that or join us on slack to discuss.
from fetchngs.
Ah sorry this is my fault. I had the configuration problem before and therefore used the preinstalled module I have on our cluster which is v2.9.6.1. Didn't anticipate the version difference. So I think this was then just a problem on my side. However, the problem with the configuration is unchanged. It now runs with my version but I am really curious why configuration is not working for me. Would you suggest opening another issue here or directly go to slack? Which one would be better for discussion?
from fetchngs.
I think Slack will be easier if you don't mind there's a fetchngs channel.
from fetchngs.
Sorry for being annoying but I joined and don't find the fetchngs channel. Just have pipelines is this the one you were referring to?
from fetchngs.
No worries found it
from fetchngs.
Last issue connected to this I promise because I fixed it this way. After fixing my ncbi config with the two missing lines
/LIBS/GUID = "bd94a196-a984-494f-909e-14e1ffb250e4"
/libs/cloud/report_instance_identity = "true"
I retried and it worked. To reset the pipeline to the state it was in I checked out all my changes and tried again just to be sure the issue doesn't seem to be fixed just because I worked my voodoo on the code. Unfortunately, this was the case with the reason for this being that the stupid prefetch
is not just writing the file in the cwd but rather in some other directory specified in the ncbi_config. Simply adding -o ./$id
to the command fixed this. So the complete change to the module should be
retry_with_backoff.sh prefetch \\
$args \\
$id \\
-o ./$id
This will also prevent others to run into the same issues as I experienced today
from fetchngs.
Although I don't know if prepending $id
with ./
is necessary since this is something I had to do with v2.9.6.1 but v2.11 does not complain about it
from fetchngs.
Since this seems specific to your case, i.e., the content of your NCBI configuration, I suggest you make use of the args
with a local configuration.
process {
withName: SRATOOLS_PREFETCH {
ext.args = { "-o ./$id" }
}
}
from fetchngs.
Ah that's clever. Thanks for the suggestion
from fetchngs.
I have identified yet another possible problem. Using the -o ./$id
forces the pipeline, at least in my case, to save the data into the cwd of the current process. However, the file is then named $id which is subsequently passed on to fasterq-dump
. But passing just a plain SRA id to fasterq-dump will result in fasterq-dump downloading the SRA file again instead of just processing the already existing one. I tried with suffixing the additional arg wit.sra
but then the pipeline tells me it could not find the expected output since the output it expects from prefetch
is a file named just with an SRA id.
I fixed this by changing the output to path("${id}.sra")
which seems to work fine until the process completes where it starts to throw an error but i didn't have the time to look into this yet. In any way I think this should be changed to incorporate the -o
argument to force the output to have the .sra
suffix in order to avoid futile downloads and data accumulation
from fetchngs.
The default behavior is that prefetch creates a directory which contains the SRA so $id/$id.sra
if you will. This directory is then supplied to fasterq-dump which will look for such a directory. I hadn't used the -o
flag before and didn't look at it specifically.
I suggest you use either -O .
/ --output-directory .
to force output in the current directory or -o $id/$id.sra
/ --output-file $id/$id.sra
. Then everything should work as expected.
from fetchngs.
Okay. Again something I didn't know. I now just deleted my ncbi config file to let the pipeline take care of it. Hope that settles it now. Thanks
from fetchngs.
I'm working on some improvements for the pipeline based on your troubles but it will still take a while until they are released.
from fetchngs.
No worries. Since I was not really dependent on the configurations in my file simply deleting it seems to have done the trick for now. But I guess at least some future user will profit from my experiences. Thanks for taking care
from fetchngs.
Looks like this has been resolved. Will close for now but feel free to re-open if the problem persists.
from fetchngs.
Related Issues (20)
- TypeError: unsupported operand type(s) for |: 'dict' and 'dict' HOT 1
- error executing process HOT 3
- Add compatibility with sarek samplesheet
- SRAtools download seems to insert paired-end suffix into workdir path HOT 6
- Support .ngc file for dbgap downloads HOT 2
- FEAT: Pass scientific name as input to download the data
- Pipeline fails for large studies HOT 1
- Pipeline crashes if some samples are not available HOT 3
- Use nf-test for input validation
- check out extensions for input files HOT 1
- Add ability to download more than 2 FastQ files via FTP and Aspera HOT 3
- Merge technical replicates (SRR1 + SRR2 -> SRX)
- nf-validation-1.1.3: Operation not supported HOT 20
- `vdb-validate` does not detect file corruption HOT 5
- URGENT: pin nf-validation version HOT 1
- wget host address error HOT 5
- aspera `CONDA_PREFIX` error HOT 1
- Automatic retrieval of input id.csv from test-datasets for test profile HOT 5
- SRA file links deprecated HOT 3
- Support for GSA accessions
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fetchngs.