Comments (12)
@arostamianfar This seems like an issue for you.
from deepvariant.
The actual error is "The TF examples in /mnt/data/input/gs/wgs-test-shan/test_samples/UDN689484temp/examples/examples_output.tfrecord-00000-of-00064.gz has image/format 'None' (expected 'raw') which means you might need to rerun make_examples to genenerate the examples again."
@pichuan @depristo this is odd since the pipeline ran as a single workflow. The model and docker binary paths also seem correct. One issue I can think of is most of the shards being empty (the output has 64 shards, but it's only 1.3KB in total). Do you know if empty shards could cause such an error?
P.S. the 'gsutil not found' error is actually harmless. I think we should provide a 'parser' for these errors based on the logs that provides a meaningful error message.
from deepvariant.
yeap, it's caused by empty shards. I was able to reproduce this by using 64 shards with the quickstart test data. @depristo should I file a separate issue for this as it's not really a docker issue?
@chenshan03: thanks for the report. As a workaround until this bug is fixed, you may reduce the number of shards to avoid having empty ones.
from deepvariant.
@pichuan @scott7z I believe the empty shards bug has been fixed, is that correct?
from deepvariant.
Hi Mark and Asha,
here's what I believe the current status is:
(1) If there is just an empty shard (a shard file that exist, but just contains 0 record) out of many, what happens is the code will move on to the next shard to attempt to read image/format. -- this is what Mark meant by the previously fixed empty shards bug.
(2) However, if all the shard files exist but all of them contains 0 records, the current code can fail with that error message above.
In this case, if the actual error message observed is:
The TF examples in /mnt/data/input/gs/wgs-test-shan/test_samples/UDN689484temp/examples/examples_output.tfrecord-00000-of-00064.gz has image/format 'None' (expected 'raw')
It seems like this call_variant run is specifically being done on on that one file. And if that file has 0 record, unfortunately it will currently fail with that error. :-(
So, I think this is a real bug that we should fix. Because we do expect the use case where users run 64 separate call_variants, and some of them might have complete empty single input file. Is that correct?
from deepvariant.
yes, I think this is a real bug that still exists.
Due to the distributed nature of the cloud process, some machines may get shards that are all empty. Also, we actually only supply one of the shards to each process, so (1) doesn't really apply (there is no 'next shard').
You can reproduce this by adding "--shards 64" to the quickstart test data configuration in https://cloud.google.com/genomics/deepvariant.
from deepvariant.
My view is that if all shards are empty we should just write an empty CVO file. If that's not what happens right now, let's add a bug to buganizer and fix it.
from deepvariant.
I filed a bug in buganizer.
from deepvariant.
This has been fixed by the DeepVariant 0.5.1 release that just came out a few minutes ago. Thank you for raising attention to this issue.
from deepvariant.
Hi Cory (@cmclean),
Thank you for the new release, but if we look at the new timings with the 0.5.1
release, they seem to have gotten longer than with the previous version:
Timings: Whole Genome Case Study - [0.5 (pink) vs. 0.5.1 (green)
]
Timings: Exome Case Study - [0.5 (pink) vs. 0.5.1 (green)
]
What is the cause of the additional delay in version 0.5.1
as compared to the previous one?
Thanks,
Paul
from deepvariant.
Hi Paul,
Two quick suggestions. First, I'd recommend posting this question in a separate issue, to keep the discussion clean since this is a very interesting and general observation.
Second, it's unclear to us if this is normal variation in cloud timing [not all machines you create are identical. For example, the case study command:
gcloud beta compute instances create "${USER}-deepvariant-casestudy" --scopes "compute-rw,storage-full,cloud-platform" --image-family "ubuntu-1604-lts" --image-project "ubuntu-os-cloud" --machine-type "custom-64-131072" --boot-disk-size "300" --boot-disk-type "pd-ssd" --zone "us-west1-b"
Doesn't specify the exact machine type, so we're likely getting skylake processors sometimes and broadwell processors other times. That alone could account for the variation in timing we are seeing here.
from deepvariant.
Hi all,
it has recently be reported again that the crashing issue on empty shard for call_variants
wasn't fully resolved last time. I just released v0.6.1 that should really resolve this issue now:
https://github.com/google/deepvariant/releases/tag/v0.6.1
The issue was that I didn't properly return in the if branch where an empty shard was detected:
12f9e67
(And the unit test I had for it was flawed. We'll fix the unit test in a later release.)
This time I've tested it manually on an empty shard, and confirmed that call_variants works when there is zero record.
Please feel free to report if you see any issues again. Thank you!
from deepvariant.
Related Issues (20)
- Error running DeepVariant v1.1.0 HOT 4
- stuck for hours at candidate finding HOT 2
- GPU with less than 16GB memory HOT 3
- Is there any option to use sequencing error correction part only?
- deepvariant 1.6.0 with singularity gpu support HOT 5
- Left-normalization error HOT 4
- Variant calls after local realignment : is it the most accurate ? HOT 1
- There are bugs in version 1.6 HOT 4
- Better example for training HOT 6
- DeepVariant with RNASeq Model "stuck" HOT 5
- Error running KMC in Giraffe case study HOT 4
- How could I lower the Total %cpu when deepvariant running call_variant.py HOT 2
- v1.6 hangs when only (GRCh38) alt-mapping reads present. HOT 5
- How could I set any threshold to control FILTER column which PASS or RefCall? HOT 5
- Issue calling make_examples.py HOT 2
- Maintain Barcode in Output HOT 6
- The step of postprocess_variants cannot find the VCF file HOT 2
- Running error with deepvariant_1.6.0-gpu.sif HOT 3
- ouput variants from tool HOT 6
- Not all shards generated HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from deepvariant.