Code Monkey home page Code Monkey logo

Comments (6)

yfarjoun avatar yfarjoun commented on August 16, 2024

I suspect that the interval-list isn't quite in the right format.
It would help resolve this issue if the head of the interval-file (header
and a few of the first intervals) were reproduced in this issue.

However, as a side-note, I feel the need to comment that BED intervals are
not Picard intervals:

BED is 0-based OPEN ended, while
Picard is 1-based CLOSED-ended, thus

in BED:
1100200

is equivalent to
1101200
in Picard.

Cheers,
Yossi.

On Mon, Dec 22, 2014 at 2:16 PM, Stephane Plaisance <
[email protected]> wrote:

Please help with this error!
my lists start with the matchong .dict content followed by bed rows from
SeqCapEZ_Exome_v3

java -jar /opt/biotools/picard/picard.jar CalculateHsMetrics
BI=/opt/biodata/SeqCapEZ_Exome_v3/bait.list
TI=/opt/biodata/SeqCapEZ_Exome_v3/target.list
I=Patient1_results/Patient1_gatk.bam
R=/opt/biodata/reference/human/GRCh37.73.fa O=test
[Mon Dec 22 22:11:52 CET 2014] picard.analysis.directed.CalculateHsMetrics
BAIT_INTERVALS=[/opt/biodata/SeqCapEZ_Exome_v3/bait.list]
TARGET_INTERVALS=[/opt/biodata/SeqCapEZ_Exome_v3/target.list]
INPUT=Patient1_results/Patient1_gatk.bam OUTPUT=test
REFERENCE_SEQUENCE=/opt/biodata/reference/human/GRCh37.73.fa
METRIC_ACCUMULATION_LEVEL=[ALL_READS] VERBOSITY=INFO QUIET=false
VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000
CREATE_INDEX=false CREATE_MD5_FILE=false
[Mon Dec 22 22:11:52 CET 2014] Executing as splaisan@r710bits on Linux
3.10.0-123.13.2.el7.x86_64 amd64; OpenJDK 64-Bit Server VM
1.7.0_71-mockbuild_2014_10_03_09_36-b00; Picard version:
1.126(4691ee611ac205d4afe2a1b7a2ea975a6f997426_1417447214) IntelDeflater
[Mon Dec 22 22:11:53 CET 2014] picard.analysis.directed.CalculateHsMetrics
done. Elapsed time: 0.00 minutes.
Runtime.totalMemory()=1515716608
To get help, see
http://broadinstitute.github.io/picard/index.html#GettingHelp
Exception in thread "main" htsjdk.samtools.SAMException: Invalid interval
record contains 1 fields: track name=target_region description="Target
Regions"
at htsjdk.samtools.util.IntervalList.fromReader(IntervalList.java:367)
at htsjdk.samtools.util.IntervalList.fromFile(IntervalList.java:293)
at htsjdk.samtools.util.IntervalList.fromFiles(IntervalList.java:322)
at
picard.analysis.directed.CollectTargetedMetrics.doWork(CollectTargetedMetrics.java:87)
at
picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:187)
at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:89)
at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:99)


Reply to this email directly or view it on GitHub
#141.

from picard.

splaisan avatar splaisan commented on August 16, 2024

Hi Yossi

you spotted it

track name=target_region description="Target Regions”

had remained behind after merging the .dict file and the BED data.
I try it after deleting this line and let you know

Thanks already

Stephane Plaisance
[email protected]

On 22 Dec 2014, at 21:30, Yossi Farjoun [email protected] wrote:

I suspect that the interval-list isn't quite in the right format.
It would help resolve this issue if the head of the interval-file (header
and a few of the first intervals) were reproduced in this issue.

However, as a side-note, I feel the need to comment that BED intervals are
not Picard intervals:

BED is 0-based OPEN ended, while
Picard is 1-based CLOSED-ended, thus

in BED:
1100200

is equivalent to
1101200
in Picard.

Cheers,
Yossi.

On Mon, Dec 22, 2014 at 2:16 PM, Stephane Plaisance <
[email protected]> wrote:

Please help with this error!
my lists start with the matchong .dict content followed by bed rows from
SeqCapEZ_Exome_v3

java -jar /opt/biotools/picard/picard.jar CalculateHsMetrics
BI=/opt/biodata/SeqCapEZ_Exome_v3/bait.list
TI=/opt/biodata/SeqCapEZ_Exome_v3/target.list
I=Patient1_results/Patient1_gatk.bam
R=/opt/biodata/reference/human/GRCh37.73.fa O=test
[Mon Dec 22 22:11:52 CET 2014] picard.analysis.directed.CalculateHsMetrics
BAIT_INTERVALS=[/opt/biodata/SeqCapEZ_Exome_v3/bait.list]
TARGET_INTERVALS=[/opt/biodata/SeqCapEZ_Exome_v3/target.list]
INPUT=Patient1_results/Patient1_gatk.bam OUTPUT=test
REFERENCE_SEQUENCE=/opt/biodata/reference/human/GRCh37.73.fa
METRIC_ACCUMULATION_LEVEL=[ALL_READS] VERBOSITY=INFO QUIET=false
VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000
CREATE_INDEX=false CREATE_MD5_FILE=false
[Mon Dec 22 22:11:52 CET 2014] Executing as splaisan@r710bits on Linux
3.10.0-123.13.2.el7.x86_64 amd64; OpenJDK 64-Bit Server VM
1.7.0_71-mockbuild_2014_10_03_09_36-b00; Picard version:
1.126(4691ee611ac205d4afe2a1b7a2ea975a6f997426_1417447214) IntelDeflater
[Mon Dec 22 22:11:53 CET 2014] picard.analysis.directed.CalculateHsMetrics
done. Elapsed time: 0.00 minutes.
Runtime.totalMemory()=1515716608
To get help, see
http://broadinstitute.github.io/picard/index.html#GettingHelp
Exception in thread "main" htsjdk.samtools.SAMException: Invalid interval
record contains 1 fields: track name=target_region description="Target
Regions"
at htsjdk.samtools.util.IntervalList.fromReader(IntervalList.java:367)
at htsjdk.samtools.util.IntervalList.fromFile(IntervalList.java:293)
at htsjdk.samtools.util.IntervalList.fromFiles(IntervalList.java:322)
at
picard.analysis.directed.CollectTargetedMetrics.doWork(CollectTargetedMetrics.java:87)
at
picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:187)
at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:89)
at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:99)


Reply to this email directly or view it on GitHub
#141.


Reply to this email directly or view it on GitHub.

from picard.

splaisan avatar splaisan commented on August 16, 2024

Dear Yossi,

Thanks a lot for finding MY mistake.

I have now generated corrected lists using the GATK .dict file as header merged to modified bed data with +1 to the start coordinate and 5-BED format columns
==> It now works perfectly.

Only missing thing is a guideline to interpret the many returned metrics
Any doc to be read about some of the non-obvious lines below??

Cheers

BAIT_SET SeqCap_EZ_Exome_v3-bait
GENOME_SIZE 3101804739
BAIT_TERRITORY 63822601
TARGET_TERRITORY 63322733
BAIT_DESIGN_EFFICIENCY 0.992168
TOTAL_READS 91215692
PF_READS 91215692
PF_UNIQUE_READS 88626067
PCT_PF_READS 1
PCT_PF_UQ_READS 0.97161
PF_UQ_READS_ALIGNED 76507144
PCT_PF_UQ_READS_ALIGNED 0.863258
PF_UQ_BASES_ALIGNED 7541771454
ON_BAIT_BASES 4370601991
NEAR_BAIT_BASES 2272083894
OFF_BAIT_BASES 899085569
ON_TARGET_BASES 4226639166
PCT_SELECTED_BASES 0.880786
PCT_OFF_BAIT 0.119214
ON_BAIT_VS_SELECTED 0.657957
MEAN_BAIT_COVERAGE 68.480474
MEAN_TARGET_COVERAGE 67.78099
PCT_USABLE_BASES_ON_BAIT 0.485669
PCT_USABLE_BASES_ON_TARGET 0.469672
FOLD_ENRICHMENT 28.164876
ZERO_CVG_TARGETS_PCT 0.016662
FOLD_80_BASE_PENALTY 2.118156
PCT_TARGET_BASES_2X 0.973948
PCT_TARGET_BASES_10X 0.941579
PCT_TARGET_BASES_20X 0.89234
PCT_TARGET_BASES_30X 0.812007
PCT_TARGET_BASES_40X 0.706267
PCT_TARGET_BASES_50X 0.591199
PCT_TARGET_BASES_100X 0.168662
HS_LIBRARY_SIZE 679044448
HS_PENALTY_10X 3.813658
HS_PENALTY_20X 3.849584
HS_PENALTY_30X 3.883513
HS_PENALTY_40X 3.921435
HS_PENALTY_50X 3.958558
HS_PENALTY_100X 4.169321
AT_DROPOUT 0.07418
GC_DROPOUT 13.513451
SAMPLE
LIBRARY
READ_GROUP

Stephane Plaisance
[email protected]

On 22 Dec 2014, at 21:30, Yossi Farjoun [email protected] wrote:

I suspect that the interval-list isn't quite in the right format.
It would help resolve this issue if the head of the interval-file (header
and a few of the first intervals) were reproduced in this issue.

However, as a side-note, I feel the need to comment that BED intervals are
not Picard intervals:

BED is 0-based OPEN ended, while
Picard is 1-based CLOSED-ended, thus

in BED:
1100200

is equivalent to
1101200
in Picard.

Cheers,
Yossi.

On Mon, Dec 22, 2014 at 2:16 PM, Stephane Plaisance <
[email protected]> wrote:

Please help with this error!
my lists start with the matchong .dict content followed by bed rows from
SeqCapEZ_Exome_v3

java -jar /opt/biotools/picard/picard.jar CalculateHsMetrics
BI=/opt/biodata/SeqCapEZ_Exome_v3/bait.list
TI=/opt/biodata/SeqCapEZ_Exome_v3/target.list
I=Patient1_results/Patient1_gatk.bam
R=/opt/biodata/reference/human/GRCh37.73.fa O=test
[Mon Dec 22 22:11:52 CET 2014] picard.analysis.directed.CalculateHsMetrics
BAIT_INTERVALS=[/opt/biodata/SeqCapEZ_Exome_v3/bait.list]
TARGET_INTERVALS=[/opt/biodata/SeqCapEZ_Exome_v3/target.list]
INPUT=Patient1_results/Patient1_gatk.bam OUTPUT=test
REFERENCE_SEQUENCE=/opt/biodata/reference/human/GRCh37.73.fa
METRIC_ACCUMULATION_LEVEL=[ALL_READS] VERBOSITY=INFO QUIET=false
VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000
CREATE_INDEX=false CREATE_MD5_FILE=false
[Mon Dec 22 22:11:52 CET 2014] Executing as splaisan@r710bits on Linux
3.10.0-123.13.2.el7.x86_64 amd64; OpenJDK 64-Bit Server VM
1.7.0_71-mockbuild_2014_10_03_09_36-b00; Picard version:
1.126(4691ee611ac205d4afe2a1b7a2ea975a6f997426_1417447214) IntelDeflater
[Mon Dec 22 22:11:53 CET 2014] picard.analysis.directed.CalculateHsMetrics
done. Elapsed time: 0.00 minutes.
Runtime.totalMemory()=1515716608
To get help, see
http://broadinstitute.github.io/picard/index.html#GettingHelp
Exception in thread "main" htsjdk.samtools.SAMException: Invalid interval
record contains 1 fields: track name=target_region description="Target
Regions"
at htsjdk.samtools.util.IntervalList.fromReader(IntervalList.java:367)
at htsjdk.samtools.util.IntervalList.fromFile(IntervalList.java:293)
at htsjdk.samtools.util.IntervalList.fromFiles(IntervalList.java:322)
at
picard.analysis.directed.CollectTargetedMetrics.doWork(CollectTargetedMetrics.java:87)
at
picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:187)
at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:89)
at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:99)


Reply to this email directly or view it on GitHub
#141.


Reply to this email directly or view it on GitHub.

from picard.

yfarjoun avatar yfarjoun commented on August 16, 2024

https://broadinstitute.github.io/picard/picard-metric-definitions.html#HsMetrics

(first hit in google....)

Yossi.

On Tue, Dec 23, 2014 at 5:18 AM, Stephane Plaisance <
[email protected]> wrote:

Dear Yossi,

Thanks a lot for finding MY mistake.

I have now generated corrected lists using the GATK .dict file as header
merged to modified bed data with +1 to the start coordinate and 5-BED
format columns
==> It now works perfectly.

Only missing thing is a guideline to interpret the many returned metrics
Any doc to be read about some of the non-obvious lines below??

Cheers

BAIT_SET SeqCap_EZ_Exome_v3-bait
GENOME_SIZE 3101804739
BAIT_TERRITORY 63822601
TARGET_TERRITORY 63322733
BAIT_DESIGN_EFFICIENCY 0.992168
TOTAL_READS 91215692
PF_READS 91215692
PF_UNIQUE_READS 88626067
PCT_PF_READS 1
PCT_PF_UQ_READS 0.97161
PF_UQ_READS_ALIGNED 76507144
PCT_PF_UQ_READS_ALIGNED 0.863258
PF_UQ_BASES_ALIGNED 7541771454
ON_BAIT_BASES 4370601991
NEAR_BAIT_BASES 2272083894
OFF_BAIT_BASES 899085569
ON_TARGET_BASES 4226639166
PCT_SELECTED_BASES 0.880786
PCT_OFF_BAIT 0.119214
ON_BAIT_VS_SELECTED 0.657957
MEAN_BAIT_COVERAGE 68.480474
MEAN_TARGET_COVERAGE 67.78099
PCT_USABLE_BASES_ON_BAIT 0.485669
PCT_USABLE_BASES_ON_TARGET 0.469672
FOLD_ENRICHMENT 28.164876
ZERO_CVG_TARGETS_PCT 0.016662
FOLD_80_BASE_PENALTY 2.118156
PCT_TARGET_BASES_2X 0.973948
PCT_TARGET_BASES_10X 0.941579
PCT_TARGET_BASES_20X 0.89234
PCT_TARGET_BASES_30X 0.812007
PCT_TARGET_BASES_40X 0.706267
PCT_TARGET_BASES_50X 0.591199
PCT_TARGET_BASES_100X 0.168662
HS_LIBRARY_SIZE 679044448
HS_PENALTY_10X 3.813658
HS_PENALTY_20X 3.849584
HS_PENALTY_30X 3.883513
HS_PENALTY_40X 3.921435
HS_PENALTY_50X 3.958558
HS_PENALTY_100X 4.169321
AT_DROPOUT 0.07418
GC_DROPOUT 13.513451
SAMPLE
LIBRARY
READ_GROUP

Stephane Plaisance
[email protected]

On 22 Dec 2014, at 21:30, Yossi Farjoun [email protected] wrote:

I suspect that the interval-list isn't quite in the right format.
It would help resolve this issue if the head of the interval-file
(header
and a few of the first intervals) were reproduced in this issue.

However, as a side-note, I feel the need to comment that BED intervals
are
not Picard intervals:

BED is 0-based OPEN ended, while
Picard is 1-based CLOSED-ended, thus

in BED:
1100200

is equivalent to
1101200
in Picard.

Cheers,
Yossi.

On Mon, Dec 22, 2014 at 2:16 PM, Stephane Plaisance <
[email protected]> wrote:

Please help with this error!
my lists start with the matchong .dict content followed by bed rows
from
SeqCapEZ_Exome_v3

java -jar /opt/biotools/picard/picard.jar CalculateHsMetrics
BI=/opt/biodata/SeqCapEZ_Exome_v3/bait.list
TI=/opt/biodata/SeqCapEZ_Exome_v3/target.list
I=Patient1_results/Patient1_gatk.bam
R=/opt/biodata/reference/human/GRCh37.73.fa O=test
[Mon Dec 22 22:11:52 CET 2014]
picard.analysis.directed.CalculateHsMetrics
BAIT_INTERVALS=[/opt/biodata/SeqCapEZ_Exome_v3/bait.list]
TARGET_INTERVALS=[/opt/biodata/SeqCapEZ_Exome_v3/target.list]
INPUT=Patient1_results/Patient1_gatk.bam OUTPUT=test
REFERENCE_SEQUENCE=/opt/biodata/reference/human/GRCh37.73.fa
METRIC_ACCUMULATION_LEVEL=[ALL_READS] VERBOSITY=INFO QUIET=false
VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5
MAX_RECORDS_IN_RAM=500000
CREATE_INDEX=false CREATE_MD5_FILE=false
[Mon Dec 22 22:11:52 CET 2014] Executing as splaisan@r710bits on
Linux
3.10.0-123.13.2.el7.x86_64 amd64; OpenJDK 64-Bit Server VM
1.7.0_71-mockbuild_2014_10_03_09_36-b00; Picard version:
1.126(4691ee611ac205d4afe2a1b7a2ea975a6f997426_1417447214)
IntelDeflater
[Mon Dec 22 22:11:53 CET 2014]
picard.analysis.directed.CalculateHsMetrics
done. Elapsed time: 0.00 minutes.
Runtime.totalMemory()=1515716608
To get help, see
http://broadinstitute.github.io/picard/index.html#GettingHelp
Exception in thread "main" htsjdk.samtools.SAMException: Invalid
interval
record contains 1 fields: track name=target_region description="Target
Regions"
at htsjdk.samtools.util.IntervalList.fromReader(IntervalList.java:367)
at htsjdk.samtools.util.IntervalList.fromFile(IntervalList.java:293)
at htsjdk.samtools.util.IntervalList.fromFiles(IntervalList.java:322)
at

picard.analysis.directed.CollectTargetedMetrics.doWork(CollectTargetedMetrics.java:87)

at

picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:187)
at
picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:89)
at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:99)


Reply to this email directly or view it on GitHub
#141.


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub
#141 (comment)
.

from picard.

Rahel14350 avatar Rahel14350 commented on August 16, 2024

@splaisan @yfarjoun
I did try to build the bait and target interval (using one time the headers from .bam file and one time with headers from mm10.dict) like this:
m=0
for i in 'cat mm10_list.interval_list' ;
do echo "$i"; echo "MPMPMP$m"; m=$(($m + 1)) ;
done > mm10_list.interval_list.tmp2
perl -0777 -i -pe "s/\nMPMPMP/\ttarget_/g" mm10_list.interval_list.tmp2
perl -0777 -i -pe "s/\r\t/\t/g" mm10_list.interval_list.tmp2
perl -0777 -i -pe "s/:/\t/g" mm10_list.interval_list.tmp2
perl -0777 -i -pe "s/-/\t/g" mm10_list.interval_list.tmp2
cat header.txt mm10_list.interval_list.tmp2 > mm10_list.interval_list.picard

but I have still error while running calculateHSmetrics. Would you please let me know if the way I am building the bait and target files is correct? and I am using the same file for bait_interval and also target_interval, is this also correct?
Many thanks in advance,
Rahel

from picard.

yfarjoun avatar yfarjoun commented on August 16, 2024

please ask this question on the forum here: https://software.broadinstitute.org/gatk

from picard.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.