Sir,
I am getting error while making CWL tool for BadGenomicKmers. Tool descriptions are as follows.
#!/usr/bin/env cwl-runner
# This tool description was generated automatically by wdl2cwl ver. 0.2
class: CommandLineTool
cwlVersion: v1.0
requirements:
- class: ShellCommandRequirement
- class: InlineJavascriptRequirement
- class: DockerRequirement
dockerPull: quay.io/biocontainers/gatk4:4.1.6.0--py38_0
- class: InitialWorkDirRequirement
listing:
- $(inputs.ReferenceGenome)
inputs:
- id: ReferenceGenome
type: File
- id: ReferenceGenomeDict
type: File
- id: sampleName
type: string
outputs:
- id: kmers_to_ignore
type: File
outputBinding:
glob: $(inputs.sampleName).txt
baseCommand: []
arguments:
- valueFrom: |-
gatk FindBadGenomicKmersSpark -R $(inputs.ReferenceGenome.path) -O $(inputs.sampleName).txt
shellQuote: false
It asks for reference genome dictionary file. Although I have set tool requirements to access directory containing reference genome and it’s dictionary file. But it does not works.
Error is as follows.
[May 28, 2020 10:00:10 AM GMT] org.broadinstitute.hellbender.tools.spark.sv.evidence.FindBadGenomicKmersSpark done. Elapsed time: 0.22 minutes.
Runtime.totalMemory()=230686720
***********************************************************************
A USER ERROR has occurred: Fasta dict file for reference /ySLAXq/NormalizeFasta.fasta does not exist. Please see http://gatkforums.broadinstitute.org/discussion/1601/how-can-i-prepare-a-fasta-file-to-use-as-reference for help creating it.
***********************************************************************
Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.
20/05/28 10:00:10 INFO ShutdownHookManager: Shutdown hook called
20/05/28 10:00:10 INFO ShutdownHookManager: Deleting directory /tmp/spark-0ec24ec5-7519-42d9-a6f7-316529086161
Using GATK jar /usr/local/share/gatk4-4.1.6.0-0/gatk-package-4.1.6.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /usr/local/share/gatk4-4.1.6.0-0/gatk-package-4.1.6.0-local.jar FindBadGenomicKmersSpark -R /ySLAXq/NormalizeFasta.fasta -O test.txt
INFO [job FindBadGenomicKmers.cwl] Max memory used: 252MiB
ERROR [job FindBadGenomicKmers.cwl] Job error:
("Error collecting output for parameter 'kmers_to_ignore':\nFindBadGenomicKmers.cwl:44:5: Did not find output file with glob pattern: '['test.txt']'", {})
WARNING [job FindBadGenomicKmers.cwl] completed permanentFail
{}
WARNING Final process status is permanentFail
I think you need to also add the reference sequence dictionary to the initial workdir list
1 Like