I am writing some CWL wrappers for bioinformatics tools, and I’ve run into a situation that I don’t know how to represent in CWL. Some tools accept input files in one of several formats. One common example is a tool that accepts a file of aligned reads in either BAM, SAM, or CRAM format. Is there a way to represent this faithfully in CWL?
I assume you have seen the file formats lesson in the user guide. Your input parameter definition can take an array of file formats to accept in the format
field. If you want indexes in secondaryFiles you will need to list the name pattern each index type and mark them as not required.
Something like:
cwlVersion: v1.1
inputs:
reads:
type: File
format: [SAM, BAM, CRAM] # not real symbols for formats, use edam ontology
secondaryFiles:
- pattern: .bai
required: false
- pattern: .crai
required: false
If you are creating CWL descriptions for bioinformatics tools, be sure to check out https://github.com/common-workflow-library/bio-cwl-tools to save yourself some time. Contributions of new descriptions are also very welcome!
Thanks! This is what I was looking for.
1 Like