bowtie2 takes as an argument a string that represents a partial path – so it doesn’t specify either a folder or a file, but a prefix that bowtie then uses to locate the rest of the files, something like: genomes/prefix
, and then in this folder it would expect to find files: genomes/prefix.1.bt2
, genomes/prefix.2.bt2
, etc.
How can you use such an input, but still allow CWL to add these files? I see 2 approaches in the bio-cwl-tools repo. One is to specify as input a file in the folder, and then use secondaryFiles
to pick up the rest:
A second approach is to just pass the directory as the input to CWL, using a Directory
type. This approach is used by another alternative tool description in the same repo:
The problem with both of these approaches is this: in neither of these is the CWL tool actually accepting the exact same thing that bowtie2 wants. They both require you to take the prefix and modify it to point either to a file, or to a directory. While this works, it would be much more satisfying to have the CWL interface reflect the actual tool interface.
Of course we could specify an input parameter as a string and then just pass the prefix – but then CWL doesn’t pick these up as files. So, is there some way to have one input parameter be derived from a value specified in another input parameter? Or what’s the way to accomplish this?
I’m in a scenario where I’m dealing with a tool (refgenie), which is built to work with bowtie2 directly, so it provides the prefix. I want to be able to take the output provided directly by refgenie, and just use it as input to a CWL tool definition for bowtie2. The arguments expected by the tool should be identical to that expected by the CWL definition. How can I do this?
Side note: one additional problem with the first is that it’s specifying a fasta file in that folder, which is not actually required by bowtie2.