Can you convert an input string into a File or Directory input?

bowtie2 takes as an argument a string that represents a partial path – so it doesn’t specify either a folder or a file, but a prefix that bowtie then uses to locate the rest of the files, something like: genomes/prefix, and then in this folder it would expect to find files: genomes/prefix.1.bt2, genomes/prefix.2.bt2, etc.

How can you use such an input, but still allow CWL to add these files? I see 2 approaches in the bio-cwl-tools repo. One is to specify as input a file in the folder, and then use secondaryFiles to pick up the rest:

A second approach is to just pass the directory as the input to CWL, using a Directory type. This approach is used by another alternative tool description in the same repo:

The problem with both of these approaches is this: in neither of these is the CWL tool actually accepting the exact same thing that bowtie2 wants. They both require you to take the prefix and modify it to point either to a file, or to a directory. While this works, it would be much more satisfying to have the CWL interface reflect the actual tool interface.

Of course we could specify an input parameter as a string and then just pass the prefix – but then CWL doesn’t pick these up as files. So, is there some way to have one input parameter be derived from a value specified in another input parameter? Or what’s the way to accomplish this?

I’m in a scenario where I’m dealing with a tool (refgenie), which is built to work with bowtie2 directly, so it provides the prefix. I want to be able to take the output provided directly by refgenie, and just use it as input to a CWL tool definition for bowtie2. The arguments expected by the tool should be identical to that expected by the CWL definition. How can I do this?

Side note: one additional problem with the first is that it’s specifying a fasta file in that folder, which is not actually required by bowtie2.

I came up with another way, where you use the input prefix as input to an Expression in a InitialWorkDirRequirement:

#!/usr/bin/env cwl-runner

class: CommandLineTool
cwlVersion: v1.0

  InlineJavascriptRequirement: {}
    listing: |
        var files = [{ "class": "File", "location": "file:///" + inputs.path_str }];
        ['.fai', '.1.bt2'].forEach(function (el) { 
          files.push({ "class": "File", "location": "file:///" + inputs.path_str + el });
        return files;

    type: string

    type: stdout

stdout: listing.txt

baseCommand: ls

(I used a subset of the required prefixes in this example). Note that the input must be an absolute path, i.e.

path_str: /tmp/cwl/input_dir/file.fasta

If the corresponding files are missing you get a fairly straightforward File Not Found error.

I must admit though that to me the first example - the one with the secondaryFiles - feels like the correct way to do things as it is explicit (from the CWL side) and does not require any Javascript.

Hmm, that’s an interesting approach – but here you’re actually passing the full file path, not just a prefix, as the parameter… I want the input to be passed a prefix only.

Also, when I tried this, CWL does not translate the input string into the temp file path in the work dir – I presume because it doesn’t know it’s actually a file path that should be converted, since it’s encoded as a string.

What I need to do is essentially add a new input parameter (a File or Directory type) that is automatically derived from a provided input parameter. It’s sort of the opposite of the valueFrom… In valueFrom, you give me a File or Directory input, I can derive something else to pass to the tool…
In my case I need to do it the other way – I pass a prefix as a string, I want to pre-process it to create a File type within CWL instead of a prefix.

I think perhaps this is impossible, it seems like a limitation of CWL.

Here is what Galaxy does

They also require explicit setting of the reference