How to iterate over input file array from arguments -> valueFrom

Perhaps this is an unconventional way to do things, but below is a (simplified) version of my CWL file.

#!/usr/bin/env cwl-runner

cwlVersion: v1.0
class: CommandLineTool
requirements:
  ShellCommandRequirement: {}
inputs:
  main_vcf_file:
    type: File
  subtract_vcf_files:
    type:
      type: array
      items: File
outputs:
  example_out:
    type: stdout
stdout: output.txt
arguments:
    - shellQuote: false
      valueFrom: >
        echo $(inputs.subtract_vcf_files)

        for f in $(inputs.subtract_vcf_files);
        do
          echo \${f};
        done

My job file is this:

main_vcf_file:
  class: File
  path: abc.vcf
subtract_vcf_files:
- {class: File, path: VCF/qrs.vcf}
- {class: File, path: VCF/tuv.vcf}

I want to be able to iterate through inputs.subtract_vcf_files and invoke a command on each file path. But it’s giving me a full JSON representation of this variable rather than just the paths.

Is there a way to do this?

Thanks for your time!

Hello!

The following will do what you wish (explanation after code)

cwlVersion: v1.0
class: CommandLineTool
requirements:
  ShellCommandRequirement: {}
  InlineJavascriptRequirement: {}
inputs:
  files:
    type:
      type: array
      items: File
outputs:
  example_out:
    type: stdout
stdout: output.txt
arguments:
    - shellQuote: false
      valueFrom: >
        ${
          var cmd = "";
          for( var i = 0; i < inputs.files.length; i++) {
             cmd += "\n echo " + inputs.files[i].path;
          }
          return cmd;
        }

The code you tried mixed bash and parameter expansion. For what you wish to do, it is better to use a javascript to completely build the script. As you can see, in the JS expression we use the .path attribute of the files[i] object. Here I’ve just used echo which you would replace by what you need.

1 Like

Here is another way of doing things, this time with less JS and more bash, which you may prefer

cwlVersion: v1.0
class: CommandLineTool
requirements:
  ShellCommandRequirement: {}
  InlineJavascriptRequirement: {}
inputs:
  files:
    type:
      type: array
      items: File
outputs:
  example_out:
    type: stdout
stdout: output.txt
arguments:
    - shellQuote: false
      valueFrom: >
        file_array=(${
          var cmd = "";
          for( var i = 0; i < inputs.files.length; i++) {
             cmd += "\"" + inputs.files[i].path + "\" ";
          }
          return cmd;
        });

        for ((i=0; i < \${#file_array[@]}; i++))
        do
         echo \${file_array[\$i]};
        done

Here I’ve used an initial JS expression to create a bash list, and then the rest is pure bash.

Thanks for both of these solutions!

For reasons that are too complicated to explain in this post, I decided that I wanted it to work without using Javascript. Below is my solution for that.

#!/usr/bin/env cwl-runner

cwlVersion: v1.0
class: CommandLineTool
requirements:
  ShellCommandRequirement: {}
  InitialWorkDirRequirement:
    listing:
    - entryname: parse_paths.py
      entry: |-
        import json, sys

        for d in json.loads(sys.argv[1]):
            print(d["path"])
inputs:
  main_vcf_file:
    type: File
  subtract_vcf_files:
    type:
      type: array
      items: File
outputs:
  example_out:
    type: stdout
stdout: output.txt
arguments:
    - shellQuote: false
      valueFrom: >
        for f in `python3 parse_paths.py '$(inputs.subtract_vcf_files)'`;
        do
          echo \${f};
        done

This line cmd += "\n echo " + inputs.files[i].path; won’t add any white space will it? I would like to get the nameroot for each input file to use for instance as a sample name and then I would run a scatter with dotproduct.