How to iterate over input file array from arguments -> valueFrom

Perhaps this is an unconventional way to do things, but below is a (simplified) version of my CWL file.

#!/usr/bin/env cwl-runner

cwlVersion: v1.0
class: CommandLineTool
requirements:
  ShellCommandRequirement: {}
inputs:
  main_vcf_file:
    type: File
  subtract_vcf_files:
    type:
      type: array
      items: File
outputs:
  example_out:
    type: stdout
stdout: output.txt
arguments:
    - shellQuote: false
      valueFrom: >
        echo $(inputs.subtract_vcf_files)

        for f in $(inputs.subtract_vcf_files);
        do
          echo \${f};
        done

My job file is this:

main_vcf_file:
  class: File
  path: abc.vcf
subtract_vcf_files:
- {class: File, path: VCF/qrs.vcf}
- {class: File, path: VCF/tuv.vcf}

I want to be able to iterate through inputs.subtract_vcf_files and invoke a command on each file path. But it’s giving me a full JSON representation of this variable rather than just the paths.

Is there a way to do this?

Thanks for your time!

Hello!

The following will do what you wish (explanation after code)

cwlVersion: v1.0
class: CommandLineTool
requirements:
  ShellCommandRequirement: {}
  InlineJavascriptRequirement: {}
inputs:
  files:
    type:
      type: array
      items: File
outputs:
  example_out:
    type: stdout
stdout: output.txt
arguments:
    - shellQuote: false
      valueFrom: >
        ${
          var cmd = "";
          for( var i = 0; i < inputs.files.length; i++) {
             cmd += "\n echo " + inputs.files[i].path;
          }
          return cmd;
        }

The code you tried mixed bash and parameter expansion. For what you wish to do, it is better to use a javascript to completely build the script. As you can see, in the JS expression we use the .path attribute of the files[i] object. Here I’ve just used echo which you would replace by what you need.

Here is another way of doing things, this time with less JS and more bash, which you may prefer

cwlVersion: v1.0
class: CommandLineTool
requirements:
  ShellCommandRequirement: {}
  InlineJavascriptRequirement: {}
inputs:
  files:
    type:
      type: array
      items: File
outputs:
  example_out:
    type: stdout
stdout: output.txt
arguments:
    - shellQuote: false
      valueFrom: >
        file_array=(${
          var cmd = "";
          for( var i = 0; i < inputs.files.length; i++) {
             cmd += "\"" + inputs.files[i].path + "\" ";
          }
          return cmd;
        });

        for ((i=0; i < \${#file_array[@]}; i++))
        do
         echo \${file_array[\$i]};
        done

Here I’ve used an initial JS expression to create a bash list, and then the rest is pure bash.

Thanks for both of these solutions!

For reasons that are too complicated to explain in this post, I decided that I wanted it to work without using Javascript. Below is my solution for that.

#!/usr/bin/env cwl-runner

cwlVersion: v1.0
class: CommandLineTool
requirements:
  ShellCommandRequirement: {}
  InitialWorkDirRequirement:
    listing:
    - entryname: parse_paths.py
      entry: |-
        import json, sys

        for d in json.loads(sys.argv[1]):
            print(d["path"])
inputs:
  main_vcf_file:
    type: File
  subtract_vcf_files:
    type:
      type: array
      items: File
outputs:
  example_out:
    type: stdout
stdout: output.txt
arguments:
    - shellQuote: false
      valueFrom: >
        for f in `python3 parse_paths.py '$(inputs.subtract_vcf_files)'`;
        do
          echo \${f};
        done

This line cmd += "\n echo " + inputs.files[i].path; won’t add any white space will it? I would like to get the nameroot for each input file to use for instance as a sample name and then I would run a scatter with dotproduct.