Tidiest way to specify heterogeneous listing logic

alexiswl · August 26, 2020, 11:22am

Hello all, first question in this group so please let me know if there’s anything I can improve on.

I have a conundrum that involved mounting an array of files at a specific set of mount points (which can be done through the listing attribute of InitialWorkDirRequirement) AND writing an inline script to the workflow at the same time.

This is the code that I have so far, I’ve abstracted the script and the script name so that the listing attribute is more clear. Ideally the script would be its own entry in the listing, but I don’t know how to specify that AND the array of files

cwlVersion: v1.1
class: CommandLineTool

requirements:
  DockerRequirement:
     dockerPull: "ubuntu:latest"  # For testing only
  InlineJavascriptRequirement:
    expressionLib:
      - var get_script_path = function(){
          /*
          Abstract script path, can then be referenced in baseCommand attribute too
          Makes things more readable.  FIXME
          */
          return "/scripts/run-script.sh";
        }
      - var get_script_contents = function(){
          /*
          Split dirent out from the listing JS.
          Makes things a little more readable
          */
          return "#!/usr/bin/env bash\n" +
                 "\n" +
                 "# Fail on non-zero exit of subshell\n" +
                 "set -e\n" +
                 "\n" +
                 "# Initialise dragen\n" +
                 "/opt/edico/bin/dragen \\\n" +
                 "  --partial-reconfig DNA-MAPPER \\\n" +
                 "  --ignore-version-check true\n" +
                 "\n" +
                 "# Create directories\n" +
                 "mkdir -p /ephemeral/ref \"" + inputs.out_dir + "\"\n" +
                 "\n" +
                 "# Tar ref data into scratch space\n" +
                 "tar -C /ephemeral/ref -xvf \"" + inputs.ref_data.path + "\"\n" +
                 "\n" +
                 "\# Run dragen command\n" +
                 "/opt/edico/bin/dragen \\\n" +
                 "  --ref-dir /ephemeral/ref \\\n" +
                 "  --fastq-list \"" + inputs.fastq_list.path + "\" \\\n" +
                 "  --output-file-prefix \"" + inputs.out_prefix + "\" \\\n" +
                 "  --output-directory \"" + inputs.out_dir + "\" \\\n" +
                 "  --force \\\n" +
                 "  --lic-instance-id-location /opt/instance-identity \\\n" +
                 "  --enable-duplicate-marking true \\\n" +
                 "  --enable-map-align-output true \\\n" +
                 "  --enable-variant-caller true\n";
        }
  InitialWorkDirRequirement:
    listing: |
        ${
            /*
            Initialise the array of files to mount
            Add in the script path and the script contents
            */

            var e = [{"entryname": get_script_path(),
                      "entry": get_script_contents()}];

            /*
            Check if input_mounts record is defined
            */
            if (inputs.input_mounts === null){
                return e;
            }

            /*
            Set as vars to shorten variable sizes
            */
            let file_mounts_points_array = inputs.input_mounts.file_mount_points;
            let file_objs_array = inputs.input_mounts.file_objs;

            /*
            Check records have the same number of items
            */
            if (file_mounts_points_array.length !== file_objs_array.length){
              /*
              Just return the inline script
              */
              return e;
            }

            /*
            Iterate through each file to mount
            Mount that object at the same reference to the mount point index.
            */
            file_objs_array.forEach(function(f_obj, index){
              console.log(f_obj);
              e.push({
                  'entry': f_obj,
                  'entryname': file_mounts_points_array[index]
              });
            });

            /*
            Return file paths
            */
            return e;
        }

inputs:
  fastq_list:
    doc: |
      Path to the fastq csv list file
    type: File
  out_prefix:
    doc: |
      The prefix given to all output files
    type: string
  out_dir:
    doc: |
      The directory where all output files are placed
    type: string
  ref_data:
    doc: |
      Path to ref data tarball
    type: File
  input_mounts:
    type:
      # - "null"
      - type: record
        name: mount_arrays
        fields:
          file_objs:
            type: File[]
          file_mount_points:
            type: string[]

outputs:
  # Will also include mounted-files.txt
  dragen_germline_directory:
    type: Directory
    outputBinding:
      glob: "."

# We add bash since we don't have executable permissions on our bash script
#baseCommand: ["bash"]
baseCommand: ["ls"]  # For testing only

arguments:
  # Run Script
  - valueFrom: "$(get_script_path())"

alexiswl · September 20, 2023, 10:52pm

Answered my own question (albeit a few years later).

InitialWorkDirRequirement listing attribute can also be an array.

One element of this array can include the JS mounting logic, while another can be a multiline bash script.

See https://github.com/umccr/cwl-ica/blob/b576cd5e9e1365d1e341fee3d7af5bd4a625b4e3/tools/dragen-somatic/4.2.4/dragen-somatic__4.2.4.cwl#L48-L219 as an example.