Pass writeable output directory to a tool

Hi,

I have a tool with one argument like this:

inputs:
  - id: output_dir
    type: Directory
    inputBinding:
      position: 2
      prefix: '-o'
    label: Output dir.
    doc: Output directory where artifacts will be extracted.

I have tried to follow the instructions in the “User manual” for “Staging input Files” but I have been unsuccessfully so far:

requirements:
  - class: InitialWorkDirRequirement
    listing:
      - entryname: $(inputs.output_dir)
        writable: true

I get this when I execute it:

../playground/CWL/output/local/astrolabe/2019-08-12-15-27-48/app.cwl:1:1:   Object
                                                                            `../playground/CWL/output/local/astrolabe/2019-08-12-15-27-48/app.cwl#astrolabe`
                                                                            is not valid because
                                                                              tried
                                                                              `CommandLineTool` but
../playground/CWL/output/local/astrolabe/2019-08-12-15-27-48/app.cwl:62:5:      the
                                                                                `requirements`
                                                                                field is not valid
                                                                                because
                                                                                  tried array of
                                                                                  <InlineJavascriptRequirement
                                                                                  or SchemaDefRequirement
                                                                                  or DockerRequirement or
                                                                                  SoftwareRequirement or
                                                                                  InitialWorkDirRequirement
                                                                                  or EnvVarRequirement or
                                                                                  ShellCommandRequirement
                                                                                  or ResourceRequirement
                                                                                  or
                                                                                  SubworkflowFeatureRequirement
                                                                                  or
                                                                                  ScatterFeatureRequirement
                                                                                  or
                                                                                  MultipleInputFeatureRequirement
                                                                                  or
                                                                                  StepInputExpressionRequirement>
                                                                                  but
../playground/CWL/output/local/astrolabe/2019-08-12-15-27-48/app.cwl:72:9:          item is
                                                                                    invalid because
../playground/CWL/output/local/astrolabe/2019-08-12-15-27-48/app.cwl:74:13:           the
                                                                                      `listing` field is
                                                                                      not valid because
                                                                                        tried array
                                                                                        of <File or Directory
                                                                                        or Dirent or string or
                                                                                        Expression> but
../playground/CWL/output/local/astrolabe/2019-08-12-15-27-48/app.cwl:75:17:               item
                                                                                          is invalid because
                                                                                            - tried
                                                                                            File but
                                                                                               
                                                                                                Missing 'class' field
                                                                                            - tried
                                                                                            Directory but
                                                                                               
                                                                                                Missing 'class' field
                                                                                            - tried
                                                                                            Dirent but
                                                                                               
                                                                                                missing required field
                                                                                                `entry`e[0m

Any hints?

Hello @FJ_Sanchez!

entryname needs to be a string, I think you need entry instead

Also, do you really need the user to pass in a directory (possibly non-empty) for the outputs to go in?

You can manually set the -o output dir:

arguments:
  - prefix: -o
    valueFrom: $(runtime.outdir)  # the current working directory

That fixes the issue, also I think that directly passing the working directory to the tool might be a good idea, I’ll explore this.

1 Like

So I got this working using your suggestion of using $(runtime.outdir) and adding an output to actually keep the data I care about, but this doesn’t completely give me what I want. I need to call the same tool a number of times with different input arguments and it expects the directory structure generated by the previous steps. So far I haven’t been able to save the outputs to a specific directory. For example, let’s say that I want to set use /tmp/output as the output directory, how can I specify this? Also, all the files present there should be in the working directory of the workflow and be writable.

This is whay I’m trying right now:

class: CommandLineTool
cwlVersion: v1.0
$namespaces:
  sbg: 'https://www.sevenbridges.com/'
id: astrolabe
baseCommand: []
inputs:
  - id: output_dir
    type: Directory
    inputBinding:
      position: 2
      prefix: '-o'
      valueFrom: $(runtime.outdir)/out
  - id: rosbag
    type: File
    inputBinding:
      position: 4
      prefix: '-b'
  - id: json_file
    type:
      - 'null'
      - File
      - type: array
        items: File
    inputBinding:
      position: 5
      prefix: '-j'
outputs:
  - id: output
    doc: Directory with the extracted data.
    label: Output data
    type: Directory?
    outputBinding:
      glob: $(runtime.outdir)/out
label: astrolabe extract rosbag
arguments:
  - position: 1
    valueFrom: extract
  - position: 3
    valueFrom: rosbag
requirements:
  - class: ResourceRequirement
    coresMin: 1
    ramMax: 12000
  - class: DockerRequirement
    dockerPull: >-
      registry.corp.XXXXX
  - class: InitialWorkDirRequirement
    listing:
      - entry: $(inputs.output_dir)
        writable: true
  - class: InlineJavascriptRequirement
successCodes:
  - 0
permanentFailCodes:
  - 1
  - 2
  - 4
  - 8
  - -1

Using valueFrom like this overrides the path of the input directory. Only use $(runtime.outdir) for a step that doesn’t need an input directory passed in.

There is no working directory for the entire workflow. Only files and directories that are explicitly marked as outputs and connected to other steps are preserved.

Ok, I understand, I have changed my tool like this:

class: CommandLineTool
cwlVersion: v1.0
$namespaces:
  sbg: 'https://www.sevenbridges.com/'
id: astrolabe
baseCommand: []
inputs:
  - id: output_dir
    type: Directory
    inputBinding:
      position: 2
      prefix: '-o'
  - id: rosbag
    type: File
    inputBinding:
      position: 4
      prefix: '-b'
    label: Rosbag
    doc: Rosbag to analyse.
  - id: json_file
    type:
      - 'null'
      - File
      - type: array
        items: File
    inputBinding:
      position: 5
      prefix: '-j'
outputs:
  - id: output
    type: Directory?
    outputBinding:
      glob: $(runtime.outdir)
label: astrolabe extract rosbag
arguments:
  - position: 1
    valueFrom: extract
  - position: 3
    valueFrom: rosbag
requirements:
  - class: ResourceRequirement
    coresMin: 1
    ramMax: 12000
  - class: DockerRequirement
    dockerPull: >-
      registry.corp.XXXX
  - class: InitialWorkDirRequirement
    listing:
      - entry: $(inputs.output_dir)
        writable: true
  - class: InlineJavascriptRequirement
successCodes:
  - 0
permanentFailCodes:
  - 1
  - 2
  - 4
  - 8
  - -1

Now in the output directory of cwltool I get a directory called exi42dim/cwl_test with the outputs from the step but what I would like is these files to be in the input given in my job.json:

{
    "json_file": {
        "class": "File",
        "path": "/home/francisco/data/tutorial/CT-1117.json"
    },
    "output_dir": {
        "class": "Directory",
        "path": "/tmp/cwl_test"
    },
    "rosbag": {
        "class": "File",
        "path": "/home/francisco/data/tutorial/sample_bag.bag"
    }
}

Is this possible?

Probably should glob on $(inputs.output_dir.basename)

1 Like

With that change I get nothing in /tmp/cwl_test and neither in the cwltool output directory.

The directory isn’t edited in place. A new directory is made.

So is it impossible to get the outputs from the tool exactly where I want?

Probably shouldn’t be an optional output either

CWL is designed so that your workflow can be run on multiple systems at once. Therefore we need to know where everything is and not hard code fixed filesystem paths.

If you need the outputs from this step for another step then connect them together, the engine will handle the intermediate parts for you.

But what about he results? These are the files that I want as final result, cannot I define where these should be at the end of the execution? Where can these be find usually after running a workflow or tool?

Each file or directory that you want needs to be named in the outputs section

After experimenting a bit more with you feedback I think I understand what I can and cannot do. I am able to control what gets saved in the designated output directory using the outputs section of the CommandLineTool. It is not possible to write any output outside of that directory at the end of the execution but you can chain to other steps when building a workflow.

1 Like