Problems with Records not behaving as intended or passing to baseCommand

I have written a cwl specification where I wanted to use type: record to define entries that require exactly one answer.

More specifically, the intent is for one of codebook: pkl or codebook: exp to be required, one of spots: pkl, spots: exp, or spots: null to be required, and one of transcripts: pkl or transcripts: exp to be required.

When I try to run the following yml file with cwltool the only Type:File or Type: Directory that are set are --codebook-exp and --roi.

Why am I not seeing --transcripts-pkl or --spots-pkl? And if it isn’t reading these in, why aren’t they being treated as mandatory?

YML file

codebook:
  exp:
    class: Directory
    path: /mnt/data/intron_rep4/txprocconverted/
spots:
  pkl:
    class: File
    path: /mnt/ccisar/pickles/collab_spots.pkl
transcripts:
  pkl:
    class: File
    path: /mnt/ccisar/pickles/collab_transcripts.pkl
roi:
  class: File
  path: /mnt/data/intron_rep4/collab_segmask/RoiSet.zip
imagesize:
  x-size: 2048
  y-size: 2048
  z-size: 11

FYI: I adjusted your account, can you try posting the link to your repo again?

That works now, thanks!

1 Like

Which CWL runner are you using? Can you share some output or logs from that runner?

See also

cwltool --validate https://raw.githubusercontent.com/hubmapconsortium/spatial-transcriptomics-pipeline/master/steps/qc.cwl
INFO /home/michael/cwltool/env3.9/bin/cwltool 3.1.20210909133024
ERROR Tool definition failed validation:
https://raw.githubusercontent.com/hubmapconsortium/spatial-transcriptomics-pipeline/master/steps/qc.cwl:3:1: "outputs" section
                                                                                                             is not valid.

I had to makes some changes, but here is a log from running your CWL description with the CWL reference runner and empty fake inputs:

$ cwltool d430_qc.cwl d430_inputs.yml 
INFO /home/michael/cwltool/env3.9/bin/cwltool 3.1.20210909133024
INFO Resolved 'd430_qc.cwl' to 'file:///home/michael/cwltool/d430/d430_qc.cwl'
INFO [job d430_qc.cwl] /tmp/17s559pv$ docker \
    run \
    -i \
    --mount=type=bind,source=/tmp/17s559pv,target=/cRIYSz \
    --mount=type=bind,source=/tmp/nvxwreul,target=/tmp \
    --mount=type=bind,source=/home/michael/cwltool/d430/txprocconverted,target=/var/lib/cwl/stg36171afc-b0a3-40af-bd8f-5d1a85707011/txprocconverted,readonly \
    --mount=type=bind,source=/home/michael/cwltool/d430/RoiSet.zip,target=/var/lib/cwl/stgaa029c93-2ae8-4724-acbb-8cac6a76205f/RoiSet.zip,readonly \
    --mount=type=bind,source=/home/michael/cwltool/d430/collab_spots.pkl,target=/var/lib/cwl/stg91a3fc3d-1e09-45c0-a418-902d5f6080a6/collab_spots.pkl,readonly \
    --mount=type=bind,source=/home/michael/cwltool/d430/collab_transcripts.pkl,target=/var/lib/cwl/stg4ad1c23d-7106-420b-a924-3f0226bc6410/collab_transcripts.pkl,readonly \
    --workdir=/cRIYSz \
    --read-only=true \
    --net=none \
    --user=1000:1000 \
    --rm \
    --cidfile=/tmp/3pyo89jz/20210911134538-354260.cid \
    --env=TMPDIR=/tmp \
    --env=HOME=/cRIYSz \
    docker.pkg.github.com/hubmapconsortium/spatial-transcriptomics-pipeline/starfish:latest \
    /opt/qcDriver.py \
    --codebook-exp \
    /var/lib/cwl/stg36171afc-b0a3-40af-bd8f-5d1a85707011/txprocconverted \
    --spots-pkl \
    /var/lib/cwl/stg91a3fc3d-1e09-45c0-a418-902d5f6080a6/collab_spots.pkl \
    --transcript-pkl \
    /var/lib/cwl/stg4ad1c23d-7106-420b-a924-3f0226bc6410/collab_transcripts.pkl \
    --roi \
    /var/lib/cwl/stgaa029c93-2ae8-4724-acbb-8cac6a76205f/RoiSet.zip \
    --x-size \
    2048 \
    --y-size \
    2048 \
    --z-size \
    11

Here is the updated description

#!/usr/bin/env cwl-runner

class: CommandLineTool
cwlVersion: v1.1
baseCommand: /opt/qcDriver.py

requirements:
  DockerRequirement:
    dockerPull: docker.pkg.github.com/hubmapconsortium/spatial-transcriptomics-pipeline/starfish:latest

outputs: []

inputs:
  codebook:
    type:
      - type: record
        name: pkl
        fields:
          pkl:
            type: File
            inputBinding:
              prefix: --codebook-pkl
            doc: A codebook for this experiment, saved in a python pickle.
      - type: record
        name: exp
        fields:
          exp:
            type: Directory
            inputBinding:
              prefix: --codebook-exp
            doc: A directory with a 'experiment.json' file inside, which has the corresponding codebook for this experiment.
  spots:
    type:
      - 'null'
      - type: record
        name: pkl
        fields:
          pkl:
            type: File
            inputBinding:
              prefix: --spots-pkl
            doc: Spots found in this experiment, saved in a python pickle.
      - type: record
        name: exp
        fields:
          exp:
            type: File
            inputBinding:
              prefix: --spots-exp
            doc: The location of OUTPUT FROM EXPERIMENT. NETCDF?

  transcripts:
    type:
      - type: record
        name: pkl
        fields:
          pkl:
            type: File
            inputBinding:
              prefix: --transcript-pkl
            doc: The output DecodedIntensityTable, saved in a python pickle.
      - type: record
        name: exp
        fields:
          exp:
            type: File
            inputBinding:
              prefix: --transcript-exp
            doc: The location of OUTPUT FROM EXPERIMENT. NETCDF?

  roi:
    type: File?
    inputBinding: 
      prefix: --roi
    doc: The location of the RoiSet.zip, if applicable.

  imagesize:
    - 'null'
    - type: record
      fields:
        - name: x-size
          type: int
          inputBinding:
            prefix: --x-size
          doc: x-dimension of image
        - name: y-size
          type: int
          inputBinding:
            prefix:  --y-size
          doc: y-dimension of image
        - name: z-size
          type: int
          inputBinding:
            prefix: --z-size
          doc: number of z-stacks

  find-ripley:
    type: boolean?
    inputBinding:
      prefix: --run-ripley
    doc: If true, will run ripley K estimates to find spatial density measures.  Can be slow.

Note that no outputs are collected.

This doesn’t appear to solve the issue. When I run this, neither --transcript-pkl nor --spots-pkl get passed to the python script, as was the problem with the original specification.

Is this because of the older cwltool version?

Output from the original cwl file is as follows:

INFO /home/ubuntu/miniconda3/bin/cwltool 3.0.20200807132242
INFO Resolved 'steps/qc.cwl' to 'file:///mnt/cwl/spatial-transcriptomics-pipeline/steps/qc.cwl'
INFO [job qc.cwl] /mnt/tmp/q79zrf03$ docker \
    run \
    -i \
    --mount=type=bind,source=/mnt/tmp/q79zrf03,target=/QqJZyL \
    --mount=type=bind,source=/mnt/tmp/twe21v3l,target=/tmp \
    --mount=type=bind,source=/mnt/data/intron_rep4/txprocconverted,target=/var/lib/cwl/stg30946621-981e-433e-9a12-f6dc94732882/txprocconverted,readonly \
    --mount=type=bind,source=/mnt/data/intron_rep4/collab_segmask/RoiSet.zip,target=/var/lib/cwl/stgcb9d5cff-e696-432c-9f62-77ac9ded27eb/RoiSet.zip,readonly \
    --workdir=/QqJZyL \
    --read-only=true \
    --net=none \
    --user=1000:1000 \
    --rm \
    --env=TMPDIR=/tmp \
    --env=HOME=/QqJZyL \
    --cidfile=/mnt/tmp/7x2zhu9g/20210913200455-524223.cid \
    docker.pkg.github.com/hubmapconsortium/spatial-transcriptomics-pipeline/starfish:latest \
    /opt/qcDriver.py \
    --codebook-exp \
    /var/lib/cwl/stg30946621-981e-433e-9a12-f6dc94732882/txprocconverted \
    --roi \
    /var/lib/cwl/stgcb9d5cff-e696-432c-9f62-77ac9ded27eb/RoiSet.zip \
    --x-size \
    2048 \
    --y-size \
    2048 \
    --z-size \
    11
Namespace(codebook_exp=PosixPath('/var/lib/cwl/stg30946621-981e-433e-9a12-f6dc94732882/txprocconverted'), codebook_pkl=None, roi=PosixPath('/var/lib/cwl/stgcb9d5cff-e696-432c-9f62-77ac9ded27eb/RoiSet.zip'), run_ripley=False, spots_exp=None, spots_pkl=None, transcript_exp=None, transcript_pkl=None, x_size=2048, y_size=2048, z_size=11)
100%|██████████| 1740/1740 [00:49<00:00, 35.24it/s]
Traceback (most recent call last):
  File "/opt/qcDriver.py", line 527, in <module>
    run("6_qc/", transcripts, codebook, size, spots, roi, args.run_ripley)
  File "/opt/qcDriver.py", line 462, in run
    trRes["density"] = getTranscriptDensity(transcripts, codebook)
  File "/opt/qcDriver.py", line 342, in getTranscriptDensity
    return np.shape(transcripts.data)[0] / len(codebook.target)
AttributeError: 'bool' object has no attribute 'data'
INFO [job qc.cwl] Max memory used: 0MiB
WARNING [job qc.cwl] completed permanentFail
{
    "qc_metrics": {
        "location": "file:///mnt/cwl/spatial-transcriptomics-pipeline/6_qc",
        "basename": "6_qc",
        "class": "Directory",
        "listing": [
            {
                "class": "File",
                "location": "file:///mnt/cwl/spatial-transcriptomics-pipeline/6_qc/2021-13-09_20%3A07_TXconversion.log",
                "basename": "2021-13-09_20:07_TXconversion.log",
                "checksum": "sha1$f31d422fcd03428b6453affd6d33771d4b7ca133",
                "size": 1312,
                "path": "/mnt/cwl/spatial-transcriptomics-pipeline/6_qc/2021-13-09_20:07_TXconversion.log"
            },
            {
                "class": "File",
                "location": "file:///mnt/cwl/spatial-transcriptomics-pipeline/6_qc/graph_output.pdf",
                "basename": "graph_output.pdf",
                "checksum": "sha1$27cdda449b113741c48f2bfc05ce7332e72bc655",
                "size": 208,
                "path": "/mnt/cwl/spatial-transcriptomics-pipeline/6_qc/graph_output.pdf"
            }
        ],
        "path": "/mnt/cwl/spatial-transcriptomics-pipeline/6_qc"
    }
}
WARNING Final process status is permanentFail

I am at a bit of a loss for why this outputs: binding isn’t considered valid. As best I can tell it is the same as other cwl specifications I’ve written in this same project, (ex steps/segmentation.cwl) which pass validation. Is this an error being deferred from something else?

Did you check in and push your latest code? There is no outputs section at spatial-transcriptomics-pipeline/qc.cwl at c2d9a68108094a0c9a4ef279c9b93244d7617247 · hubmapconsortium/spatial-transcriptomics-pipeline · GitHub

Please also compare that to the changes I made in Problems with Records not behaving as intended or passing to baseCommand - #6 by mrc

Ah, I didn’t realize that I hadn’t pushed the latest version, it is uploaded and passes validation now.

I did run the modified cwl you wrote, but experienced the same problem with the directories not getting passed (below). Is this entirely because of the version difference in cwltool?

$ cwltool steps/qc2.cwl qc_intron4_collab.yml

INFO /home/ubuntu/miniconda3/bin/cwltool 3.0.20200807132242
INFO Resolved 'steps/qc2.cwl' to 'file:///mnt/cwl/spatial-transcriptomics-pipeline/steps/qc2.cwl'
INFO [job qc2.cwl] /mnt/tmp/dlte55eh$ docker \
    run \
    -i \
    --mount=type=bind,source=/mnt/tmp/dlte55eh,target=/qnPMkp \
    --mount=type=bind,source=/mnt/tmp/8rnwesek,target=/tmp \
    --mount=type=bind,source=/mnt/data/intron_rep4/txprocconverted,target=/var/lib/cwl/stgce69d91e-54ad-4597-89e7-760a49baab46/txprocconverted,readonly \
    --mount=type=bind,source=/mnt/data/intron_rep4/collab_segmask/RoiSet.zip,target=/var/lib/cwl/stg272078e1-7d50-4f2d-bad2-018aaf4f10c8/RoiSet.zip,readonly \
    --workdir=/qnPMkp \
    --read-only=true \
    --net=none \
    --user=1000:1000 \
    --rm \
    --env=TMPDIR=/tmp \
    --env=HOME=/qnPMkp \
    --cidfile=/mnt/tmp/eb0we_be/20210914212533-985220.cid \
    docker.pkg.github.com/hubmapconsortium/spatial-transcriptomics-pipeline/starfish:latest \
    /opt/qcDriver.py \
    --codebook-exp \
    /var/lib/cwl/stgce69d91e-54ad-4597-89e7-760a49baab46/txprocconverted \
    --roi \
    /var/lib/cwl/stg272078e1-7d50-4f2d-bad2-018aaf4f10c8/RoiSet.zip \
    --x-size \
    2048 \
    --y-size \
    2048 \
    --z-size \
    11

Here is my input file and the result from running your latest spatial-transcriptomics-pipeline/qc.cwl at c2d9a68108094a0c9a4ef279c9b93244d7617247 · hubmapconsortium/spatial-transcriptomics-pipeline · GitHub

codebook:
  exp:
    class: Directory
    path: txprocconverted
spots:
  pkl:
    class: File
    path: collab_spots.pkl
transcripts:
  pkl:
    class: File
    path: collab_transcripts.pkl
roi:
  class: File
  path: RoiSet.zip
imagesize:
  x-size: 2048
  y-size: 2048
  z-size: 11
$ cwltool d430_qc_new.cwl ./d430_inputs.yml 
INFO /home/michael/cwltool/env3.9/bin/cwltool 3.1.20210917120557
INFO Resolved 'd430_qc_new.cwl' to 'file:///home/michael/cwltool/d430/d430_qc_new.cwl'
INFO [job d430_qc_new.cwl] /tmp/zgyk7aoz$ docker \
    run \
    -i \
    --mount=type=bind,source=/tmp/zgyk7aoz,target=/QJbuJB \
    --mount=type=bind,source=/tmp/vm42cymc,target=/tmp \
    --mount=type=bind,source=/home/michael/cwltool/d430/txprocconverted,target=/var/lib/cwl/stgd017b054-beb9-4bd2-87c4-b715cefbc4b7/txprocconverted,readonly \
    --mount=type=bind,source=/home/michael/cwltool/d430/RoiSet.zip,target=/var/lib/cwl/stg891be482-8f1d-4e65-a774-3957deaa3110/RoiSet.zip,readonly \
    --mount=type=bind,source=/home/michael/cwltool/d430/collab_spots.pkl,target=/var/lib/cwl/stgd773eb33-d1d4-4c2c-ba13-84b38caaa9f2/collab_spots.pkl,readonly \
    --mount=type=bind,source=/home/michael/cwltool/d430/collab_transcripts.pkl,target=/var/lib/cwl/stg9e9ced24-45a1-46f0-9847-9e19114f748f/collab_transcripts.pkl,readonly \
    --workdir=/QJbuJB \
    --read-only=true \
    --net=none \
    --user=1000:1000 \
    --rm \
    --cidfile=/tmp/xfxx8wc9/20210920110952-438323.cid \
    --env=TMPDIR=/tmp \
    --env=HOME=/QJbuJB \
    docker.pkg.github.com/hubmapconsortium/spatial-transcriptomics-pipeline/starfish:latest \
    /opt/qcDriver.py \
    --codebook-exp \
    /var/lib/cwl/stgd017b054-beb9-4bd2-87c4-b715cefbc4b7/txprocconverted \
    --spots-pkl \
    /var/lib/cwl/stgd773eb33-d1d4-4c2c-ba13-84b38caaa9f2/collab_spots.pkl \
    --transcript-pkl \
    /var/lib/cwl/stg9e9ced24-45a1-46f0-9847-9e19114f748f/collab_transcripts.pkl \
    --roi \
    /var/lib/cwl/stg891be482-8f1d-4e65-a774-3957deaa3110/RoiSet.zip \
    --x-size \
    2048 \
    --y-size \
    2048 \
    --z-size \
    11

I updated cwltool to version 3.1.20210922130607 and I am no longer having this error. Thanks for the help.

1 Like