Fixed step order in workflow

Stikus · May 12, 2023, 7:49am

Hello, I assume this question was already asked, and I’ve tried to search for similar themes but failed.

Our workflow have several steps (callers, typers) starting from bam - that’s why their order can vary. But for some data, one of these steps fails (and killing workflow) - I want to set order of steps so more crucial steps execute first, and we’ll get their outputs. Is there any standard way for this?

HrishiDhondge · May 25, 2023, 3:56pm

Hello @Stikus,

The order of steps is as you arranged them in your workflow. So I didn’t fully understand your question.
Could you provide more details on how you are implementing these steps with an example?

Stikus · May 26, 2023, 9:09am

Order of steps in scheme/workflow doesn’t control execution order.

Here are some examples:

scheme.cwl:

cwlVersion: v1.0
class: Workflow
id: wget-count-wf-parallel-001
doc: |
  Parallel workflow (YML) to run several copies of sequential pairs of tools (every
  sequential pair is a 'simple test Wget Count workflow', based on Alpine docker image).
  This workflow includes such elements as:
    - non-required WF and T inputs (marked both by "?" after main type and "- 'null'" as a separate type),
    - type as a string (not a list of one element),
    - defaults for WF and T inputs,
    - references inside T,
    - includes of T into WF step runs,
    - baseCommand usage,
    - comments after "#" (YML only),
    - multiline descriptions with the help of ">" and "|" (YML only)
  Last modification date: 08.05.2019 09:35.
# A line that starts with '#' is ignored
inputs:
  - id: WFI_WgetUrl1_out_filename
    doc: Base name for downloaded file
    type: string?
    default: index1-WFIdefault.html
  - id: WFI_WordsCount1_out_filename
    doc: Base name for file with a counter
    type: string?
  - id: WFI_url1
    doc: Url to download
    type: string
  - id: WFI_word1
    doc: A word to count
    type: string
  - id: WFI_WgetUrl2_out_filename
    doc: Base name for downloaded file
    type: string?
    default: index2-WFIdefault.html
  - id: WFI_WordsCount2_out_filename
    doc: Base name for file with a counter
    type: string?
  - id: WFI_url2
    doc: Url to download
    type: string
  - id: WFI_word2
    doc: A word to count
    type: string
outputs:
  - id: WFO_countfile1
    type: File
    outputSource: S_WordsCount1/outfile_count
  - id: WFO_outfile1
    type: File
    outputSource: S_WgetUrl1/outfile
  - id: WFO_countfile2
    type: File
    outputSource: S_WordsCount2/outfile_count
  - id: WFO_outfile2
    type: File
    outputSource: S_WgetUrl2/outfile
steps:
  - id: S_WgetUrl1
    in:
      - id: out_filename
        source: WFI_WgetUrl1_out_filename
      - id: url
        source: WFI_url1
    out:
      - id: outfile
    run: wget-url-tool.yml
  - id: S_WordsCount1
    in:
      - id: file
        source: S_WgetUrl1/outfile
      - id: out_filename
        source: WFI_WordsCount1_out_filename
      - id: word
        source: WFI_word1
    out:
      - id: outfile_count
    run: words-count-tool.yml
  - id: S_WgetUrl2
    in:
      - id: out_filename
        source: WFI_WgetUrl2_out_filename
      - id: url
        source: WFI_url2
    out:
      - id: outfile
    run: wget-url-tool.yml
  - id: S_WordsCount2
    in:
      - id: file
        source: S_WgetUrl2/outfile
      - id: out_filename
        source: WFI_WordsCount2_out_filename
      - id: word
        source: WFI_word2
    out:
      - id: outfile_count
    run: words-count-tool.yml

values.yml:

WFI_url1: https://en.wikipedia.org/wiki/Category:Bioinformatics_software
WFI_word1: bio
WFI_url2: https://en.wikipedia.org/wiki/SAMtools
WFI_word2: samtools
WFI_WgetUrl2_out_filename: index2-WFI-from_values_file.html
WFI_WordsCount2_out_filename: count-from_values_file.txt

wget-url-tool.yml:

cwlVersion: v1.0
class: CommandLineTool
id: wget-tool
doc: A tool to download a file from a url and to save it into a file
baseCommand: [bash, -c, "cd $HOME; wget --no-check-certificate -q $0 -O $1"]
inputs:
  - id: url
    doc: Url to download
    type: string
    inputBinding:
      position: 1
  - id: out_filename
    doc: Output downloaded filename
    type: string?
    inputBinding:
      position: 2
    default: index.html
outputs:
  - id: outfile
    type: File
    outputBinding:
      glob: $(inputs.out_filename)
requirements:
  - class: DockerRequirement
    dockerPull: 1science/alpine
  - class: ResourceRequirement
    coresMin: 1
    outdirMin: 2000
    ramMin: 500
    tmpdirMin: 2000

words-count-tool.yml:

cwlVersion: v1.0
class: CommandLineTool
id: wordscount-tool
doc: A tool to count the amount of a specified word in a provided file
baseCommand: [bash, -c, "cd $HOME; sed -e \"s/\\($1\\)/\\1\\n/g\" $0 | grep -c \"$1\" > $2"]
inputs:
  - id: file
    doc: A file to count words in
    type: File
    inputBinding:
      position: 1
  - id: word
    doc: A word to count
    type: string
    inputBinding:
      position: 2
  - id: out_filename
    doc: Output filename with count
    type: string?
    inputBinding:
      position: 3
    default: count.txt
outputs:
  - id: outfile_count
    type: File
    outputBinding:
      glob: $(inputs.out_filename)
requirements:
  - class: DockerRequirement
    dockerPull: 1science/alpine

command to run: cwltool scheme.cwl values.yml |& tee -a log.log
log.log:

INFO /usr/local/bin/cwltool 3.1.20230425144158
INFO Resolved 'scheme.cwl' to 'file:///home/bio/scheme.cwl'
INFO [workflow ] start
INFO [workflow ] starting step S_WgetUrl2
INFO [step S_WgetUrl2] start
WARNING [job S_WgetUrl2] Skipping Docker software container '--memory' limit despite presence of ResourceRequirement with ramMin and/or ramMax setting. Consider running with --strict-memory-limit for increased portability assurance.
WARNING [job S_WgetUrl2] Skipping Docker software container '--cpus' limit despite presence of ResourceRequirement with coresMin and/or coresMax setting. Consider running with --strict-cpu-limit for increased portability assurance.
INFO [job S_WgetUrl2] /tmp/9pgm7u9d$ docker \
    run \
    -i \
    --mount=type=bind,source=/tmp/9pgm7u9d,target=/GolwUU \
    --mount=type=bind,source=/tmp/gde66af9,target=/tmp \
    --workdir=/GolwUU \
    --read-only=true \
    --user=1094:1094 \
    --rm \
    --cidfile=/tmp/8e5zoa3m/20230526114528-071946.cid \
    --env=TMPDIR=/tmp \
    --env=HOME=/GolwUU \
    1science/alpine \
    bash \
    -c \
    'cd $HOME; wget --no-check-certificate -q $0 -O $1' \
    https://en.wikipedia.org/wiki/SAMtools \
    index2-WFI-from_values_file.html
INFO [job S_WgetUrl2] Max memory used: 0MiB
INFO [job S_WgetUrl2] completed success
INFO [step S_WgetUrl2] completed success
INFO [workflow ] starting step S_WgetUrl1
INFO [step S_WgetUrl1] start
WARNING [job S_WgetUrl1] Skipping Docker software container '--memory' limit despite presence of ResourceRequirement with ramMin and/or ramMax setting. Consider running with --strict-memory-limit for increased portability assurance.
WARNING [job S_WgetUrl1] Skipping Docker software container '--cpus' limit despite presence of ResourceRequirement with coresMin and/or coresMax setting. Consider running with --strict-cpu-limit for increased portability assurance.
INFO [job S_WgetUrl1] /tmp/_39m4crr$ docker \
    run \
    -i \
    --mount=type=bind,source=/tmp/_39m4crr,target=/GolwUU \
    --mount=type=bind,source=/tmp/5jxxjeb6,target=/tmp \
    --workdir=/GolwUU \
    --read-only=true \
    --user=1094:1094 \
    --rm \
    --cidfile=/tmp/gd4ggr7d/20230526114529-079936.cid \
    --env=TMPDIR=/tmp \
    --env=HOME=/GolwUU \
    1science/alpine \
    bash \
    -c \
    'cd $HOME; wget --no-check-certificate -q $0 -O $1' \
    https://en.wikipedia.org/wiki/Category:Bioinformatics_software \
    index1-WFIdefault.html
INFO [job S_WgetUrl1] Max memory used: 0MiB
INFO [job S_WgetUrl1] completed success
INFO [step S_WgetUrl1] completed success
INFO [workflow ] starting step S_WordsCount1
INFO [step S_WordsCount1] start
INFO [job S_WordsCount1] /tmp/9osg00te$ docker \
    run \
    -i \
    --mount=type=bind,source=/tmp/9osg00te,target=/GolwUU \
    --mount=type=bind,source=/tmp/hj9mqom8,target=/tmp \
    --mount=type=bind,source=/tmp/_39m4crr/index1-WFIdefault.html,target=/var/lib/cwl/stg0a937bcb-f23c-4862-bd3f-79adb9c585a9/index1-WFIdefault.html,readonly \
    --workdir=/GolwUU \
    --read-only=true \
    --user=1094:1094 \
    --rm \
    --cidfile=/tmp/2rvh6_pk/20230526114530-092544.cid \
    --env=TMPDIR=/tmp \
    --env=HOME=/GolwUU \
    1science/alpine \
    bash \
    -c \
    'cd $HOME; sed -e "s/\($1\)/\1\n/g" $0 | grep -c "$1" > $2' \
    /var/lib/cwl/stg0a937bcb-f23c-4862-bd3f-79adb9c585a9/index1-WFIdefault.html \
    bio \
    count.txt
INFO [job S_WordsCount1] Max memory used: 0MiB
INFO [job S_WordsCount1] completed success
INFO [step S_WordsCount1] completed success
INFO [workflow ] starting step S_WordsCount2
INFO [step S_WordsCount2] start
INFO [job S_WordsCount2] /tmp/qrncx4ow$ docker \
    run \
    -i \
    --mount=type=bind,source=/tmp/qrncx4ow,target=/GolwUU \
    --mount=type=bind,source=/tmp/a02asp4_,target=/tmp \
    --mount=type=bind,source=/tmp/9pgm7u9d/index2-WFI-from_values_file.html,target=/var/lib/cwl/stg0754c423-51cd-4834-9288-116506974446/index2-WFI-from_values_file.html,readonly \
    --workdir=/GolwUU \
    --read-only=true \
    --user=1094:1094 \
    --rm \
    --cidfile=/tmp/nxtx5fml/20230526114531-105476.cid \
    --env=TMPDIR=/tmp \
    --env=HOME=/GolwUU \
    1science/alpine \
    bash \
    -c \
    'cd $HOME; sed -e "s/\($1\)/\1\n/g" $0 | grep -c "$1" > $2' \
    /var/lib/cwl/stg0754c423-51cd-4834-9288-116506974446/index2-WFI-from_values_file.html \
    samtools \
    count-from_values_file.txt
INFO [job S_WordsCount2] Max memory used: 0MiB
INFO [job S_WordsCount2] completed success
INFO [step S_WordsCount2] completed success
INFO [workflow ] completed success
{
    "WFO_countfile1": {
        "location": "file:///home/bio/count.txt",
        "basename": "count.txt",
        "class": "File",
        "checksum": "sha1$37e3ecf5f468d8af8698ba15797184523a4b401c",
        "size": 3,
        "path": "/home/bio/count.txt"
    },
    "WFO_outfile1": {
        "location": "file:///home/bio/index1-WFIdefault.html",
        "basename": "index1-WFIdefault.html",
        "class": "File",
        "checksum": "sha1$ecfcd7f6785119948145604288ef0dc2cfac1b26",
        "size": 63022,
        "path": "/home/bio/index1-WFIdefault.html"
    },
    "WFO_countfile2": {
        "location": "file:///home/bio/count-from_values_file.txt",
        "basename": "count-from_values_file.txt",
        "class": "File",
        "checksum": "sha1$29581e412c0981bffd7d0f4a9cdd9b114fb80947",
        "size": 3,
        "path": "/home/bio/count-from_values_file.txt"
    },
    "WFO_outfile2": {
        "location": "file:///home/bio/index2-WFI-from_values_file.html",
        "basename": "index2-WFI-from_values_file.html",
        "class": "File",
        "checksum": "sha1$04db83843ff22701e5247ac4555ed82cc75e209b",
        "size": 94238,
        "path": "/home/bio/index2-WFI-from_values_file.html"
    }
}INFO Final process status is success


INFO /usr/local/bin/cwltool 3.1.20230425144158
INFO Resolved 'scheme.cwl' to 'file:///home/bio/scheme.cwl'
INFO [workflow ] start
INFO [workflow ] starting step S_WgetUrl2
INFO [step S_WgetUrl2] start
WARNING [job S_WgetUrl2] Skipping Docker software container '--memory' limit despite presence of ResourceRequirement with ramMin and/or ramMax setting. Consider running with --strict-memory-limit for increased portability assurance.
WARNING [job S_WgetUrl2] Skipping Docker software container '--cpus' limit despite presence of ResourceRequirement with coresMin and/or coresMax setting. Consider running with --strict-cpu-limit for increased portability assurance.
INFO [job S_WgetUrl2] /tmp/5ska444w$ docker \
    run \
    -i \
    --mount=type=bind,source=/tmp/5ska444w,target=/MzwRqk \
    --mount=type=bind,source=/tmp/i2nyhbg9,target=/tmp \
    --workdir=/MzwRqk \
    --read-only=true \
    --user=1094:1094 \
    --rm \
    --cidfile=/tmp/um_6bnax/20230526114613-136998.cid \
    --env=TMPDIR=/tmp \
    --env=HOME=/MzwRqk \
    1science/alpine \
    bash \
    -c \
    'cd $HOME; wget --no-check-certificate -q $0 -O $1' \
    https://en.wikipedia.org/wiki/SAMtools \
    index2-WFI-from_values_file.html
INFO [job S_WgetUrl2] Max memory used: 0MiB
INFO [job S_WgetUrl2] completed success
INFO [step S_WgetUrl2] completed success
INFO [workflow ] starting step S_WgetUrl1
INFO [step S_WgetUrl1] start
WARNING [job S_WgetUrl1] Skipping Docker software container '--memory' limit despite presence of ResourceRequirement with ramMin and/or ramMax setting. Consider running with --strict-memory-limit for increased portability assurance.
WARNING [job S_WgetUrl1] Skipping Docker software container '--cpus' limit despite presence of ResourceRequirement with coresMin and/or coresMax setting. Consider running with --strict-cpu-limit for increased portability assurance.
INFO [job S_WgetUrl1] /tmp/sh7fb95j$ docker \
    run \
    -i \
    --mount=type=bind,source=/tmp/sh7fb95j,target=/MzwRqk \
    --mount=type=bind,source=/tmp/xuiiirql,target=/tmp \
    --workdir=/MzwRqk \
    --read-only=true \
    --user=1094:1094 \
    --rm \
    --cidfile=/tmp/m9i1hfar/20230526114614-145138.cid \
    --env=TMPDIR=/tmp \
    --env=HOME=/MzwRqk \
    1science/alpine \
    bash \
    -c \
    'cd $HOME; wget --no-check-certificate -q $0 -O $1' \
    https://en.wikipedia.org/wiki/Category:Bioinformatics_software \
    index1-WFIdefault.html
INFO [job S_WgetUrl1] Max memory used: 0MiB
INFO [job S_WgetUrl1] completed success
INFO [step S_WgetUrl1] completed success
INFO [workflow ] starting step S_WordsCount2
INFO [step S_WordsCount2] start
INFO [job S_WordsCount2] /tmp/pdgw6nnd$ docker \
    run \
    -i \
    --mount=type=bind,source=/tmp/pdgw6nnd,target=/MzwRqk \
    --mount=type=bind,source=/tmp/jxjausbs,target=/tmp \
    --mount=type=bind,source=/tmp/5ska444w/index2-WFI-from_values_file.html,target=/var/lib/cwl/stg5c7e7a76-2787-48e7-a704-5ff2f8c5fd0d/index2-WFI-from_values_file.html,readonly \
    --workdir=/MzwRqk \
    --read-only=true \
    --user=1094:1094 \
    --rm \
    --cidfile=/tmp/1use22_t/20230526114615-153849.cid \
    --env=TMPDIR=/tmp \
    --env=HOME=/MzwRqk \
    1science/alpine \
    bash \
    -c \
    'cd $HOME; sed -e "s/\($1\)/\1\n/g" $0 | grep -c "$1" > $2' \
    /var/lib/cwl/stg5c7e7a76-2787-48e7-a704-5ff2f8c5fd0d/index2-WFI-from_values_file.html \
    samtools \
    count-from_values_file.txt
INFO [job S_WordsCount2] Max memory used: 0MiB
INFO [job S_WordsCount2] completed success
INFO [step S_WordsCount2] completed success
INFO [workflow ] starting step S_WordsCount1
INFO [step S_WordsCount1] start
INFO [job S_WordsCount1] /tmp/bqo2efyj$ docker \
    run \
    -i \
    --mount=type=bind,source=/tmp/bqo2efyj,target=/MzwRqk \
    --mount=type=bind,source=/tmp/dy7sbx18,target=/tmp \
    --mount=type=bind,source=/tmp/sh7fb95j/index1-WFIdefault.html,target=/var/lib/cwl/stgdd99d157-583b-4df0-b9af-6027f7ffbed8/index1-WFIdefault.html,readonly \
    --workdir=/MzwRqk \
    --read-only=true \
    --user=1094:1094 \
    --rm \
    --cidfile=/tmp/neppa_8x/20230526114616-166386.cid \
    --env=TMPDIR=/tmp \
    --env=HOME=/MzwRqk \
    1science/alpine \
    bash \
    -c \
    'cd $HOME; sed -e "s/\($1\)/\1\n/g" $0 | grep -c "$1" > $2' \
    /var/lib/cwl/stgdd99d157-583b-4df0-b9af-6027f7ffbed8/index1-WFIdefault.html \
    bio \
    count.txt
INFO [job S_WordsCount1] Max memory used: 0MiB
INFO [job S_WordsCount1] completed success
INFO [step S_WordsCount1] completed success
INFO [workflow ] completed success
{
    "WFO_countfile1": {
        "location": "file:///home/bio/count.txt",
        "basename": "count.txt",
        "class": "File",
        "checksum": "sha1$37e3ecf5f468d8af8698ba15797184523a4b401c",
        "size": 3,
        "path": "/home/bio/count.txt"
    },
    "WFO_outfile1": {
        "location": "file:///home/bio/index1-WFIdefault.html",
        "basename": "index1-WFIdefault.html",
        "class": "File",
        "checksum": "sha1$ecfcd7f6785119948145604288ef0dc2cfac1b26",
        "size": 63022,
        "path": "/home/bio/index1-WFIdefault.html"
    },
    "WFO_countfile2": {
        "location": "file:///home/bio/count-from_values_file.txt",
        "basename": "count-from_values_file.txt",
        "class": "File",
        "checksum": "sha1$29581e412c0981bffd7d0f4a9cdd9b114fb80947",
        "size": 3,
        "path": "/home/bio/count-from_values_file.txt"
    },
    "WFO_outfile2": {
        "location": "file:///home/bio/index2-WFI-from_values_file.html",
        "basename": "index2-WFI-from_values_file.html",
        "class": "File",
        "checksum": "sha1$04db83843ff22701e5247ac4555ed82cc75e209b",
        "size": 94238,
        "path": "/home/bio/index2-WFI-from_values_file.html"
    }
}INFO Final process status is success


INFO /usr/local/bin/cwltool 3.1.20230425144158
INFO Resolved 'scheme.cwl' to 'file:///home/bio/scheme.cwl'
INFO [workflow ] start
INFO [workflow ] starting step S_WgetUrl2
INFO [step S_WgetUrl2] start
WARNING [job S_WgetUrl2] Skipping Docker software container '--memory' limit despite presence of ResourceRequirement with ramMin and/or ramMax setting. Consider running with --strict-memory-limit for increased portability assurance.
WARNING [job S_WgetUrl2] Skipping Docker software container '--cpus' limit despite presence of ResourceRequirement with coresMin and/or coresMax setting. Consider running with --strict-cpu-limit for increased portability assurance.
INFO [job S_WgetUrl2] /tmp/52tlovzq$ docker \
    run \
    -i \
    --mount=type=bind,source=/tmp/52tlovzq,target=/TcOXrL \
    --mount=type=bind,source=/tmp/xa62viun,target=/tmp \
    --workdir=/TcOXrL \
    --read-only=true \
    --user=1094:1094 \
    --rm \
    --cidfile=/tmp/dvyto87i/20230526114626-206672.cid \
    --env=TMPDIR=/tmp \
    --env=HOME=/TcOXrL \
    1science/alpine \
    bash \
    -c \
    'cd $HOME; wget --no-check-certificate -q $0 -O $1' \
    https://en.wikipedia.org/wiki/SAMtools \
    index2-WFI-from_values_file.html
INFO [job S_WgetUrl2] Max memory used: 0MiB
INFO [job S_WgetUrl2] completed success
INFO [step S_WgetUrl2] completed success
INFO [workflow ] starting step S_WgetUrl1
INFO [step S_WgetUrl1] start
WARNING [job S_WgetUrl1] Skipping Docker software container '--memory' limit despite presence of ResourceRequirement with ramMin and/or ramMax setting. Consider running with --strict-memory-limit for increased portability assurance.
WARNING [job S_WgetUrl1] Skipping Docker software container '--cpus' limit despite presence of ResourceRequirement with coresMin and/or coresMax setting. Consider running with --strict-cpu-limit for increased portability assurance.
INFO [job S_WgetUrl1] /tmp/ibarffhw$ docker \
    run \
    -i \
    --mount=type=bind,source=/tmp/ibarffhw,target=/TcOXrL \
    --mount=type=bind,source=/tmp/ggnpb5rr,target=/tmp \
    --workdir=/TcOXrL \
    --read-only=true \
    --user=1094:1094 \
    --rm \
    --cidfile=/tmp/hi7dz5b1/20230526114627-214633.cid \
    --env=TMPDIR=/tmp \
    --env=HOME=/TcOXrL \
    1science/alpine \
    bash \
    -c \
    'cd $HOME; wget --no-check-certificate -q $0 -O $1' \
    https://en.wikipedia.org/wiki/Category:Bioinformatics_software \
    index1-WFIdefault.html
INFO [job S_WgetUrl1] Max memory used: 0MiB
INFO [job S_WgetUrl1] completed success
INFO [step S_WgetUrl1] completed success
INFO [workflow ] starting step S_WordsCount1
INFO [step S_WordsCount1] start
INFO [job S_WordsCount1] /tmp/chxa_6u0$ docker \
    run \
    -i \
    --mount=type=bind,source=/tmp/chxa_6u0,target=/TcOXrL \
    --mount=type=bind,source=/tmp/f91jzzbq,target=/tmp \
    --mount=type=bind,source=/tmp/ibarffhw/index1-WFIdefault.html,target=/var/lib/cwl/stg78b0261e-b926-4bec-83e3-28870bb0a640/index1-WFIdefault.html,readonly \
    --workdir=/TcOXrL \
    --read-only=true \
    --user=1094:1094 \
    --rm \
    --cidfile=/tmp/v3sp08zn/20230526114628-227535.cid \
    --env=TMPDIR=/tmp \
    --env=HOME=/TcOXrL \
    1science/alpine \
    bash \
    -c \
    'cd $HOME; sed -e "s/\($1\)/\1\n/g" $0 | grep -c "$1" > $2' \
    /var/lib/cwl/stg78b0261e-b926-4bec-83e3-28870bb0a640/index1-WFIdefault.html \
    bio \
    count.txt
INFO [job S_WordsCount1] Max memory used: 0MiB
INFO [job S_WordsCount1] completed success
INFO [step S_WordsCount1] completed success
INFO [workflow ] starting step S_WordsCount2
INFO [step S_WordsCount2] start
INFO [job S_WordsCount2] /tmp/vtazkojh$ docker \
    run \
    -i \
    --mount=type=bind,source=/tmp/vtazkojh,target=/TcOXrL \
    --mount=type=bind,source=/tmp/0qsfphro,target=/tmp \
    --mount=type=bind,source=/tmp/52tlovzq/index2-WFI-from_values_file.html,target=/var/lib/cwl/stg0ddc1d5a-40a5-4bc8-bd3c-06f966e59b12/index2-WFI-from_values_file.html,readonly \
    --workdir=/TcOXrL \
    --read-only=true \
    --user=1094:1094 \
    --rm \
    --cidfile=/tmp/hw2fpzav/20230526114629-240674.cid \
    --env=TMPDIR=/tmp \
    --env=HOME=/TcOXrL \
    1science/alpine \
    bash \
    -c \
    'cd $HOME; sed -e "s/\($1\)/\1\n/g" $0 | grep -c "$1" > $2' \
    /var/lib/cwl/stg0ddc1d5a-40a5-4bc8-bd3c-06f966e59b12/index2-WFI-from_values_file.html \
    samtools \
    count-from_values_file.txt
INFO [job S_WordsCount2] Max memory used: 0MiB
INFO [job S_WordsCount2] completed success
INFO [step S_WordsCount2] completed success
INFO [workflow ] completed success
{
    "WFO_countfile1": {
        "location": "file:///home/bio/count.txt",
        "basename": "count.txt",
        "class": "File",
        "checksum": "sha1$37e3ecf5f468d8af8698ba15797184523a4b401c",
        "size": 3,
        "path": "/home/bio/count.txt"
    },
    "WFO_outfile1": {
        "location": "file:///home/bio/index1-WFIdefault.html",
        "basename": "index1-WFIdefault.html",
        "class": "File",
        "checksum": "sha1$ecfcd7f6785119948145604288ef0dc2cfac1b26",
        "size": 63022,
        "path": "/home/bio/index1-WFIdefault.html"
    },
    "WFO_countfile2": {
        "location": "file:///home/bio/count-from_values_file.txt",
        "basename": "count-from_values_file.txt",
        "class": "File",
        "checksum": "sha1$29581e412c0981bffd7d0f4a9cdd9b114fb80947",
        "size": 3,
        "path": "/home/bio/count-from_values_file.txt"
    },
    "WFO_outfile2": {
        "location": "file:///home/bio/index2-WFI-from_values_file.html",
        "basename": "index2-WFI-from_values_file.html",
        "class": "File",
        "checksum": "sha1$04db83843ff22701e5247ac4555ed82cc75e209b",
        "size": 94238,
        "path": "/home/bio/index2-WFI-from_values_file.html"
    }
}INFO Final process status is success


INFO /usr/local/bin/cwltool 3.1.20230425144158
INFO Resolved 'scheme.cwl' to 'file:///home/bio/scheme.cwl'
INFO [workflow ] start
INFO [workflow ] starting step S_WgetUrl1
INFO [step S_WgetUrl1] start
WARNING [job S_WgetUrl1] Skipping Docker software container '--memory' limit despite presence of ResourceRequirement with ramMin and/or ramMax setting. Consider running with --strict-memory-limit for increased portability assurance.
WARNING [job S_WgetUrl1] Skipping Docker software container '--cpus' limit despite presence of ResourceRequirement with coresMin and/or coresMax setting. Consider running with --strict-cpu-limit for increased portability assurance.
INFO [job S_WgetUrl1] /tmp/6c7jztmr$ docker \
    run \
    -i \
    --mount=type=bind,source=/tmp/6c7jztmr,target=/NTUBVx \
    --mount=type=bind,source=/tmp/hz2t59lh,target=/tmp \
    --workdir=/NTUBVx \
    --read-only=true \
    --user=1094:1094 \
    --rm \
    --cidfile=/tmp/fqh93cvb/20230526114725-681656.cid \
    --env=TMPDIR=/tmp \
    --env=HOME=/NTUBVx \
    1science/alpine \
    bash \
    -c \
    'cd $HOME; wget --no-check-certificate -q $0 -O $1' \
    https://en.wikipedia.org/wiki/Category:Bioinformatics_software \
    index1-WFIdefault.html
INFO [job S_WgetUrl1] Max memory used: 0MiB
INFO [job S_WgetUrl1] completed success
INFO [step S_WgetUrl1] completed success
INFO [workflow ] starting step S_WgetUrl2
INFO [step S_WgetUrl2] start
WARNING [job S_WgetUrl2] Skipping Docker software container '--memory' limit despite presence of ResourceRequirement with ramMin and/or ramMax setting. Consider running with --strict-memory-limit for increased portability assurance.
WARNING [job S_WgetUrl2] Skipping Docker software container '--cpus' limit despite presence of ResourceRequirement with coresMin and/or coresMax setting. Consider running with --strict-cpu-limit for increased portability assurance.
INFO [job S_WgetUrl2] /tmp/_zh8yq8n$ docker \
    run \
    -i \
    --mount=type=bind,source=/tmp/_zh8yq8n,target=/NTUBVx \
    --mount=type=bind,source=/tmp/bqzi65n5,target=/tmp \
    --workdir=/NTUBVx \
    --read-only=true \
    --user=1094:1094 \
    --rm \
    --cidfile=/tmp/vf5o_9ar/20230526114726-690087.cid \
    --env=TMPDIR=/tmp \
    --env=HOME=/NTUBVx \
    1science/alpine \
    bash \
    -c \
    'cd $HOME; wget --no-check-certificate -q $0 -O $1' \
    https://en.wikipedia.org/wiki/SAMtools \
    index2-WFI-from_values_file.html
INFO [job S_WgetUrl2] Max memory used: 0MiB
INFO [job S_WgetUrl2] completed success
INFO [step S_WgetUrl2] completed success
INFO [workflow ] starting step S_WordsCount1
INFO [step S_WordsCount1] start
INFO [job S_WordsCount1] /tmp/871jpb1b$ docker \
    run \
    -i \
    --mount=type=bind,source=/tmp/871jpb1b,target=/NTUBVx \
    --mount=type=bind,source=/tmp/ypbqj7ow,target=/tmp \
    --mount=type=bind,source=/tmp/6c7jztmr/index1-WFIdefault.html,target=/var/lib/cwl/stgdacaaf8f-bfe2-43c2-bf66-3139f9f011cb/index1-WFIdefault.html,readonly \
    --workdir=/NTUBVx \
    --read-only=true \
    --user=1094:1094 \
    --rm \
    --cidfile=/tmp/a31srkwy/20230526114727-702418.cid \
    --env=TMPDIR=/tmp \
    --env=HOME=/NTUBVx \
    1science/alpine \
    bash \
    -c \
    'cd $HOME; sed -e "s/\($1\)/\1\n/g" $0 | grep -c "$1" > $2' \
    /var/lib/cwl/stgdacaaf8f-bfe2-43c2-bf66-3139f9f011cb/index1-WFIdefault.html \
    bio \
    count.txt
INFO [job S_WordsCount1] Max memory used: 0MiB
INFO [job S_WordsCount1] completed success
INFO [step S_WordsCount1] completed success
INFO [workflow ] starting step S_WordsCount2
INFO [step S_WordsCount2] start
INFO [job S_WordsCount2] /tmp/v50ol5bn$ docker \
    run \
    -i \
    --mount=type=bind,source=/tmp/v50ol5bn,target=/NTUBVx \
    --mount=type=bind,source=/tmp/4lrabiq9,target=/tmp \
    --mount=type=bind,source=/tmp/_zh8yq8n/index2-WFI-from_values_file.html,target=/var/lib/cwl/stg918eea8a-c35d-4f0d-88af-6d4b49e9f0ba/index2-WFI-from_values_file.html,readonly \
    --workdir=/NTUBVx \
    --read-only=true \
    --user=1094:1094 \
    --rm \
    --cidfile=/tmp/u2r_pg0p/20230526114728-714774.cid \
    --env=TMPDIR=/tmp \
    --env=HOME=/NTUBVx \
    1science/alpine \
    bash \
    -c \
    'cd $HOME; sed -e "s/\($1\)/\1\n/g" $0 | grep -c "$1" > $2' \
    /var/lib/cwl/stg918eea8a-c35d-4f0d-88af-6d4b49e9f0ba/index2-WFI-from_values_file.html \
    samtools \
    count-from_values_file.txt
INFO [job S_WordsCount2] Max memory used: 0MiB
INFO [job S_WordsCount2] completed success
INFO [step S_WordsCount2] completed success
INFO [workflow ] completed success
{
    "WFO_countfile1": {
        "location": "file:///home/bio/count.txt",
        "basename": "count.txt",
        "class": "File",
        "checksum": "sha1$37e3ecf5f468d8af8698ba15797184523a4b401c",
        "size": 3,
        "path": "/home/bio/count.txt"
    },
    "WFO_outfile1": {
        "location": "file:///home/bio/index1-WFIdefault.html",
        "basename": "index1-WFIdefault.html",
        "class": "File",
        "checksum": "sha1$ecfcd7f6785119948145604288ef0dc2cfac1b26",
        "size": 63022,
        "path": "/home/bio/index1-WFIdefault.html"
    },
    "WFO_countfile2": {
        "location": "file:///home/bio/count-from_values_file.txt",
        "basename": "count-from_values_file.txt",
        "class": "File",
        "checksum": "sha1$29581e412c0981bffd7d0f4a9cdd9b114fb80947",
        "size": 3,
        "path": "/home/bio/count-from_values_file.txt"
    },
    "WFO_outfile2": {
        "location": "file:///home/bio/index2-WFI-from_values_file.html",
        "basename": "index2-WFI-from_values_file.html",
        "class": "File",
        "checksum": "sha1$04db83843ff22701e5247ac4555ed82cc75e209b",
        "size": 94238,
        "path": "/home/bio/index2-WFI-from_values_file.html"
    }
}INFO Final process status is success

As you can see in log - order varies:

  4: INFO [workflow ] starting step S_WgetUrl2
 29: INFO [workflow ] starting step S_WgetUrl1
 54: INFO [workflow ] starting step S_WordsCount1
 79: INFO [workflow ] starting step S_WordsCount2

144: INFO [workflow ] starting step S_WgetUrl2
169: INFO [workflow ] starting step S_WgetUrl1
194: INFO [workflow ] starting step S_WordsCount2
219: INFO [workflow ] starting step S_WordsCount1

284: INFO [workflow ] starting step S_WgetUrl2
309: INFO [workflow ] starting step S_WgetUrl1
334: INFO [workflow ] starting step S_WordsCount1
359: INFO [workflow ] starting step S_WordsCount2

424: INFO [workflow ] starting step S_WgetUrl1
449: INFO [workflow ] starting step S_WgetUrl2
474: INFO [workflow ] starting step S_WordsCount1
499: INFO [workflow ] starting step S_WordsCount2

I want to find way fo pin it - i.e. S_WgetUrl1 → S_WordsCount1 → S_WgetUrl2 → S_WordsCount2. Anyone know how?

HrishiDhondge · May 26, 2023, 10:00am

Hello @Stikus,

Thanks for the explanation.
There is dependency only for words-count-tool, so it will always run after the wget-url-tool. If you want to run the steps in order as you mentioned:

then, you can try to give some optional input for the S_WgetUrl2 step from S_WordsCount1.

I hope it helps.

Stikus · May 26, 2023, 10:04am

Your solution will fork, but this is a workaround. Is there any standard way without various pseudoinputs?

When workflow have 20+ steps, managing these pseudoinputs can become very tedious.

mrc · May 26, 2023, 10:18am

Welcome @Stikus ;

If your main concern is to keep running your CWL workflow as far as possible and not stopping early due to errors in another part of the same workflow, then I would suggest configuring your workflow runner/platform to do so.

For example, cwltool has an --on-error=continue option. What workflow runner/platform do you use?

Stikus · May 26, 2023, 10:32am

We’re using cwltool on (primary) Ubuntu.

Thanks for advice about --on-error=continue, we’ll look into it. I assume that exitcode of cwltool will be nonzero in this case?

cwltool will try to run as many steps as possible and collect as many outputs as possible, correct?

Correct me if I’m wrong, but there is no standard way to fix order steps (or set priority)? This is not in focus of CWL team or this is not in concept of CWL at all?

locos · June 7, 2023, 1:30am

No, that’s not correct.

According to the official doc, " Workflow steps are not necessarily run in the order they are listed, instead the order is determined by the dependencies between steps (using source )."

Official doc

locos · June 7, 2023, 9:52am

Can you provide an example explaining how to set this optional input ?

mrc · June 7, 2023, 10:20am

Correct, the exit code will should be nonzero if there are any failure, even with --on-error=continue

With --on-error=continue, yes.

Process scheduling priority is not a concept in the CWL standards, no. However, one could develop an extension (hint) to indicate this priority. With a code contribution, cwltool could even respect that hint, when possible, with the --enable-ext flag. If this hint becomes popular, it could become a part of a future version of the CWL standards.

mrc · June 7, 2023, 10:30am

Sure, using the example from @Stikus 1. S_WgetUrl1 → S_WordsCount1 (already linked, no extra input needed)
2. S_WordsCount1 → S_WgetUrl2 (look for the unused_input I added below)
3. S_WgetUrl2 → S_WordsCount2 (already linked, no extra input needed).

cwlVersion: v1.0
class: Workflow
id: wget-count-wf-parallel-001
doc: |
  Parallel workflow (YML) to run several copies of sequential pairs of tools (every
  sequential pair is a 'simple test Wget Count workflow', based on Alpine docker image).
  This workflow includes such elements as:
    - non-required WF and T inputs (marked both by "?" after main type and "- 'null'" as a separate type),
    - type as a string (not a list of one element),
    - defaults for WF and T inputs,
    - references inside T,
    - includes of T into WF step runs,
    - baseCommand usage,
    - comments after "#" (YML only),
    - multiline descriptions with the help of ">" and "|" (YML only)
  Last modification date: 08.05.2019 09:35.
# A line that starts with '#' is ignored
inputs:
  - id: WFI_WgetUrl1_out_filename
    doc: Base name for downloaded file
    type: string?
    default: index1-WFIdefault.html
  - id: WFI_WordsCount1_out_filename
    doc: Base name for file with a counter
    type: string?
  - id: WFI_url1
    doc: Url to download
    type: string
  - id: WFI_word1
    doc: A word to count
    type: string
  - id: WFI_WgetUrl2_out_filename
    doc: Base name for downloaded file
    type: string?
    default: index2-WFIdefault.html
  - id: WFI_WordsCount2_out_filename
    doc: Base name for file with a counter
    type: string?
  - id: WFI_url2
    doc: Url to download
    type: string
  - id: WFI_word2
    doc: A word to count
    type: string
outputs:
  - id: WFO_countfile1
    type: File
    outputSource: S_WordsCount1/outfile_count
  - id: WFO_outfile1
    type: File
    outputSource: S_WgetUrl1/outfile
  - id: WFO_countfile2
    type: File
    outputSource: S_WordsCount2/outfile_count
  - id: WFO_outfile2
    type: File
    outputSource: S_WgetUrl2/outfile
steps:
  - id: S_WgetUrl1
    in:
      - id: out_filename
        source: WFI_WgetUrl1_out_filename
      - id: url
        source: WFI_url1
    out:
      - id: outfile
    run: wget-url-tool.yml
  - id: S_WordsCount1
    in:
      - id: file
        source: S_WgetUrl1/outfile
      - id: out_filename
        source: WFI_WordsCount1_out_filename
      - id: word
        source: WFI_word1
    out:
      - id: outfile_count
    run: words-count-tool.yml
  - id: S_WgetUrl2
    in:
      - id: out_filename
        source: WFI_WgetUrl2_out_filename
      - id: url
        source: WFI_url2
      - id: unused_input
        source: S_WordsCount1/outfile_count  # forces S_WgetUrl2 to run only after S_WordsCount1
    out:
      - id: outfile
    run: wget-url-tool.yml
  - id: S_WordsCount2
    in:
      - id: file
        source: S_WgetUrl2/outfile
      - id: out_filename
        source: WFI_WordsCount2_out_filename
      - id: word
        source: WFI_word2
    out:
      - id: outfile_count
    run: words-count-tool.yml

locos · June 8, 2023, 6:23am

What if step S_WordsCount1 has no output file, or its output file has been outputted by another parameter?

mrc · June 13, 2023, 7:32pm

One could add a fake output to an output-less step.

Output files from one step can be used by any number of other steps.

locos · June 14, 2023, 3:31am

With outputEval: ?