How can type "File" be incompatible with type "File"?

I am getting this error message from cwltool:

demo.cwl:45:9: Source ‘indexed_sequences’ of type “File” is incompatible
demo.cwl:49:7: with sink ‘reference’ of type “File”
source has linkMerge method merge_nested
demo.cwl:13:5: Source ‘reference’ of type “File” is incompatible
demo.cwl:49:7: with sink ‘reference’ of type “File”
source has linkMerge method merge_nested

I am at a loss to understand how the type File can be incompatible with the type File. The problem seems to be that the source is output from bwa index which has secondary files associated with it and the sink is bwa mem which needs the base reference file as a command line input but the secondary files also have to be present. I’ve tried all sorts of iterations of how to specify this including a number of examples from github but I can’t get this simple worklfow to get past cwltool’s validation.

Here are the relevant files:

bwa_index.cwl

#!/usr/bin/env cwl-runner

cwlVersion: v1.0
class: CommandLineTool

requirements:
  InitialWorkDirRequirement:
    listing: [ $(inputs.sequences) ]
#TODO: Enable after this issue is fixed: https://github.com/common-workflow-language/cwltool/issues/80
#hints:
#  - $import: bwa-docker.yml

inputs:
  algorithm:
    type: string?
    inputBinding:
      prefix: -a
    doc: |
      BWT construction algorithm: bwtsw or is (Default: auto)
  sequences:
    type: File
    inputBinding:
      valueFrom: $(self.basename)
      position: 4
  block_size:
    type: int?
    inputBinding:

      prefix: -b
    doc: |
      Block size for the bwtsw algorithm (effective with -a bwtsw) (Default: 10000000)

outputs:
  indexed_sequences:
    type: File
    secondaryFiles:
      - .amb
      - .ann
      - .bwt
      - .pac
      - .sa
    outputBinding:
      glob: $(inputs.sequences.basename)

baseCommand:
- bwa
- index
type or paste code here

bwa_mem.cwl

#!/usr/bin/env cwl-runner

cwlVersion: v1.0
class: CommandLineTool

requirements:
- class: InlineJavascriptRequirement

inputs:
  reference:
    type: File
#    secondaryFiles:
#        - .amb
#        - .ann
#        - .bwt
#        - .pac
#        - .sa
    inputBinding:
      position: 2

  output_filename: string

  reads:
    type: File[]
    inputBinding:
      position: 3

  smart_pairing:
    type: boolean?
    inputBinding:
      position: 1
      prefix: -p

  threads:
    type: int?
    inputBinding:
      position: 1
      prefix: -t
    doc: -t INT        number of threads [1]

  min_std_max_min:
    type: int[]?
    inputBinding:
      position: 1
      prefix: -I
      itemSeparator: ','
stdout: $(inputs.output_filename)

outputs:
  aligned_reads:
    type: File
    outputBinding:
      glob: $(inputs.output_filename)

baseCommand:
- bwa
- mem

demo.cwl

#!/usr/bin/env cwl-runner

class: Workflow
cwlVersion: v1.0

requirements:
  InitialWorkDirRequirement:
    listing: [ $(inputs.compressed_file) ]
  MultipleInputFeatureRequirement: {}

inputs:
  reference:
    type: File

  reads:
    type: File[]

  smart_pairing:
    type: boolean

  output_filename:
    type: string

outputs:
  indexed_sequences:
    type: File
    secondaryFiles:
      - .amb
      - .ann
      - .bwt
      - .pac
      - .sa
    outputSource: bwa_index/indexed_sequences

  bwamem_output:
    type: File
    outputSource: bwa_mem/aligned_reads

steps:
  bwa_index:
    run: bwa_index.cwl
    in:
      sequences: reference
    out:
      [ indexed_sequences ]
  bwa_mem:
    run: bwa_mem.cwl
    in:
      reference:
        source: [ bwa_index/indexed_sequences, reference ]
      reads: reads
      output_filename: output_filename
      smart_pairing: smart_pairing
    out:
      [ aligned_reads ]
type or paste code here

I’ve updated the demo.cwl file to add the secondary files to the reference input:

inputs:
  reference:
    type: File
    secondaryFiles:
      - .amb
      - .ann
      - .bwt
      - .pac
      - .sa

That didn’t work either. Same error.

OK after browsing through the Workflow spec I changed the reference input to:

in:
      reference:
        source: [ reference ]

and didn’t mention the secondary files. That got me past the incompatible source/sink type error and the bwa_index step executed successfully and produced the index files. However, the workflow still failed with a cryptic error from cwltool:

cwltool.errors.WorkflowException: Expression evaluation error:
Expecting value: line 1 column 1 (char 0)

My question is line 1 column 1 of what?

Here’s some additional output:

101 (function(){return ((inputs.compressed_file));})()
stdout was: 'undefined'
stderr was: ''


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/work/miniconda3/envs/python3.9/lib/python3.9/site-packages/cwltool/workflow_job.py", line 857, in job
    for newjob in step.iterable:
  File "/work/miniconda3/envs/python3.9/lib/python3.9/site-packages/cwltool/workflow_job.py", line 781, in try_make_job
    yield from jobs
  File "/work/miniconda3/envs/python3.9/lib/python3.9/site-packages/cwltool/workflow_job.py", line 77, in job
    yield from self.step.job(joborder, output_callback, runtimeContext)
  File "/work/miniconda3/envs/python3.9/lib/python3.9/site-packages/cwltool/workflow.py", line 485, in job
    raise WorkflowException(str(exc)) from exc
cwltool.errors.WorkflowException: Expression evaluation error:
Expecting value: line 1 column 1 (char 0)
script was:
03     "output_filename": "VCtest_S2.sam",
04     "reads": [
05         {
06             "class": "File",
07             "location": "file:///work/snakemake_cwl_demo/cwl/no_unzip/VCtest_S2_R1.fastq",
08             "size": 39902,
09             "basename": "VCtest_S2_R1.fastq",
10             "nameroot": "VCtest_S2_R1",
11             "nameext": ".fastq",
12             "path": "/tmp/xr4v8bai/stg394e219c-a3ee-4a25-b492-3052c245003b/VCtest_S2_R1.fastq",
13             "dirname": "/tmp/xr4v8bai/stg394e219c-a3ee-4a25-b492-3052c245003b"
14         },
15         {
16             "class": "File",
17             "location": "file:///work/snakemake_cwl_demo/cwl/no_unzip/VCtest_S2_R2.fastq",
18             "size": 40148,
19             "basename": "VCtest_S2_R2.fastq",
20             "nameroot": "VCtest_S2_R2",
21             "nameext": ".fastq",
22             "path": "/tmp/xr4v8bai/stg99d955a1-5510-4802-a3dc-aecd2e625920/VCtest_S2_R2.fastq",
23             "dirname": "/tmp/xr4v8bai/stg99d955a1-5510-4802-a3dc-aecd2e625920"
24         }
25     ],
26     "reference": {
27         "class": "File",
28         "location": "file:///work/snakemake_cwl_demo/cwl/no_unzip/mmv_NC_001510.fasta",
29         "size": 5291,
30         "basename": "mmv_NC_001510.fasta",
31         "nameroot": "mmv_NC_001510",
32         "nameext": ".fasta",
33         "secondaryFiles": [
34             {
35                 "location": "file:///work/snakemake_cwl_demo/cwl/no_unzip/mmv_NC_001510.fasta.amb",
36                 "basename": "mmv_NC_001510.fasta.amb",
37                 "class": "File",
38                 "nameroot": "mmv_NC_001510.fasta",
39                 "nameext": ".amb",
40                 "size": 9,
41                 "path": "/tmp/xr4v8bai/stg03506b13-a58a-4513-9aaa-11b833127d47/mmv_NC_001510.fasta.amb",
42                 "dirname": "/tmp/xr4v8bai/stg03506b13-a58a-4513-9aaa-11b833127d47"
43             },
44             {
45                 "location": "file:///work/snakemake_cwl_demo/cwl/no_unzip/mmv_NC_001510.fasta.ann",
46                 "basename": "mmv_NC_001510.fasta.ann",
47                 "class": "File",
48                 "nameroot": "mmv_NC_001510.fasta",
49                 "nameext": ".ann",
50                 "size": 87,
51                 "path": "/tmp/xr4v8bai/stg03506b13-a58a-4513-9aaa-11b833127d47/mmv_NC_001510.fasta.ann",
52                 "dirname": "/tmp/xr4v8bai/stg03506b13-a58a-4513-9aaa-11b833127d47"
53             },
54             {
55                 "location": "file:///work/snakemake_cwl_demo/cwl/no_unzip/mmv_NC_001510.fasta.bwt",
56                 "basename": "mmv_NC_001510.fasta.bwt",
57                 "class": "File",
58                 "nameroot": "mmv_NC_001510.fasta",
59                 "nameext": ".bwt",
60                 "size": 5240,
61                 "path": "/tmp/xr4v8bai/stg03506b13-a58a-4513-9aaa-11b833127d47/mmv_NC_001510.fasta.bwt",
62                 "dirname": "/tmp/xr4v8bai/stg03506b13-a58a-4513-9aaa-11b833127d47"
63             },
64             {
65                 "location": "file:///work/snakemake_cwl_demo/cwl/no_unzip/mmv_NC_001510.fasta.pac",
66                 "basename": "mmv_NC_001510.fasta.pac",
67                 "class": "File",
68                 "nameroot": "mmv_NC_001510.fasta",
69                 "nameext": ".pac",
70                 "size": 1289,
71                 "path": "/tmp/xr4v8bai/stg03506b13-a58a-4513-9aaa-11b833127d47/mmv_NC_001510.fasta.pac",
72                 "dirname": "/tmp/xr4v8bai/stg03506b13-a58a-4513-9aaa-11b833127d47"
73             },
74             {
75                 "location": "file:///work/snakemake_cwl_demo/cwl/no_unzip/mmv_NC_001510.fasta.sa",
76                 "basename": "mmv_NC_001510.fasta.sa",
77                 "class": "File",
78                 "nameroot": "mmv_NC_001510.fasta",
79                 "nameext": ".sa",
80                 "size": 2624,
81                 "path": "/tmp/xr4v8bai/stg03506b13-a58a-4513-9aaa-11b833127d47/mmv_NC_001510.fasta.sa",
82                 "dirname": "/tmp/xr4v8bai/stg03506b13-a58a-4513-9aaa-11b833127d47"
83             }
84         ],
85         "path": "/tmp/xr4v8bai/stg03506b13-a58a-4513-9aaa-11b833127d47/mmv_NC_001510.fasta",
86         "dirname": "/tmp/xr4v8bai/stg03506b13-a58a-4513-9aaa-11b833127d47"
87     },
88     "smart_pairing": true,
89     "min_std_max_min": null,
90     "threads": null
91 };
92 var self = null;
93 var runtime = {
94     "cores": 1,
95     "ram": 1024,
96     "tmpdirSize": 1024,
97     "outdirSize": 1024,
98     "tmpdir": "/tmp/g0x1y58h",
99     "outdir": "/tmp/d66kvx89"
100 };
101 (function(){return ((inputs.compressed_file));})()
stdout was: 'undefined'
stderr was: ''

INFO [workflow ] completed permanentFail
DEBUG [workflow ] outputs {
    "bwamem_output": null,
    "indexed_sequences": null
}
DEBUG Removing intermediate output directory /tmp/99onenpo
{
    "bwamem_output": null,
    "indexed_sequences": null
}
WARNING Final process status is permanentFail

Hello,

Hope you’ve managed to solve this, if not, the following may be of interest.

  1. For the bwa mem step, you have the source input as

    • bwa_index/indexed_sequences
    • reference

    Which is an array of files, which is why you are getting the expected error.

    You likely just need bwa_index/indexed_sequences since this will have the reference as the main file and the bwa indexes as the secondary files.

  2. As for you evaluation error,

    You have specified InitialWorkDirRequirement inside a CWL Workflow, this requirement should only be used inside a tool. You are getting the undefined since “compressed_file” is no where in your inputs.

3 Likes

That worked! Thanks for correcting my rookie mistakes. You are a gentleman and a scholar.

2 Likes