How does ResourceRequirement work?

Hi,

I’m working on a project that uses Workflows with ResourceRequirement entries. I’ve been trying to detect any conflicts between ramMin/ramMax and coresMin/coresMax values in our CWL files.

As asked by my coworker, I tested this using cwltool to see if it can identify:

  • whether a step/run requirements is not larger than the global requirements
  • whether the resourceMin is not larger than the resourceMax in any requirements.

before running the job, but it seems that cwltool does not check for conflicts between the minimum and maximum resource values(?)

For example, even with the following configurations, the job still runs:

cwlVersion: v1.2
class: Workflow
label: "Higher coresMin Workflow"
doc: The Workflow ResourceRequirement has a coresMin higher than its coresMax = failure

requirements:
  ResourceRequirement:
    coresMin: 4 # > coresMax=2, should'nt work, right?
    coresMax: 2
inputs: []
outputs: []

steps:
  good_step:
    run:
      class: CommandLineTool
      baseCommand: ["echo", "Hello World"]
      inputs: []
      outputs: []
    out: []
    in: []
cwlVersion: v1.2
class: Workflow
label: "Higher WorkflowStep coresMin than Workflow coresMax"
doc: The WorkflowStep ResourceRequirement has a coresMin higher than the Workflow ResourceRequirement coresMax = failure

inputs: []
outputs: []
requirements:
  ResourceRequirement:
    coresMax: 2 # also equals coresMin (when coresMin not specified)

steps:
  too_high_cores:
    requirements:
      ResourceRequirement:
        coresMin: 4 # > globalCoresMax=2, shouldn't work right?
    run:
      class: CommandLineTool
      baseCommand: ["echo", "Hello World"]
      inputs: []
      outputs: []
    in: []
    out: []

Am I doing/understanding something wrong? Or is my CWL job simply too simple for the ResourceRequirement to be taken into account?

Hi @Stellatsuu

ResourceRequirement is for CommandLineTool, not for Workflow.
Setting ResourceRequirement in the workflow will not set this as a ‘global’ variable.

cwlVersion: v1.2
class: Workflow
label: "Higher WorkflowStep coresMin than Workflow coresMax"
doc: |
  The WorkflowStep ResourceRequirement has a coresMin higher than the 
  Workflow ResourceRequirement coresMax = failure

inputs: []
outputs: []


steps:
  too_high_cores:
    requirements:
      ResourceRequirement:
        coresMax: 2
        coresMin: 4
    run:
      class: CommandLineTool
      baseCommand: ["echo", "Hello World"]
      inputs: []
      outputs: []
    in: []
    out: []

However, you are right regarding the coresMin / coresMax contradiction.
Even setting with --strict-cpu-limit, cwltool doesn’t fail.
This seems to violate the definition in Common Workflow Language (CWL) Command Line Tool Description, v1.2

However, cwltool is just one implementation of CWL. Other implementations such as Toil may adhere to the definition I linked above.

Welcome to the CWL forum @Stellatsuu !

Reviewing the standard itself https://www.commonwl.org/v1.2/Workflow.html#Requirements_and_hints

If the same process requirement appears at different levels of the workflow, the most specific instance of the requirement is used, that is, an entry in requirements on a process implementation such as CommandLineTool will take precedence over an entry in requirements specified in a workflow step, and an entry in requirements on a workflow step takes precedence over the workflow. Entries in hints are resolved the same way.

But what you are trying to do is reasonable, and there have been previous discussions about blending requirements, or reformulating them for CWL v2 to make that easier.

Back to your example, only the innermost ResourceRequirement will be evaluated by any CWL compliant engine. If the ResourceRequirement in CommandLineTool was under hints instead of requirements then it would have been overridden by theResourceRequirement entry under workflow step requirements.

If there was only a ResourceRequirement specified in the example workflow and not at the step level nor in the CommandLineTool then it would have been applied on the CommandLineTool. Though I don’t personally recommend that.

Good catch, both of you. Yes, cwltool should complain when a max is less than the min. Can someone open a bug report https://github.com/common-workflow-language/cwltool/issues/new?template=BLANK_ISSUE ? I can help whoever implement that logic check.

Hello, thank you both for your responses

I remember seeing a CWL file in the cwltool tests called count-lines1-wf.cwl, which contained a Workflow with a ResourceRequirement, it was used to test sequential worflows. Is it allowed to include a ResourceRequirement in a Workflow only for this purpose?

So the priority in requirements is like this: CommandLineTool > WorkflowStep > Workflow?
And if there’s hints, it’s requirements first so priority is: Req[CommandLineTool > WorkflowStep > Workflow] > Hint[CommandLineTool > WorkflowStep > Workflow]?

So, if I understand correctly, we need to specify the ResourceRequirement for each individual step and CommandLineTool that requires it?
If two steps (or CLT) need the same resource requirements, I should define the requirement in both steps separately (even if it’s the same resource values), rather than placing it in the Workflow to apply it to all steps?
For example, instead of doing this:

cwlVersion: v1.2
class: Workflow
label: "Higher coresMin Workflow"
doc: The Workflow ResourceRequirement has a coresMin higher than its coresMax = failure

requirements:
  ResourceRequirement:
    coresMin: 4
    coresMax: 2
inputs: []
outputs: []

steps:
  good_step_1:
    run:
      class: CommandLineTool
      baseCommand: ["echo", "Hello World"]
      inputs: []
      outputs: []
    out: []
    in: []
  good_step_2:
      run:
        class: CommandLineTool
        baseCommand: ["echo", "Hello World"]
        inputs: []
      outputs: []
    out: []
    in: []

We should do something like this (in steps or CLT):

cwlVersion: v1.2
class: Workflow
label: "Higher coresMin Workflow"
doc: The Workflow ResourceRequirement has a coresMin higher than its coresMax = failure

inputs: []
outputs: []

steps:
  good_step_1:
    requirements:
         ResourceRequirement:
             coresMin: 4
             coresMax: 2
    run:
      class: CommandLineTool
      baseCommand: ["echo", "Hello World"]
      inputs: []
      outputs: []
    out: []
    in: []
  good_step_2:
      requirements:
         ResourceRequirement:
             coresMin: 4
             coresMax: 2
      run:
        class: CommandLineTool
        baseCommand: ["echo", "Hello World"]
        inputs: []
      outputs: []
    out: []
    in: []

Yes, it is allowed by the CWL standards to use a CLT-only requirement in a Workflow for the purposes of flowing down / overriding into the steps. Personally I don’t really recommend it.

Correct. Likewise if there was a sub-workflow in there.

From the perspective of portability & re-use it is nice if the CLTs are sufficient on their own. So if you know particular tool always needs some particular minimum requirements (especially if you can dynamically determine more accurate numbers via the inputs to the CLT), then I recommend putting those ResoureRequirements as an entry in the CLT’s hints.

Hi,

I work with @Stellatsuu.
Thank you very much for your tips and guidance!

Regarding resourceMin greater than resourceMax:

@Stellatsuu opened a Github Issue ( cwltool doesn't complain when max is lesser than min in ResourceRequirements · Issue #2163 · common-workflow-language/cwltool · GitHub ). Thanks for reviewing it, we will soon come up with the associated PR.

Regarding CommandLineTools requirements exceeding Workflow requirements:

I would like to provide additional context if it can help you understand the use case.
We’re working on transitioning our distributed workflow management system (DIRAC) from a custom workflow implementation to CWL. Our system has the following objects:

  • Jobs: Can be either a Workflow or a CLT, executed on a specific single worker node.

    • For single CLT, using one set of requirements for scheduling and execution works well.
    • However, for Workflows with varying requirements at different levels, we face a challenge: we need one set of requirements for scheduling the job on a resource and get an allocation, and we must ensure the job respects these limits during execution (using cgroups). If CLT-level requirements exceed the “scheduling” requirements, the job can be killed.
  • Transformations: Act as templates to create jobs with similar Workflows/CLTs and requirements, but different inputs.

  • Productions: Large workflows where each step represents a transformation.

(For more details about our system and the planned transition: https://www.epj-conferences.org/articles/epjconf/abs/2025/22/epjconf_chep2025_01074/epjconf_chep2025_01074.html)

Based on your suggestion, I understand we should internally use the maximum requirement values across all CLTs to schedule a given job.
In the example below, we would use the requirements from the first CLT to schedule the job:

cwlVersion: v1.2
class: Workflow

inputs: []
outputs: []

# This is not recommended and should be avoided, right?
requirements:
  ResourceRequirement:
    coresMin: 2
    coresMax: 4

steps:
  good_step_1:         
    run:
      class: CommandLineTool
      requirements:
        ResourceRequirement:
          coresMin: 3
          coresMax: 6
      baseCommand: ["echo", "Hello World"]
      inputs: []
      outputs: []
    out: []
    in: []
  good_step_2:
    run:
      class: CommandLineTool
      # No requirements: bad practice! But will inherit from the ones define at the workflow level IIUC.
      baseCommand: ["echo", "Hello World"]
      inputs: []
      outputs: []
    out: []
    in: []

Note: This approach requires iterating over all step requirements to find the maximum values, which does not seem particularly convenient (but would probably have to be done in any case to validate the Workflow).

Again, thank you very much for your support!