Network Connectivity Issues in CWL Environment

mbsabath · June 2, 2021, 4:15pm

I’m working on a CWL workflow running on a Linux VM, where one step involves making a fairly massive number of API calls. The workflow is able to complete successfully on my laptop, and when I run the code on the VM outside of a cwl-runner call, but when it is run in the CWL environment, after a short period of time we end up seeing network connectivity errors.

My guess is that there might be a configuration issue somewhere, but I’m not really sure where to start. Have people run in to this before/do people have any suggestions?

tetron · June 2, 2021, 8:33pm

Hi @mbsabath, did you include the NetworkAccess requirement in your workflow?
https://www.commonwl.org/v1.2/CommandLineTool.html#NetworkAccess

mbsabath · June 2, 2021, 8:43pm

Hi @tetron, thanks for pointing that out, I didn’t include that in my workflow. Do you know of somewhere with a good example of what including that requirement looks like?

Edit: Nevermind, figured it out (I think). Went with

Requirements:
  NetworkAccess:
    networkAccess: true

mbsabath · June 2, 2021, 9:14pm

Unfortunately no luck with that, I may need to dig a bit deeper.

tetron · June 3, 2021, 2:48pm

Which runner are you using?

Does it work for a little bit, and then start failing, or not work at all?

Can you share your workflow?

mbsabath · June 3, 2021, 5:10pm

I’m just using the reference runner. After checking debug logs, it looks like no HTML connections are ever established. The system were on is using an HTTP proxy, which I can see possibly causing issues? I can definitely share my workflow, although what’s being run is fairly abstracted and in a package we haven’t released publicly yet.

tetron · June 3, 2021, 5:20pm

How is the proxy configured? Perhaps you have to pass through some environment variables?

mbsabath · June 3, 2021, 6:42pm

Yup, that’s the exact issue, I needed to define an HTTP_PROXY Environment variable. I’d like to set up those variables to be defined only when an http_proxy input is defined. Would a javaScript expression be the best way to go about that?

tetron · June 3, 2021, 6:48pm

Yes. Although I don’t remember exactly the behavior you get if an environment variable is an empty string or null. That might be a gap in the spec.

mbsabath · June 3, 2021, 6:56pm

I’ve been getting errors with a null variable, but setting a default of the empty string resolves the issue.

mrc · June 4, 2021, 10:20am

Another option is to use cwltool --preserve-environment HTTP_PROXY my_workflow.cwl my_inputs.yml instead of adding to the tool description.

HrishiDhondge · June 2, 2023, 9:04am

Hello @tetron,
Is there any example of this configuration?
Would it also be useful if I’m using CWL as a wrapper around the Python script?