Network Connectivity Issues in CWL Environment

I’m working on a CWL workflow running on a Linux VM, where one step involves making a fairly massive number of API calls. The workflow is able to complete successfully on my laptop, and when I run the code on the VM outside of a cwl-runner call, but when it is run in the CWL environment, after a short period of time we end up seeing network connectivity errors.

My guess is that there might be a configuration issue somewhere, but I’m not really sure where to start. Have people run in to this before/do people have any suggestions?

Hi @mbsabath, did you include the NetworkAccess requirement in your workflow?
https://www.commonwl.org/v1.2/CommandLineTool.html#NetworkAccess

1 Like

Hi @tetron, thanks for pointing that out, I didn’t include that in my workflow. Do you know of somewhere with a good example of what including that requirement looks like?

Edit: Nevermind, figured it out (I think). Went with

Requirements:
  NetworkAccess:
    networkAccess: true
1 Like

Unfortunately no luck with that, I may need to dig a bit deeper.

Which runner are you using?

Does it work for a little bit, and then start failing, or not work at all?

Can you share your workflow?

I’m just using the reference runner. After checking debug logs, it looks like no HTML connections are ever established. The system were on is using an HTTP proxy, which I can see possibly causing issues? I can definitely share my workflow, although what’s being run is fairly abstracted and in a package we haven’t released publicly yet.

How is the proxy configured? Perhaps you have to pass through some environment variables?

1 Like

Yup, that’s the exact issue, I needed to define an HTTP_PROXY Environment variable. I’d like to set up those variables to be defined only when an http_proxy input is defined. Would a javaScript expression be the best way to go about that?

Yes. Although I don’t remember exactly the behavior you get if an environment variable is an empty string or null. That might be a gap in the spec.

I’ve been getting errors with a null variable, but setting a default of the empty string resolves the issue.

Another option is to use cwltool --preserve-environment HTTP_PROXY my_workflow.cwl my_inputs.yml instead of adding to the tool description.

1 Like

Hello @tetron,
Is there any example of this configuration?
Would it also be useful if I’m using CWL as a wrapper around the Python script?