I’m working on a CWL workflow running on a Linux VM, where one step involves making a fairly massive number of API calls. The workflow is able to complete successfully on my laptop, and when I run the code on the VM outside of a cwl-runner call, but when it is run in the CWL environment, after a short period of time we end up seeing network connectivity errors.
My guess is that there might be a configuration issue somewhere, but I’m not really sure where to start. Have people run in to this before/do people have any suggestions?
Hi @tetron, thanks for pointing that out, I didn’t include that in my workflow. Do you know of somewhere with a good example of what including that requirement looks like?
Edit: Nevermind, figured it out (I think). Went with
Unfortunately no luck with that, I may need to dig a bit deeper.
Which runner are you using?
Does it work for a little bit, and then start failing, or not work at all?
Can you share your workflow?
I’m just using the reference runner. After checking debug logs, it looks like no HTML connections are ever established. The system were on is using an HTTP proxy, which I can see possibly causing issues? I can definitely share my workflow, although what’s being run is fairly abstracted and in a package we haven’t released publicly yet.
How is the proxy configured? Perhaps you have to pass through some environment variables?
Yes. Although I don’t remember exactly the behavior you get if an environment variable is an empty string or null. That might be a gap in the spec.
I’ve been getting errors with a null variable, but setting a default of the empty string resolves the issue.
Another option is to use
cwltool --preserve-environment HTTP_PROXY my_workflow.cwl my_inputs.yml instead of adding to the tool description.
Is there any example of this configuration?
Would it also be useful if I’m using CWL as a wrapper around the Python script?