Cwl-runner - tmpdir not deleted

Hello!

I’m using cwl-runner 3.1.20210628163208 and I’m running a CWL Workflow with several steps.
The execution is successful but the tmp dirs are not deleted. I’ve tried using --rm-tmpdir although it’s the default behaviour but no luck, the tmp dirs are not deleted.

Am I missing something obvious or I ran into a bug?

I seem to be having the same issue. So it sounds like bug but didn’t really look into yet.

Hello!

Based upon the version number, I assume that you’ve got the CWL reference runner (cwltool) installed as the default cwl-runner. Did a previous version of cwltool correctly delete the temporary directories?

Hello @mrc
Your assumptions are correct. I can’t tell if it’s a regression wrt a previous version of cwltool.
If needed, I can try a few versions to verify this

Hello @mrc as mentioned by @bartn a workflow that has this problem can be found at the workflow hub

https://workflowhub.eu/workflows/154

Thanks @jjkoehorst ; the RO-Crate seems to be incomplete:

$ cwltool workflow_ngtax_picrust2.cwl 
INFO /home/michael/cwltool/env3.9/bin/cwltool 3.1.20210928171851
INFO Resolved 'workflow_ngtax_picrust2.cwl' to 'file:///home/michael/cwltool/discourse_401/workflow_ngtax_picrust2.cwl'
Cache entry deserialization failed, entry ignored
ERROR Tool definition failed validation:
workflow_ngtax_picrust2.cwl:72:1:  checking field `steps`
workflow_ngtax_picrust2.cwl:74:3:    checking object `workflow_ngtax_picrust2.cwl#fastqc`
workflow_ngtax_picrust2.cwl:75:5:      Field `run` contains undefined reference to
                                       `file:///home/michael/cwltool/fastqc/fastqc.cwl`

Probably should have been cwltool --packed first?

Do you have test data?

I’m getting issues running from your repo: cwltool cwl/cwl/workflows/workflow_ngtax_picrust2.cwl cwl/tests/ngtax/NGTAX_Silva138.1_100.yaml

It seems you run cwltool from within your CWL workflow?

Okay, the issue with the RO-Crate is being looked at by the workflowhub.eu developers (a fix is demonstrated at Quality assessment, amplicon classification and functional prediction)

@mrc I am not sure if it is a workflow file issue. The issue I was raising was that no matter the workflow that the temp directory is not cleared when the run is finished. This causes an accumulation of temporary files on the system.

Yes, that’s the behavior we also have.

Had a break for lunch so decided to take a look at this issue :smiley:

I think I found the issue, just need to figure out how to write a test. I used the hello workflow from this example from the user guide.

Running with a debugger: cwltool --tmpdir-prefix /tmp/cwl/ /tmp/1st-tool.cwl /tmp/echo-job.yml

I had created /tmp/cwl/, and copied the user guide examples to /tmp. I set two breakpoints, one in each part where temporary directories are removed in executors.py and in job.py.

In the executors.py, there’s a line:

if runtime_context.cachedir is None:

The issue is when cachedir is "", which was my case when running the hello workflow. That skipped that if statement, and then the temporary directory would not be removed.

If anyone would like to patch cwltool locally and try the fix, it’s an one-liner change :slight_smile: : Remove temporary directories when cache dir is empty string by kinow · Pull Request #1541 · common-workflow-language/cwltool · GitHub

Bruno

2 Likes