Cwltool.main text arguments a hard requirement for provenance?

It seems that the text of the command-line arguments is always required when requesting provenance, even when providing preparsed arguments, and even though setup_provenance will happily skip them if absent. Why is this a hard requirement?

One can invoke cwltoo.main with optional parameters args for preprocessed arguments and argsl for command-line arguments to be parsed. Both of these are Optional, and if empty, are assigned from sys.argv. However, cwltool.main insists that argsl not be empty.

Concretely, this means that if we invoke cwltool only with preparsed arguments, we must also provide an unparsed list of text arguments if we want to request provenance.

Note that the only use of this list is in cwltool.setup_provenance where if nonempty it is logged in the provenance, and ignored otherwise. Regardless, the parsed arguments are then logged in the provenance.

In case providing text arguments isn’t really a requirement, I went ahead and opened issue cwltool#1963, with a very simple test case.

As I mentioned there, the current behavior of cwltool prevents requesting provenance in Calrissian (CWL on Kubernetes).

The conclusion was that providing the text of the arguments isn’t a requirement for provenance. Issue cwltool#1963 was resolved by pull request cwltool#1964 (merged, thank you @mrc).

We can now obtain a provenance RO-Crate when running CWL with Calrissian, albeit using a patched version for now until the next cwltool release

1 Like