I’m using a step to git clone a Git repository into a specific directory:
cwlVersion: v1.0
class: CommandLineTool
doc: Git cloner
requirements:
DockerRequirement:
dockerPull: alpine/git:latest
InlineJavascriptRequirement: {}
baseCommand: [git, clone]
inputs:
url:
doc: URL of repository to clone
type: string
inputBinding:
position: 1
output_dir:
doc: Name of directory to clone into
type: string
default: null
inputBinding:
position: 2
outputs:
cloned:
type: Directory
outputBinding:
glob: |
${
if (inputs.output_dir == null){
return inputs.url.split('/').slice(-1)[0].slice(0, -4);
} else {
return inputs.output_dir;
}
}
stderr: stderr
stderr: git_clone-stderr.log
According to the cached output (--cachedir CACHE), this seems to work fine in the /tmp directory, e.g., running git clone <REPO URL> /data/git/repo via the job YML (output_dir: /data/git/repo) and cwltool creates .CACHE/<JOB ID>/data/git/repo.
However, when the workflow is run from, say /home/stephan/workflow, I only get /home/stephan/workflow/repo. What I want to get is home/stephan/workflow/data/git/repo instead. How can I do this?
(Sidenote, not sure if it matters, I’m using --singularity with apptainer 1.1.3.)
If you simplify the outputBinding to be just glob: * then you will get the entire directory structure, and not just the named (sub)-directory you have now.
In CWL, Directories have a name and a listing of files and sub-directories; we don’t preserve the name of the parent directories.
Tool definition file:///home/stephan/src/tools/generic/git_shallow_clone.cwl failed validation:
tools/generic/git_shallow_clone.cwl:31:13: while scanning an alias
tools/generic/git_shallow_clone.cwl:31:14: expected alphabetic or numeric character, but found
'\n'