we are looking for a way to run CWL workflows with conda dependencies on an HPC.
The HPC has no internet access, but local mirrors of bioconda and conda-forge are available, aliases of which can be specified in .condarc.
One can manually install a conda package (e.g. fastqc) on the HPC with conda install -c http://<address/local/mirror>/bioconda --override-channels fastqc.
Now I would like to avoid manual installation of each tool and keep the cwl document as portable as possible.
The following minimal example works well on my own machine, with cwltool --beta-conda-dependencies workflow.cwl.
Hello @Brilator! It isn’t exactly what you asked for, but did you try running cwltool --beta-conda-dependencies --beta-dependencies-directory path/to/local/conda_env pointing to a local conda environment that already has the needed programs installed?
I think it should be enough to add conda_ensure_channels: ["channel1_URL", "channel2_NAME", ...] to the app_config dict in cwltool/software_requirements.py:DependenciesConfiguration.build_job_script() .
Thanks for you quick responses!
This would also help as a first step to ship the data and analysis as a whole.
Unfortunately, I did not get it to run with the conda environment created using above command on macOS (differing OS might also be an issue?)
I’ve tried
requests.exceptions.ConnectTimeout: HTTPSConnectionPool(host='github.com', port=443): Max retries exceeded with url: /conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x2ae2eaaf0bc0>, 'Connection to github.com timed out. (connect timeout=30)'))
WARNING Conda installation requested and failed.
requests.exceptions.ConnectTimeout: HTTPSConnectionPool(host='github.com', port=443): Max retries exceeded with url: /conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x2ace5bc988e0>, 'Connection to github.com timed out. (connect timeout=30)'))
WARNING Conda installation requested and failed.
ERROR Failed to install conda
Setting the conda_auto_* to false obviously prevents the installation request.
I also tried adding the path to a local conda exe (deduced from planemo docs, that this might be an option Commands — Planemo 0.75.26 documentation).
And I added the full URLs to the local conda channels.
INFO cwltool 0.1.dev4672+g8f00f73
INFO Resolved 'workflow.cwl' to '<...>/workflow.cwl'
INFO [job workflow.cwl] /var/tmp/pbs.13533619.hpc-batch/lpchmx7o$ which \
fastqc
which: no fastqc in <printing my $PATH>
WARNING [job workflow.cwl] exited with status: 1
WARNING [job workflow.cwl] completed permanentFail
{}WARNING Final process status is permanentFail
By the way, should I be worried about the dev version? cwltool 0.1.dev4672+g8f00f73