Generation of RO create failed

Hello,

I am trying to create the RO create for my workflow. To do so, I am following this talk. According to this talk, I can generate the RO create from the provenance information with the help of runcrate tool.

I installed runcrate as per the instructions.

Now coming to the actual part, when I run my CWL workflow to save provenance information I get a warning as follows:
research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7ff8e3e73dc0>

Then, when I tried to generate the RO crate from the saved provenance using runcrate I get the following error:
Entity wf:main not found in Provenance<urn:uuid:669d5067-67a4-47f6-8744-29a1e9d93690 from /path/to/Provenance/metadata/provenance/primary.cwlprov.xml>

I am not sure, how to address this. The command I used to generate RO crate from provenance information was:
runcrate convert Test_latest/Prov1/ -o Test_lates/ROcrate

I also tried to runcrate report Test_lates/Prov1/ and the output of this command gives following error:
raise ValueError(f"Not a valid RO-Crate: missing {Metadata.BASENAME}") ValueError: Not a valid RO-Crate: missing ro-crate-metadata.json

How can I solve this issue and generate the RO crate successfully?
Is it a good practice to add either Provenance info or RO crate to workflowhub with the code?

Is your workflow public? If not, can you post the entire log?

What is your cwltool version?

Yes, the workflow is public and registered on WorkflowHub here.
I am using the latest version of cwltool==3.1.20230601100705.
Below is the entire log:

Entire Log
[2023-06-19 13:33:13] INFO /users/hdhondge/miniconda3/envs/CroMaSt/bin/cwltool 3.1.20230601100705
[2023-06-19 13:33:13] INFO [cwltool] /users/hdhondge/miniconda3/envs/CroMaSt/bin/cwltool --parallel --timestamps --provenance Run_prov --leave-tmpdir --outdir=/data2/hdhondge/CroMaSt/Results2/ CroMaSt.cwl Results1/new_param.yml
[2023-06-19 13:33:13] INFO Resolved 'CroMaSt.cwl' to 'file:///local/data2/hdhondge/CroMaSt/CroMaSt.cwl'
Tools/align_compute_avg.cwl:44:7: Warning: Field 'location' contains undefined reference to
                                  'file:///local/data2/hdhondge/CroMaSt/Tools/split_PDB'
[2023-06-19 13:33:29] WARNING Workflow checker warning:
Tools/resmapping_cath_instances_subwf.cwl:89:12: Source 'resmapped_domains' of type ["null",
                                                 "File"] may be incompatible
Tools/resmapping_cath_instances_subwf.cwl:104:5:   with sink 'cath_domain_posi_file' of type
                                                   "File"
Tools/resmapping_cath_instances_subwf.cwl:103:5:   Source is from conditional step and may
                                                   produce `null`
Tools/resmapping_cath_instances_subwf.cwl:89:12: Source 'resmapped_domains' of type ["null",
                                                 "File"] may be incompatible
Tools/resmapping_cath_instances_subwf.cwl:104:5:   with sink 'cath_domain_posi_file' of type
                                                   "File"
Tools/align_compute_avg.cwl:44:7: Warning: Field 'location' contains undefined reference to
                                  'file:///local/data2/hdhondge/CroMaSt/Tools/split_PDB'
Tools/align_compute_avg.cwl:44:7: Warning: Field 'location' contains undefined reference to
                                  'file:///local/data2/hdhondge/CroMaSt/Tools/split_PDB'
Tools/align_compute_avg.cwl:44:7: Warning: Field 'location' contains undefined reference to
                                  'file:///local/data2/hdhondge/CroMaSt/Tools/split_PDB'
Tools/align_compute_avg.cwl:44:7: Warning: Field 'location' contains undefined reference to
                                  'file:///local/data2/hdhondge/CroMaSt/Tools/split_PDB'
Tools/align_compute_avg.cwl:44:7: Warning: Field 'location' contains undefined reference to
                                  'file:///local/data2/hdhondge/CroMaSt/Tools/split_PDB'
Tools/align_compute_avg.cwl:44:7: Warning: Field 'location' contains undefined reference to
                                  'file:///local/data2/hdhondge/CroMaSt/Tools/split_PDB'
Tools/align_compute_avg.cwl:44:7: Warning: Field 'location' contains undefined reference to
                                  'file:///local/data2/hdhondge/CroMaSt/Tools/split_PDB'
Tools/align_compute_avg.cwl:44:7: Warning: Field 'location' contains undefined reference to
                                  'file:///local/data2/hdhondge/CroMaSt/Tools/split_PDB'
[2023-06-19 13:33:45] WARNING Workflow checker warning:
CroMaSt.cwl:226:12: Source 'averaged_structs' of type ["null", "File"] may be incompatible
CroMaSt.cwl:325:7:    with sink 'core_struct' of type "File"
                      source has linkMerge method merge_nested
                      Source is from conditional step, but pickValue is not used
CroMaSt.cwl:226:12: Source 'averaged_structs' of type ["null", "File"] may be incompatible
CroMaSt.cwl:379:7:    with sink 'core_domain_struct' of type ["File", {"type": "array", "items":
                      "File"}]
                      source has linkMerge method merge_nested
                      Source is from conditional step, but pickValue is not used
CroMaSt.cwl:287:55: Source 'cath_crossmap_passed' of type ["null", "File"] may be incompatible
CroMaSt.cwl:389:7:    with sink 'crossmap_cath' of type "File"
CroMaSt.cwl:287:11: Source 'pfam_crossmap_passed' of type ["null", "File"] may be incompatible
CroMaSt.cwl:388:7:    with sink 'crossmap_pfam' of type "File"
CroMaSt.cwl:226:12: Source 'averaged_structs' of type ["null", "File"] may be incompatible
CroMaSt.cwl:299:7:    with sink 'core_struct' of type "File"
                      source has linkMerge method merge_nested
                      Source is from conditional step, but pickValue is not used
CroMaSt.cwl:226:12: Source 'averaged_structs' of type ["null", "File"] may be incompatible
CroMaSt.cwl:261:7:    with sink 'core_avg' of type [{"type": "array", "items": "File"}, "File"]
                      source has linkMerge method merge_nested
                      Source is from conditional step, but pickValue is not used
CroMaSt.cwl:148:25: Source 'cath_structs' of type ["null", "File"] may be incompatible
CroMaSt.cwl:159:7:    with sink 'resmapped_cath' of type "File"
                      source has linkMerge method merge_nested
                      Source is from conditional step, but pickValue is not used
CroMaSt.cwl:148:11: Source 'pfam_structs' of type ["null", "File"] may be incompatible
CroMaSt.cwl:168:7:    with sink 'resmapped_pfam' of type "File"
                      source has linkMerge method merge_nested
                      Source is from conditional step, but pickValue is not used
CroMaSt.cwl:340:12: Source 'unmapped_aligned_results' of type ["null", "File"] may be incompatible
CroMaSt.cwl:551:5:    with sink 'align_unmap_cath' of type "File"
CroMaSt.cwl:550:5:    Source is from conditional step and may produce `null`
CroMaSt.cwl:340:12: Source 'unmapped_aligned_results' of type ["null", "File"] may be incompatible
CroMaSt.cwl:551:5:    with sink 'align_unmap_cath' of type "File"
CroMaSt.cwl:314:12: Source 'unmapped_aligned_results' of type ["null", "File"] may be incompatible
CroMaSt.cwl:533:5:    with sink 'align_unmap_pfam' of type "File"
CroMaSt.cwl:532:5:    Source is from conditional step and may produce `null`
CroMaSt.cwl:314:12: Source 'unmapped_aligned_results' of type ["null", "File"] may be incompatible
CroMaSt.cwl:533:5:    with sink 'align_unmap_pfam' of type "File"
CroMaSt.cwl:287:55: Source 'cath_crossmap_passed' of type ["null", "File"] may be incompatible
CroMaSt.cwl:463:5:    with sink 'crossmapped_cath_passed' of type "File"
CroMaSt.cwl:287:11: Source 'pfam_crossmap_passed' of type ["null", "File"] may be incompatible
CroMaSt.cwl:457:5:    with sink 'crossmapped_pfam_passed' of type "File"
CroMaSt.cwl:148:25: Source 'cath_structs' of type ["null", "File"] may be incompatible
CroMaSt.cwl:475:5:    with sink 'crossres_mappedcath' of type "File"
CroMaSt.cwl:474:5:    Source is from conditional step and may produce `null`
CroMaSt.cwl:148:25: Source 'cath_structs' of type ["null", "File"] may be incompatible
CroMaSt.cwl:475:5:    with sink 'crossres_mappedcath' of type "File"
CroMaSt.cwl:148:11: Source 'pfam_structs' of type ["null", "File"] may be incompatible
CroMaSt.cwl:469:5:    with sink 'crossres_mappedpfam' of type "File"
CroMaSt.cwl:468:5:    Source is from conditional step and may produce `null`
CroMaSt.cwl:148:11: Source 'pfam_structs' of type ["null", "File"] may be incompatible
CroMaSt.cwl:469:5:    with sink 'crossres_mappedpfam' of type "File"
CroMaSt.cwl:340:56: Source 'failed_domains_list' of type ["null", "File"] may be incompatible
CroMaSt.cwl:563:5:    with sink 'unmap_cath_failed' of type "File"
CroMaSt.cwl:562:5:    Source is from conditional step and may produce `null`
CroMaSt.cwl:340:56: Source 'failed_domains_list' of type ["null", "File"] may be incompatible
CroMaSt.cwl:563:5:    with sink 'unmap_cath_failed' of type "File"
CroMaSt.cwl:340:38: Source 'domain_like_list' of type ["null", "File"] may be incompatible
CroMaSt.cwl:557:5:    with sink 'unmap_cath_passed' of type "File"
CroMaSt.cwl:556:5:    Source is from conditional step and may produce `null`
CroMaSt.cwl:340:38: Source 'domain_like_list' of type ["null", "File"] may be incompatible
CroMaSt.cwl:557:5:    with sink 'unmap_cath_passed' of type "File"
CroMaSt.cwl:314:56: Source 'failed_domains_list' of type ["null", "File"] may be incompatible
CroMaSt.cwl:545:5:    with sink 'unmap_pfam_failed' of type "File"
CroMaSt.cwl:544:5:    Source is from conditional step and may produce `null`
CroMaSt.cwl:314:56: Source 'failed_domains_list' of type ["null", "File"] may be incompatible
CroMaSt.cwl:545:5:    with sink 'unmap_pfam_failed' of type "File"
CroMaSt.cwl:314:38: Source 'domain_like_list' of type ["null", "File"] may be incompatible
CroMaSt.cwl:539:5:    with sink 'unmap_pfam_passed' of type "File"
CroMaSt.cwl:538:5:    Source is from conditional step and may produce `null`
CroMaSt.cwl:314:38: Source 'domain_like_list' of type ["null", "File"] may be incompatible
CroMaSt.cwl:539:5:    with sink 'unmap_pfam_passed' of type "File"
[2023-06-19 13:33:45] INFO [provenance] Adding to RO file:///data2/hdhondge/CroMaSt/Results1/core_avgStruct.pdb
[2023-06-19 13:33:45] INFO [provenance] Adding to RO file:///data2/hdhondge/CroMaSt/Results1/domain_like_structures.json
[2023-06-19 13:33:45] INFO [provenance] Adding to RO file:///data2/hdhondge/CroMaSt/Results1/failed_domains_list.json
[2023-06-19 13:33:45] INFO [provenance] Adding to RO file:///data2/hdhondge/CroMaSt/Results1/family_ids.json
[2023-06-19 13:33:45] INFO [provenance] Adding to RO file:///data2/hdhondge/CroMaSt/Results1/CroMaSt_input.yml
[2023-06-19 13:33:45] INFO [provenance] Adding to RO file:///data2/hdhondge/CroMaSt/Results1/crossmapped_cath_passed.json
[2023-06-19 13:33:45] INFO [provenance] Adding to RO file:///data2/hdhondge/CroMaSt/Results1/crossmapped_pfam_passed.json
[2023-06-19 13:33:45] INFO [provenance] Adding to RO file:///data2/hdhondge/CroMaSt/Results1/true_domains.json
[2023-06-19 13:33:45] INFO [workflow ] starting step get_family_ids
[2023-06-19 13:33:45] INFO [workflow ] start
[2023-06-19 13:33:45] INFO [step get_family_ids] start
[2023-06-19 13:33:45] INFO [job get_family_ids] /tmp/u9f97b3d$ python3 \
    get_family_ids.py \
    -p \
    PF03467 \
    PF03880 \
    PF04847 \
    PF05172 \
    PF08675 \
    PF08777 \
    PF08952 \
    PF09162 \
    PF11608 \
    PF11835 \
    PF13893 \
    PF16367 \
    PF16842 \
    PF17774 \
    -n \
    1 \
    -f \
    /tmp/u9f97b3d/family_ids.json
[2023-06-19 13:33:45] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
[2023-06-19 13:33:45] INFO [job get_family_ids] completed success
[2023-06-19 13:33:45] INFO [step get_family_ids] completed success
[2023-06-19 13:33:45] INFO [workflow ] starting step cath_domain_instances
[2023-06-19 13:33:45] INFO [step cath_domain_instances] start
[2023-06-19 13:33:45] INFO [workflow cath_domain_instances] starting step filter_cath_structures
[2023-06-19 13:33:45] INFO [workflow cath_domain_instances] start
[2023-06-19 13:33:45] INFO [step filter_cath_structures] start
[2023-06-19 13:33:45] INFO [workflow ] starting step pfam_domain_instances
[2023-06-19 13:33:45] INFO [step pfam_domain_instances] start
[2023-06-19 13:33:45] INFO [workflow pfam_domain_instances] start
[2023-06-19 13:33:45] INFO [workflow pfam_domain_instances] starting step filter_pfam_structures
[2023-06-19 13:33:45] INFO [step filter_pfam_structures] start
[2023-06-19 13:33:45] INFO [job filter_cath_structures] /tmp/kk_tecl6$ python3 \
    separate_cath.py \
    -l \
    31 \
    -d \
    /tmp/kk_tecl6/obsolete_PDB_entry_ids.txt \
    -o \
    obsolete_cath.txt \
    -c \
    /tmp/kk_tecl6/cath-domain-description-file.txt \
    -n \
    Filtered_CATH.csv \
    -s \
    part.csv \
    -f \
    /tmp/fadlj2ia/stgeff70189-3142-41fb-a5fa-4443ddae0016/family_ids.json
[2023-06-19 13:33:45] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
[2023-06-19 13:33:45] INFO [job filter_pfam_structures] /tmp/v65wak31$ python3 \
    separate_pfam.py \
    -l \
    31 \
    -d \
    /tmp/v65wak31/obsolete_PDB_entry_ids.txt \
    -o \
    obsolete_pfam.txt \
    -p \
    /tmp/v65wak31/pdbmap \
    -n \
    Filtered_Pfam.csv \
    -s \
    part.csv \
    -f \
    /tmp/msoym7xl/stg9327731c-da08-4b5d-8ad6-7363e40a4b17/family_ids.json
[2023-06-19 13:33:45] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
[2023-06-19 13:33:48] INFO [job filter_pfam_structures] Max memory used: 61MiB
[2023-06-19 13:33:49] INFO [job filter_pfam_structures] completed success
[2023-06-19 13:33:49] INFO [step filter_pfam_structures] completed success
[2023-06-19 13:33:49] INFO [workflow pfam_domain_instances] starting step resmapping_pfam_structs
[2023-06-19 13:33:49] INFO [step resmapping_pfam_structs] start
[2023-06-19 13:33:49] INFO [workflow resmapping_pfam_structs] starting step resmapping_for_Pfam_UP2PDB
[2023-06-19 13:33:49] INFO [workflow resmapping_pfam_structs] start
[2023-06-19 13:33:49] INFO [step resmapping_for_Pfam_UP2PDB] start
[2023-06-19 13:33:49] INFO [job resmapping_for_Pfam_UP2PDB] /tmp/a7tz2mcq$ python3 \
    resmapping_pfam2pdb.py \
    -f \
    /tmp/yopz756i/stg0894283c-86cb-4034-a279-f65d7896420f/0_part.csv \
    -s \
    /tmp/yopz756i/stg3a87b5c4-d857-4fdc-9844-0fb5026bd856/SIFTS \
    -m \
    pfam_resMapped.csv \
    -l \
    lost_pfam.txt
[2023-06-19 13:33:49] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
[2023-06-19 13:33:56] INFO [job filter_cath_structures] Max memory used: 60MiB
[2023-06-19 13:33:56] INFO [job filter_cath_structures] completed success
[2023-06-19 13:33:56] INFO [step filter_cath_structures] completed success
[2023-06-19 13:33:56] INFO [workflow cath_domain_instances] starting step resmapping_cath_structs
[2023-06-19 13:33:56] WARNING [job step resmapping_cath_structs] Notice: scattering over empty input in 'flt_files'.  All outputs will be empty.
[2023-06-19 13:33:56] INFO [step resmapping_cath_structs] completed success
[2023-06-19 13:33:56] INFO [workflow cath_domain_instances] starting step collect_lost_instances
[2023-06-19 13:33:56] INFO [step collect_lost_instances] start
[2023-06-19 13:33:56] INFO [workflow cath_domain_instances] starting step add_domain_positions
[2023-06-19 13:33:56] INFO [step add_domain_positions] will be skipped
[2023-06-19 13:33:56] INFO [step add_domain_positions] completed skipped
[2023-06-19 13:33:56] INFO [job collect_lost_instances] /tmp/89aoormj$ python3 \
    collect_lost_instances.py \
    cath_lost_resmap_domain_StIs.json
[2023-06-19 13:33:56] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
[2023-06-19 13:33:56] INFO [job collect_lost_instances] completed success
[2023-06-19 13:33:56] INFO [step collect_lost_instances] completed success
/users/hdhondge/miniconda3/envs/CroMaSt/lib/python3.10/site-packages/rdflib/plugins/serializers/nt.py:40: UserWarning: NTSerializer always uses UTF-8 encoding. Given encoding was: None
  warnings.warn(
[2023-06-19 13:33:56] INFO [workflow cath_domain_instances] completed success
[2023-06-19 13:33:56] INFO [step cath_domain_instances] completed success
[2023-06-19 13:36:05] INFO [job resmapping_for_Pfam_UP2PDB] Max memory used: 92MiB
[2023-06-19 13:36:06] INFO [job resmapping_for_Pfam_UP2PDB] completed success
[2023-06-19 13:36:06] INFO [step resmapping_for_Pfam_UP2PDB] completed success
/users/hdhondge/miniconda3/envs/CroMaSt/lib/python3.10/site-packages/rdflib/plugins/serializers/nt.py:40: UserWarning: NTSerializer always uses UTF-8 encoding. Given encoding was: None
  warnings.warn(
[2023-06-19 13:36:06] INFO [workflow resmapping_pfam_structs] completed success
[2023-06-19 13:36:06] INFO [step resmapping_pfam_structs] completed success
[2023-06-19 13:36:06] INFO [workflow pfam_domain_instances] starting step add_domain_positions_2
[2023-06-19 13:36:06] INFO [step add_domain_positions_2] start
[2023-06-19 13:36:06] INFO [workflow pfam_domain_instances] starting step collect_lost_instances_2
[2023-06-19 13:36:06] INFO [step collect_lost_instances_2] start
[2023-06-19 13:36:06] INFO [job add_domain_positions] /tmp/oskxpuaa$ python3 \
    add_domain_num.py \
    -i \
    /tmp/8andmar0/stg152329b2-b9b7-4a97-9b8a-5df6ef72b078/pfam_resMapped.csv \
    -o \
    pfam_resmapped_domain_StIs.csv
[2023-06-19 13:36:06] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
[2023-06-19 13:36:06] INFO [job collect_lost_instances_2] /tmp/b_wxjed8$ python3 \
    collect_lost_instances.py \
    pfam_lost_resmap_domain_StIs.json
[2023-06-19 13:36:06] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
[2023-06-19 13:36:06] INFO [job collect_lost_instances_2] completed success
[2023-06-19 13:36:06] INFO [step collect_lost_instances_2] completed success
[2023-06-19 13:36:07] INFO [job add_domain_positions] completed success
[2023-06-19 13:36:07] INFO [step add_domain_positions_2] completed success
/users/hdhondge/miniconda3/envs/CroMaSt/lib/python3.10/site-packages/rdflib/plugins/serializers/nt.py:40: UserWarning: NTSerializer always uses UTF-8 encoding. Given encoding was: None
  warnings.warn(
[2023-06-19 13:36:07] INFO [workflow pfam_domain_instances] completed success
[2023-06-19 13:36:07] INFO [step pfam_domain_instances] completed success
[2023-06-19 13:36:07] INFO [workflow ] starting step add_crossmapped_to_resmapped
[2023-06-19 13:36:07] INFO [step add_crossmapped_to_resmapped] start
[2023-06-19 13:36:07] INFO [job add_crossmapped_to_resmapped] /tmp/1c9nhrjh$ python3 \
    add_crossmapped2resmapped.py \
    -p \
    /tmp/m7u_2moc/stgc0b9e5ce-6838-402d-8da3-ba22cb3ced91/pfam_resmapped_domain_StIs.csv \
    -px \
    /tmp/m7u_2moc/stge75002b9-a239-4074-a759-b9be2faca38c/crossmapped_pfam_passed.json \
    -cx \
    /tmp/m7u_2moc/stg611f1bed-afd4-4ea3-bfc0-f8771c467dc0/crossmapped_cath_passed.json \
    -pr \
    pfam_res_crossMapped.csv \
    -cr \
    cath_res_crossMapped.csv
[2023-06-19 13:36:07] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
[2023-06-19 13:36:07] INFO [job add_crossmapped_to_resmapped] completed success
[2023-06-19 13:36:07] INFO [step add_crossmapped_to_resmapped] completed success
[2023-06-19 13:36:07] INFO [workflow ] starting step compare_instances_CATH_Pfam
[2023-06-19 13:36:07] INFO [step compare_instances_CATH_Pfam] start
[2023-06-19 13:36:07] INFO [job compare_instances_CATH_Pfam] /tmp/51a73ev0$ python3 \
    compare_cath_pfam.py \
    -l \
    31 \
    -c \
    /tmp/wiubg70o/stg6191943c-6107-4b39-8267-4ff23c7c9d21/cath_res_crossMapped.csv \
    -p \
    /tmp/wiubg70o/stg5b1ab25a-96d9-4782-95a7-9de2e4d95c47/pfam_res_crossMapped.csv \
    -f \
    /tmp/51a73ev0/true_domains.json \
    -uq_pf \
    unique_pfam.csv \
    -uq_ca \
    unique_cath.csv
[2023-06-19 13:36:07] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
[2023-06-19 13:36:08] INFO [job compare_instances_CATH_Pfam] completed success
[2023-06-19 13:36:08] INFO [step compare_instances_CATH_Pfam] completed success
[2023-06-19 13:36:08] INFO [workflow ] starting step crossmapping_Pfam2CATH
[2023-06-19 13:36:08] INFO [step crossmapping_Pfam2CATH] start
[2023-06-19 13:36:08] INFO [workflow ] starting step format_core_list
[2023-06-19 13:36:08] INFO [step format_core_list] start
[2023-06-19 13:36:08] INFO [workflow ] starting step crossmapping_CATH2Pfam
[2023-06-19 13:36:08] INFO [step crossmapping_CATH2Pfam] start
[2023-06-19 13:36:08] INFO [job crossmapping_Pfam2CATH] /tmp/4mtel3mt$ python3 \
    map_unique_struct_pfam2cath.py \
    -c \
    /tmp/_qfq79rc/stg9e7b5ea3-cf77-4177-9fcf-d189d5cfe17b/cath-domain-description-file.txt \
    -x \
    pfam_crossMapped_cath.jsonx \
    -l \
    31 \
    -u \
    pfam_unq_unmapped.jsonx \
    -p \
    /tmp/_qfq79rc/stg98b7c529-4d10-4e58-99b1-2677e350f761/unique_pfam.csv
[2023-06-19 13:36:08] INFO [job format_core_list] /tmp/tr8p7lta$ python3 \
    list_true_domains.py
[2023-06-19 13:36:08] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
[2023-06-19 13:36:08] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
[2023-06-19 13:36:08] INFO [job crossmapping_CATH2Pfam] /tmp/u1alckm8$ python3 \
    map_unique_struct_cath2pfam.py \
    -x \
    cath_crossMapped_pfam.jsonx \
    -l \
    31 \
    -u \
    cath_unq_unmapped.jsonx \
    -p \
    /tmp/br56pq7z/stg6a228f02-e1fd-44fb-8358-ded924000e04/pdbmap \
    -c \
    /tmp/br56pq7z/stg8062ff82-25bf-4040-99f6-b69a6db90659/unique_cath.csv
[2023-06-19 13:36:08] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
[2023-06-19 13:36:08] INFO [job format_core_list] completed success
[2023-06-19 13:36:08] INFO [step format_core_list] completed success
[2023-06-19 13:36:08] INFO [workflow ] starting step chop_and_avg_for_core
[2023-06-19 13:36:08] INFO [step chop_and_avg_for_core] will be skipped
[2023-06-19 13:36:08] INFO [step chop_and_avg_for_core] completed skipped
[2023-06-19 13:36:09] INFO [job crossmapping_CATH2Pfam] completed success
[2023-06-19 13:36:09] INFO [step crossmapping_CATH2Pfam] completed success
[2023-06-19 13:36:09] INFO [workflow ] starting step unmapped_from_cath
[2023-06-19 13:36:09] INFO [step unmapped_from_cath] will be skipped
[2023-06-19 13:36:09] INFO [step unmapped_from_cath] completed skipped
[2023-06-19 13:36:09] INFO [workflow ] starting step chop_and_avg_for_CATH2Pfam
[2023-06-19 13:36:09] INFO [step chop_and_avg_for_CATH2Pfam] start
[2023-06-19 13:36:09] INFO [workflow chop_and_avg_for_CATH2Pfam] starting step chop_and_avg_from_list
[2023-06-19 13:36:09] INFO [workflow chop_and_avg_for_CATH2Pfam] start
[2023-06-19 13:36:09] WARNING [job step chop_and_avg_from_list] Notice: scattering over empty input in 'in_file'.  All outputs will be empty.
[2023-06-19 13:36:09] INFO [step chop_and_avg_from_list] completed success
/users/hdhondge/miniconda3/envs/CroMaSt/lib/python3.10/site-packages/rdflib/plugins/serializers/nt.py:40: UserWarning: NTSerializer always uses UTF-8 encoding. Given encoding was: None
  warnings.warn(
[2023-06-19 13:36:09] INFO [workflow chop_and_avg_for_CATH2Pfam] completed success
[2023-06-19 13:36:09] INFO [step chop_and_avg_for_CATH2Pfam] completed success
[2023-06-19 13:37:57] INFO [job crossmapping_Pfam2CATH] Max memory used: 1633MiB
[2023-06-19 13:37:57] INFO [job crossmapping_Pfam2CATH] completed success
[2023-06-19 13:37:57] INFO [step crossmapping_Pfam2CATH] completed success
[2023-06-19 13:37:57] INFO [workflow ] starting step chop_and_avg_for_Pfam2CATH
[2023-06-19 13:37:57] INFO [step chop_and_avg_for_Pfam2CATH] start
[2023-06-19 13:37:57] INFO [workflow chop_and_avg_for_Pfam2CATH] starting step chop_and_avg_from_list_2
[2023-06-19 13:37:57] INFO [workflow chop_and_avg_for_Pfam2CATH] start
[2023-06-19 13:37:57] WARNING [job step chop_and_avg_from_list_2] Notice: scattering over empty input in 'in_file'.  All outputs will be empty.
[2023-06-19 13:37:57] INFO [step chop_and_avg_from_list_2] completed success
/users/hdhondge/miniconda3/envs/CroMaSt/lib/python3.10/site-packages/rdflib/plugins/serializers/nt.py:40: UserWarning: NTSerializer always uses UTF-8 encoding. Given encoding was: None
  warnings.warn(
[2023-06-19 13:37:57] INFO [workflow chop_and_avg_for_Pfam2CATH] completed success
[2023-06-19 13:37:57] INFO [step chop_and_avg_for_Pfam2CATH] completed success
[2023-06-19 13:37:57] INFO [workflow ] starting step unmapped_from_pfam
[2023-06-19 13:37:57] INFO [step unmapped_from_pfam] start
[2023-06-19 13:37:57] INFO [workflow unmapped_from_pfam] start
[2023-06-19 13:37:57] INFO [workflow unmapped_from_pfam] starting step per_unp_dom_instance
[2023-06-19 13:37:57] INFO [step per_unp_dom_instance] start
[2023-06-19 13:37:57] INFO [workflow ] starting step align_avg_structs_pairwise
[2023-06-19 13:37:57] INFO [step align_avg_structs_pairwise] start
[2023-06-19 13:37:57] INFO [job per_unp_dom_instance] /tmp/i8076r4i$ python3 \
    crossmapped_per_unp_dom.py \
    -f \
    /tmp/qxxv1f5n/stgea00de1c-7406-4ffd-b1a5-668ae8a2e2c7/pfam_unq_unmapped.jsonx > /tmp/i8076r4i/5a0b5b0ee8819feece292f4af464b2ccbffcead7
[2023-06-19 13:37:57] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
[2023-06-19 13:37:57] INFO [job align_avg_structs_pairwise] /tmp/6k4obnu6$ python3 \
    pairwise_aligner.py \
    -t \
    /tmp/i_f8nxc2/stg6b0813eb-025b-436b-9ae3-aa13422a1f7b/core_avgStruct.pdb \
    -r \
    align_Struct_analysis.csv
[2023-06-19 13:37:57] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
[2023-06-19 13:37:57] INFO [job per_unp_dom_instance] completed success
[2023-06-19 13:37:57] INFO [step per_unp_dom_instance] completed success
[2023-06-19 13:37:57] INFO [workflow unmapped_from_pfam] starting step avg_unp_domains
[2023-06-19 13:37:57] INFO [step avg_unp_domains] start
[2023-06-19 13:37:57] INFO [workflow avg_unp_domains] starting step chop_structs
[2023-06-19 13:37:57] INFO [workflow avg_unp_domains] start
[2023-06-19 13:37:57] INFO [step chop_structs] start
[2023-06-19 13:37:57] INFO [step avg_unp_domains] start
[2023-06-19 13:37:57] INFO [workflow avg_unp_domains_2] start
[2023-06-19 13:37:57] INFO [workflow avg_unp_domains_2] starting step chop_structs_2
[2023-06-19 13:37:57] INFO [step chop_structs_2] start
[2023-06-19 13:37:57] INFO [job chop_structs] /tmp/knkvijgl$ python3 \
    chop_struct2domains.py \
    -f \
    /tmp/2puoo18c/stgd8991b78-f390-42f8-bff2-15b858be8183/unmapped_domain_1_A0A6A5Q318.json \
    -p \
    /tmp/2puoo18c/stge8492157-02db-4490-bf6a-a0182a13663a/PDB_files \
    -s \
    split_PDB \
    -k \
    KPAX_RESULTS > /tmp/knkvijgl/99a4dda21068e1076bcbc268ceeef8242974f012

I removed the pandas warning from one of the tools to keep it relatively short.

Second part of the log
[2023-06-19 13:37:57] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
[2023-06-19 13:37:57] INFO [step avg_unp_domains] start
[2023-06-19 13:37:57] INFO [workflow avg_unp_domains_3] starting step chop_structs_3
[2023-06-19 13:37:57] INFO [workflow avg_unp_domains_3] start
[2023-06-19 13:37:57] INFO [step chop_structs_3] start
[2023-06-19 13:37:57] INFO [step avg_unp_domains] start
[2023-06-19 13:37:57] INFO [workflow avg_unp_domains_4] starting step chop_structs_4
[2023-06-19 13:37:57] INFO [workflow avg_unp_domains_4] start
[2023-06-19 13:37:57] INFO [step chop_structs_4] start
[2023-06-19 13:37:57] INFO [job chop_structs_2] /tmp/d_pe67_n$ python3 \
    chop_struct2domains.py \
    -f \
    /tmp/_gilevj2/stgb6990e4f-ebd7-45fc-bba7-a143307172c7/unmapped_domain_1_P0A9P6.json \
    -p \
    /tmp/_gilevj2/stgd1491899-895c-4f95-8bdf-33ed098e1363/PDB_files \
    -s \
    split_PDB \
    -k \
    KPAX_RESULTS > /tmp/d_pe67_n/99a4dda21068e1076bcbc268ceeef8242974f012
[2023-06-19 13:37:57] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
[2023-06-19 13:37:57] INFO [job chop_structs_3] /tmp/8obiuj5w$ python3 \
    chop_struct2domains.py \
    -f \
    /tmp/59f125cu/stg72a7eb10-5387-4ad4-8f11-c62adca26492/unmapped_domain_1_P21693.json \
    -p \
    /tmp/59f125cu/stgdc71e63d-8e35-4cc6-a8bf-0895dfb4e6f6/PDB_files \
    -s \
    split_PDB \
    -k \
    KPAX_RESULTS > /tmp/8obiuj5w/99a4dda21068e1076bcbc268ceeef8242974f012
[2023-06-19 13:37:57] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
[2023-06-19 13:37:57] INFO [step avg_unp_domains] start
[2023-06-19 13:37:57] INFO [workflow avg_unp_domains_5] starting step chop_structs_5
[2023-06-19 13:37:57] INFO [workflow avg_unp_domains_5] start
[2023-06-19 13:37:57] INFO [step chop_structs_5] start
[2023-06-19 13:37:57] INFO [job chop_structs_4] /tmp/fnm_81tm$ python3 \
    chop_struct2domains.py \
    -f \
    /tmp/c1lwczc8/stg8a784610-722b-4bda-97f4-3b02d7ec8686/unmapped_domain_1_P40567.json \
    -p \
    /tmp/c1lwczc8/stg0699b957-5bce-4867-a063-93994c2e291e/PDB_files \
    -s \
    split_PDB \
    -k \
    KPAX_RESULTS > /tmp/fnm_81tm/99a4dda21068e1076bcbc268ceeef8242974f012
[2023-06-19 13:37:57] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
[2023-06-19 13:37:57] INFO [step avg_unp_domains] start
[2023-06-19 13:37:57] INFO [workflow avg_unp_domains_6] starting step chop_structs_6
[2023-06-19 13:37:57] INFO [workflow avg_unp_domains_6] start
[2023-06-19 13:37:57] INFO [step chop_structs_6] start
[2023-06-19 13:37:57] INFO [step avg_unp_domains] start
[2023-06-19 13:37:57] INFO [workflow avg_unp_domains_7] starting step chop_structs_7
[2023-06-19 13:37:57] INFO [job chop_structs_5] /tmp/azsbjhlt$ python3 \
    chop_struct2domains.py \
    -f \
    /tmp/ujxr5t1s/stgada7885b-7272-4f8c-aca8-687ea495a548/unmapped_domain_1_P49960.json \
    -p \
    /tmp/ujxr5t1s/stgc7e58974-733d-4318-881f-88490933c94f/PDB_files \
    -s \
    split_PDB \
    -k \
    KPAX_RESULTS > /tmp/azsbjhlt/99a4dda21068e1076bcbc268ceeef8242974f012
[2023-06-19 13:37:57] INFO [workflow avg_unp_domains_7] start
[2023-06-19 13:37:57] INFO [step chop_structs_7] start
[2023-06-19 13:37:57] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
[2023-06-19 13:37:57] INFO [job chop_structs_6] /tmp/pfna9l3i$ python3 \
    chop_struct2domains.py \
    -f \
    /tmp/sby8gw5b/stg90f1d5a6-d1dd-42c6-845c-e7c99adce8ea/unmapped_domain_1_Q17RY0.json \
    -p \
    /tmp/sby8gw5b/stg192d82da-e5eb-4a97-a9dc-dd554618c9b6/PDB_files \
    -s \
    split_PDB \
    -k \
    KPAX_RESULTS > /tmp/pfna9l3i/99a4dda21068e1076bcbc268ceeef8242974f012
[2023-06-19 13:37:57] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
[2023-06-19 13:37:57] INFO [step avg_unp_domains] start
[2023-06-19 13:37:57] INFO [workflow avg_unp_domains_8] starting step chop_structs_8
[2023-06-19 13:37:57] INFO [workflow avg_unp_domains_8] start
[2023-06-19 13:37:57] INFO [step chop_structs_8] start
[2023-06-19 13:37:57] INFO [job chop_structs_7] /tmp/9zb160n4$ python3 \
    chop_struct2domains.py \
    -f \
    /tmp/opwywd6n/stg1453f640-2d96-4da5-881a-fddb5a3b5571/unmapped_domain_1_Q4G0J3.json \
    -p \
    /tmp/opwywd6n/stg65757dbd-cff6-428b-a6c6-ffac55137860/PDB_files \
    -s \
    split_PDB \
    -k \
    KPAX_RESULTS > /tmp/9zb160n4/99a4dda21068e1076bcbc268ceeef8242974f012
[2023-06-19 13:37:57] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
[2023-06-19 13:37:57] INFO [step avg_unp_domains] start
[2023-06-19 13:37:57] INFO [workflow avg_unp_domains_9] starting step chop_structs_9
[2023-06-19 13:37:57] INFO [workflow avg_unp_domains_9] start
[2023-06-19 13:37:57] INFO [step chop_structs_9] start
[2023-06-19 13:37:57] INFO [job chop_structs_8] /tmp/ern5y6m4$ python3 \
    chop_struct2domains.py \
    -f \
    /tmp/7rrb3ies/stgbca36e17-7959-4836-b3a1-259dbacad8f3/unmapped_domain_1_Q8VDG3.json \
    -p \
    /tmp/7rrb3ies/stgb83672e9-ed6d-4334-8492-79063c0f95c3/PDB_files \
    -s \
    split_PDB \
    -k \
    KPAX_RESULTS > /tmp/ern5y6m4/99a4dda21068e1076bcbc268ceeef8242974f012
[2023-06-19 13:37:57] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
[2023-06-19 13:37:57] INFO [job chop_structs_9] /tmp/dxo87uab$ python3 \
    chop_struct2domains.py \
    -f \
    /tmp/9p_jmoxl/stg3abe8f3c-e5c1-4a22-9729-583c91171527/unmapped_domain_1_Q9BZB8.json \
    -p \
    /tmp/9p_jmoxl/stgdbc83757-0cb4-4da4-b3ea-5bea178429fb/PDB_files \
    -s \
    split_PDB \
    -k \
    KPAX_RESULTS > /tmp/dxo87uab/99a4dda21068e1076bcbc268ceeef8242974f012
[2023-06-19 13:37:57] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
[2023-06-19 13:37:58] INFO [job chop_structs_2] Max memory used: 93MiB
[2023-06-19 13:37:58] INFO [job chop_structs_2] completed success
[2023-06-19 13:37:58] INFO [step chop_structs_2] completed success
[2023-06-19 13:37:58] INFO [workflow avg_unp_domains_2] starting step avg_chopped_structs_unp_domains_2
[2023-06-19 13:37:58] INFO [step avg_chopped_structs_unp_domains_2] start
Warning: invalid field 'nameroot', expected one of: 'class', 'location', 'path', 'basename', 'listing'
Warning: invalid field 'nameext', expected one of: 'class', 'location', 'path', 'basename', 'listing'
[2023-06-19 13:37:58] INFO [job avg_chopped_structs_unp_domains] /tmp/1sarkp9w$ python3 \
    align_compute_avg.py \
    -f \
    /tmp/ylz_b1tn/stg703fe899-995e-42bc-a2f2-a1f214ae6569/99a4dda21068e1076bcbc268ceeef8242974f012 \
    -s \
    /tmp/ylz_b1tn/stg6eede303-f8ea-4269-85c6-ec923be45c7e/split_PDB \
    -k \
    KPAX_RESULTS
[2023-06-19 13:37:58] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
/tmp/6k4obnu6/pairwise_aligner.py:88: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  kaln_df = kaln_df.append(one, ignore_index=True)
/tmp/6k4obnu6/pairwise_aligner.py:88: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  kaln_df = kaln_df.append(one, ignore_index=True)
[2023-06-19 13:37:59] INFO [job chop_structs_5] Max memory used: 82MiB
[2023-06-19 13:37:59] INFO [job chop_structs_5] completed success
[2023-06-19 13:37:59] INFO [step chop_structs_5] completed success
[2023-06-19 13:37:59] INFO [workflow avg_unp_domains_5] starting step avg_chopped_structs_unp_domains_5
[2023-06-19 13:37:59] INFO [step avg_chopped_structs_unp_domains_5] start
Warning: invalid field 'nameroot', expected one of: 'class', 'location', 'path', 'basename', 'listing'
Warning: invalid field 'nameext', expected one of: 'class', 'location', 'path', 'basename', 'listing'
[2023-06-19 13:37:59] INFO [job avg_chopped_structs_unp_domains_2] /tmp/2vtd79tl$ python3 \
    align_compute_avg.py \
    -f \
    /tmp/4tqimjfw/stg492c51f4-dfba-4785-a804-bb032cce8f7c/99a4dda21068e1076bcbc268ceeef8242974f012 \
    -s \
    /tmp/4tqimjfw/stga20b73e6-e846-4620-8229-c1165adb506e/split_PDB \
    -k \
    KPAX_RESULTS
[2023-06-19 13:37:59] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
[2023-06-19 13:37:59] INFO [job align_avg_structs_pairwise] Max memory used: 79MiB
[2023-06-19 13:37:59] INFO [job chop_structs_8] Max memory used: 81MiB
[2023-06-19 13:37:59] INFO [job chop_structs_8] completed success
[2023-06-19 13:37:59] INFO [step chop_structs_8] completed success
[2023-06-19 13:37:59] INFO [workflow avg_unp_domains_8] starting step avg_chopped_structs_unp_domains_8
[2023-06-19 13:37:59] INFO [step avg_chopped_structs_unp_domains_8] start
Warning: invalid field 'nameroot', expected one of: 'class', 'location', 'path', 'basename', 'listing'
Warning: invalid field 'nameext', expected one of: 'class', 'location', 'path', 'basename', 'listing'
[2023-06-19 13:37:59] INFO [job avg_chopped_structs_unp_domains_3] /tmp/g188qd7n$ python3 \
    align_compute_avg.py \
    -f \
    /tmp/6sm4opu4/stgff6f541c-5dda-4f3c-b81f-cfcc0e2ab72d/99a4dda21068e1076bcbc268ceeef8242974f012 \
    -s \
    /tmp/6sm4opu4/stgaf0cf259-a260-41ff-961e-52c49d32dd80/split_PDB \
    -k \
    KPAX_RESULTS
[2023-06-19 13:37:59] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
[2023-06-19 13:37:59] INFO [job chop_structs_7] Max memory used: 84MiB
[2023-06-19 13:37:59] INFO [job chop_structs_7] completed success
[2023-06-19 13:37:59] INFO [step chop_structs_7] completed success
[2023-06-19 13:37:59] INFO [workflow avg_unp_domains_7] starting step avg_chopped_structs_unp_domains_7
[2023-06-19 13:37:59] INFO [step avg_chopped_structs_unp_domains_7] start
Warning: invalid field 'nameroot', expected one of: 'class', 'location', 'path', 'basename', 'listing'
Warning: invalid field 'nameext', expected one of: 'class', 'location', 'path', 'basename', 'listing'
[2023-06-19 13:37:59] INFO [job avg_chopped_structs_unp_domains_4] /tmp/ugf3i2vx$ python3 \
    align_compute_avg.py \
    -f \
    /tmp/69_e6m9s/stg2dd3c52a-8d3c-458c-9ca7-0a91e0714303/99a4dda21068e1076bcbc268ceeef8242974f012 \
    -s \
    /tmp/69_e6m9s/stg9919d2c1-272a-4628-894e-c219c1662a8f/split_PDB \
    -k \
    KPAX_RESULTS
[2023-06-19 13:37:59] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
[2023-06-19 13:37:59] INFO [job align_avg_structs_pairwise] completed success
[2023-06-19 13:37:59] INFO [step align_avg_structs_pairwise] completed success
[2023-06-19 13:37:59] INFO [workflow ] starting step check_alignment_scores
[2023-06-19 13:37:59] INFO [step check_alignment_scores] start
[2023-06-19 13:37:59] INFO [job check_alignment_scores] /tmp/qpspckfl$ python3 \
    check_threshold.py \
    -a \
    /tmp/1o4v9wcz/stg88d1302e-80f9-4263-b9b4-39ad6956e1ec/align_Struct_analysis.csv \
    -f \
    /tmp/1o4v9wcz/stg35049313-0c3b-4c93-b83d-b845cd75c342/family_ids.json \
    -s \
    Mscore \
    -t \
    0.6 \
    -px \
    /tmp/1o4v9wcz/stg6c1bedd7-5687-4a60-96ba-d33ce517e828/pfam_crossMapped_cath.jsonx \
    -cx \
    /tmp/1o4v9wcz/stg6811ecdf-4a45-4339-a44f-5504935dddc4/cath_crossMapped_pfam.jsonx
[2023-06-19 13:37:59] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
[2023-06-19 13:38:00] INFO [job avg_chopped_structs_unp_domains] Max memory used: 85MiB
[2023-06-19 13:38:00] INFO [job avg_chopped_structs_unp_domains] completed success
[2023-06-19 13:38:00] INFO [step avg_chopped_structs_unp_domains_2] completed success
/users/hdhondge/miniconda3/envs/CroMaSt/lib/python3.10/site-packages/rdflib/plugins/serializers/nt.py:40: UserWarning: NTSerializer always uses UTF-8 encoding. Given encoding was: None
  warnings.warn(
[2023-06-19 13:38:00] INFO [workflow avg_unp_domains_2] completed success
[2023-06-19 13:38:00] INFO [job chop_structs_3] Max memory used: 82MiB
[2023-06-19 13:38:00] INFO [job chop_structs_3] completed success
[2023-06-19 13:38:00] INFO [step chop_structs_3] completed success
[2023-06-19 13:38:00] INFO [workflow avg_unp_domains_3] starting step avg_chopped_structs_unp_domains_3
[2023-06-19 13:38:00] INFO [step avg_chopped_structs_unp_domains_3] start
Warning: invalid field 'nameroot', expected one of: 'class', 'location', 'path', 'basename', 'listing'
Warning: invalid field 'nameext', expected one of: 'class', 'location', 'path', 'basename', 'listing'
[2023-06-19 13:38:00] INFO [job avg_chopped_structs_unp_domains_5] /tmp/ngmtvunp$ python3 \
    align_compute_avg.py \
    -f \
    /tmp/0yopn99a/stg0df48036-cc59-4644-ad07-69d78719eb81/99a4dda21068e1076bcbc268ceeef8242974f012 \
    -s \
    /tmp/0yopn99a/stgc4d23acb-f796-49f1-9689-6ae535439702/split_PDB \
    -k \
    KPAX_RESULTS
[2023-06-19 13:38:00] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
[2023-06-19 13:38:00] INFO [job check_alignment_scores] completed success
[2023-06-19 13:38:00] INFO [step check_alignment_scores] completed success
[2023-06-19 13:38:00] INFO [job chop_structs_9] Max memory used: 96MiB
[2023-06-19 13:38:00] INFO [job chop_structs_9] completed success
[2023-06-19 13:38:00] INFO [step chop_structs_9] completed success
[2023-06-19 13:38:00] INFO [workflow avg_unp_domains_9] starting step avg_chopped_structs_unp_domains_9
[2023-06-19 13:38:00] INFO [step avg_chopped_structs_unp_domains_9] start
Warning: invalid field 'nameroot', expected one of: 'class', 'location', 'path', 'basename', 'listing'
Warning: invalid field 'nameext', expected one of: 'class', 'location', 'path', 'basename', 'listing'
[2023-06-19 13:38:00] INFO [job avg_chopped_structs_unp_domains_6] /tmp/j03tbsar$ python3 \
    align_compute_avg.py \
    -f \
    /tmp/g7t8srpy/stgd81e49f3-e4de-4aa2-8dbf-cb1814dfd347/99a4dda21068e1076bcbc268ceeef8242974f012 \
    -s \
    /tmp/g7t8srpy/stgf3e4b27b-27db-4156-97da-d624d5ca7e63/split_PDB \
    -k \
    KPAX_RESULTS
[2023-06-19 13:38:00] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
[2023-06-19 13:38:01] INFO [job avg_chopped_structs_unp_domains_5] completed success
[2023-06-19 13:38:01] INFO [step avg_chopped_structs_unp_domains_3] completed success
/users/hdhondge/miniconda3/envs/CroMaSt/lib/python3.10/site-packages/rdflib/plugins/serializers/nt.py:40: UserWarning: NTSerializer always uses UTF-8 encoding. Given encoding was: None
  warnings.warn(
[2023-06-19 13:38:01] INFO [workflow avg_unp_domains_3] completed success
[2023-06-19 13:38:01] INFO [job chop_structs_6] Max memory used: 94MiB
[2023-06-19 13:38:01] INFO [job chop_structs_6] completed success
[2023-06-19 13:38:01] INFO [step chop_structs_6] completed success
[2023-06-19 13:38:01] INFO [workflow avg_unp_domains_6] starting step avg_chopped_structs_unp_domains_6
[2023-06-19 13:38:01] INFO [step avg_chopped_structs_unp_domains_6] start
Warning: invalid field 'nameroot', expected one of: 'class', 'location', 'path', 'basename', 'listing'
Warning: invalid field 'nameext', expected one of: 'class', 'location', 'path', 'basename', 'listing'
[2023-06-19 13:38:01] INFO [job avg_chopped_structs_unp_domains_7] /tmp/ll7pv5wy$ python3 \
    align_compute_avg.py \
    -f \
    /tmp/5pwqh3lu/stg3c63165e-1782-4896-99b3-c78c3fba458d/99a4dda21068e1076bcbc268ceeef8242974f012 \
    -s \
    /tmp/5pwqh3lu/stgf12cdead-1f42-47d8-8633-05762560e2e6/split_PDB \
    -k \
    KPAX_RESULTS
[2023-06-19 13:38:01] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
unmapped_domain_1_P49960 2 unmapped_domain_1_P49960_avgStruct.pdb IF error happens
[2023-06-19 13:38:02] INFO [job avg_chopped_structs_unp_domains_2] Max memory used: 86MiB
[2023-06-19 13:38:02] INFO [job avg_chopped_structs_unp_domains_2] completed success
[2023-06-19 13:38:02] INFO [step avg_chopped_structs_unp_domains_5] completed success
/users/hdhondge/miniconda3/envs/CroMaSt/lib/python3.10/site-packages/rdflib/plugins/serializers/nt.py:40: UserWarning: NTSerializer always uses UTF-8 encoding. Given encoding was: None
  warnings.warn(
[2023-06-19 13:38:02] INFO [workflow avg_unp_domains_5] completed success
unmapped_domain_1_Q8VDG3 3 unmapped_domain_1_Q8VDG3_avgStruct.pdb IF error happens
[2023-06-19 13:38:03] INFO [job avg_chopped_structs_unp_domains_3] Max memory used: 92MiB
[2023-06-19 13:38:03] INFO [job avg_chopped_structs_unp_domains_3] completed success
[2023-06-19 13:38:03] INFO [step avg_chopped_structs_unp_domains_8] completed success
[2023-06-19 13:38:03] INFO [workflow avg_unp_domains_8] completed success
unmapped_domain_1_Q17RY0 2 unmapped_domain_1_Q17RY0_avgStruct.pdb IF error happens
[2023-06-19 13:38:04] INFO [job avg_chopped_structs_unp_domains_7] Max memory used: 98MiB
[2023-06-19 13:38:04] INFO [job avg_chopped_structs_unp_domains_7] completed success
[2023-06-19 13:38:04] INFO [step avg_chopped_structs_unp_domains_6] completed success
[2023-06-19 13:38:04] INFO [workflow avg_unp_domains_6] completed success
unmapped_domain_1_Q9BZB8 2 unmapped_domain_1_Q9BZB8_avgStruct.pdb IF error happens
[2023-06-19 13:38:04] INFO [job avg_chopped_structs_unp_domains_6] Max memory used: 94MiB
[2023-06-19 13:38:04] INFO [job avg_chopped_structs_unp_domains_6] completed success
[2023-06-19 13:38:04] INFO [step avg_chopped_structs_unp_domains_9] completed success
[2023-06-19 13:38:04] INFO [workflow avg_unp_domains_9] completed success
unmapped_domain_1_Q4G0J3 3 unmapped_domain_1_Q4G0J3_avgStruct.pdb IF error happens
[2023-06-19 13:38:05] INFO [job avg_chopped_structs_unp_domains_4] Max memory used: 86MiB
[2023-06-19 13:38:05] INFO [job avg_chopped_structs_unp_domains_4] completed success
[2023-06-19 13:38:05] INFO [step avg_chopped_structs_unp_domains_7] completed success
[2023-06-19 13:38:05] INFO [workflow avg_unp_domains_7] completed success
[2023-06-19 13:38:10] INFO [job chop_structs] Max memory used: 67MiB
[2023-06-19 13:38:10] INFO [job chop_structs] completed success
[2023-06-19 13:38:10] INFO [step chop_structs] completed success
[2023-06-19 13:38:10] INFO [workflow avg_unp_domains] starting step avg_chopped_structs_unp_domains
[2023-06-19 13:38:10] INFO [step avg_chopped_structs_unp_domains] start
Warning: invalid field 'nameroot', expected one of: 'class', 'location', 'path', 'basename', 'listing'
Warning: invalid field 'nameext', expected one of: 'class', 'location', 'path', 'basename', 'listing'
[2023-06-19 13:38:10] INFO [job avg_chopped_structs_unp_domains_8] /tmp/u0ay_u1u$ python3 \
    align_compute_avg.py \
    -f \
    /tmp/w14i7c72/stg73efed26-7ed7-4def-8844-0191d559c710/99a4dda21068e1076bcbc268ceeef8242974f012 \
    -s \
    /tmp/w14i7c72/stg663c438e-2550-40a0-a2f6-6ac432f1c8a3/split_PDB \
    -k \
    KPAX_RESULTS
[2023-06-19 13:38:10] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
unmapped_domain_1_A0A6A5Q318 2 unmapped_domain_1_A0A6A5Q318_avgStruct.pdb IF error happens
[2023-06-19 13:38:12] INFO [job avg_chopped_structs_unp_domains_8] Max memory used: 101MiB
[2023-06-19 13:38:12] INFO [job avg_chopped_structs_unp_domains_8] completed success
[2023-06-19 13:38:12] INFO [step avg_chopped_structs_unp_domains] completed success
/users/hdhondge/miniconda3/envs/CroMaSt/lib/python3.10/site-packages/rdflib/plugins/serializers/nt.py:40: UserWarning: NTSerializer always uses UTF-8 encoding. Given encoding was: None
  warnings.warn(
[2023-06-19 13:38:12] INFO [workflow avg_unp_domains] completed success
[2023-06-19 13:38:13] INFO [job chop_structs_4] Max memory used: 88MiB
[2023-06-19 13:38:13] INFO [job chop_structs_4] completed success
[2023-06-19 13:38:13] INFO [step chop_structs_4] completed success
[2023-06-19 13:38:13] INFO [workflow avg_unp_domains_4] starting step avg_chopped_structs_unp_domains_4
[2023-06-19 13:38:13] INFO [step avg_chopped_structs_unp_domains_4] start
Warning: invalid field 'nameroot', expected one of: 'class', 'location', 'path', 'basename', 'listing'
Warning: invalid field 'nameext', expected one of: 'class', 'location', 'path', 'basename', 'listing'
[2023-06-19 13:38:13] INFO [job avg_chopped_structs_unp_domains_9] /tmp/yesd51wa$ python3 \
    align_compute_avg.py \
    -f \
    /tmp/q6bru3pe/stg7651e23c-915f-461f-aa27-f235c5bcd495/99a4dda21068e1076bcbc268ceeef8242974f012 \
    -s \
    /tmp/q6bru3pe/stg49cc0000-bbd9-4d1e-9970-39cdeb4183ee/split_PDB \
    -k \
    KPAX_RESULTS
[2023-06-19 13:38:13] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
unmapped_domain_1_P40567 5 unmapped_domain_1_P40567_avgStruct.pdb IF error happens
[2023-06-19 13:38:16] INFO [job avg_chopped_structs_unp_domains_9] Max memory used: 99MiB
[2023-06-19 13:38:16] INFO [job avg_chopped_structs_unp_domains_9] completed success
[2023-06-19 13:38:16] INFO [step avg_chopped_structs_unp_domains_4] completed success
/users/hdhondge/miniconda3/envs/CroMaSt/lib/python3.10/site-packages/rdflib/plugins/serializers/nt.py:40: UserWarning: NTSerializer always uses UTF-8 encoding. Given encoding was: None
  warnings.warn(
[2023-06-19 13:38:16] INFO [workflow avg_unp_domains_4] completed success
[2023-06-19 13:38:16] INFO [step avg_unp_domains] completed success
[2023-06-19 13:38:16] INFO [workflow unmapped_from_pfam] starting step copy_avg_dom
[2023-06-19 13:38:16] INFO [step copy_avg_dom] start
[2023-06-19 13:38:16] INFO [job copy_avg_dom] /tmp/gjy7c1j9$ python \
    script.py
[2023-06-19 13:38:16] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
[2023-06-19 13:38:16] INFO [job copy_avg_dom] completed success
[2023-06-19 13:38:16] INFO [step copy_avg_dom] completed success
[2023-06-19 13:38:16] INFO [workflow unmapped_from_pfam] starting step pairwise_align_avg_structs
[2023-06-19 13:38:16] INFO [step pairwise_align_avg_structs] start
Warning: invalid field 'nameroot', expected one of: 'class', 'location', 'path', 'basename', 'listing'
Warning: invalid field 'nameext', expected one of: 'class', 'location', 'path', 'basename', 'listing'
[2023-06-19 13:38:16] INFO [job pairwise_align_avg_structs] /tmp/sg5h986e$ python3 \
    pairwise_aligner.py \
    -d \
    /tmp/5_ww6ydm/stgf212598a-bb2f-4aef-ac61-17a140e2507c/avg_split_PDB \
    -t \
    /tmp/5_ww6ydm/stge310164b-0824-4ae1-9088-902e4da7e862/core_avgStruct.pdb \
    -r \
    align_Struct_analysis.csv
[2023-06-19 13:38:16] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
[2023-06-19 13:38:17] INFO [job pairwise_align_avg_structs] completed success
[2023-06-19 13:38:17] INFO [step pairwise_align_avg_structs] completed success
[2023-06-19 13:38:17] INFO [workflow unmapped_from_pfam] starting step check_threshold_step
[2023-06-19 13:38:17] INFO [step check_threshold_step] start
[2023-06-19 13:38:17] INFO [job check_threshold_step] /tmp/kwq3zd1o$ python3 \
    filter_align_scores.py \
    -s \
    Mscore \
    -f \
    pfam_unmapped_failed_structs.csv \
    -p \
    pfam_unmapped_passed_structs.csv \
    -t \
    0.6 \
    -i \
    /tmp/tojafpj5/stgbd8e2a5d-3614-4dc6-b933-9749bd217281/align_Struct_analysis.csv \
    -x \
    /tmp/tojafpj5/stg42d6c252-aad0-465e-a6d8-2ce59548d1f3/pfam_unq_unmapped.jsonx
[2023-06-19 13:38:17] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
/tmp/kwq3zd1o/filter_align_scores.py:44: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  final_df = final_df.append(tmp_group, ignore_index=True)
/tmp/kwq3zd1o/filter_align_scores.py:44: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  final_df = final_df.append(tmp_group, ignore_index=True)
[2023-06-19 13:38:18] INFO [job check_threshold_step] completed success
[2023-06-19 13:38:18] INFO [step check_threshold_step] completed success
/users/hdhondge/miniconda3/envs/CroMaSt/lib/python3.10/site-packages/rdflib/plugins/serializers/nt.py:40: UserWarning: NTSerializer always uses UTF-8 encoding. Given encoding was: None
  warnings.warn(
[2023-06-19 13:38:18] INFO [workflow unmapped_from_pfam] completed success
[2023-06-19 13:38:18] INFO [step unmapped_from_pfam] completed success
[2023-06-19 13:38:18] INFO [workflow ] starting step gather_failed_domains
[2023-06-19 13:38:18] INFO [step gather_failed_domains] start
[2023-06-19 13:38:18] INFO [workflow ] starting step gather_domain_like
[2023-06-19 13:38:18] INFO [step gather_domain_like] start
[2023-06-19 13:38:18] INFO [job gather_failed_domains] /tmp/2da1zwkx$ python3 \
    merge_unmapped.py \
    -p \
    /tmp/opu_ecbw/stg8f2ebd9c-919c-4a84-90dd-1037f4a596ba/pfam_unmapped_failed_structs.csv \
    -px \
    /tmp/opu_ecbw/stgd32eb944-5c8d-4f59-a42f-86d88e311ca0/crossmapped_pfam_failed.json \
    -o \
    /tmp/2da1zwkx/failed_domains_list.json \
    -cx \
    /tmp/opu_ecbw/stg13df5bed-0e05-4fe8-90c2-91a108fef038/crossmapped_cath_failed.json
[2023-06-19 13:38:18] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
[2023-06-19 13:38:18] INFO [job gather_domain_like] /tmp/3sia3mqf$ python3 \
    merge_unmapped.py \
    -p \
    /tmp/0t6jrnj7/stgf0c4f059-ff36-47d3-8764-5bcdbb23be4e/pfam_unmapped_passed_structs.csv \
    -o \
    /tmp/3sia3mqf/domain_like_structures.json
[2023-06-19 13:38:18] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
[2023-06-19 13:38:18] INFO [job gather_failed_domains] completed success
[2023-06-19 13:38:18] INFO [step gather_failed_domains] completed success
[2023-06-19 13:38:18] INFO [job gather_domain_like] completed success
[2023-06-19 13:38:18] INFO [step gather_domain_like] completed success
[2023-06-19 13:38:18] INFO [workflow ] starting step create_new_parameters
[2023-06-19 13:38:18] INFO [step create_new_parameters] start
[2023-06-19 13:38:18] INFO [job create_new_parameters] /tmp/8ciw6370$ python3 \
    create_param.py \
    -i \
    /tmp/0ns_ouf1/stg44e2a912-db6a-4cc5-8e75-5e6d1b35323c/CroMaSt_input.yml \
    -o \
    new_param.yml \
    -px \
    /tmp/0ns_ouf1/stg1b4b9d23-d42f-4cae-b45a-9e65084c6ff3/crossmapped_pfam_passed.json \
    -cx \
    /tmp/0ns_ouf1/stg86c01cfd-dc5c-45ce-b1c7-4a771ff54560/crossmapped_cath_passed.json
[2023-06-19 13:38:18] WARNING research_obj set but one of process_run_id or prov_obj is missing from runtimeContext: <cwltool.context.RuntimeContext object at 0x7f7755d43a00>
new_param.yml
[2023-06-19 13:38:18] INFO [job create_new_parameters] completed success
[2023-06-19 13:38:18] INFO [step create_new_parameters] completed success
[2023-06-19 13:38:18] INFO [workflow ] completed success
/users/hdhondge/miniconda3/envs/CroMaSt/lib/python3.10/site-packages/rdflib/plugins/serializers/nt.py:40: UserWarning: NTSerializer always uses UTF-8 encoding. Given encoding was: None
  warnings.warn(

[2023-06-19 13:38:19] INFO Final process status is success
[2023-06-19 13:38:19] INFO [provenance] Finalizing Research Object
[2023-06-19 13:38:19] INFO [provenance] Deleting existing /local/data2/hdhondge/CroMaSt/Run_prov
[2023-06-19 13:38:21] INFO [provenance] Research Object saved to /local/data2/hdhondge/CroMaSt/Run_prov
Outputs info from Workflow log

{
“align_unmap_cath”: null,
“align_unmap_pfam”: {
“location”: “file:///data2/hdhondge/CroMaSt/Results2/align_Struct_analysis.csv”,
“basename”: “align_Struct_analysis.csv”,
“class”: “File”,
“checksum”: “sha1$a6078f3b267d3d91daddb3a12f7c616f576c8e21”,
“size”: 2291,
“format”: “http://edamontology.org/format_3752”,
“path”: “/data2/hdhondge/CroMaSt/Results2/align_Struct_analysis.csv”
},
“all_domain_like”: {
“location”: “file:///data2/hdhondge/CroMaSt/Results2/domain_like_structures.json”,
“basename”: “domain_like_structures.json”,
“class”: “File”,
“checksum”: “sha1$e191e6e3abca6c918c63ade666a2a17a81121403”,
“size”: 31693,
“format”: “EDAM - Bioscientific data analysis ontology - JSON - Classes | NCBO BioPortal”,
“path”: “/data2/hdhondge/CroMaSt/Results2/domain_like_structures.json”
},
“all_failed_domains”: {
“location”: “file:///data2/hdhondge/CroMaSt/Results2/failed_domains_list.json”,
“basename”: “failed_domains_list.json”,
“class”: “File”,
“checksum”: “sha1$b15cca0131877e09d2414c448cf0f62f459b14b4”,
“size”: 9616,
“format”: “EDAM - Bioscientific data analysis ontology - JSON - Classes | NCBO BioPortal”,
“path”: “/data2/hdhondge/CroMaSt/Results2/failed_domains_list.json”
},
“allmap_cath”: {
“location”: “file:///data2/hdhondge/CroMaSt/Results2/cath_crossMapped_pfam.jsonx”,
“basename”: “cath_crossMapped_pfam.jsonx”,
“class”: “File”,
“checksum”: “sha1$bf21a9e8fbc5a3846fb05b4fa0859e0917b2202f”,
“size”: 2,
“format”: “EDAM - Bioscientific data analysis ontology - JSON - Classes | NCBO BioPortal”,
“path”: “/data2/hdhondge/CroMaSt/Results2/cath_crossMapped_pfam.jsonx”
},
“allmap_pfam”: {
“location”: “file:///data2/hdhondge/CroMaSt/Results2/pfam_crossMapped_cath.jsonx”,
“basename”: “pfam_crossMapped_cath.jsonx”,
“class”: “File”,
“checksum”: “sha1$bf21a9e8fbc5a3846fb05b4fa0859e0917b2202f”,
“size”: 2,
“format”: “EDAM - Bioscientific data analysis ontology - JSON - Classes | NCBO BioPortal”,
“path”: “/data2/hdhondge/CroMaSt/Results2/pfam_crossMapped_cath.jsonx”
},
“avg_alignment_result”: {
“location”: “file:///data2/hdhondge/CroMaSt/Results2/align_Struct_analysis.csv_2”,
“basename”: “align_Struct_analysis.csv”,
“class”: “File”,
“checksum”: “sha1$dba86745697ceeb3e546e4eb218eb58a703a184e”,
“size”: 630,
“format”: “http://edamontology.org/format_3752”,
“path”: “/data2/hdhondge/CroMaSt/Results2/align_Struct_analysis.csv_2”
},
“cath_crossmap_pfam_avg”: ,
“core_domains_list”: {
“location”: “file:///data2/hdhondge/CroMaSt/Results2/coreDomains.json”,
“basename”: “coreDomains.json”,
“class”: “File”,
“checksum”: “sha1$8e9686a3cb95b4c0423cdf0d9538fb0db09572f0”,
“size”: 51040,
“format”: “EDAM - Bioscientific data analysis ontology - JSON - Classes | NCBO BioPortal”,
“path”: “/data2/hdhondge/CroMaSt/Results2/coreDomains.json”
},
“core_structure”: null,
“crossmap_cath”: ,
“crossmap_pfam”: ,
“crossmapped_cath_passed”: {
“location”: “file:///data2/hdhondge/CroMaSt/Results2/crossmapped_cath_passed.json”,
“basename”: “crossmapped_cath_passed.json”,
“class”: “File”,
“checksum”: “sha1$bf21a9e8fbc5a3846fb05b4fa0859e0917b2202f”,
“size”: 2,
“format”: “EDAM - Bioscientific data analysis ontology - JSON - Classes | NCBO BioPortal”,
“path”: “/data2/hdhondge/CroMaSt/Results2/crossmapped_cath_passed.json”
},
“crossmapped_pfam_passed”: {
“location”: “file:///data2/hdhondge/CroMaSt/Results2/crossmapped_pfam_passed.json”,
“basename”: “crossmapped_pfam_passed.json”,
“class”: “File”,
“checksum”: “sha1$bf21a9e8fbc5a3846fb05b4fa0859e0917b2202f”,
“size”: 2,
“format”: “EDAM - Bioscientific data analysis ontology - JSON - Classes | NCBO BioPortal”,
“path”: “/data2/hdhondge/CroMaSt/Results2/crossmapped_pfam_passed.json”
},
“crossres_mappedcath”: {
“location”: “file:///data2/hdhondge/CroMaSt/Results2/cath_res_crossMapped.csv”,
“basename”: “cath_res_crossMapped.csv”,
“class”: “File”,
“checksum”: “sha1$1e6f28b9b3a46bf26bcc1b5db72b9af30b99bf64”,
“size”: 4209,
“format”: “http://edamontology.org/format_3752”,
“path”: “/data2/hdhondge/CroMaSt/Results2/cath_res_crossMapped.csv”
},
“crossres_mappedpfam”: {
“location”: “file:///data2/hdhondge/CroMaSt/Results2/pfam_res_crossMapped.csv”,
“basename”: “pfam_res_crossMapped.csv”,
“class”: “File”,
“checksum”: “sha1$e94de2557620c3aca8954ca10613e4832cc31fa8”,
“size”: 5526,
“format”: “http://edamontology.org/format_3752”,
“path”: “/data2/hdhondge/CroMaSt/Results2/pfam_res_crossMapped.csv”
},
“family_ids_x”: {
“location”: “file:///data2/hdhondge/CroMaSt/Results2/family_ids.json”,
“basename”: “family_ids.json”,
“class”: “File”,
“checksum”: “sha1$3d12faf52957d5247dd74c31d41a96a6fbe1f7f8”,
“size”: 380,
“format”: “EDAM - Bioscientific data analysis ontology - JSON - Classes | NCBO BioPortal”,
“path”: “/data2/hdhondge/CroMaSt/Results2/family_ids.json”
},
“next_parmfile”: {
“location”: “file:///data2/hdhondge/CroMaSt/Results2/new_param.yml”,
“basename”: “new_param.yml”,
“class”: “File”,
“checksum”: “sha1$77d6f1be66b9250a28b22839437e1f19bc308ead”,
“size”: 2209,
“format”: “EDAM - Bioscientific data analysis ontology - YAML - Classes | NCBO BioPortal”,
“path”: “/data2/hdhondge/CroMaSt/Results2/new_param.yml”
},
“pfam_crossmap_cath_avg”: ,
“reslost_cath”: {
“location”: “file:///data2/hdhondge/CroMaSt/Results2/cath_lost_resmap_domain_StIs.json”,
“basename”: “cath_lost_resmap_domain_StIs.json”,
“class”: “File”,
“checksum”: “sha1$d51cbfa7d4c3dc2549c684225aa38c9591462499”,
“size”: 59,
“format”: “EDAM - Bioscientific data analysis ontology - JSON - Classes | NCBO BioPortal”,
“path”: “/data2/hdhondge/CroMaSt/Results2/cath_lost_resmap_domain_StIs.json”
},
“reslost_pfam”: {
“location”: “file:///data2/hdhondge/CroMaSt/Results2/pfam_lost_resmap_domain_StIs.json”,
“basename”: “pfam_lost_resmap_domain_StIs.json”,
“class”: “File”,
“checksum”: “sha1$d51cbfa7d4c3dc2549c684225aa38c9591462499”,
“size”: 59,
“format”: “EDAM - Bioscientific data analysis ontology - JSON - Classes | NCBO BioPortal”,
“path”: “/data2/hdhondge/CroMaSt/Results2/pfam_lost_resmap_domain_StIs.json”
},
“resmapped_cath”: null,
“resmapped_pfam”: {
“location”: “file:///data2/hdhondge/CroMaSt/Results2/pfam_resmapped_domain_StIs.csv”,
“basename”: “pfam_resmapped_domain_StIs.csv”,
“class”: “File”,
“checksum”: “sha1$d681f02920a46b8a822c697688e77f5304f918d0”,
“size”: 5527,
“format”: “http://edamontology.org/format_3752”,
“path”: “/data2/hdhondge/CroMaSt/Results2/pfam_resmapped_domain_StIs.csv”
},
“true_domains”: {
“location”: “file:///data2/hdhondge/CroMaSt/Results2/true_domains.json”,
“basename”: “true_domains.json”,
“class”: “File”,
“checksum”: “sha1$be55bc64e52d35333280cca4e82f5f9e58262d8f”,
“size”: 149282,
“format”: “EDAM - Bioscientific data analysis ontology - JSON - Classes | NCBO BioPortal”,
“path”: “/data2/hdhondge/CroMaSt/Results2/true_domains.json”
},
“unmap_cath”: {
“location”: “file:///data2/hdhondge/CroMaSt/Results2/cath_unq_unmapped.jsonx”,
“basename”: “cath_unq_unmapped.jsonx”,
“class”: “File”,
“checksum”: “sha1$c99352b290c7b0b80192d494f88a3b55608d6694”,
“size”: 30,
“format”: “EDAM - Bioscientific data analysis ontology - JSON - Classes | NCBO BioPortal”,
“path”: “/data2/hdhondge/CroMaSt/Results2/cath_unq_unmapped.jsonx”
},
“unmap_cath_failed”: null,
“unmap_cath_passed”: null,
“unmap_pfam”: {
“location”: “file:///data2/hdhondge/CroMaSt/Results2/pfam_unq_unmapped.jsonx”,
“basename”: “pfam_unq_unmapped.jsonx”,
“class”: “File”,
“checksum”: “sha1$27687398f6c69d082b333f6b57fcd1c11353184e”,
“size”: 1295,
“format”: “EDAM - Bioscientific data analysis ontology - JSON - Classes | NCBO BioPortal”,
“path”: “/data2/hdhondge/CroMaSt/Results2/pfam_unq_unmapped.jsonx”
},
“unmap_pfam_failed”: {
“location”: “file:///data2/hdhondge/CroMaSt/Results2/pfam_unmapped_failed_structs.csv”,
“basename”: “pfam_unmapped_failed_structs.csv”,
“class”: “File”,
“checksum”: “sha1$af68f8216c7deaf47642eddaf7ad407f09a7d2f8”,
“size”: 139,
“format”: “http://edamontology.org/format_3752”,
“path”: “/data2/hdhondge/CroMaSt/Results2/pfam_unmapped_failed_structs.csv”
},
“unmap_pfam_passed”: {
“location”: “file:///data2/hdhondge/CroMaSt/Results2/pfam_unmapped_passed_structs.csv”,
“basename”: “pfam_unmapped_passed_structs.csv”,
“class”: “File”,
“checksum”: “sha1$87923f1963875f5d228799c7aef3cb00e1d50ad2”,
“size”: 1161,
“format”: “http://edamontology.org/format_3752”,
“path”: “/data2/hdhondge/CroMaSt/Results2/pfam_unmapped_passed_structs.csv”
}
}

The log was quite long so had to divide it into parts. :slight_smile:

Update:

I tried --provenance option with a single tool from the same workflow followed by runcrate convert command for generating RO crate and it worked! So might be something fishy with the workflow?

Tool Name: separate_pfam.cwl

Log for Tool
[2023-06-19 16:10:25] INFO /users/hdhondge/miniconda3/envs/CroMaSt/bin/cwltool 3.1.20230601100705
[2023-06-19 16:10:25] INFO [cwltool] /users/hdhondge/miniconda3/envs/CroMaSt/bin/cwltool --timestamps --provenance Run_prov2/ --outdir=/data2/hdhondge/CroMaSt/Results2/ Tools/separate_pfam.cwl yml/CroMaSt_input.yml
[2023-06-19 16:10:25] INFO Resolved 'Tools/separate_pfam.cwl' to 'file:///local/data2/hdhondge/CroMaSt/Tools/separate_pfam.cwl'
[2023-06-19 16:12:27] INFO [job separate_pfam.cwl] /tmp/n3ogdty0$ python3 \
    separate_pfam.py \
    -l \
    31 \
    -d \
    /tmp/n3ogdty0/obsolete_PDB_entry_ids.txt \
    -o \
    obsolete_pfam.txt \
    -p \
    /tmp/n3ogdty0/pdbmap \
    -n \
    Filtered_Pfam.csv \
    -s \
    part.csv \
    -f \
    /tmp/8bcctq_k/stg55b78be9-0ce6-43d6-874c-e910e905b71c/fam_ids.json
/tmp/n3ogdty0/separate_pfam.py:73: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  pfam_df = pfam_df.append({'PDB_id': line[0], 'Chain_id': line[1], \
/tmp/n3ogdty0/separate_pfam.py:73: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  pfam_df = pfam_df.append({'PDB_id': line[0], 'Chain_id': line[1], \
/tmp/n3ogdty0/separate_pfam.py:73: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  pfam_df = pfam_df.append({'PDB_id': line[0], 'Chain_id': line[1], \
/tmp/n3ogdty0/separate_pfam.py:73: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  pfam_df = pfam_df.append({'PDB_id': line[0], 'Chain_id': line[1], \
[2023-06-19 16:14:13] INFO [job separate_pfam.cwl] Max memory used: 53MiB
[2023-06-19 16:14:13] INFO [job separate_pfam.cwl] completed success
/users/hdhondge/miniconda3/envs/CroMaSt/lib/python3.10/site-packages/rdflib/plugins/serializers/nt.py:40: UserWarning: NTSerializer always uses UTF-8 encoding. Given encoding was: None
  warnings.warn(
{
    "pfam_obs": {
        "location": "file:///data2/hdhondge/CroMaSt/Results2/obsolete_pfam.txt",
        "basename": "obsolete_pfam.txt",
        "class": "File",
        "checksum": "sha1$da39a3ee5e6b4b0d3255bfef95601890afd80709",
        "size": 0,
        "format": "http://edamontology.org/format_2330",
        "path": "/data2/hdhondge/CroMaSt/Results2/obsolete_pfam.txt"
    },
    "pfam_structs": {
        "location": "file:///data2/hdhondge/CroMaSt/Results2/Filtered_Pfam.csv",
        "basename": "Filtered_Pfam.csv",
        "class": "File",
        "checksum": "sha1$8fe3f0b3c50a03ee572f1da19afca2e809a99f69",
        "size": 194,
        "format": "http://edamontology.org/format_3752",
        "path": "/data2/hdhondge/CroMaSt/Results2/Filtered_Pfam.csv"
    },
    "splitted_pfam_sep": [
        {
            "location": "file:///data2/hdhondge/CroMaSt/Results2/0_part.csv",
            "basename": "0_part.csv",
            "class": "File",
            "checksum": "sha1$8fe3f0b3c50a03ee572f1da19afca2e809a99f69",
            "size": 194,
            "format": "http://edamontology.org/format_3752",
            "path": "/data2/hdhondge/CroMaSt/Results2/0_part.csv"
        }
    ]
}[2023-06-19 16:14:36] INFO Final process status is success
[2023-06-19 16:14:36] INFO [provenance] Finalizing Research Object
[2023-06-19 16:14:37] INFO [provenance] Deleting existing /local/data2/hdhondge/CroMaSt/Run_prov2
[2023-06-19 16:16:25] INFO [provenance] Research Object saved to /local/data2/hdhondge/CroMaSt/Run_prov2