Bioinformatics analysis of single-cell RNA sequencing using CWL

Presenter: Michael Kotliar

Session 3 (APAC-Americas Friday, March 4th, 02:05 UTC

Summary: Here, we provide an outline of the bioinformatics processing of the single-cell RNA sequencing data and clustering of heterogeneous cellular populations comprising pancreatic tumors using Common Workflow Language (CWL) pipelines. In the original paper (Gabitova-Cornell et al., 2020), analysis of scRNA-Seq data was conducted by manual command line and R processing. However, due to potential changes in tool versions, libraries and execution environments simply repeating the sequence of commands used in processing is likely to produce different results for different users. In order to guarantee the reproducibility and portability of our analytic approach, we converted our analysis into reproducible CWL pipelines and executed them on user-friendly Scientific Data Analysis Platform (SciDAP, https://scidap.com). Open source CWL Pipelines used here are available at GitHub - Barski-lab/sc-seq-analysis: CWL toolkit for single-cell sequencing data analysis and GitHub - datirium/workflows: CWL based Bioinformatics Workflows. As a workflow runner we used CWL-Airflow (Kotliar et al., 2019), however, the same pipelines can be executed in any other CWL-based execution environments. For details, refer to https://doi.org/10.1016/j.xpro.2021.100989 (Surumbayeva, Kotliar et al., 2021).

Please leave your questions below!

As an alternative to YouTube, this presentations is also available on ConfTube