Bioinformatics Systems Engineer, EMBL-EBI

Link to apply

Grading: Grade 5 - 6 (Monthly salary staring at £2,809 - £3,143 after tax) + other paid benefits

Closing date: 6 March 2022

  • Contract duration: This position is a contract based role which will end on 30/04/2024.

  • International applicants: We recruit internationally and successful candidates are offered visa exemptions. Read more on our page for international applicants.

  • Diversity and inclusion: At EMBL-EBI, we strongly believe that inclusive and diverse teams benefit from higher levels of innovation and creative thought. We encourage applications from women, LGBTQ+ and individuals from all nationalities.

  • Job location: This role is based in Hinxton, near Cambridge, UK. You will be required to relocate if you are based overseas and you will receive a generous relocation package to support you.

  • How to apply: To apply please submit a cover letter and a CV through our online system

Reference number: EBI01988

We are looking for an innovative technical engineer who can help develop and deploy the workflow and data infrastructure that is becoming increasingly fundamental for our high throughput life sciences research.

You will work within a team based within the IT department that engages with teams across the institute to bring new workflow related technologies into production.

Your role

You will primarily engineer workflow and data infrastructure systems and standards in order to demonstrate the scalability and performance of these systems alongside our local approaches. You largely won’t be building things from scratch - you’ll be collaborating with developers situated inside and outside of EMBL-EBI on a variety of open-source projects.

The workflow systems we are interested in are those that are established and popular in the bioinformatics space with the potential for long-term sustainability. These include projects such as Nextflow, Snakemake, Galaxy, Toil and Arvados. On the data infrastructure side, the engineering doesn’t fall so neatly into specific projects. However, to give one example, we’re interested in Refgenie an emerging system to store, access and transfer genome resources.

You may also work on standards such as the Common Workflow Language (CWL) and emerging workflow, tools and data APIs from the Global Alliance for Genomics and Health (GA4GH), such as the Workflow Execution Service (WES) and the Data Repository Service (DRS).

Projects and standards are important only insofar as they can form integrated infrastructure that can carry out research. So you’ll not only help engineer systems but also put them together into new end-to-end data processing solutions, or use your knowledge to improve our existing installations. The use cases are many and varied - for example to run QA processing on genome sequences submitted to EMBL-EBI archives or to process distributed biomedical data hosted in different countries. You will engage with the relevant teams within EMBL-EBI and the wider bioinformatics community to drive improvements and address issues as they are identified.

This is where your bioinformatics experience comes in as it will give you insight into the problems that our service and research teams are really trying to solve. However, please be aware that this is an engineering role with a bioinformatics flavour rather than a bioinformatics role with an engineering flavour. Your core duties will always be on the engineering side of the workflow and data infrastructure systems rather than the pure bioinformatics side.

You’ll also exercise your cloud skills as infrastructure is increasingly hosted on internal clouds, external clouds, and in a hybrid model. You’ll have a good knowledge of Kubernetes, Docker and associated projects such as Terraform and Prometheus with the opportunity to extend your skills as opportunities arise.

We’ve named a lot of technologies in this text and there are a few hard requirements in the section below. However, your overriding skills will be your ability to collaborate intelligently and work autonomously. You’ll be asked by your manager to tackle specific projects from time to time but you’ll also have the initiative to formulate your own work proposals within the general requirements of the cluster. You’ll attend project calls and sometimes physically travel outside the UK to external project meetings and events such as hackathons.

From time to time as requested you’ll also carry out other technical duties such as reviewing technical proposals and presenting your work at meetings.

You have

  • A bachelor’s degree or higher in bioinformatics, computer science, software engineering, or equivalent experience.

  • Experience working with bioinformatics workflows. You do not necessarily need to have created workflows but you must be able to demonstrate an understanding of components commonly used within them and the results they are trying to achieve.

  • Experience working with at least one of the workflow systems commonly used in bioinformatics such as Nextflow, Snakemake and Galaxy.

  • Contributed to projects used by open-source communities. You must be able to demonstrate pull request acceptance and evidence of collaborative discussion.

  • Developed software in Python or Java

  • Created containerised software in Docker or Singularity and deployed software using container orchestration tools such as Kubernetes.

  • Deployed software on cloud infrastructures such as GCP or AWS.

  • Good verbal and writing skills in English.

You might also have

  • Contributed to open-source workflow systems such as Nextflow, Snakemake and Galaxy. This need not be code - equally valid are, for example, contributions of documentation, substantive discussion and organization of builds and releases.

  • Worked on scientific projects and collaborations where participants are distributed over a number of different organizations and countries. For example, you may have experience working with the ELIXIR distributed infrastructure for life science data or in the life-science community through organisations such as GA4GH.

  • Integrated individual components (e.g. microservices) into a larger system using facilities such as message queueing and REST APIs.

  • Experience working with continuous integration systems such as Jenkins or Travis.