Skip to content
Search Digital Science
See all jobs →

SITE RELIABILITY ENGINEER

Company
Altmetric
brand logo
Location

Remote | London

Closing Date

No closing date

We’re looking for a talented Site Reliability Engineer to join our growing team!

What we do

Altmetric analyses the online activity around scholarly content to measure the broader impact of science and research. We deliver and support products such as the Altmetric Explorer and the Altmetric badges. Our customers include institutions across Europe, North America and Australia and scholarly publishers such as Springer Nature, Wiley, Taylor & Francis and MIT Press.

How we do it

As part of our engineering team you’ll work with an infrastructure that processes hundreds of thousands of social media posts and serves over 50 million API requests every single day. With the help of the team you will be responsible for the stability and performance of our applications and will contribute to projects aimed at improving on how we continuously deliver our products whilst ensuring the security, and integrity of our systems.

The vast majority of our code-base is Ruby and PHP, managed with Github, built and tested with CircleCI and deployed to Linux (Ubuntu) based machines using Docker and Hashicorp Nomad & Consul. We run several high-throughput public-facing web applications at scale using Nginx and a number of CDNs, backed by a diverse set of Redis, MongoDB, Elasticsearch and PostgreSQL database clusters. We monitor the performance and reliability of our entire stack using Sensu, Grafana and InfluxDB. We’ve also found configuration management (Chef in our case, but we really like Puppet/Ansible/Terraform as well) and knowing some type of scripting or programming language to be very helpful in automating various aspects of what we do and reducing the boring repetitive work we dislike.

During the work day, we use Kanban to keep track of our work and Slack and Zoom to communicate.

As the products we support are widely used across all time zones, the engineering team is also an integral part of the on-call rota which helps keep our applications available around the clock. The Pagerduty rota is financially compensated and continuously reviewed and improved in terms of scheduling and alerting in order to protect the work-life balance of everyone involved, which is a very important aspect of everything we do at Altmetric.

We have some exciting projects coming up which we expect to include migrating our entire setup to the cloud, something that will definitely benefit from strong experience with migrating to and/or operating in a cloud environment.

If working with the above sounds like it might be your cup of tea (or coffee), or alternately if you have good experience with some technology that you think would be really useful to us and blow our minds, do get in touch. Ultimately and most of all, we’re looking for people who are keen to learn, teach and be flexible in their approach.

Requirements

You’ll need production experience with:

  • General operation/management/security for Linux-based systems (particularly Debian-based operating systems such as Ubuntu);
  • Cloud-based infrastructure/services and deployments (AWS/GCP preferred)
  • A programming/scripting language (such as bash, Ruby or Python)
  • CI/CD systems (Github, CircleCI, Jenkins)
  • Container management/orchestration (Docker, Hashicorp Nomad/Consul)
  • Operating web applications (such as PHP, Ruby on Rails or Django applications), preferably in a high-throughput environment;
  • Operating terabyte-level database clusters (Redis, MongoDB, ElasticSearch, PostgreSQL)
  • Server and resource monitoring (Sensu, Grafana, InfluxDB)
  • Configuration management software (such as Chef or Puppet)

Bonus points for experience with:

  • Hashicorp Terraform

Ultimately and above all we’re looking for people who are keen to learn and flexible in their approach.

Benefits

Our offices are currently based in The Smithson building, in Clerkenwell, London. As a portfolio company of Digital Science, we share our office with other scientific start-ups including figshare, Overleaf & Symplectic. Most of our developers and engineers work remotely around the UK and Europe, but the office is always welcoming for when you’d like it. (Outside of these unusual times, of course.)

As a company, work-life balance is very important to us: we have flexible working hours and our teams have been set up to work well remotely for a number of years now – we’re quite good at it.

In order to create time for personal development, we hold “hackdays” every month for team members to explore new topics and technologies and work with people outside their usual product team. These projects range from building a prototype of something, experimenting with a new technology, online training, or just reading that software development book you never get around to during the week.

We offer a competitive market rate salary and all members of the team are provided with a personal laptop (tailored to individual requirements) and have an annual training & conference budget including international travel. Our wider benefits package also includes private pension contribution, life insurance and income assurance, travel loan as well as other discounts and contributions to improve the quality of life and work for our colleagues.

We’re proud to be an equal opportunity employer, which has given us a wide diversity of backgrounds throughout the team. This is something that Louise (our VP of Engineering), Lewis (our CTO) and Kathy (our CEO) have always strived for and we’re certainly pleased with the progress so far.

If this role interests you, please apply here.

© 2022 Digital Science & Research Solutions Ltd. All Rights Reserved