Cloud Storage Site Reliability Engineer F/M

Ceph is an open-source, distributed storage system with no single point of failure. It is increasingly used as a software-defined storage solution for cloud infrastructure. Our project's goal is to provide Ceph-as-a-Service, so others can make use of it without having to set it up and manage themselves. Our project is most accurately described as delivering Infrastructure As Code. We write software that takes care of setting up systems, so that no human sysadmin intervention is required. We use Python as the primary programming language and Puppet for setting up and managing systems. We also contribute all our patches upstream.

Your role:

- Developing massive, distributed storage services for cloud computing

- Developing IaaS (Infrastructure-as-a-Service) applications in Python and Puppet

- Optimizing systems with regard to efficiency

- Automating Continuous Delivery process

- Extending monitoring capabilities



- 4+ years' experience with GNU/Linux systems administration, troubleshooting or performance improvements

- Very good knowledge of at least one of the following languages: Python/Puppet/Ruby

- Good command of spoken and written English


- Experience with software-defined storage systems (Ceph, Gluster or similar)

- Familiarity with good software development practices (writing tests, documentation)

- Experience in working with distributed systems

- Familiarity with Python frameworks: Celery, SQLAlchemy

- Experience with kernel development

- Knowledge of C/C++ programming

- Experience with Docker and container technology

- Experience with Openstack

- French language skills

Your team


Our Storage team develops storage solutions used throughout OVH by our customers as well as internally. Common terminology includes NFS,… Find out more