qubitsok.com

Cut Noise. Work Quantum.

Back to Job Listings

Europe, United Kingdom, Cambridge

Posted 20 days ago

System Reliability Engineer - Quantum Computing

🏢 Quantinuum

AI Summarised
Visit Website

Role Type

🛠️ Engineer / Developer

Role Focus

🏗️ Build Systems

Seniority

🌿 Experienced

Employer Type

🏢 Industry

The System Reliability Engineer will maintain and enhance the Quantinuum Nexus cloud platform, ensuring its high performance, reliability, and security for quantum researchers. This role involves expert management of Kubernetes clusters, primarily Amazon EKS, and associated distributed systems infrastructure. Key duties include managing costs, optimizing performance through monitoring tools like Opentelemetry and AWS CloudWatch, and collaborating with development teams to quickly resolve outages and issues.

Key Responsibilities

Manage the architecture, performance, security, and cost efficiency of managed Kubernetes instances like Amazon EKS and the distributed systems built upon them.

Collect logs, traces, and metrics using Opentelemetry and make them available through AWS products such as x-ray and cloudwatch to monitor Nexus performance and reliability.

Use monitoring readings to ensure the Nexus platform meets high standards for performance and reliability, directing team improvements when necessary.

Actively report, monitor, and diagnose the cause of issues and outages when they occur.

Collaborate closely with the development team, providing necessary information to quickly identify and resolve production issues.

Required Skills

Expert knowledge of managed Kubernetes instances such as Amazon EKS.

Experience working with distributed systems.

Proficiency using tools such as Helm, Karpenter, and k9s.

Experience collecting logs, traces, and metrics for distributed systems.

Experience using AWS CloudWatch to locate bugs and performance issues.

Experience improving declarative Infrastructure as Code tools such as Terraform.

Professional experience working with Python.

Nice-to-have Skills

Experience with PostgreSQL.

Experience working in a continuous deployment environment.

Experience with triaging and debugging issues in code.

Familiarity with the OpenTelemetry standard and SDKs.

Technology Tags

Cloud platforms

The role is centered on managing and maintaining the cloud-based Quantinuum Nexus platform using specific AWS products like EKS and CloudWatch.

Classical programming

The candidate must have professional experience working with Python for debugging and production analysis.

Programming Tools

The role explicitly requires experience with infrastructure and observability tools like Helm, Karpenter, k9s, Terraform, and OpenTelemetry.

Network integration

Expertise in working with managed Kubernetes instances and distributed systems is a core requirement for the position.

Benchmarking

The role focuses on collecting logs, traces, and metrics to ensure the platform meets high standards for performance and reliability.

Quantum Middleware

The engineer supports the Quantinuum Nexus cloud platform, which serves as an intermediary layer between researchers and quantum computers.

Quantum Runtime Software

The role involves maintaining the performance and reliability of the platform where quantum experiments and job executions occur.

Is this your company's listing?

Boost it to the top of search results and reach 497+ newsletter subscribers.

Promote This Job