Site Reliability Engineer – UKIC (South)

On Site : 1
Contract Rate : 550
Contract Job : 1
Salary range high : 800
Salary range low : 550

Site Reliability Engineer – UKIC (South)

Site Reliability Engineer – UKIC (South)

Site Reliability Engineer

Location: South of the UK / Hybrid
Clearance: Must hold active UKIC (South) clearance

The Role

We’re supporting a requirement for a number of Site Reliability Engineers to join a high-assurance engineering environment delivering secure, resilient digital services within a sensitive UK government setting.

This role would suit someone with a strong background in production reliability, platform operations, automation, observability, and incident response, who is comfortable working across complex cloud-based or hybrid infrastructure. You’ll play a key role in ensuring services are robust, scalable and supportable, while driving improvements in reliability, performance and operational maturity.

Responsibilities

  • Support the availability, performance and resilience of critical live services
  • Build and improve automation across operational processes, deployments and platform management
  • Design and maintain monitoring, alerting and observability tooling across services and infrastructure
  • Troubleshoot complex incidents, conduct root cause analysis, and implement preventative improvements
  • Work closely with engineering, platform and delivery teams to improve reliability and reduce operational risk
  • Contribute to capacity planning, service scaling, failover readiness and disaster recovery approaches
  • Help shape SRE best practice, including SLIs, SLOs, error budgets and operational standards
  • Support continuous improvement across CI/CD, release management and operational tooling

Experience Required

  • Strong experience in a Site Reliability Engineering, DevOps, or production support engineering role
  • Experience supporting business-critical live services in secure or complex environments
  • Strong understanding of Linux/Unix systems, networking fundamentals, and infrastructure troubleshooting
  • Experience with cloud platforms such as AWS, Azure or GCP
  • Hands-on experience with Infrastructure as Code, ideally Terraform or similar tooling
  • Experience with containers and orchestration, such as Docker and Kubernetes
  • Knowledge of CI/CD tooling and automated deployment pipelines
  • Strong experience with monitoring and observability tools such as Prometheus, Grafana, ELK, Datadog, Splunk or similar
  • Scripting or coding capability in tools/languages such as Python, Bash, Go or similar
  • Strong incident management, problem-solving and stakeholder communication skills

Upload your CV/resume or any other relevant file. Max. file size: 4MB.


You can apply to this job and others using your online profile. Click the link below to submit your online profile and email your application to this employer.