Site Reliability Engineer – UKIC (South)

South West England
Contract
Consulting
GBP550 - 800 per day
Posted 4 weeks ago

Site Reliability Engineer – UKIC (South)

Site Reliability Engineer

Location: South of the UK / Hybrid
Clearance: Must hold active UKIC (South) clearance

The Role

We’re supporting a requirement for a number of Site Reliability Engineers to join a high-assurance engineering environment delivering secure, resilient digital services within a sensitive UK government setting.

This role would suit someone with a strong background in production reliability, platform operations, automation, observability, and incident response, who is comfortable working across complex cloud-based or hybrid infrastructure. You’ll play a key role in ensuring services are robust, scalable and supportable, while driving improvements in reliability, performance and operational maturity.

Responsibilities

Support the availability, performance and resilience of critical live services
Build and improve automation across operational processes, deployments and platform management
Design and maintain monitoring, alerting and observability tooling across services and infrastructure
Troubleshoot complex incidents, conduct root cause analysis, and implement preventative improvements
Work closely with engineering, platform and delivery teams to improve reliability and reduce operational risk
Contribute to capacity planning, service scaling, failover readiness and disaster recovery approaches
Help shape SRE best practice, including SLIs, SLOs, error budgets and operational standards
Support continuous improvement across CI/CD, release management and operational tooling

Experience Required

Strong experience in a Site Reliability Engineering, DevOps, or production support engineering role
Experience supporting business-critical live services in secure or complex environments
Strong understanding of Linux/Unix systems, networking fundamentals, and infrastructure troubleshooting
Experience with cloud platforms such as AWS, Azure or GCP
Hands-on experience with Infrastructure as Code, ideally Terraform or similar tooling
Experience with containers and orchestration, such as Docker and Kubernetes
Knowledge of CI/CD tooling and automated deployment pipelines
Strong experience with monitoring and observability tools such as Prometheus, Grafana, ELK, Datadog, Splunk or similar
Scripting or coding capability in tools/languages such as Python, Bash, Go or similar
Strong incident management, problem-solving and stakeholder communication skills

Site Reliability Engineer – UKIC (South)

Site Reliability Engineer – UKIC (South)

Site Reliability Engineer – UKIC (South)

Site Reliability Engineer

The Role

Responsibilities

Experience Required

Previous PostPlatform Engineer - UKIC (South)

Next PostOAT Engineer (Operational Acceptance Test) | Remote

Make a referral

Access Report