Lucidya

Site Reliability Engineer

Permanent
Riyadh, Saudi Arabia
Experience 2 - 5 yrs

View more jobs like this

Return to jobs page

Job overview

Date posted
24/03/2026
Location
Riyadh, Saudi Arabia
Salary
SAR 15,000 - 20,000 per month
Compensation
Salary only
Experience
2 - 5 yrs
Seniority
Experienced
Qualification
Bachelors degree
Expiration date
08/05/2026

Job description

As a Site Reliability Engineer at Lucidya in Riyadh, you will ensure the reliability, performance, and scalability of our AI-native customer experience platform. You will design, implement, and maintain highly available, fault-tolerant infrastructure across cloud environments, proactively identify potential failures, and build automation to eliminate manual operational work. You will manage Kubernetes clusters, optimize cloud resources, improve CI/CD pipelines, and implement monitoring and observability systems to prevent downtime. The role requires close collaboration with engineering and DevOps teams, responding to incidents, performing root cause analyses, and driving improvements to make our systems robust and scalable.

Required skills

Site Reliability Engineering (SRE)

DevOps

Cloud Infrastructure Management (AWS, GCP, Azure)

Infrastructure as Code (Terraform)

Kubernetes (EKS, GKE)

Docker

CI/CD Pipelines (Jenkins, GitHub Actions, Bitbucket)

Monitoring & Observability (Prometheus, Grafana, Datadog, ELK)

Incident Response & Root Cause Analysis

Automation & Scripting (Python, Bash)

Networking and Load Balancing

High Availability & Fault Tolerance Design

Distributed Systems

Performance Optimization

Cloud Resource Management

Troubleshooting & Problem Solving

Collaboration with DevOps and Engineering Teams

Technical Documentation

Key responsibilities

Design and maintain infrastructure that is highly available, fault-tolerant, and scalable
Proactively identify and eliminate single points of failure to prevent incidents
Manage cloud workloads across AWS, GCP, or Azure using Infrastructure as Code (Terraform)
Operate and scale Kubernetes clusters, troubleshoot issues, and ensure smooth deployments
Implement and refine monitoring and alerting systems (Prometheus, Grafana, Datadog, ELK)
Respond to incidents, lead root cause analysis, and implement preventive measures
Automate workflows and infrastructure management to eliminate repetitive manual tasks
Optimize cloud resource usage to balance cost and performance
Collaborate with DevOps and engineering teams to solve performance bottlenecks
Contribute to CI/CD improvements and deployment reliability
Document infrastructure, processes, and incidents to support knowledge sharing
Identify opportunities to improve system reliability, scalability, and operational efficiency

Experience & skills

3+ years of experience in SRE, DevOps, or infrastructure engineering
Hands-on experience with cloud platforms (AWS, GCP, Azure) and distributed systems
Proficient with Kubernetes and Docker in production environments
Experience with Infrastructure as Code (Terraform or similar)
Strong scripting skills in Python, Bash, or similar languages
Understanding of CI/CD pipelines and automation
Knowledge of networking, load balancing, and high-availability design
Experience implementing monitoring and observability tools (Prometheus, Grafana, Datadog, ELK)
Ability to troubleshoot complex issues and perform root cause analysis
Calm under pressure and methodical in incident response
Excellent communication and collaboration skills
Ownership mindset and proactive approach to reliability challenges
Cloud or Linux certifications are a plus
Experience with RabbitMQ or Redis in production environments is a plus
Familiarity with Ansible or AWX is advantageous
Exposure to multi-cloud or hybrid environments is a plus

Return to jobs page

Share job opening, get 1-month free Private Network access (worth 99 AED)

Site Reliability Engineer

Job overview

Date posted

Location

Salary

Compensation

Experience

Seniority

Qualification

Expiration date

Job description

Required skills

Key responsibilities

Experience & skills