
Trapeze
Senior DevOps Engineer (Java & Cloud Infrastructure)
- Permanent
- Dubai, United Arab Emirates
- Experience 5 - 10 yrs
Job expiry date: 24/03/2026
Job overview
Date posted
07/02/2026
Location
Dubai, United Arab Emirates
Salary
Undisclosed
Compensation
Comprehensive package
Experience
5 - 10 yrs
Seniority
Senior & Lead
Qualification
Bachelors degree
Expiration date
24/03/2026
Job description
The Senior DevOps Engineer is responsible for driving infrastructure excellence, reliability, and automation across on-premises and cloud platforms in Dubai. This role includes end-to-end ownership of CI/CD pipelines, hybrid deployments, containerized Java services, and automated testing frameworks. Key responsibilities involve orchestrating and automating infrastructure using Ansible, Foreman, Satellite, PXE/Kickstart, and Terraform; building HA clusters, load balancers, and resilient storage; establishing hardened RHEL baselines with CIS, SELinux, and firewalld; managing patching, kernel upgrades, and OS troubleshooting; implementing observability via Prometheus, Grafana, and ELK with SLO/SLI-driven alerting; optimizing Java runtime environments, tuning JVMs (G1/ZGC, heap sizing, thread pools), and profiling with JFR, Async-Profiler, and APM tools like Dynatrace, New Relic, and AppDynamics. The engineer also leads automated load testing, penetration testing, zero-downtime deployments, and container orchestration on Kubernetes/OpenShift. Cloud expertise across Azure, AWS, and GCP is required, including identity, networking, security, logging, cost governance, and hybrid connectivity (ExpressRoute/Direct Connect/VPNs). The role emphasizes reliability, security, incident response, and compliance.
Required skills
Key responsibilities
- Automate on-prem infrastructure at scale using Ansible, Foreman, Satellite, PXE/Kickstart, and Terraform
- Build HA clusters, load balancers, and resilient storage for high-performance workloads
- Establish hardened RHEL baselines (CIS, SELinux, firewalld) and manage lifecycle patching
- Troubleshoot OS issues including systemd, cgroups, I/O schedulers, and NIC offloads, and perform capacity planning
- Implement observability with Prometheus, Grafana, ELK and SLO/SLI-driven alerting with actionable runbooks
- Operate and optimize Java application servers (JBoss, WildFly, Tomcat, WebLogic) and tune JVM parameters
- Profile applications with JFR, Async-Profiler, and APM tools; conduct automated load testing with JMeter, Gatling, or k6
- Implement automated penetration testing and vulnerability scanning integrated into CI/CD pipelines
- Engineer zero-downtime deployments using blue/green and canary strategies, and perform performance tuning
- Containerize Java services with Docker/Podman and deploy on Kubernetes/OpenShift clusters
- Build and maintain secure CI/CD pipelines using Jenkins, Azure DevOps, with artifact management (Nexus/Artifactory), SBOMs, and image signing
- Design and operate workloads across Azure, AWS, and GCP with consistent patterns for identity, networking, security, logging, and cost governance
- Build hybrid connectivity between on-prem and cloud estates using ExpressRoute, Direct Connect, Cloud Interconnect, and VPNs
- Lead incident response and post-mortems to improve MTTR and overall system reliability
- Implement robust secrets management, PKI/certificate rotation, and least-privilege access; support audits and compliance
Experience & skills
- 8+ years of hands-on DevOps or Site Reliability Engineering experience
- Strong Java runtime expertise including JVM/GC tuning, thread and heap diagnostics, and application server operations
- Experience operating on-premises environments, including VMware
- DevOps certification in Azure, AWS, or GCP is advantageous
- Proficiency in Ansible, Terraform, Foreman, Satellite, PXE/Kickstart
- Experience with containerization and orchestration tools including Docker, Podman, Kubernetes, and OpenShift
- Experience with CI/CD tools including Jenkins and Azure DevOps
- Experience with monitoring and observability tools such as Prometheus, Grafana, ELK Stack
- Experience in automated load testing (JMeter, Gatling, k6) and application performance monitoring (Dynatrace, New Relic, AppDynamics)
- Experience with security and compliance tooling including OWASP ZAP, Burp Suite, and Snyk
- Knowledge of hybrid cloud architectures across Azure, AWS, and GCP including networking, identity, and security management
- Ability to troubleshoot complex systems, optimize performance, and implement scalable, secure solutions