Senior Observability and Monitoring Engineer
by Tabby in FinTech & Digital Payments
Senior Observability and Monitoring Engineer responsible for designing, operating, and evolving company-wide observability platforms within a high-scale fintech environment. The role supports a strategic migration from Datadog-based observability systems (logs, metrics, APM, RUM, dashboards, alerts) to a self-hosted observability stack centered on Elastic Enterprise, ensuring full compliance with regulatory requirements including audit, access, application, and database logging. The position involves managing Elastic Enterprise clusters with focus on index lifecycle management, scaling, retention, access control, and backups, as well as building and maintaining log ingestion pipelines using Fluentd, Fluent Bit, Logstash, and Beats. The engineer collaborates with SRE, DevOps, and Security teams to ensure observability systems integrate with SOC2 compliance frameworks and SIEM platforms. The role includes defining SLIs, SLOs, and error budgets to improve system reliability, and implementing APM and metrics solutions using tools such as Prometheus, VictoriaMetrics, Mimir, Grafana Loki, and Grafana Tempo. Responsibilities also include infrastructure automation using Terraform, Helm, and GitOps tools like FluxCD and ArgoCD. The role requires strong experience in observability system design, monitoring at scale, and distributed system visibility, while supporting gradual decommissioning of legacy Datadog infrastructure. The engineer contributes to architecture documentation, system design, data flow mapping, and observability best practices across a global fintech organization handling large-scale transaction systems and compliance-driven logging requirements.