Virtusa

AI/ML Engineer

Permanent
Dubai, United Arab Emirates
Experience 5 - 10 yrs

Apply now

View more jobs like this

Job expiry date: 19/09/2025

Return to jobs page

Job overview

Date posted
05/08/2025
Location
Dubai, United Arab Emirates
Salary
AED 20,000 - 30,000 per month
Compensation
Comprehensive package
Experience
5 - 10 yrs
Seniority
Experienced
Qualification
Bachelors degree
Expiration date
19/09/2025

Job description

We are looking for an AI/ML-focused Data Engineer with deep expertise in building intelligent data pipelines for unstructured content and integrating with modern machine learning ecosystems. The ideal candidate will be hands-on in PySpark and Python, and experienced in document classification, cleansing, and building AI-first applications using LLMs, vector databases, and RAG frameworks. This role bridges data engineering and machine learning across the enterprise.

Required skills

PySpark

Python

document classification

data cleansing

quality metrics

LLMs

LangChain

Transformers

Hugging Face

FAISS

vector databases

Redis

RAG frameworks

document chunking

metadata tagging

semantic search

OCR

NLP

CI/CD

agile methodologies

Key responsibilities

Build robust, scalable data processing pipelines for unstructured documents using PySpark and Python.
Implement document cleansing, classification, and enrichment to prepare data for AI/ML applications.
Develop and integrate data workflows into LLM pipelines and support RAG architectures.
Engineer vector embeddings, chunk documents, and apply metadata tagging for semantic search and QA systems.
Collaborate with AI architects, data engineers, and platform teams to design end-to-end AI solutions.
Communicate pipeline quality, data readiness, and model integration strategies to stakeholders.
Apply Agile and CI/CD practices to continuously deliver AI capabilities.

Experience & skills

5+ years of commercial experience, with at least 2 years in a relevant AI/ML role.
Strong proficiency in PySpark and distributed data frameworks.
Solid experience in Python and ML/AI libraries (e.g., Transformers, LangChain, Hugging Face, FAISS).
Expertise in processing unstructured data, including OCR, NLP, classification, and tagging.
Familiarity with vector databases like Redis and embedding models for RAG pipelines.
Understanding of the LLM lifecycle, including fine-tuning, inference, and prompt engineering.
Experience in agile environments and working with cross-functional teams.
Excellent communication skills with ability to engage technical and business stakeholders.

Apply now

Return to jobs page

Share this post