Know ATS Score
CV/Résumé Score
  • Expertini Resume Scoring: Our Semantic Matching Algorithm evaluates your CV/Résumé before you apply for this job role: Senior MLOps Engineer.
United Arab Emirates Jobs Expertini

Urgent! Senior MLOps Engineer Job Opening In Dubai – Now Hiring Open Data Science

Senior MLOps Engineer



Job description

Brief description of the vacancy

We are seeking a Senior MLops Engineer with proven experience in deploying and managing large-scale ML infrastructure for LLMs , TTS , STT , Stable Diffusion , and other GPU-intensive models in production.

You will lead the design and operation of cost-efficient , high-availability , and high-performance serving stacks in a Kubernetes-based AWS environment.

About the company

Company Identity AI Labs

A fast-growing and well-funded AI startup in the UAE.

Mission of the company is to redefine how humans interact with AI through emotionally intelligent, relationship-focused technology

Responsibilities

  • You will architect, deploy, and maintain scalable ML infrastructure on AWS EKS using Terraform and Helm .

  • You will own end-to-end model deployment pipelines for LLMs, diffusion models (LDM/Stable Diffusion), and other generative/AI models requiring high GPU throughput .

  • You will design cost-effective, auto-scaling serving systems using tools like Triton Inference Server , vLLM , Ray Serve , or similar model-serving frameworks.

  • You will build and maintain CI/CD pipelines integrating the ML model lifecycle (training → validation → packaging → deployment).

  • You will optimize GPU resource utilization and implement job orchestration with frameworks like KServe , Kubeflow , or custom workloads on EKS .

  • You will deploy and manage FluxCD (or ArgoCD ) for GitOps-based deployment and environment promotion.

  • You will implement robust monitoring, logging, and alerting for model health and infrastructure performance (Prometheus, Grafana, Loki).

  • You will collaborate closely with ML Engineers and Software Engineers to ensure smooth integration, observability, and feedback loops.

Requirements
  • 2–3 years of experience with model serving frameworks such as Triton , vLLM , Ray Serve , TorchServe , or similar.

  • 2–3 years of experience deploying and optimizing LLMs and LDMs (e.g., Stable Diffusion) under high load with GPU-aware scaling .

  • 3–4 years of experience with Kubernetes (EKS) and infrastructure-as-code ( Terraform , Helm).

  • 4–5 years of hands-on software engineering experience in Python , with production-grade experience in ML model lifecycle.

  • Nice to have: familiarity with Go or Rust for backend or performance-critical systems.

Working conditions

Full time job in Dubai office, official employment and full relocation package

Contacts

Log In Only registered users can open employer contacts.

#J-18808-Ljbffr


Required Skill Profession

Other General



Your Complete Job Search Toolkit

✨ Smart • Intelligent • Private • Secure

Start Using Our Tools

Join thousands of professionals who've advanced their careers with our platform

Rate or Report This Job
If you feel this job is inaccurate or spam kindly report to us using below form.
Please Note: This is NOT a job application form.


    Unlock Your Senior MLOps Potential: Insight & Career Growth Guide