Back to Jobs
ZainTECHAI & Machine Learning 2h ago

Data & AI Operations Specialist

IndiaIndia
Full-time
Not Disclosed

Job Description

Role Overview: The Data & Operations AI Specialist serves as the Level 3 technical lead for Artificial Intelligence and Data Platform estate. You will be responsible for the architecture, engineering, and advanced troubleshooting of AI infrastructure, data pipelines, and MLOps lifecycles across a multi-cloud environment (Azure and OCI).

Responsibilities

  • AI Infrastructure & Platform Engineering: Maintain monitoring architecture, configure advanced dashboards (Grafana/Azure Monitor). Manage Azure Machine Learning (AML) workspaces, compute targets, and Databricks cluster lifecycles. Oversee GPU resource allocation and optimize cost-performance. Ensure all AI services utilize private endpoints, VNET integration, and RBAC controls.
  • Data Pipeline & ETL Management: Own the design, optimization, and remediation of Azure Data Factory (ADF) and Synapse pipelines. Resolve complex bottlenecks related to authentication failures, data format changes, and ETL performance. Author SOPs for the L1 NOC team.
  • MLOps & Model Lifecycle: Implement CI/CD pipelines for model training, testing, and deployment to AML endpoints. Configure data drift detection thresholds and automated retraining triggers. Develop self-healing scripts and automated recovery runbooks.
  • Governance & Compliance: Implement and maintain audit logging for AI decisions and model outputs (SIEM/vSOC). Conduct quarterly AI governance reviews to ensure compliance with NESA standards and data privacy guidelines.

Requirements

  • AI/ML Platforms: Deep expertise in Azure Machine Learning and Databricks.
  • Data Integration: Proficiency in Azure Data Factory and Synapse.
  • Infrastructure-as-Code (IaC): Experience with Terraform or ARM Templates for reproducible deployments.
  • Observability & Containerization: Ability to use Dynatrace, Grafana, Azure Monitor, AKS, Istio Service Mesh, and KEDA.
  • ITIL Mastery: Strong understanding of ITIL-aligned Incident, Change, and Problem management.
  • Security Mindset & Technical Writing: Familiarity with NESA standards, UAE data residency requirements, and ability to draft complex SOPs and RCAs within 48 hours.
  • Certifications: Microsoft Azure Data Scientist Associate or Azure AI Engineer Associate is highly preferred.

Safety First

  • Never pay for a job application.
  • Do not share sensitive bank info.
  • Verify the client before starting work.