Roles & Responsibilities
Senior Cloud Operations Engineer – AI / ML Platforms
Our client is looking for an experienced cloud specialist to lead and optimise the operation, reliability and security of Azure-based AI platform to ensure seamless, scalable and secure delivery of AI / ML services.
Responsibilities :
- Ensure high availability, performance and reliability of Azure-based AI cloud platforms.
- Lead real-time monitoring, incident detection and response to minimize downtime and risk.
- Design and manage disaster recovery and business continuity plans for critical AI services.
- Oversee cybersecurity operations including threat detection, access control and vulnerability management.
- Ensure compliance with security policies, standards and regulatory frameworks through regular audits and reporting.
- Collaborate with MLOps, LLMOps and engineering teams to embed automation, observability and security into platform operations.
- Drive operational improvements by implementing automation and adopting SRE and DevOps best practices.
- Manage the AI platform operations team by assigning tasks and ensuring smooth day-to-day operations.
Requirements :
Bachelor’s Degree in Computer Science, Engineering or a related discipline.Minimum 6 years of experience in cloud platform operations and / or administration.In-depth expertise in Azure services especially Azure Monitor, Log Analytics and Application Insights.Proven skills in incident management, site reliability engineering (SRE) and disaster recovery planning.Strong knowledge of cloud security tools and practices (IAM, SIEM / SOAR, EDR, Firewalls, etc.).Proficient in infrastructure-as-code tools (Terraform, Bicep, ARM, etc.) and scripting languages (PowerShell, Python, etc.).Experience managing AI / ML infrastructure components such as AKS, GPU VMs, data pipelines and model deployment would be an added advantage.Familiarity with security compliance standards like ISO 27001, CIS and NIST.Location : Central Singapore
Kindly indicate your current / last salary details and your notice period
If the above speaks to you, we d love to hear from you. Please send in your updated CV to ezmir.m@peoplesearch.jobs (Ezmir, Reg No : R22105085) if you feel there is a fit with your experience and interest. You may forward this great opportunity to someone who would be a great fit for this role.
All information will be kept strictly confidential. We regret to inform that only successful applicants will be contacted.
PeopleSearch Pte Ltd
EA License No : 16S8057
Tell employers what skills you have
Azure
Pipelines
Vulnerability Management
ARM
ISO
Scripting
Administration
ISO 27001
Reliability
Access Control
Compliance
Reliability Engineering
Python
Disaster Recovery
Incident Management