Talent.com
This job offer is not available in your country.
Senior Cloud Operations Engineer – AI / ML Platforms

Senior Cloud Operations Engineer – AI / ML Platforms

PEOPLESEARCH PTE. LTD.Islandwide, SG
16 hours ago
Job description

Roles & Responsibilities

Senior Cloud Operations Engineer – AI / ML Platforms

Our client is looking for an experienced cloud specialist to lead and optimise the operation, reliability and security of Azure-based AI platform to ensure seamless, scalable and secure delivery of AI / ML services.

Responsibilities :

  • Ensure high availability, performance and reliability of Azure-based AI cloud platforms.
  • Lead real-time monitoring, incident detection and response to minimize downtime and risk.
  • Design and manage disaster recovery and business continuity plans for critical AI services.
  • Oversee cybersecurity operations including threat detection, access control and vulnerability management.
  • Ensure compliance with security policies, standards and regulatory frameworks through regular audits and reporting.
  • Collaborate with MLOps, LLMOps and engineering teams to embed automation, observability and security into platform operations.
  • Drive operational improvements by implementing automation and adopting SRE and DevOps best practices.
  • Manage the AI platform operations team by assigning tasks and ensuring smooth day-to-day operations.

Requirements :

  • Bachelor’s Degree in Computer Science, Engineering or a related discipline.
  • Minimum 6 years of experience in cloud platform operations and / or administration.
  • In-depth expertise in Azure services especially Azure Monitor, Log Analytics and Application Insights.
  • Proven skills in incident management, site reliability engineering (SRE) and disaster recovery planning.
  • Strong knowledge of cloud security tools and practices (IAM, SIEM / SOAR, EDR, Firewalls, etc.).
  • Proficient in infrastructure-as-code tools (Terraform, Bicep, ARM, etc.) and scripting languages (PowerShell, Python, etc.).
  • Experience managing AI / ML infrastructure components such as AKS, GPU VMs, data pipelines and model deployment would be an added advantage.
  • Familiarity with security compliance standards like ISO 27001, CIS and NIST.
  • Location : Central Singapore

    Kindly indicate your current / last salary details and your notice period

    If the above speaks to you, we d love to hear from you. Please send in your updated CV to ezmir.m@peoplesearch.jobs (Ezmir, Reg No : R22105085) if you feel there is a fit with your experience and interest. You may forward this great opportunity to someone who would be a great fit for this role.

    All information will be kept strictly confidential. We regret to inform that only successful applicants will be contacted.

    PeopleSearch Pte Ltd

    EA License No : 16S8057

    Tell employers what skills you have

    Azure

    Pipelines

    Vulnerability Management

    ARM

    ISO

    Scripting

    Administration

    ISO 27001

    Reliability

    Access Control

    Compliance

    Reliability Engineering

    Python

    Disaster Recovery

    Incident Management

    Create a job alert for this search

    Cloud Engineer • Islandwide, SG