At Xtremax, our Operations Support Engineers play a key role in ensuring the reliability, stability, and efficiency of mission-critical systems. In this role, you’ll work closely with developers, product managers, and user support teams to monitor system performance, resolve technical issues, and implement preventive measures. Your contributions will directly support smooth operations, high uptime, and reliable services for our users. Candidates with public sector experience are preferred, as this role supports IT projects for government agencies.
Responsibility :
System Monitoring & Performance
- Monitor and analyse product runtime environments (production and non-production) to ensure optimal system performance.
- Implement continuous improvement strategies to enhance system reliability and efficiency.
Incident & Problem Management
Manage application and security incidents, performing problem determination and coordinating with internal teams and vendors for resolution.Escalate issues as necessary to minimize business impact.Operational Processes & Compliance
Develop and maintain operations and process guides to meet audit and compliance requirements.Handle day-to-day operational activities, analyse performance data, and prepare status reports for stakeholders and management.Ensure operational processes align with IM8 and ISO 27001 standards.Conduct periodic compliance drills and support audit preparation.Team Coordination & Support
Lead and coordinate with operations teams and vendors to ensure 24 / 7 system support availability.Facilitate communication between teams to resolve operational issues efficiently.Automation & Proactive Operations
Build self-healing systems with automated remediation for common alerts.Implement Infrastructure as Code (IaC) pipelines to reduce manual configuration drift.Observability & Incident Readiness
Deploy full-stack monitoring with predictive analytics (CloudWatch Anomaly Detection, Stackdriver, Azure Monitor).Integrate alerting with central NOC / SOC for faster escalation and resolution.Collaboration & Enablement
Serve as the bridge between app teams and infra teams, enabling self-service for troubleshooting.Train agency teams on operational best practices and tool adoption (e.g., ITSM workflows, DevOps pipelines).Requirements
Must Have
A Bachelor’s degree in Computer Science, Information Technology, or a related field.Must have 2–5 years of relevant experience.Proven experience as an Operations Engineer or in a similar IT role.Familiarity with ITSM tools (e.g., Remedy, Zendesk, ServiceDesk) for change and incident management workflows.Experience in implementing security and access controls for production and test environments.Proficiency with full stack monitoring tools (e.g., APM tools, CloudWatch, Stackdriver, OpenAPM stack).Experience with automation tools (e.g., Terraform, Ansible) to minimize downtime and reduce human error.Knowledge of agile methodologies, DevOps pipelines, test-driven development, and information security practices.Cloud infrastructure experienceStrong problem-solving and communication skills, with the ability to explain complex issues to non-technical audiences.A collaborative, resourceful mindset with the ability to deliver innovative solutions.Experience with Linux and Windows admin.Good to Have :
Experience with Singapore Government Project will be advantageous.Database experience and scripting experience (Shell script / PowerShell / Python) are an advantage.Certificate Preferred :
AWS Certified DevOps Engineer – ProfessionalMicrosoft Certified : Azure DevOps Engineer ExpertGoogle Professional Cloud DevOps EngineerHashiCorp Certified : Terraform AssociateCertified Kubernetes Administrator (CKA)ITIL 4 Foundation