Talent.com
This job offer is not available in your country.
Site Reliability Engineer (SRE) for WOG with 4 years experience (contract) - D14 Geylang, Eunos, SG

Site Reliability Engineer (SRE) for WOG with 4 years experience (contract) - D14 Geylang, Eunos, SG

WEBSPARKS PTE. LTD.D14 Geylang, Eunos, SG
14 days ago
Job description

Roles & Responsibilities

Contract Duration : 12 months (Renewable)

We are seeking a skilled and passionate Engineer to join our team to build and operate a Whole-of-Government (WoG) runtime platform.

As a Site Reliability Engineer, you will be responsible for designing and operating GitLab, AWS and Kubernetes-based infrastructure and solutions that power our platform, to ensure the stability, scalability, and performance of our runtime platform.

Key Responsibilities

Toil Reduction & Automation

  • Identify repetitive tasks and develop automation via CI / CD pipelines, ensuring integration with cross-functional teams to reduce manual intervention and improve operational efficiency.

Observability & System Health

  • Implement comprehensive observability solutions (logs, metrics, traces, alerts) around the four Golden Signals (latency, traffic, errors, saturation), and build automation for proactive system health assessments and self-remediation.
  • Production Support & Incident Management

  • Participate in on-call rotations, promptly respond to incidents to minimize MTTR, and conduct thorough post-incident reviews to implement preventive measures and improve system resilience.
  • Security & Compliance

  • Design and implement solutions that are secure and compliant by collaborating with dedicated security teams, conducting regular audits, and integrating advanced vulnerability scanning tools.
  • Maintenance, Optimisation & Performance

  • Identify and resolve performance bottlenecks and operational issues, define and track KPIs (e.g., MTTR, system uptime, cost efficiency), and drive ongoing optimisation efforts.
  • Strategic Customer Engagement

  • Act as a technical advisor for tenants, guiding them on containerization, and best practices for cloud-native deployments, and participating in strategic initiatives to enhance platform scalability and performance.
  • Knowledge Sharing & Documentation

  • Develop and maintain detailed playbooks, runbooks, and documentation to facilitate team-wide knowledge sharing, streamline incident response, and ensure that critical processes are well understood across the team.
  • Continuous Learning & Innovation

  • Stay current with the latest AWS, Kubernetes, and industry developments, and proactively recommend improvements and innovative solutions to maintain a competitive and reliable platform.
  • Required Qualifications

  • Minimum 4 years relevant experience
  • Bachelor's degree or Diploma in Computer Science, Engineering, or a related field (or equivalent experience).
  • Proven experience as a Site Reliability Engineer or similar role, with a strong background in containerization, orchestration, and cloud-native technologies.
  • Proven ability to troubleshoot and resolve complex technical issues in containerized applications.
  • Demonstrated experience with incident management, including post-incident reviews and continuous improvement.
  • Strong documentation skills and experience in knowledge sharing across teams.
  • Deep understanding of AWS, Kubernetes (including AWS EKS), and operational best practices, with familiarity in multi-cloud or hybrid environments.
  • Solid grasp of networking, security, and storage in both AWS and Kubernetes contexts.
  • Experience integrating Kubernetes with AWS cloud technologies (e.g., Secrets Manager, Load Balancers) and using infrastructure-as-code (Terraform or similar).
  • Hands-on experience with containerization tools (Kubernetes, Kustomize, Helm) and automation scripting (Go, Python, Bash, or equivalent).
  • Ability to write and maintain automated tests or conduct thorough manual testing for automation scripts, ensuring the reliability and effectiveness of automated solutions.
  • Familiarity with CI / CD tools (GitLab CI / CD, ArgoCD) and version control systems (Git).
  • Experience with observability / monitoring tools (Prometheus, Grafana, ELK Stack) and defining SLOs and Error Budgets.
  • Certifications such as Certified Kubernetes Administrator (CKA) or Certified Kubernetes Application Developer (CKAD) are a plus.
  • Experience with developing Kubernetes operators using Go, service mesh technologies, and Chaos Engineering is a plus.
  • Soft skills :

  • Proactive in identifying problems and recommending strategic solutions.
  • Excellent problem-solving skills with a robust analytical mindset.
  • Clear, concise, and effective communication skills; adept at collaborating across cross-functional teams, including development, security, and customer-facing groups.
  • Ability to remain calm and effective under pressure, especially during incident response.
  • Adaptability to rapid change with a continuous learning mindset, sharing knowledge to foster team growth.
  • Customer-focused with the ability to translate technical insights into understandable, actionable guidance.
  • Leadership and mentoring capabilities, contributing to the development of a resilient and collaborative team environment are a plus.
  • Please note that a coding test will be included in the second round of interviews for selected candidates.

    Tell employers what skills you have

    Version Control

    Manual Testing

    Scalability

    Vulnerability Scanning

    Kubernetes

    Customerfacing

    Customer Engagement

    Documentation Skills

    Bash

    Soft Skills

    Customerfocused

    Adaptability

    Containerization

    Orchestration

    Incident Management

    Create a job alert for this search

    Site Reliability Engineer • D14 Geylang, Eunos, SG

    Related jobs
    • Promoted
    Site Engineer - D14 Geylang, Eunos, SG

    Site Engineer - D14 Geylang, Eunos, SG

    JUNNIMED SERVICES PTE. LTD.D14 Geylang, Eunos, SG
    Manage the day to day progress of installations, ensuring project / s progresses accordingly and addresses technical challenges that may arise in a timely manner. Read and interpret technical drawings...Show moreLast updated: 10 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    ABAXX SINGAPORE PTE. LTD.D01 Cecil, Marina, People’s Park, Raffles Place, SG
    Site Reliability Engineer - Networking.We are seeking competent candidate joining our Infrastructure Team for the mission building and operating MAS regulated marketplace and clearing house.This ro...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    INMARSAT SOLUTIONS PTE. LTD.D12 Balestier, Serangoon, Toa Payoh, SG
    At Viasat, we’re on a mission to deliver connections with the capacity to change the world.For more than 35 years, Viasat has helped shape how consumers, businesses, governments and militaries arou...Show moreLast updated: 13 days ago
    • Promoted
    • New!
    Site Reliability Engineer (SRE) - Islandwide, SG

    Site Reliability Engineer (SRE) - Islandwide, SG

    PERSOLKELLY SINGAPORE PTE. LTD.Islandwide, SG
    Site Reliability Engineer (SRE).An excellent Site Reliability Engineer (SRE) opportunity is available in a cutting-edge, fast-growing cloud environment. Deliver reliable, secure, and scalable cloud ...Show moreLast updated: 1 hour ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    CAREER INTERNATIONAL - FOS PTE. LTD.D11 Novena, Thomson, Watten Estate, SG
    Ensure the stability, reliability, and efficient operation of the Company's global business, maintaining high availability of services at all times. Responsible for core operational tasks such as re...Show moreLast updated: 6 days ago
    • Promoted
    Site Reliability Engineer (MCS) - D05 Clementi New Town, Hong Leong Garden, Pasir Panjang, SG

    Site Reliability Engineer (MCS) - D05 Clementi New Town, Hong Leong Garden, Pasir Panjang, SG

    THALES DIS (SINGAPORE) PTE. LTD.D05 Clementi New Town, Hong Leong Garden, Pasir Panjang, SG
    You will work in a Devops team managing ODC products in GCP Cloud, following the SRE approach.You will develop and maintain IAC code and automation tools. You will be responsible to provide technica...Show moreLast updated: 7 days ago
    • Promoted
    Site Reliability Engineer (SRE)

    Site Reliability Engineer (SRE)

    COFFEE MEETS BAGEL PTE. LTD.D01 Cecil, Marina, People’s Park, Raffles Place, SG
    We are a global dating app created to give everyone a chance at love.The sense of belonging and connectedness we get from relationships helps us survive and thrive, and we’re working to make it a l...Show moreLast updated: 24 days ago
    • Promoted
    Site Reliability Engineer - INMARSAT SOLUTIONS PTE. LTD.

    Site Reliability Engineer - INMARSAT SOLUTIONS PTE. LTD.

    INMARSAT SOLUTIONS PTE. LTD.D12 Balestier, Serangoon, Toa Payoh, SG
    At Viasat, we’re on a mission to deliver connections with the capacity to change the world.For more than 35 years, Viasat has helped shape how consumers, businesses, governments and militaries arou...Show moreLast updated: 13 days ago
    • Promoted
    Site Reliability Engineer (SRE) - D01 Cecil, Marina, People’s Park, Raffles Place, SG

    Site Reliability Engineer (SRE) - D01 Cecil, Marina, People’s Park, Raffles Place, SG

    COFFEE MEETS BAGEL PTE. LTD.D01 Cecil, Marina, People’s Park, Raffles Place, SG
    We are a global dating app created to give everyone a chance at love.The sense of belonging and connectedness we get from relationships helps us survive and thrive, and we’re working to make it a l...Show moreLast updated: 16 days ago
    • Promoted
    Site Reliability Engineer - D11 Novena, Thomson, Watten Estate, SG

    Site Reliability Engineer - D11 Novena, Thomson, Watten Estate, SG

    CAREER INTERNATIONAL - FOS PTE. LTD.D11 Novena, Thomson, Watten Estate, SG
    Ensure the stability, reliability, and efficient operation of the Company's global business, maintaining high availability of services at all times. Responsible for core operational tasks such as re...Show moreLast updated: 6 days ago
    • Promoted
    Site Reliability Engineer - D01 Cecil, Marina, People’s Park, Raffles Place, SG

    Site Reliability Engineer - D01 Cecil, Marina, People’s Park, Raffles Place, SG

    ABAXX SINGAPORE PTE. LTD.D01 Cecil, Marina, People’s Park, Raffles Place, SG
    Site Reliability Engineer - Networking.We are seeking competent candidate joining our Infrastructure Team for the mission building and operating MAS regulated marketplace and clearing house.This ro...Show moreLast updated: 15 days ago
    • Promoted
    • New!
    Site Reliability Engineer (SRE) - PERSOLKELLY SINGAPORE PTE. LTD.

    Site Reliability Engineer (SRE) - PERSOLKELLY SINGAPORE PTE. LTD.

    PERSOLKELLY SINGAPORE PTE. LTD.Islandwide, SG
    Site Reliability Engineer (SRE).An excellent Site Reliability Engineer (SRE) opportunity is available in a cutting-edge, fast-growing cloud environment. Deliver reliable, secure, and scalable cloud ...Show moreLast updated: 1 hour ago
    • Promoted
    • New!
    Site Reliability Engineer (SRE)

    Site Reliability Engineer (SRE)

    PERSOLKELLY SINGAPORE PTE. LTD.Islandwide, SG
    Site Reliability Engineer (SRE).An excellent Site Reliability Engineer (SRE) opportunity is available in a cutting-edge, fast-growing cloud environment. Deliver reliable, secure, and scalable cloud ...Show moreLast updated: 1 hour ago
    • Promoted
    Site Reliability Engineer (SRE) - COFFEE MEETS BAGEL PTE. LTD.

    Site Reliability Engineer (SRE) - COFFEE MEETS BAGEL PTE. LTD.

    COFFEE MEETS BAGEL PTE. LTD.D01 Cecil, Marina, People’s Park, Raffles Place, SG
    We are a global dating app created to give everyone a chance at love.The sense of belonging and connectedness we get from relationships helps us survive and thrive, and we’re working to make it a l...Show moreLast updated: 16 days ago
    • Promoted
    Site Reliability Engineer (SRE) for WOG with 4 years experience (contract)

    Site Reliability Engineer (SRE) for WOG with 4 years experience (contract)

    WEBSPARKS PTE. LTD.D14 Geylang, Eunos, SG
    Contract Duration : 12 months (Renewable).We are seeking a skilled and passionate Engineer to join our team to build and operate a Whole-of-Government (WoG) runtime platform.As a Site Reliability En...Show moreLast updated: 14 days ago
    • Promoted
    Site Reliability Engineer (SRE) for WOG with 4 years experience (contract) - WEBSPARKS PTE. LTD.

    Site Reliability Engineer (SRE) for WOG with 4 years experience (contract) - WEBSPARKS PTE. LTD.

    WEBSPARKS PTE. LTD.D14 Geylang, Eunos, SG
    Contract Duration : 12 months (Renewable).We are seeking a skilled and passionate Engineer to join our team to build and operate a Whole-of-Government (WoG) runtime platform.As a Site Reliability En...Show moreLast updated: 14 days ago
    • Promoted
    Site Reliability Engineer - ABAXX SINGAPORE PTE. LTD.

    Site Reliability Engineer - ABAXX SINGAPORE PTE. LTD.

    ABAXX SINGAPORE PTE. LTD.D01 Cecil, Marina, People’s Park, Raffles Place, SG
    Site Reliability Engineer - Networking.We are seeking competent candidate joining our Infrastructure Team for the mission building and operating MAS regulated marketplace and clearing house.This ro...Show moreLast updated: 15 days ago
    • Promoted
    Senior Site Engineer - D14 Geylang, Eunos, SG

    Senior Site Engineer - D14 Geylang, Eunos, SG

    TENG XUAN CONSTRUCTION PTE. LTD.D14 Geylang, Eunos, SG
    We are seeking an experienced and dedicated Senior Site Engineer to join our dynamic team in the construction industry.The Senior Site Engineer will be responsible for overseeing construction proje...Show moreLast updated: 3 days ago