Raffles Quay, Singapore | Posted on 06 / 17 / 2025
KLAARA specializes in providing AI-driven solutions tailored to the needs of customers in the Capital Markets, Insurance, Reinsurance, and various other organizations. Our cutting-edge platform empowers businesses to transform their unstructured data into actionable intelligence, fueling better decision-making and driving competitive advantage.
We are seeking a highly-skilled Solution Architect to lead the Infrastructure and Cloud Architecture of our AI-driven platform, particularly in the financial and banking sectors. In this role, you will be responsible for designing scalable, secure, and resilient cloud-native environments that support complex AI / ML workloads within the firm and at customer sites, ensuring optimal instrumentation, scalability, and compliance with security standards.
You will collaborate closely with cross-functional teams including Machine Learning Engineering, Security, Software Engineers, Business Analysts, and Customers to ensure seamless delivery, performance testing, and documentation of our solutions. This is a hands-on role requiring strong technical acumen, cross-functional teamwork, and occasional code fixes. You are expected to :
- Design, implement, and manage scalable cloud infrastructure (AWS, Azure, or GCP) tailored for AI / ML workloads.
- Lead and execute the implementation of software deployment packages at customer sites, ensuring robust instrumentation and scalable performance.
- Design, develop, and execute load and scalability tests to validate system reliability under various conditions.
- Implement and maintain monitoring and alerting systems to proactively identify and resolve issues in development, staging, and production environments.
- Participate in incident response, root cause analysis, and contribute to continuous improvement of system reliability.
- Develop and maintain automation scripts and tools to streamline deployment, infrastructure provisioning, and routine operational tasks.
- Create and maintain clear, comprehensive documentation for customers on deployment, configuration, and operations.
- Collaborate with internal development teams, business analysts, and customers to align technical solutions with business requirements.
- Troubleshoot and resolve deployment and integration issues, occasionally contributing fixes in TypeScript or Rust.
- Ensure compliance with security and regulatory requirements, particularly in Banking environments.
- Maintain and optimize CI / CD and MLOps pipelines and deployment automation using Jenkins and Atlassian tools.
- Manage and maintain development, staging, and production environments, ensuring consistency and high availability.
- Support and manage infrastructure components using Linux, Docker, and Kubernetes.
- Integrate and manage application components such as Kong, OpenTelemetry, Kafka, RabbitMQ, and MySQL.
- Work with systems designed using microservice architecture, ensuring seamless deployment and integration across services.
- Utilize Infrastructure as Code (IaC) tools such as Terraform, Ansible, or Helm to provision and manage infrastructure effectively.
Requirements
Ph.D. or Master's Degree in Information Technology, Computer Science, Engineering, or related field.Minimum of 10 years of experience in Cloud Operations, Infrastructure Engineering, and Automation, with at least 5 years in AWS or similar Cloud operations (Azure, GCP), preferably within Banking or Financial Services industry.Proven experience with CI / CD tools such as Jenkins and the Atlassian suite (Bitbucket, Jira, Confluence).Solid understanding of mainstream cloud products and services including AWS, Azure, GCP, and OpenShift.Hands-on experience with Linux, Docker, Kubernetes, and cloud-native deployment practices.Solid understanding of microservice architecture and deploying and managing distributed systems.Experience with monitoring and observability tools, especially OpenTelemetry, and managing alerts and incident response processes.Proficiency in automation and scripting (e.g., Bash, Python, or similar) for deployment and infrastructure tasks.Experience in building and implementing Infrastructure as Code (IaC) tools such as Terraform or Open Tofu.Experience with API and Microservices architecture patterns for deploying ML models on cloud.Familiarity with API gateways (e.g., Kong), message brokers (Kafka, RabbitMQ), and databases (MySQL, PostgreSQL).Ability to design and execute performance, load, and scalability tests.Basic proficiency in TypeScript and / or Rust, sufficient for reading code and implementing minor fixes.Strong understanding of security, compliance, and risk management practices in financial services or banking.Excellent communication and collaboration skills; able to interact with technical and non-technical stakeholders.Skill set
AWS, Azure, GCP, Linux, OpenShift, Redhat, SQL, Solution Architect, Cloud Computing, AI, Fintech, Banking, Infrastructure, Java, MLOps.
J-18808-Ljbffr