The Role : Architecting the Cognitive Platform
We are seeking a foundational AI Systems Engineer to design, build, and own the core infrastructure that powers our entire company.
This is a unique, high-leverage role. You are not just supporting one product; you are building the single, unified platform that must serve two critical functions :
- Our SaaS Platform : A scalable, multi-tenant, and low-latency infrastructure that delivers our Causal AI models to paying customers via robust APIs / UI.
- Our AI Research Platform : A high-performance, flexible environment that empowers our researchers to conduct massive "self-play" simulations, run "hero training runs" on vast, multimodal datasets, and rapidly prototype new models.
Your work will be the backbone that connects our most advanced research with real-world, high-stakes industrial data.
What You Will Do
Build the Core Cloud-Native Platform : Design, build, and manage our entire infrastructure from the ground up on Kubernetes (K8s), using Infrastructure as Code (Terraform, Pulumi) for everything.Engineer the SaaS Delivery Architecture : Implement the multi-tenant, secure, and highly-available service architecture for our customer-facing APIs. This includes API gateways, service mesh, observability, and logging.Create the MLOps / Research Engine : Build the internal AI / ML platform. This includes managing data versioning (DVC, Pachyderm), orchestrating on-demand GPU / TPU-heavy training workloads, and providing researchers with feature stores and a "self-service" environment for experimentation.Master Real-Time Data & Orchestration : Engineer the high-throughput, real-time data ingestion pipelines (e.g., Kafka, Pulsar, Spark Streaming) required to model "network cascades" and "perishable inventory" in sectors like aviation and logistics.Own Complex Dataflow (DAGs) : Design, implement, and manage the complex dataflow orchestration (e.g., Airflow, Dagster, Prefect) that powers both our production ETL / ELT and our complex, multi-stage AI simulation and training loops.Champion CI / CD & GitOps : Own and enforce a rigorous CI / CD and GitOps-based discipline. You will be responsible for building the automated pipelines that enable our "relentless shipping" culture, allowing us to deploy to production safely and multiple times a day.Unify the Data Layer : Design and manage our central data lakehouse (e.g., Databricks, Snowflake) to act as the "single source of truth," serving real-time analytics for our SaaS platform and batch workloads for AI research.Ideal Candidate Profile
A "Full-Stack" Infrastructure Engineer : You are a systems-level thinker who is equally comfortable in the domains of cloud-native infrastructure (K8s, Networking), data engineering (Kafka, Spark), and MLOps (GPU workloads, orchestration).Deep Cloud-Native Expertise : You have multiple years of hands-on, in-production experience with Kubernetes, Terraform (or other IaC), and a major cloud provider (AWS / GCP / Azure).CI / CD & Automation Fanatic : You live and breathe automation. You have extensive experience building and maintaining robust CI / CD pipelines (e.g., GitLab CI, Jenkins, ArgoCD) and believe GitOps is the standard.SaaS & MLOps Fluency : You have ideally built platforms that serve both external B2B customers (with SLAs, security, and multi-tenancy) and internal R&D teams (with needs for flexibility, speed, and massive compute).A "Relentless Shipper" (Startup Mentality) : You are a pragmatic, proactive builder who thrives in a fast-paced startup environment. You understand that "done" is better than "perfect" and are comfortable with tight release schedules and high ownership.Technical Polyglot : You possess deep expertise in Python and / or Go, shell scripting, and the modern data stack (SQL, orchestration tools, streaming platforms).#J-18808-Ljbffr