Roles & Responsibilities
Team :
Our Tech & product team is defining the next generation of trusted enterprise computing in the cloud. We're a fast-paced, agile and innovative team. We're highly collaborative and work across all areas of our technology stack. We enable critical services for the business, qualify complex compute changes, trail-blaze new engineering solutions for the cloud.
Responsibilities :
- You have experience balancing live-site management, feature delivery, and retirement of technical debt across an entire development team.
- Develop tools / products for service deployments, system configuration management, monitoring, diagnostics, and performance measurement
- Supporting Homegrown Self Serve services and software design solutions
- Review and Management of version-controlled deployment technologies
- An unwavering love of shipping software.
- Excited about building reliable, self-healing services on unreliable hardware.
- Design, deployment and continuous improvement of important infrastructure services
- Provide input into long-range platform requirements and operational guidelines, with a focus on automation and continuous improvement of Platform Service availability.
- Analyze and understand how stakeholders are using the platform and help drive continuous improvement of the offering based on that.
- Ensure that we are continuously raising our standard of engineering excellence by implementing best practices for coding, testing, code coverage and deployment
- Scoping of work, management of backlog and predictability of project delivery.
- Share daily operational responsibilities with the team
Required Skills :
BS or MS in Computer Science or equivalent experience.Strong CS fundamentals including data structures, algorithms, and distributed systems.You care about code simplicity and performanceMin 5-8 years of industry experience in designing, building, supporting, scalable, available, and low-latency distributed systems.Understanding of OO programming and concepts (Java, C++, C#, Python)Experience designing, developing, debugging, and operating resilient distributed systems that run across thousands of compute nodes in multiple data centres.Fluency in one or more scripting languages such as Python, ruby.Solve challenging technical problems related to security, parallel and distributed systems, programming, resource management, large-scale system maintenance, and more!Experience with AWS or GCP or another cloud PaaS provider.Solid understanding of how to configure, deploy, manage and maintain large cloud-hosted systems; including auto-scaling, monitoring, performance tuning, troubleshooting and disaster recovery.Proficiency with source control, continuous integration, and testing pipelines.Being a great listener, collaborator, communicator, and mentor.Championing a culture and work environment that promotes diversity and inclusionA strong background in open source technology.Experience using telemetry and metrics to drive operational excellenceExperience with building APIs and services using REST, SOAP, etc.Experience in the development of distributed / scalable systems and high-volume transaction applicationsKnowledge of professional software engineering and best practices for the full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operationsExperience with building services on top of relational and non-relational stores like SQL Server, MySQL, PostgreSQL, Cassandra, or MongoDBPreferred Skills :
Experience with Containers and orchestration services like Kubernetes, Docker etc.Experience with HBase, Hadoop, and large-scale big data systems.Knowledge about cloud security and best practices.Deep understanding of fundamental network technologies like DNS, Load Balancing, SSL, TCP / IP, SQL, HTTP.Tell employers what skills you have
Tactical Operations
Troubleshooting
Kubernetes
Scripting
Routing
Open Source
Reliability
Logging
Distributed Systems
Reliability Engineering
Python
Operating Systems
Docker
Java
Orchestration
Systems Engineering
Software Development