Site Reliability Engineer

- Full-Time

Position Objective

You’re part of the Modzy engineering team that is dedicated to helping define, implement, and operate the technical infrastructure of a world-class artificial intelligence solution.


  • You will provide an expert’s perspective on how to improve and maintain service level objectives (SLO) for systems that will deploy to a wide variety of infrastructure and handle missions critical to the federal government.
  • You will implement processes, assessments, custom tools, and vendor appliances to help enable peak reliability, stability, and operational engagement for microservice development teams across the western hemisphere.
  • You will act as a subject matter expert (SME) for implementing DevSecOps best practices while establishing a continuous delivery pipeline.
  • You will assess performance and stability issues in production, lead blameless postmortem retrospectives, develop frameworks that collect and evaluate system telemetry, and write operational runbooks.


  • A minimum of 5 years of professional full-stack development experience
  • Production experience working with Kubernetes and major cloud providers, including AWS, Azure, or GCE
  • Experience with continuous integration and delivery pipelines, including Jenkins, CircleCI, or TravisCI
  • Experience with a scripting language, including Bash, Python, or Ruby
  • Experience with corporate networking fundamentals and network security best practices
  • Experience with service meshes
  • Knowledge of how to set up and administer Splunk, ElasticSearch, or other log aggregation products
  • Knowledge of how to set up and administer RDBMS products, including PostgreSQL
  • Security Clearance is a huge plus!
  • BS degree in CS or Computer Engineering

Learn More