Careers

Senior Site Reliability Engineer

Position Objective

Do you love finding ways to make systems more efficient? Do you find it impossible to simply maintain when you could improve? Engineering to make a system more resilient and efficient frees up time and money to build more capabilities. Whether you come from a background in network engineering, systems administration, or software development – if you have a passion for making systems better, we need you!

Responsibilities

  • You’ll with the Modzy team on the development of more robust systems for by building a resilient infrastructure.
  • You’ll build in redundancy, implement monitoring tools, and automate wherever possible.
  • You’ll reduce toil by scripting routine tasks and automating self-repair. This is your chance to leverage your expertise in modern cloud and container monitoring while assisting junior engineers and broadening your knowledge base.

Skills

  • 2+ years of experience with monitoring a production cloud or container-based system
  • Experience with administering Kubernetes
  • Experience with administering a cloud environment
  • Experience with building queries in log aggregation tools, including Splunk or Elasticsearch
  • Experience with instrumenting cloud or container systems via a metrics collector, including Prometheus, SignalFX, DataDog, or New Relic
  • Ability to obtain a security clearance
  • BA or BS degree

Nice If You Have:

  • Experience collecting and analyzing system traces using tools that include Jaeger, OpenTracing, SignalFX, DataDoc
  • Experience with Azure