Staff Site Reliability Engineer

Elevare Search• Anywhere

6 years - 10 years

Negotiable

Posted: 4 days ago

Healthcare

Full-time

Job Summary

Blink Health is a healthcare technology company focused on making prescription medications more accessible and affordable through innovative digital platforms. The Staff Site Reliability Engineer will lead reliability and observability practices across the organization, designing scalable, resilient cloud infrastructure and automation while partnering with engineering teams to improve platform performance, operational maturity, and system reliability at scale.

Job Description

Blink Health is seeking a Staff Site Reliability Engineer to establish and advance reliability engineering practices across its healthcare technology platforms. This role is a senior technical leadership position responsible for improving system resilience, scalability, and operational excellence across cloud infrastructure and application services that support millions of patients.

Responsibilities:
- Establish and evolve SRE best practices including SLIs, SLOs, error budgets, incident response, and postmortems.
- Define and drive observability strategy across metrics, logging, tracing, dashboards, and alerting.
- Design and implement automation to reduce operational toil and improve system reliability.
- Lead large, ambiguous infrastructure and reliability initiatives from concept through delivery.
- Partner with engineering teams to improve developer workflows, tooling, and operational readiness.
- Provide technical mentorship, architecture guidance, and design and code reviews across teams.

Requirements:
- 7+ years of experience in site reliability, infrastructure, or platform engineering roles.
- Expert-level troubleshooting across application, system, and network layers.
- Strong Linux and networking expertise, including load balancing, DNS, and TCP/IP.
- Experience with automation and tooling using languages such as Python, Go, or Bash.
- Deep experience with AWS and Kubernetes, including production-grade architectures.
- Strong background in Infrastructure as Code using tools like Terraform or similar.

Benefits:
- Opportunity to work on systems that directly improve healthcare access and affordability.
- Collaborative, learning-focused engineering culture.
- Equal opportunity workplace committed to diversity and inclusion.

This role offers the chance to shape reliability at scale within a high-impact healthcare platform.

Keyskills

Site reliability engineering Cloud infrastructure AWS Kubernetes Observability Automation Linux Networking Infrastructure as Code

This site uses cookies

Staff Site Reliability Engineer

Job Summary

Job Description

Keyskills