Wormhole Foundation is seeking a Crypto Production Engineer to ensure the reliability, security, and operational excellence of its blockchain production infrastructure. This role focuses on uptime, observability, deployment workflows, and incident response across distributed systems supporting interoperability technologies. The engineer will collaborate with engineering, DevOps, and validator partners to maintain a minimum of 99.99% uptime, excluding scheduled maintenance.
Responsibilities:
- Act as incident commander and first responder during production incidents.
- Lead incident triage, root cause analysis, and retrospective documentation.
- Build detailed timelines and preventative runbooks.
- Improve reliability, observability, monitoring, and alerting systems.
- Harden infrastructure for security and operational resilience.
- Enhance deployment workflows and reduce operational friction.
Requirements:
- Bachelor’s or Master’s degree in computer science or related field, or 5+ years of relevant experience.
- Proven experience leading incident response across global stakeholders.
- Strong knowledge of reliability engineering and distributed systems.
- Experience with Grafana, Datadog, Splunk, PagerDuty, or similar tools.
- Ability to write and debug code in Go, Rust, or Java.
- Experience operating production environments in AWS or GCP.
This full-time remote role requires strong communication skills and the ability to operate independently while leading critical production initiatives across global teams.