Senior Site Reliability Engineer - Klaviyo｜Meet.jobs

Salary

156k - 235k USD Annually

Required skills

Job description

Klaviyo is growing fast and we have openings for all skill levels across all of our teams. Learn more about our engineering culture at https://klaviyo.tech

Site Reliability Engineering (SRE) is what you get when you treat system operations as a software engineering problem. The mission of the Site Reliability Engineering team is to provide services, tooling, and guidance to Klaviyo's product engineers to make them more productive and ensure their services are sufficiently reliable, scalable, and secure.

The SRE team builds foundational backend services as well as tooling and automation to allow product teams to release and scale their software reliably and predictably. SREs are team players who work collaboratively amongst themselves and with engineers from product teams to build the platform Klaviyo relies on to power its products.

As a Senior Site Reliability Engineer you will own multiple foundational Klaviyo services and make a big impact on the productivity of our product engineering teams.

How you will make a difference:

Ship foundational services to enable Klaviyo engineering to move faster with confidence
Design and develop systems and processes that enable highly available & scalable systems
Design, build and deliver software to dramatically improve the availability, scalability, latency, and efficiency of Klaviyo’s services
Achieve break-throughs in systems throughput by identifying and eliminating bottlenecks
Leverage technology such as Python, Go, Bash, Django, AWS, Kubernetes, Terraform, MySQL, Apache Pulsar, Redis, and Clickhouse to advance Klaviyo’s platform
Champion best practices by actively collaborating with other teams in a culture that values technical design review
Contribute to the company as a subject matter expert in multiple areas, constantly pushing yourself to be a better engineer and to level up all of your peers within your team and within Klaviyo.
Mentor and pair with other Klaviyo engineers to build better software by focusing on performance, self-healing system, configuration as code; defensive programming, application security, etc.
Participate in periodic on call duties with a focus on solving issues when they are discovered, preventing recurrences and minimizing alert fatigue
Work hand-in-hand with product-facing engineers to ship impactful code
Perform quantitative analysis to understand and scale Klaviyo systems and manage the cross-functional effort to resolve scalability issues
Produce and advocate for preventative, upstream solutions with internal stakeholders and external vendors and dependencies
Confidently make informed, data-driven decisions in a fast paced environment with competing priorities
Evangelize Site Reliability best practices across the engineering organization and community

Who You Are:

BA or BS Degree in Computer Science, related field, or equivalent experience
5+ years of responsibility operating & scaling complex distributed systems
Experience developing applications in Python, Ruby, Go, etc.
Experience working on an engineering team building software
Fundamental understanding of Linux (we run Ubuntu) and all layers of the networking stack; you should be confident administering and debugging production Linux systems
Ability to handle yourself and complex systems in outage situations and to drive failures to root cause analysis and prevention of future issues

All Jobs

Referrer

Employer

Column

Log in

Sign up

Senior Site Reliability Engineer - Klaviyo｜Meet.jobs

Salary

156k - 235k USD Annually

Required skills

Job description

Klaviyo