Site Reliability Engineering Manager - Web Infrastructure and Gateway - Klaviyo|Meet.jobs

薪資

192k - 288k USD Annually

技能需求

    工作機會描述

     

    Site Reliability Engineering (SRE) is what you get when you treat system operations as a software engineering problem. The mission of the Site Reliability Engineering group is to provide services, tooling, and guidance to Klaviyo's product engineers to make them more productive and ensure their services are sufficiently reliable, scalable, and secure.

    We are seeking an Engineering Manager to join our SRE team to drive the evolution of our web infrastructure and gateway. We are in the early stages of building a modern edge services platform to support our product engineering teams. This is an exciting opportunity to join a new team and have a big impact on the future of how Klaviyo makes new services available.

    Our Mission

    API Gateway Deployment: Develop and operate an extensible API gateway platform. Create positive developer experience by introducing new features to the API Gateway Platform while maintaining or enhancing system reliability. Unlock domain decomposition for user facing web services at Klaviyo.

    Maintaining System Stability: Ensure the current systems remain stable through proactive monitoring and optimization.

    Customer Engagement and Feedback: Gather and act on customer feedback to prioritize feature development and roadmap decisions.

    How You'll Make a Difference

    • Manage 3-5 Site Reliability Engineers remotely or in Klaviyo's Boston office.
    • Help individuals on your team develop and execute SMART goals and personal development plans that align with Klaviyo's goals and objectives, and understand how their work fits into the bigger picture.
    • Interview, hire, and level up the Web Infrastructure and Gateway engineering team.
    • Work with the team on project planning and defining milestones, identifying dependencies, and meeting business goals.
    • Participate in deep system design and implementation discussions within your team and across partner teams to ensure that we're building the right systems and keeping quality high.
    • Level up the team through hands-on coaching and individual contribution. This includes pairing with direct reports to design, write, and deliver software to improve the scalability, reliability, and security of Klaviyo's systems.
    • Iterate and improve upon engineering-wide processes like recruiting, onboarding, performance management, communication, and Agile software development.

    Who You Are

    • Successfully led and delivered routing and network infrastructure projects spanning multiple quarters and involving input from multiple external stakeholders.
    • Experience coaching and growing Site Reliability Engineers.
    • Experience developing and rolling out engineering-wide processes.
    • Hands on coding experience. This role will require about 30-50% coding.
    • Focused on providing a positive user experience.
    • Technical experience in

      • Edge routing
      • Api gateway systems
      • Networking
      • High scale, high availability, high performance HTTP traffic
      • Observability

    Klaviyo