Site Reliability Engineer - Field Operations - C3.ai|Meet.jobs

Salary

120k - 158k USD Annually

Required skills

    Job description

    The Role

    We are looking for a Site Reliability Engineers to join our team in Tysons, VA and Redwood City, CA.

    Responsibilities:

    • Maximize system uptime and availability, ensuring functional and performance SLAs.
    • Establish end-to-end monitoring and alerting on all critical aspects.
    • Solve complex problems for critical services and build automation to prevent problem recurrence.
    • Influence and create new designs, architectures, standards, and methods for supporting the platform.
    • Initiate and lead scripting and automation to streamline system updates and upgrades.
    • Set up critical infrastructure, tools, and framework to streamline the deployment cycle.
    • Work cross-functionally with Services and Engineering teams.

    Qualifications:

    • Demonstrated experience in deploying, managing, and operating scalable and fault-tolerant Linux/Kubernetes/JVM-based infrastructure in AWS, GCP, and other public clouds.
    • Expertise in Linux Operating Systems, Networking, and Database concepts.
    • Experience with Cassandra (or another NoSQL alternative).
    • Expertise in cloud providers, such as Amazon Web Services, Azure, and GCP.
    • Experience with configuration management systems such as Ansible or Terraform.
    • Experience in Ruby or Python; to automate and monitor systems.
    • Excellent problem-solving, critical thinking, and communication skills.
    • Experience supporting as a DevOps or sys admin for commercial SaaS solutions.
    • BS or MS in Computer Science, related field, or equivalent professional experience.

    Candidates must be authorized to work in the United States without the need for current or future company sponsorship.

    C3.ai