Lead Site Reliability Engineer, SRE

Requisition ID 2022-12125
Job Function
Information Technology
Position Type
Experienced Professionals
Posting Location : Location
US-GA-Atlanta

Overview

At Chick-fil-A, Site Reliability Engineering is a technical function which mixes in influence. Across our 2000+ restaurants, cloud, and private data centers, SREs work with our DevOps teams to introduce and hone SRE principles, establish reliability goals, and develop tooling for operational observability. We are a small team working through many different patterns to bring observability to everyone. SREs at Chick-fil-A collaborate across teams and roles, feed learnings back into the organization, and learn all the ways technology is used along the way.

 

In this role, there is a strong focus on building the tooling and integrations necessary to easily onboard services into our next-generation platform. There will be a mix of platform and application-level work to support the vision of out-of-the-box visibility, monitoring, and dashboarding.

 

Our Flexible Future model offers a healthy mix of working in person and remotely, strengthening key elements of the Chick-fil-A culture by fostering collaboration and community. 

Responsibilities

  • Work independently with DevOps teams to refine running production systems
  • Building on-call processes
  • Creating Incident Management and Response procedures
  • Instrumenting for observability
  • Monitoring SLIs
  • Work to varying degrees with DevOps teams
  • Provide consultation on SRE best practices
  • Give guidance on specific topics
  • Oversee groups of dedicated engineers
  • Embed directly with teams
  • Work with teams to define SLOs and error budgets
  • Ensure services and systems meet availability needs of customers
  • Document learnings to share with the broader engineering teams
  • Ensure clear communication around SRE objectives
  • Collaborate broadly across the entire engineering organization
  • Oversee other SREs to bring best practices or learnings from across the organization to them
  • Build internal tooling around operational observability
  • Bring a strong mindset of continual improvement
  • An aversion to toil and automatable tasks
  • Advocate for SRE as a part of engineering culture
  • Act as a conduit for Architecture, Security, Tools, and Common Engineering
  • Keep abreast of industry changes and evaluate for implementation

Location: Hybrid

Minimum Qualifications

  • Bachelor’s Degree or the equivalent combination of education, training and experience from which comparable skills can be acquired
  • 5+ Year’s experience of total software engineering experience
  • 3+ years support a production on a DevOps team
  • 2+ years in cloud platforms such as amazon web services, google cloud or Microsoft azure
  • Agile/DevOps Development
  • Software Engineering
  • Development and support of a production system
  • Cloud Platform Experience
  • Significant experience with: Building and supporting systems - Enterprise cloud providers - Production containerized environments - Excellent written and verbal communication - Experience with CI/CD pipelines
  • Ability to build strong relationships, collaborate, and influence diverse groups of engineers and non-technical roles
  • Ability to influence other engineers without organizational authority

Preferred Qualifications

  • 7+ years of total software engineering experience
  • 4+ years support a production system on a devops team
  • 4+ years in cloud platforms such as amazon web services, google cloud or Microsoft azure
  • Agile/DevOps Development
  • Software Engineering
  • Development and support of a production system
  • Cloud Platform Experience

Minimum Years of Experience

5

Travel Requirements

5%

Required Level of Education

Bachelor's degree or equivalent experience

Preferred Level of Education

Bachelors Degree

Major/Concentration

Computer Science, Computer Engineering, or related technical field

Submit Resume

Sorry the Share function is not working properly at this moment. Please refresh the page and try again later.
Share on your newsfeed