Senior SRE Engineer
Recognized as the No. 1 site trusted by real estate professionals, Realtor.com® has been at the forefront of online real estate for over 25 years, connecting buyers, sellers, and renters with trusted insights and expert guidance to find their perfect home. Through its robust suite of tools, Realtor.com® not only makes a significant impact on the real estate industry at large, but for consumers, navigating the biggest purchase they will make in their life, by providing a user experience that is easy to use, easy to understand, and most of all, easy to make decisions.
Join us on our mission to empower more people to find their way home by breaking barriers to entry, making the right connections, and building confidence through expert guidance.
About the Role
We are seeking a Senior Site Reliability Engineer to join our newly formed Operations Excellence organization, reporting to the Director, Operations Excellence. This role will contribute to the reliability, observability, and operational excellence of our platform infrastructure serving millions of users. As a Senior SRE, you will be a strong technical contributor who implements best practices, solves complex problems, and enables our 600+ engineers to deliver exceptional customer experiences.
You will work on critical platform systems including EKS infrastructure, Skyway (CI/CD), Frontdoor (Tyk API Gateway), Pantheon (Apollo GraphQL Federation), and our observability stack, while contributing to chaos engineering practices and cost optimization initiatives with measurable ROI.
We believe in leveraging the best tools to solve problems faster. You will be expected to utilize AI coding assistants and LLMs proficiently to accelerate development velocity, generate boilerplate, and troubleshoot complex debugging scenarios. Beyond simple usage, this role requires the critical judgment to verify AI-generated outputs for security, performance, and accuracy. You should be comfortable integrating AI tooling into your daily workflow to eliminate repetitive tasks, allowing you to focus on high-impact architectural and strategic engineering challenges.
What You'll Do
Platform Reliability & Infrastructure
- Implement and maintain highly available AWS infrastructure including EKS clusters, Fargate (ECS), and multi-region architectures
- Support reliability of critical services: Skyway (CI/CD), Frontdoor (Tyk), Pantheon (Apollo GraphQL), and supporting infrastructure
- Monitor SLIs, SLOs, and error budgets for Tier 1/2/3 systems; participate in architectural reviews for reliability and cost-efficiency
- Implement reliability patterns including circuit breakers, graceful degradation, and automated failover
Observability & Cost Optimization
- Implement observability solutions using NewRelic for APM, distributed tracing, metrics, and logging for rapid troubleshooting
- Build dashboards and alerts that reduce MTTD and MTTR; contribute to observability standards across teams
- Identify infrastructure cost optimization opportunities and implement FinOps practices including rightsizing and resource lifecycle management
- Support cost-conscious architecture decisions and CI/CD spend optimization (CircleCI, Argo CD)
Chaos Engineering & Incident Response
- Execute chaos engineering experiments to identify system weaknesses; contribute to frameworks for safe production testing
- Participate in game day exercises and disaster recovery simulations; create runbooks and automation for resilience
- Participate in on-call rotation for critical systems; conduct post-incident reviews and implement improvements
- Support incident response processes and contribute to System Health Scorecard
Technical Contribution
- Contribute as a strong technical individual contributor to the Operations Excellence team
- Collaborate with Platform Engineering, Quality Engineering, and product teams on reliability initiatives
- Support security initiatives including AWS Secrets Manager migration and compliance requirements (SOC 2, PCI, GDPR)
- Contribute to Developer Experience metrics and platform adoption goals
- May provide technical guidance to junior team members
What You'll Bring
Experience & Expertise
- 5+ years in Site Reliability Engineering, DevOps, or Infrastructure Engineering with demonstrated success improving system reliability
- Bachelor’s degree or equivalent experience
- 3+ years hands-on experience with AWS (EKS, EC2, RDS, S3, CloudWatch, IAM) and Kubernetes including cluster management
- Proficient programming skills (Python, Go, or Java) with infrastructure automation and Infrastructure as Code experience (Terraform, CloudFormation)
- Production experience with observability tools (NewRelic, Datadog, Prometheus, Grafana, Splunk) and distributed systems
- Experience with CI/CD platforms and GitOps workflows (CircleCI, Argo CD, Jenkins); on-call rotation and incident response
- Preferred: Exposure to chaos engineering tools, API Gateway technologies (Tyk/Kong), GraphQL federation (Apollo), cost optimization initiatives, FinOps principles
Technical Skills
- Cloud & Infrastructure : AWS (EKS, Fargate, Lambda, VPC, Route53, CloudFront), Kubernetes, Docker, Istio Service Mesh
- CI/CD & GitOps : Argo CD, CircleCI, Jenkins, GitHub Actions
- Observability : NewRelic - APM, distributed tracing, metrics & logging; Splunk - logging
- IaC & Automation : Terraform, CloudFormation, Helm, Kustomize, Python/Go/Bash
- Platform Services : Tyk Gateway, Apollo GraphQL, AWS Secrets Manager, Vault
- Incident Management : OpsGenie, PagerDuty, ServiceNow
Professional Qualities
- Strong communication skills with ability to explain technical concepts to diverse audiences
- Collaborative approach working across engineering, product, and business teams
- Self-motivated with ability to solve complex problems within established practices and policies
- Data-driven decision making with customer-centric approach and empathy for developer experience
Do the best work of your life at Realtor.com®
Here, you’ll partner with a diverse team of experts as you use leading-edge tech to empower everyone to meet a crucial goal: finding their way home. And you’ll find your way home too. At Realtor.com®, you’ll bring your full self to work as you innovate with speed, serve our consumers, and champion your teammates. In return, we’ll provide you with a warm, welcoming, and inclusive culture; intellectual challenges; and the development opportunities you need to grow.
Diversity is important to us, therefore, Realtor.com® is an Equal Opportunity Employer regardless of age, color, national origin, race, religion, creed, gender, sex, sexual orientation, gender identity and/or expression, marital status, status as a disabled veteran and/or veteran of the Vietnam Era or any other characteristic protected by federal, state or local law. In addition, Realtor.com® will provide reasonable accommodations for otherwise qualified disabled individuals.
Recommended Jobs
FSM OverIT Technical Consultant, Senior Associate
Specialty/Competency: Functional & Industry Technologies Industry/Sector: Power and Utilities Time Type: Full time Travel Requirements: Up to 60% At PwC, our people in business applica…
Competitive Make ASE Certified Technician
Automotive Technician - Are you passionate about what you do? Does working on and repairing cars really rev your engine? Then it’s time you were rewarded for your skills/experience. That’s why we b…
Homecare Scheduler
Homecare SchedulerSchedule: Monday - Friday | 9:00 AM - 6:00 PM or 10:00 AM - 7:00 PM (In-Office Required)One Weekend per Month RequiredLocation: Dallas, TX About UsSince 1996, Always Best Care has …
Payroll Tax Analyst
Job Description The Payroll Tax Analyst plays a crucial role in ensuring the organization maintains compliance and accuracy in all payroll tax functions. This position is responsible for investigat…
CTERA Remote File Service Engineer
Req ID: 338806 NTT DATA strives to hire exceptional, innovative and passionate individuals who want to grow with us. If you want to be part of an inclusive, adaptable, and forward-thinking organiza…
Research And Development Engineer
The R&D Engineer is an established performer that helps drive technological advancements and contribute to the success of innovative projects. This role will be responsible for conducting research, d…
PDC Strategic Sales Manager
As part of the Planning Design and Construction Strategic Sales Manager team, you will have upstream responsibility for the Endo business by developing, coordinating and executing sales efforts and…
Wendy's Team Member
Road Ranger is looking for Wendy's Team Members to join the team at our New Deal, TX location on 1st shift! Join the Road Ranger family and see how far your drive can take you! At Road Ranger, we …
Pediatric Emergency Dept Registered Nurse
: Introduction Do you currently have an opportunity to make a real impact with your work? With over 2,000 sites of care and serving over 31.2 million patient interactions every year, nurses at …
Wellness Dietitian PRN - Corporate Wellness
At Houston Methodist, the Wellness Dietitian PRN position is responsible for assessing client's nutritional status, in-person or via telehealth, using recognized tools of assessment and instructing cl…