Nathan Geyer

UPTIME 24/7

Now Serving

N A T H A N

G E Y E R

Senior Site Reliability Engineer·Atlanta, GA

Experience

Years in SRE & DevOps

Uptime

99.9%

Production availability

Cost Savings

$1.7M

Annual cloud reduction

On-Call

24/7

Pager always on

kitchen.log — tail -f

About the Cook

I started at 15, working the graveyard shift as a fry cook at Waffle House. It taught me how to stay calm in a rush and not make people wait — habits that turned out to translate pretty well to running production systems.

These days I'm a Senior Site Reliability Engineer. I work with teams to keep their services healthy, automate the work that shouldn't need a human, and resolve incidents before customers notice. Most of my time goes to Kubernetes, observability, and the unglamorous infrastructure plumbing that holds everything else up.

The best systems work the way the best kitchens do — busy behind the scenes, calm out front.

🔥 Thrives under pressure

⚡ Automates everything

📟 On-call ready

🤝 Team player

🎵 The Jukebox 🎵

IoT Data Logging Platform

Tech Stack: Android, Java, Firebase, Cloud Functions

A side project for streaming IoT sensor readings to a phone, with configurable thresholds that page when something drifts out of range. The same observability principles I apply at work, applied to hardware.

Key Features: Real-time data streaming, configurable alert thresholds, historical data visualization.

Kubernetes Cluster Optimizer

Tech Stack: Golang, Kubernetes, Prometheus

A Go service that audits Kubernetes clusters for over-provisioning, recommends right-sizes, and validates the math against Prometheus before any change ships.

Impact: Right-sizing recommendations adopted across teams, with measurable reductions in cluster compute spend.

Incident Response Automation

Tech Stack: Python, PagerDuty API, Slack API, Datadog

Detects anomalies, opens the ticket, drops the relevant runbook into Slack, and gives the on-call engineer a head start. MTTR dropped 45% — most of which was time we used to spend looking for the runbook.

Features: Auto-remediation for common issues, intelligent alert routing, post-incident analysis with auto-drafted summaries.

GitHub Repositories

Open-source work, side projects, and tools that earned their keep — over at GitHub (@n8orz).

Inside: infrastructure automation, monitoring utilities, CI/CD pipeline templates, and Terraform modules.

Order History

FanDuel 2025 — Present

Senior Site Reliability Engineer

Atlanta, GA

Lead technical direction inside the SRE team — driving mission and ownership conversations, and mentoring engineers on Kubernetes, Kafka, automation, and CI/CD.
Embed with product teams to harden critical services: instrument what matters, drive incidents to mitigation, and run blameless postmortems that actually change behavior.
Operate multiple EKS clusters and supporting infrastructure as code via Terraform, with deploys automated through Buildkite and GitHub Actions.
Partner with product, networking, and ITOps to diagnose and shut down production-impacting outages fast.

Greenlight 2022 — 2025

Software Engineer II — Site Reliability Engineering

Remote — Atlanta, GA

Designed, deployed, and tuned Kubernetes microservices to hit 99.99% availability on production workloads.
Built TypeScript and Golang automation to take deployment, monitoring, and maintenance off engineers' plates.
Stood up monitoring and alerting with Datadog, Prometheus, and Grafana — held alert precision above 99%.
Led incident response, ran root-cause analysis, and shipped the preventative work so the same fire didn't restart.
Mentored junior engineers on CI/CD, resilience, and cloud security.

Infor 2021 — 2022

Software Engineer, DevOps

Remote — Atlanta, GA

Ran infrastructure for 75+ microservices across EKS and AKS with Terraform and ArgoCD.
Built scalable, cost-effective container security auditing in AWS ECR.
Shipped CI/CD pipelines on GitHub Actions and GitLab CI to cut deployment time.
Held Tier 3 on-call, keeping production highly available under pressure.
Cut incident response time 30% with custom Grafana dashboards and proactive Splunk alerting.

InductiveHealth Informatics 2020 — 2021

Software Engineer

Remote — Atlanta, GA

Built and maintained high-throughput, HIPAA-compliant data pipelines for national-scale epidemiological systems.
Hardened test coverage on critical data workflows, sharpening reliability in extraction and reporting.
Shipped new features for an internal CDC web app and closed authorization-flow vulnerabilities along the way.

Waffle House Age 15

Fry Cook

Where it all started

Learned to stay calm under pressure at 3 AM
Mastered parallel processing (multiple orders, one grill)
First exposure to 24/7 operations and incident response
Discovered that "the system is down" hits different at a diner

On the Wall

Education

Bachelor of Computer Science

Georgia State University, Atlanta, GA

Graduated Magna Cum Laude with Honors

Certification

Certified Kubernetes Application Developer

Cloud Native Computing Foundation

Personal Project

IoT Data Logging/UI Android Application

2020

Mobile app to track IoT sensor data with cloud storage, alerting thresholds, and SRE principles around reliability and observability.

About the Cook

★ Today's Specials ★

🎵 The Jukebox 🎵

Order History

On the Wall