Airtable is the no-code app platform that empowers people closest to the work to accelerate their most critical business processes. More than 500,000 organizations, including 80% of the Fortune 100, rely on Airtable to transform how work gets done.
The Observability team at Airtable ensures that our engineers have the tools they need to measure performance, monitor reliability, and debug issues in real time. Our mission is to provide actionable insights into errors and crashes, fueling a better and more reliable experience for millions of users. We build logging, metrics, and tracing systems that are leveraged by nearly every engineering team at Airtable.
We also work on LLM observability for AI-powered features. We provide visibility into prompts, model calls, and RAG components, with a focus on latency, reliability, cost, safety signals, and evaluation quality.
If you’re excited about building resilient systems at scale, empowering engineers with best-in-class observability, and shaping the future of Airtable’s infrastructure, we’d love to hear from you.
What You’ll Do:
Architect and scale core observability
- Lead the design and evolution of logging, metrics, and tracing pipelines to handle massive data volumes
- Evaluate and integrate new technologies (e.g., OpenTelemetry, ClickHouse, ELK stack) that enhance Airtable’s observability posture
- Guide and mentor a growing team of infrastructure engineers; share best practices in distributed tracing, monitoring, and logging
- Define and uphold coding standards and operational excellence across the org
- Partner with Deploy Infrastructure, Service Orchestration, and Product teams to embed observability throughout the development lifecycle
- Align infrastructure decisions with business goals to detect issues before they impact customers
- Own end-to-end reliability for observability tools and establish SLAs, SLOs, and error budgets
- Optimize performance and cost of large-scale data pipelines and storage
- Shape the observability roadmap, prioritizing initiatives like improved tracing coverage, advanced monitoring dashboards, and next-gen logging pipelines
- Continuously explore emerging trends to keep Airtable’s monitoring capabilities at the cutting edge
Extend observability to LLM and AI features
- Instrument prompts, model calls, and RAG pipelines to capture latency, reliability, cost, and safety signals
- Design online and offline evaluation loops for LLM quality, includin