Senior Autonomy Engineer - Data Curation
Skydio
Skydio is the leading US drone company and the world leader in autonomous flight, the key technology for the future of drones and aerial mobility. The Skydio team combines deep expertise in artificial intelligence, best-in-class hardware and software product development, operational excellence, and customer obsession to empower a broader, more diverse audience of drone users, from utility inspectors https://www.skydio.com/solutions/energy-and-utilities to first responders https://www.skydio.com/solutions/public-safety, soldiers in battlefield scenarios https://www.skydio.com/solutions/national-security/tactical-isr, and beyond https://www.skydio.com/solutions.
About the Role:
As a Senior Engineer on the Autonomy Data Curation team, youβll help build a data flywheel. Youβll collect the data that matters most from internal and external fleets of drones and turn it into high-quality, model-ready datasets that Autonomy teams can consume quickly and confidently for training and model development. Close peers will include the Deep Learning and Computer Vision teams.
This role is an individual contributor who will report to the Director of Autonomy Data Curation.
How Youβll Make an Impact:
- Build and operate pipelines that transform raw autonomy logs & media into curated datasets with strong observability and clear ownership to make curated data more broadly reusable.
- Build tooling that makes data discovery and slicing fast and self-serve for Autonomy teams. For example: media search tooling and hard mining loops with infra for auto-routing data to annotation.
- Improve dataset quality and repeatability: versioning, provenance, and automated checks.
- Apply privacy and security requirements in throughout our processes (access controls, retention, redaction/anonymization).
- Build with a data-driven and impact-forward mindset with dashboards highlighting cost, dataset balance, and audit details.
What Makes You a Good Fit:
- 5+ years of professional software engineering experience (or equivalent), with significant ownership of production systems.
- Strong proficiency in programming, demonstrable in at least one of of our most frequently used languages (Python/C++).
- Hands-on experience building data pipelines for large-scale datasets (ETL/ELT, streaming or batch, orchestration).
- Experience with data modeling, schema evolution, and dataset/version management.
- Solid understanding of reliability engineering: monitoring, incident response, backfills, and operational rigor.
- Ability to work across ambiguous interfaces (data + tooling + model consumers) and drive decisions.
Nice To Haves:
- Experience with autonomy/robotics data: flight logs, self-driving car data, sensor fusion traces, video, geospatial metadata.
- Experience with labeling workflows, annotation tooling, and labeling QA at scale.
- Familiarity with privacy concepts (PII handling, redaction, access control, audit logs).
- Experience with vector/semant