N

Software Engineer, Data Platform

Notion
10 months ago
Full-time
Remote
Worldwide
Remote Engineering
ABOUT US:

Notion helps you build beautiful tools for your life’s work. In today's world of endless apps and tabs, Notion provides one place for teams to get everything done, seamlessly connecting docs, notes, projects, calendar, and email—with AI built in to find answers and automate work. Millions of users, from individuals to large organizations like Toyota, Figma, and OpenAI, love Notion for its flexibility and choose it because it helps them save time and money.

In-person collaboration is essential to Notion's culture. We require all team members to work from our offices on Mondays, Tuesdays, and Thursdays, our designated Anchor Days. Certain teams or positions may require additional in-office workdays.


ABOUT THE ROLE:

Join Notion’s Data Platform team as we build out the data infrastructure for Notion's next phase of large enterprise customers and novel agentic data use cases. You’ll help design and build the core data platform that powers Notion’s AI, analytics, and search while meeting stringent security, privacy, and compliance requirements. This role focuses on the data platform layer (storage, compute, pipelines, governance) and partners closely with Infrastructure, Security, Search Platform, AI, and Data Engineering.


WHAT YOU'LL DO:

- Design and evolve the data lakehouse

Build and operate core lakehouse components (e.g., Iceberg/Hudi/Delta tables, catalogs, schema management) that serve as the source of truth for analytics, AI, and search.

- Own critical data pipelines and services

Design, implement, and harden batch and streaming pipelines (Spark, Kafka, etc.) that move and transform data reliably across regions.

- Advance EKM and encryption-by-design

Work with Security and platform teams to integrate Enterprise Key Management (EKM) into data workflows, including file- and record-level encryption and safe key handling in Spark and storage systems.

- Improve data access, auditability, and residency

Build primitives for fine-grained access control, auditing, and data residency so customers can see who accessed what, where, and under which guarantees.

- Drive reliability and observability

Raise the operational bar for our data stack: improve on-call experience, debugging, and alerting for data jobs and services.

- Optimize large-scale performance and cost

Tackle performance and cost challenges across Kafka, Spark, and storage for very large workspaces (20k+ users, multi-cell deployments)

- Shape the platform roadmap

Empower product and infrastructure engineering teams reliable, scalable data infrastructure to enable novel large volume, agentic use cases.



SKILLS YOU'LL NEED:

- Experience: 2+ years building and operating data platforms or large-scale infrastructure for SaaS or similar environments.

- Programming: Strong skills in at least one of Python, Scala, or Typescript; comfortable working with SQL for analytics and data modeling.

- Distributed data systems: