C

Release Engineer - Data Plane

Clickhouse
2 months ago
Full-time
Remote
Worldwide
Remote Engineering

About ClickHouse

Recognized on the 2025 Forbes Cloud 100 list, ClickHouse is one of the most innovative and fast-growing private cloud companies. With more than 3,000 customers and ARR that has grown over 250 percent year over year, ClickHouse leads the market in real-time analytics, data warehousing, observability, and AI workloads.

The company’s sustained, accelerating momentum was recently validated by a $400M Series D financing round. Over the past three months, customers including Capital One, Lovable, Decagon, Polymarket, and Airwallex have adopted the platform or expanded existing deployments. These customers join an established base of AI innovators and global brands such as Meta, Cursor, Sony, and Tesla.

We’re on a mission to transform how companies use data. Come be a part of our journey!

About the team

The Release Team owns the safe, continuous delivery of ClickHouse Cloud, a managed database platform running tens of thousands of ClickHouse clusters. We are responsible for upgrading and maintaining those clusters at scale, building the internal tooling that makes it possible, and being the last line of defense when something doesn't go according to plan.

About the role

This role is an equal split between operational execution and software development. You are responsible for the operational side: coordinating and running upgrades, dealing with edge cases that don't fit the happy path, and keeping tens of thousands of clusters healthy in production. At the same time, you are building and constantly improving the systems that make the next rollout safer and more automated than the last. If you find satisfaction in both writing the playbook and executing it, including the messy parts, this role is for you.

What you'll do

  • Plan and execute rolling upgrades across tens of thousands of ClickHouse clusters, ensuring safety, correctness, and minimal customer impact
  • Own the full release pipeline: from pre-upgrade validation and staged rollouts to post-upgrade monitoring and incident response
  • Investigate and resolve production issues as part of a regular on-call rotation, including snowflake clusters and edge cases that automation can't yet handle
  • Build and improve the internal tooling and automation that makes large-scale database operations reliable and repeatable
  • Work closely with the core database and cloud infrastructure teams to identify operational pain points and turn them into solved problems
  • Support and educate other engineering teams using our internal tools
&