Release Engineer - Data Plane
Job Description
About ClickHouse
Recognized on the 2025 Forbes Cloud 100 list, ClickHouse is one of the most innovative and fast-growing private cloud companies. With more than 3,000 customers and ARR that has grown over 250 percent year over year, ClickHouse leads the market in real-time analytics, data warehousing, observability, and AI workloads.
About the team
The Release Team owns the safe, continuous delivery of ClickHouse Cloud, a managed database platform running tens of thousands of ClickHouse clusters. We are responsible for upgrading and maintaining those clusters at scale, building the internal tooling that makes it possible, and being the last line of defense when something doesn't go according to plan.
About the role
This role is an equal split between operational execution and software development. You are responsible for the operational side: coordinating and running upgrades, dealing with edge cases that don't fit the happy path, and keeping tens of thousands of clusters healthy in production. At the same time, you are building and constantly improving the systems that make the next rollout safer and more automated than the last. If you find satisfaction in both writing the playbook and executing it, including the messy parts, this role is for you.
What you'll do
- Plan and execute rolling upgrades across tens of thousands of ClickHouse clusters, ensuring safety, correctness, and minimal customer impact
- Own the full release pipeline: from pre-upgrade validation and staged rollouts to post-upgrade monitoring and incident response
- Investigate and resolve production issues as part of a regular on-call rotation, including snowflake clusters and edge cases that automation can't yet handle
- Build and improve the internal tooling and automation that makes large-scale database operations reliable and repeatable
- Work closely with the core database and cloud infrastructure teams to identify operational pain points and turn them into solved problems
- Support and educate other engineering teams using our internal tools
About you
5+ years of experience operating stateful distributed systems in production, such as databases, message queues, or storage systems
- Hands-on experience running upgrades or maintenance operations on live production data stores, at scale
- Strong production debugging skills; you are comfortable digging into unfamiliar systems under pressure
- Experience with cloud infrastructure (AWS, Azure, or GCP) and Kubernetes
- Software development experience in Go (or strong experience in another language and genuine willingness to learn)
- Experience with ClickHouse preferred (as a user, operator or contributor)
Compensation
For roles based in the United States, the typical starting salary range for this position is listed above. In certain locations, such as the San Francisco Bay Area and the New York City Metro Area, a premium market range may apply, as listed.
Perks
- Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries.
- Healthcare - Employer contributions towards your healthcare.
- Equity in the company - Every new team member who joins our company receives stock options.
- Time off - Flexible time off in the US, generous entitlement in other countries.
- A $500 Home office setup if youβre a remote employee.
- Global Gatherings β We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites.
Is this company safe?
Ask Hyrizon AI to scan this company for potential red flags.
Safety First
- Never pay for a job application.
- Do not share sensitive bank info.
- Verify the client before starting work.