Senior/Staff Platform Engineer
Job Description
Key Skills Required
Master these to land this role
Want to know if you're a match for this job?
About VRChat: VRChat offers a first-of-its-kind, game-changing social platform that powers an endless, rapidly growing collection of immersive virtual reality experiences. With over 250,000 custom-created worlds and counting, VRChat empowers a robust global community to bring their imaginations to life on any device, anywhere in the world. Backed by $100M in venture funding from premier investors like Makers Fund, Anthos Capital, and HTC, our remote-first engineering core includes seasoned veterans from Netflix, Meta, Google, and Discord.
Position Overview
We are seeking a high-signal Senior/Staff Platform Engineer to spearhead the reliability, performance, and scalability architectures of our massive production infrastructure. Reporting directly to the Head of Platform, you will take full operational ownership of keeping the complex machinery behind the scenes finely tuned and running with maximum availability. This role bridges classical site reliability engineering with deep data-driven governance, utilizing error budgets, SLIs/SLOs, and DORA tracking metrics to support millions of concurrent real-time connections while managing large-scale distributed database clusters.
Key Responsibilities
- Production Infrastructure Governance: Manage and optimize our hyper-scale production infrastructure with an uncompromised focus on high-availability performance, robust security posture, and cost efficiency.
- Data-Driven SRE Practices: Define, monitor, and enforce precision reliability targets using SLIs, SLOs, SLAs, error budgets, and DORA metrics frameworks.
- Telemetry & Incident Management: Architect robust monitoring, logging systems, and real-time alerting dashboards, driving end-to-end incident management, root cause analysis (RCA), and blameless postmortems.
- Infrastructure-as-Code Automation: Standardize and accelerate operational workflows by writing declarative infrastructure configurations utilizing modern IaC engines like Terraform or OpenTofu.
- NoSQL Datastore Orchestration: Maintain, configure, and optimize high-concurrency database deployments across MongoDB, Elasticsearch, and Redis.
- Cross-Functional Code Engagement: Collaborate closely with product engineering units via code reviews and occasional backend feature or tooling development to build strong shared technical context.
Required Skills & Qualifications
- 8+ years of verified professional history executing inside Site Reliability Engineering (SRE), DevOps, Platform Engineering, or Infrastructure Engineering roles.
- Deep, hands-on production experience scaling high-concurrency, distributed cloud or hybrid cloud computing environments.
- Expertise automating cloud infrastructure blueprints using infrastructure-as-code configuration tooling like Terraform or OpenTofu.
- Profound command of Linux systems engineering, low-level networking primitives, and advanced observability orchestration patterns.
- Solid operational, configuration, and optimization tracking history running datastores across MongoDB, Elasticsearch, and Redis.
- Outstanding technical communication mechanics, optimized to turn highly ambiguous infrastructure bugs into clear, measurable engineering solutions.
- Location Context: 100% remote-first company structure open to qualified engineering leads working Worldwide (Global Remote).
Preferred Strategic Indicators (Nice to Have)
- Production-grade experience managing AWS ecosystems, featuring cost optimization layers and secure multi-account organizational architectures.
- Familiarity with containerized workload reliability using Kubernetes, including service mesh layers, ingress configurations, and advanced networking solutions like Cilium.
- Prior technical history supporting hyper-scale storage systems, real-time distributed pipelines, or content delivery networks (CDNs).
What We Offer
- The exceptional technical canvas to scale and secure the canonical virtual reality metaverse platform relied upon daily by millions of creators worldwide.
- Highly competitive global compensation package supplemented by lucrative corporate stock option equity paths.
- Comprehensive health benefit choices alongside geographic workspace freedom allowing you to work from anywhere.
- Dedicated retirement matching programs (401K for US / RRSP for Canadian employees).
- Generous paid holiday schedule balanced with flexible/unlimited vacation time allowances and paid parental leave benefits.
How would you rate this job post?
See what other professionals think about this role.
Is this company safe?
Ask Hyrizon AI to scan this company for potential red flags before you apply.
Safety First
- Never pay for a job application.
- Do not share sensitive bank info.
- Verify the client before starting work.