Senior Technical Program Manager, Machine Learning Infrastructure
Canada
United StatesJob Description
Key Skills Required
Master these to land this role
Want to know if you're a match for this job?
About Cohere: Cohere is the leading security-first enterprise AI company, co-headquartered in Toronto and San Francisco, with key international footprints across London, New York City, Montreal, Seoul, Germany, and Paris. We build and deploy cutting-edge foundation AI models and end-to-end solutions engineered specifically to resolve complex, real-world business challenges at scale. Comprising a world-class collective of researchers, infrastructure developers, and product designers, we obsess over software craftsmanship and operational precision, building the robust infrastructure that enables widespread global enterprise adoption of frontier AI systems.
Position Overview
We are seeking an exceptionally logical, execution-oriented, and technically deep Senior Technical Program Manager (TPM) to lead our Machine Learning Infrastructure portfolio under a permanent, full-time remote engagement framework across Canada, Europe, or the United States. Operating as an operational mastermind specializing in high-stakes cross-functional efforts, you will step up to claim direct narrative orchestration, milestone dependency mapping, and deployment lifecycle accountability over our high-throughput inference, efficiency, model serving, and API endpoint systems. Shifting completely away from static project scheduling, low-level virtual assistant data logging, or basic clerical database entries, you will run an active system scaling, incident root-cause mitigation, and interdisciplinary resource synchronization laboratory—partnering face-to-face with core modeling researchers, distributed systems developers, customer success directors, and executive enterprise stakeholders. This position requires an infrastructure program authority with 5+ years of specialized ML space depth who structures engineering best practices fluidly natively using Machine Learning operational frameworks, evaluates structural pain points cleanly within fast-paced, low-structure computing environments, and directs overlapping technical initiatives confidently to support a rapidly expanding international user base.
Key Responsibilities
- Infrastructure Program Portfolio Governance: Own and drive the end-to-end strategic execution of technical portfolios covering runtime inference, computing efficiency, containerized serving, and endpoint design to guarantee model stability for thousands of enterprise clients.
- Cross-Functional Architecture Coordination: Lead rigorous synchronization pipelines bridging core Modeling researchers, compute infrastructure developers, and customer-facing engineering blocks to optimize deployment velocity natively utilizing Machine Learning protocols.
- Process Engineering and Bottleneck Removal: Diagnose operational pain points and build scalable, repeatable workflows that filter out administrative noise, allowing engineering squads to focus strictly on system development and feature optimization.
- Incident Mitigation and Root-Cause Analysis: Partner closely with incident management leads to ensure production bottlenecks and deployment anomalies are rigorously investigated, documented, and remediated via timely technical fixes.
- Ruthless Priority Management: Navigate complex overlapping project dependencies, systematically vetting and filtering demands to ensure that Cohere’s high-impact strategic milestones are achieved cleanly.
- Multi-Tier Stakeholder Communication: Deliver exceptionally clear, consistent, and timely program updates, timeline maps, and system health charts tailored accurately for both deeply technical developers and non-technical business leadership.
- Strategic Technical Partnership: Act as a primary operational advisor to senior technical directors, providing precise low-level tactical diagnostics alongside macro-level strategic planning frameworks to maximize engineering impact.
Required Skills & Qualifications
- A minimum of 5 years of verified professional history operating as a Technical Program Manager (TPM) or Engineering Program Manager (EPM) within a high-concurrency technology company.
- Mandatory Portfolio Specialization: Direct history managing technical programs explicitly within Machine Learning Infrastructure, showing deep domain understanding of model inference mechanics, serving architectures, endpoints implementation, and compute efficiency parameters.
- In-depth technical knowledge of ML infrastructure system interactions and modern engineering development standards, enabling a reliable balance of velocity and structure.
- Proven capability to step into highly dynamic, fast-paced, and low-structure environments independently, rolling up your sleeves to map clarity out of operational ambiguity.
- Outstanding verbal and written communication strengths in English, with the conversational flexibility required to navigate dense software engineering reviews and executive leadership updates smoothly.
- Timezone Alignment Parameters: While accepting applications globally, we maintain a strong hiring prioritization for candidates located within or fully aligned with the Eastern Time (ET) timezone.
- Location Context: Position features 100% remote parameters open to qualified engineering program managers residing permanently within Canada, Europe, or the United States.
Preferred Strategic Indicators (Nice to Have)
- Prior technical background in a hands-on software development, DevOps infrastructure, or machine learning engineering role, allowing you to dive into complex internal code repositories as a subject matter expert.
- Clear conceptual comprehension of how large foundation models are trained, tuned, and embedded into production architectures.
What We Offer
- Premium Enterprise Technical Compensation Blueprint: Exceptionally competitive global base salary model backed by generous corporate equity grants calibrated to your infrastructure management depth.
- A flexible, remote-friendly work environment offering fully supported home office set-up stipends ($500) or comprehensive monthly local co-working desk allowances.
- Robust personal benefits portfolio including top-tier Health and Dental insurance lines alongside a dedicated separate budget for mental health care.
- RRSP matching, 401(k) pathways, and region-specific Pension Scheme structures tailored directly to your local workspace country.
- Unmatched personal flexibility featuring 6 full weeks of paid annual vacation (30 working days) plus a $75 weekly corporate lunch stipend.
- 100% Paid Parental Leave top-up protection extending for up to 6 months for either parent.
- Annual enrichment allowances covering arts/culture, fitness/wellness, family quality time, and an annual training budget for technical conferences, courses, and executive coaching.
- Fully paid travel provisions to visit international company offices, augmented by our highly anticipated annual company-wide team offsite retreats.
How would you rate this job post?
See what other professionals think about this role.
Is this company safe?
Ask Hyrizon AI to scan this company for potential red flags before you apply.
Safety First
- Never pay for a job application.
- Do not share sensitive bank info.
- Verify the client before starting work.