Back to Jobs
GrafanalabsAI & Machine Learning 3h ago

Staff AI Engineer - Grafana Ops, AI/ML

Remote (USA)
Full-time
USD 174,986 - USD 209,983
Be the first applicant! 🚀

Job Description

Introduction

Grafana Labs is a remote-first, open-source powerhouse with over 20M users globally. We help more than 3,000 companies manage their observability strategies with the Grafana LGTM Stack. We are scaling fast, maintaining an open-source legacy, a global collaborative culture, and a passion for meaningful work. Our team thrives in an innovation-driven environment where transparency, autonomy, and trust are paramount.

This is a remote opportunity and we would be interested in applicants from USA time zones only at this time.

Staff AI Engineer

The Opportunity: At Grafana, we build observability tools that help users understand, respond to, and improve their systems. The Grafana AI teams play a key role by helping users make sense of complex observability data through AI-driven features. These capabilities reduce toil, lower the barrier of domain expertise, and surface meaningful signals from noisy environments. Our team operates with a high degree of autonomy and ownership, empowering engineers to make decisions, move quickly, and validate ideas early.

We’re looking for an AI Software Engineer with a strong software engineering background, a quick iteration mindset, and a passion for experimentation, balanced by a focus on shipping and scaling impactful features. You’ll work closely with cross-functional teams to develop, test, and ship AI-powered features that improve infrastructure and observability quality through automation, and expand AI agents across the observability stack to assist with incident response. As the team matures, there’s a broad opportunity to expand or redefine this role based on impact and initiative.

What You’ll Be Doing:

  • Build and deliver AI solutions: Take ownership of developing high-performance AI features to help users detect, triage, and resolve incidents using observability data and tools.
  • Rapid experimentation and iteration: Implement a highly iterative process for prototyping, testing, and validating with real users, including shipping and evolving LLM- or agent-powered workflows for incident lifecycle management and automated analysis tasks.
  • Collaborate cross-functionally: Work with data analysts, product managers, and designers to shape AI-driven product features, including integration of agentic components with internal tools, alerting systems, runbooks, and developer workflows.
  • Utilize AI tools effectively: Use AI and automation tools to enhance both product functionality and your own development workflows.
  • Effective communication: Communicate effectively and contribute across teams in a highly dynamic and collaborative environment.
  • Ownership and impact: Take full ownership of the AI solutions you develop, ensuring they are innovative, scalable, maintainable, and aligned with real user workflows.

We invest heavily in developer productivity, offering modern AI coding assistants and company-funded usage budgets. We encourage pragmatic AI-assisted development (faster prototyping, test generation, refactors, documentation) paired with strong code review and quality standards. You’ll also have access to frontier models (e.g., GPT-Codex 5/3, Claude Opus 4.6, Gemini 3 Pro).

What Makes You a Great Fit:

  • Strong engineering skills: Solid experience building production software systems (backend and / or full stack). You’re a self-starter, capable of tackling complex engineering problems with minimal supervision.
  • AI experience with a practical mindset: Familiarity with AI technologies and frameworks, with a focus on delivering high-quality, real-world solutions.
  • Quick iteration and experimentation: Comfortable releasing prototypes, collecting feedback, and iterating with a pragmatic mindset.
  • Proven initiative: You take ownership, drive projects forward, and define scope in ambiguous situations.
  • Collaborative attitude: Communicate effectively with peers, product managers, and designers; open to feedback and solutions-oriented.

Requirements:

  • Experience with LLMs, prompt engineering, and building applications powered by GenAI.
  • Proven track record of delivering software that made it into production and is actively used by users.
  • Exposure to working in cloud-native environments (e.g., AWS, GCP, Azure).
  • Experience using observability tools to understand and troubleshoot system behavior.

Bonus Points For:

  • Experience building or working with agent frameworks or multi‑agent workflows.
  • Experience with infrastructure / devops related tooling: Kubernetes, Docker, Terraform or similar for deployments.
  • Familiarity with model fine-tuning techniques.
  • Experience building observability tooling.

Compensation & Rewards:

In the United States, the Base compensation range for this role is USD 174,986 - USD 209,983. Actual compensation may vary based on level, experience, and skillset. Benefits include equity, bonus (if applicable) and other benefits listed here.

All of our roles include Restricted Stock Units (RSUs), giving every team member ownership in Grafana Labs' success. We believe in shared outcomes—RSUs help us stay aligned and invested as we scale globally.

Why You’ll Thrive at Grafana Labs:

  • 100% Remote, Global Culture - As a remote-only company, we bring together talent from around the world, united by a culture of collaboration and shared purpose.
  • Scaling Organization – Tackle meaningful work in a high-growth, ever-evolving environment.
  • Transparent Communication – Expect open decision-making and regular company-wide updates.
  • Innovation-Driven – Autonomy and support to ship great work and try new things.
  • Open Source Roots – Built on community-driven values that shape how we work.
  • Empowered Teams – High trust, low ego culture that values outcomes over optics.
  • Career Growth Pathways – Defined opportunities to grow and develop your career.
  • Approachable Leadership – Transparent execs who are involved, visible, and human.
  • Passionate People – Join a team of smart, supportive folks who care deeply about what they do.
  • In-Person onboarding - We want you to thrive from day 1 with your fellow new ‘Grafanistas’ to learn all about what we do and how we do it.
  • Balance is Key - We operate a global annual leave policy of 30 days per annum. 3 days of your annual leave entitlement are reserved for Grafana Shutdown Days to allow the team to really disconnect.

Safety First

  • Never pay for a job application.
  • Do not share sensitive bank info.
  • Verify the client before starting work.