About Innodata: Innodata (NASDAQ: INOD) is a premier, internationally recognized global data engineering pioneer, digital technology innovator, and artificial intelligence ecosystem leader with a distinguished 36+ year legacy of delivering the highest quality data and outstanding outcomes for its customers. On an absolute mission to enable the responsible advancement of artificial intelligence, Innodata provides the specialized data, complex evaluation frameworks, and deep human expertise required to build enterprise-grade AI systems that can be trusted at scale. The platform provides a comprehensive range of transferable solutions, infrastructure layers, and engineering services for Generative AI builders, large language model developers, and tech adopters globally. Innodata provides high-agency audio architects with an uncompromised remote canvas to build automated processing pipelines and secure high-fidelity vocal datasets safely across the international artificial intelligence landscape.

Position Overview

We are seeking a highly analytical, systems-minded Audio Engineer to join our core centralized AI and Dataset Production collective in a full-time remote capacity across the United States. In this high-leverage technical leadership seat, you will own the comprehensive technological heart of our voice and speech solutions—defining, managing, and optimizing the critical signal chains, post-processing recipes, and technical specifications that establish the consistent acoustic “sound signature” defining an Innodata dataset. Shifting completely away from traditional localized music tracking or static studio mixing loops, you will build automated software validation checks to ensure every hour of conversational data delivered across dozens of languages meets an uncompromised technical bar. This position requires an engineering veteran who models sound metrics fluidly, scripts data-processing tracks smoothly natively using Python libraries, and partners with Solutions Architects to maximize the downstream performance of speech AI models.

Key Responsibilities

End-to-End Signal Chain Governance: Formulate, document, and execute the global audio signal chain guidelines and automated post-processing pipelines for all multi-lingual voice collection programs natively utilizing Audio Engineering standards.
Technical Acoustic Blueprint Specification: Define and enforce rigid technical properties across dataset categories, explicitly configuring sample rates, bit depths, container formats, loudness targets (LUFS), noise floors, and multi-channel configurations.
Consistent Sound Signature Architecture: Design, balance, and maintain a uniform, spec-compliant acoustic profile across highly diverse recording environments, encompassing professional studio tracking, distributed remote platforms, and real-world or telephonic captures.
Automated Data Quality Assurance Engineering: Code, implement, and run advanced automated QA scripts and manual review matrices to cross-examine and validate large-scale audio blocks against customer schemas prior to final delivery.
Programmatic Audio Batch Processing: Author, test, and scale automated backend media pipelines and file transformation routines natively leveraging Python Scripting modules explicitly including ffmpeg, sox, pydub, and librosa.
Cross-Functional Solution Alignment: Partner peer-to-peer alongside internal Solutions Architects to translate complex, abstract enterprise acoustic requirements into achievable, highly performant technical formatting recipes.
Vendor Setup Testing and Validation: Specify, audit, and validate external hardware and microphone setups for remote contributors, executing regular remote signal-chain diagnostic checks via a small localized in-house studio node.
Downstream Model Performance Optimization: Analyze speech quality metrics continuously to understand how low-level acoustic engineering adjustments impact the training accuracy of text-to-speech (TTS) and automatic speech recognition (ASR) machine learning layers.
Anti-Scam Operational Hygiene Guarding: Maintain strict corporate transparency guidelines throughout your daily operations, ensuring safe handling of corporate metrics and adhering exclusively to official verifyjoboffer@innodata.com reporting rules.

Required Skills & Qualifications

Proven professional history running advanced audio engineering, digital signal processing (DSP) architecture, acoustic quality assurance management, sound post-processing automation, or tech-sector media consulting.
Deep, authoritative technical command of audio properties, sample rate conversion laws, codec variables, gain staging, LUFS leveling, spectral noise restoration, and linguistic data structures.
Expert-tier capability designing repeatable mastering recipes and troubleshooting signal anomalies natively utilizing Audio Engineering software applications and batch utilities.
Practical operational familiarity writing back-end automation scripts, parsing file metadata, and configuring continuous media processing pipelines natively using Python Scripting environments (specifically utilizing ffmpeg, sox, or pydub frameworks).
Demonstrated experience successfully managing vocal data optimization loops targeted explicitly at training Text-to-Speech (TTS), Automatic Speech Recognition (ASR), or conversational AI systems.
Outstanding verbal and written communication mechanics in business-fluent English, enabling uncompromised collaboration across distributed multi-lingual teams and precise documentation of technical rules.
Location Context: Position open exclusively to qualified audio engineering experts based permanently and resident within the **United States** to work under a 100% remote layout.

Preferred Strategic Indicators (Nice to Have)

Prior commercial audio or data engineering history operating within an enterprise speech AI company, big-data annotation marketplace, linguistic research lab, or high-throughput podcast/voice-over production platform.
Hands-on familiarity with algorithmic speech metrics (such as PESQ, STOI, or WER tracking architectures) and their variance across cloud networks.
Exposure to front-end database repositories or basic data curation mechanics inside version-controlled Git frameworks.
An outcome-driven personal philosophy rooted in absolute process orientation, an intense passion for consistency at scale rather than standalone recordings, and a desire to be at the absolute vanguard of global artificial intelligence training.

What We Offer

Experience-Calibrated United States Salaried Structure: An attractive full-time annual base salary range of $120,000 – $160,000 USD per year, calibrated precisely to evaluate your technical audio authority, programmatic Python craft, and dataset governance velocity.
The exceptional professional canvas to directly direct, shape, and code-engineer the automated processing pipelines and acoustic sound signatures power-routing data frameworks for the world’s leading speech AI models.
Profound work-from-home remote parameters providing an elite distributed workspace layout, complete scheduling trust, and zero physical geographic office commuting friction anywhere across America.
Direct operational exposure to a highly visible, NASDAQ-listed global technology leader with a proud 36+ year reputation for technical uncompromised quality.
Continuous skill acceleration pathways under elite tech leads, learning advanced programmatic media transformations, large-scale data schema optimizations, and frontier conversational AI platform mechanics.

Audio Engineer

Job Description

Key Skills Required

Position Overview

Key Responsibilities

Required Skills & Qualifications

Preferred Strategic Indicators (Nice to Have)

What We Offer

How would you rate this job post?

Safety First