Platform

A sovereign voice AI appliance — 2U, on-prem — ships the entire voice AI platform inside: the same model, the same pipeline, and the same APIs across Auricus Voice 8 / 16 / 32.

Two paths, one stack

Live and near–real-time — windowed streaming for contact centers, supervision, and compliance workflows. ~48–64 concurrent near–real-time streams per Auricus Voice 32 appliance before queue limits dominate.
Batch at scale — sustained ~100 files/min on Auricus Voice 32; end-to-end (ingest, language ID, delivery) ~240–300 files/hour under typical overhead.

Same pipeline serves both — no separate “real-time” and “batch” SKUs, no second integration to build.

Quality observability

Per-language WER dashboards — separate tracking for 16 kHz vs 8 kHz so phone-band quality is never hidden behind wideband averages.
Continuous evaluation harness — same suite used for customer pilots and internal regression checks, so the numbers we share with you are directly comparable to the numbers we run against.
Customer-shared benchmarks — published quality figures will be added here as we complete the next round of cross-language reference runs.

Language reach

~99 transcription languages.
107 automatic language-ID classes (93.3% benchmark accuracy).
Mixed-language deployments handled by per-language priors and regional defaults.
LID detection runs in the sub-second band on the primary path; CPU fallback available for resilience.

Integration

REST + JSON ingestion over HTTPS.
Bearer-token authentication.
Async worker model — submit a job, then either poll or receive a webhook callback with the result.
Webhook events: transcript (raw transcript) · wer (corrected + raw + measured WER when ground truth is supplied).
HTTP 429 backpressure with Retry-After semantics for both per-job and queue-depth limits.
Webhook retries with dead-letter handling for terminal failures.

→ Public spec sheet: Specifications.

Observability

Auricus Voice ships with the operational surface area of a modern platform — not a black-box appliance.

Prometheus metrics endpoint in standard exposition format. Stage durations, device utilisation, queue depth, jobs in flight, jobs by language, WER ratio, language-detection counters, canary metrics.
Grafana dashboards: system overview, pipeline performance (P50 / P95 / P99 + per-stage breakdown), quality (WER trends + per-language), device health (temperature, power, utilisation, errors), queue management, SLO tracking, language analytics.
Structured audit logs — every job submission, completion, and failure emitted with a request ID for downstream SIEM ingestion.

Service-level objectives

SLO	Target
Availability	99.9% (1-hour completed-vs-all-events ratio); ~43.2 min downtime / 30-day budget
Latency	95% of pipeline stages complete within ≤ 10 s (1-hour rolling)

Sustainability

Auricus Voice 32 at full load: ~600 W typical appliance draw — vs comparable GPU-based racks measured in kilowatts.
2U chassis replaces multi-server GPU racks, cutting embodied materials, cooling load, and end-of-life e-waste.
End-of-life take-back under the EPR framework — vendor-managed return logistics for retired appliances; no customer responsibility for downstream WEEE handling.

SKU comparison

SKU	Concurrent live calls	Matched annual audio (M min/yr)	Batch (files/min)	End-to-end (files/hr)	Peak accelerator power	Target use case
Auricus Voice 8	12	2	25	60–75	80 W	Mid-size contact center, departmental fleet
Auricus Voice 16	24	4	50	120–150	160 W	Large enterprise contact center, regional carrier
Auricus Voice 32	48	8	100	240–300	320 W	National contact center, telco / government scale

Concurrent live calls = sustained concurrent near-real-time conversations per appliance. Matched annual audio = realistic operational capacity per appliance under a typical mixed real-time + batch duty cycle. Batch (files/min) = sustained transcription throughput per appliance. files/hr (E2E) = end-to-end including ingest, language ID, decode, and delivery overhead. Peak accelerator power scales linearly across the family; add ~120 W chassis baseline for total appliance draw.

→ See the matched-workload savings comparison vs cloud STT.

Two paths, one stack

Quality observability

Language reach

Integration

Observability

Service-level objectives

Sustainability

SKU comparison

Legal

Site

Company

Reach us