Fiche technique publique. Tous les chiffres clés décrivent Auricus Voice 32 sauf mention ; le débit, la concurrence et la puissance des accélérateurs évoluent linéairement sur Auricus Voice 8 / 16 / 32.

Pour plus de détails d’intégration, contactez-nous — nous partageons la documentation complète sous NDA avec les pilotes qualifiés.

Les sections tabulaires ci-dessous restent en anglais pour rester alignées sur la publication de référence.

Hardware & deployment

Parameter Value
Form factor Purpose-built rack-mount appliance
Product family Auricus Voice 8 / 16 / 32 (8 / 16 / 32 purpose-built edge AI accelerators per chassis, field-flexible)
Typical chassis power (Voice 32) ~600 W at full load (incl. host CPU, RAM, NICs, fans, PSU losses)
Inference stack Compiled INT8
Deployment model Single appliance, on-prem, no inbound internet dependency for inference

Performance — streaming / near–real-time

Metric Auricus Voice 32
Sustained concurrent streams ~48–64
Latency profile Bounded by your rack — no provider RTT, no shared-region jitter

Performance — batch

Metric Auricus Voice 32
Throughput ~100 files/min
End-to-end (ingest → language ID → decode → delivery) ~240–300 files/hour
Compressed-source overhead ~10–20% additional CPU time on MP3 vs uncompressed WAV
Long-form (> 30 s) Supported with reduced per-file efficiency vs short clips

Audio formats

Parameter Value
Containers / codecs WAV, MP3, FLAC, OGG
Sample rates Phone-band (8 kHz) and wideband (16 kHz) — quality dashboards track both separately
Channels Mono and stereo input; per-channel handling configurable
Long audio No hard upper bound; raise client read timeouts for files > ~15 min

Languages & language identification

Parameter Value
Transcription languages ~99 (multilingual coverage)
Language-ID classes 107 automatic detection
LID benchmark accuracy 93.3% on the reference corpus
Detection latency Sub-second on the primary path; CPU fallback available
Per-language priors Configurable boosts and regional defaults for mixed-language deployments

API & integration

Parameter Value
Transport HTTPS / HTTP, REST + JSON
Authentication Bearer token
Async model Job submission, asynchronous workers, horizontal scale-out across accelerators
Result delivery Polling or webhook callback to a customer-supplied URL
Webhook events transcript (raw transcript) · wer (corrected + raw + measured WER when ground truth is supplied)
Backpressure Standard HTTP 429 with Retry-After semantics for both per-job and queue-depth limits
Webhook reliability Retries with dead-letter handling for terminal failures

Detailed endpoint paths, queue topology, port assignments, and tuning knobs are shared with pilot customers under NDA.

Observability

Parameter Value
Metrics endpoint Prometheus exposition format
Metric families Stage durations, device utilisation, queue depth, jobs in flight, jobs by language, WER ratio, language-detection counters, canary metrics
Shipped Grafana dashboards System overview, pipeline performance (P50/P95/P99 + per-stage breakdown), quality (WER trends + per-language), device health, queue management, SLO tracking, language analytics
SLO — availability 99.9% target (1-hour completed-vs-all-events ratio); ~43.2 min downtime / 30-day budget
SLO — latency 95% of pipeline stages complete within ≤ 10 s (1-hour rolling)

Compliance & data handling

Parameter Value
Network egress None required for inference
Audio retention Configurable; default is in-memory plus working storage purged after job completion
Transcript retention Customer-controlled
Encryption in transit TLS terminates at the appliance ingress
Audit logging Job submission, completion, and failure events emitted as structured logs with request IDs
Personal-data minimisation Audio retained only for job duration; transcripts delivered and purged from appliance-side stores per customer policy
No automated individual decision-making Transcription only; no Art. 22 GDPR profiling or automated decision-making in the product

→ Full compliance posture, including the EU Digital Omnibus mapping: Compliance.

Variants comparison

SKU Concurrent live calls Matched annual audio (M min/yr) Batch (files/min) End-to-end (files/hr) Peak accelerator power Target use case
Auricus Voice 8 12 2 25 60–75 80 W Mid-size contact center, departmental fleet
Auricus Voice 16 24 4 50 120–150 160 W Large enterprise contact center, regional carrier
Auricus Voice 32 48 8 100 240–300 320 W National contact center, telco / government scale

Concurrent live calls = sustained concurrent near-real-time conversations per appliance. Matched annual audio = realistic operational capacity per appliance under a typical mixed real-time + batch duty cycle. Batch (files/min) = sustained transcription throughput per appliance. files/hr (E2E) = end-to-end including ingest, language ID, decode, and delivery overhead. Peak accelerator power scales linearly across the family; add ~120 W chassis baseline for total appliance draw.


Specifications represent the current production reference appliance and are subject to change. Performance figures are configuration-dependent. Customer pilots use the same evaluation harness for traceability.