Spécifications

Fiche technique publique. Tous les chiffres clés décrivent Auricus Voice 32 sauf mention ; le débit, la concurrence et la puissance des accélérateurs évoluent linéairement sur Auricus Voice 8 / 16 / 32.

Pour plus de détails d’intégration, contactez-nous — nous partageons la documentation complète sous NDA avec les pilotes qualifiés.

Les sections tabulaires ci-dessous restent en anglais pour rester alignées sur la publication de référence.

Hardware & deployment

Parameter	Value
Form factor	Purpose-built rack-mount appliance
Product family	Auricus Voice 8 / 16 / 32 (8 / 16 / 32 purpose-built edge AI accelerators per chassis, field-flexible)
Typical chassis power (Voice 32)	~600 W at full load (incl. host CPU, RAM, NICs, fans, PSU losses)
Inference stack	Compiled INT8
Deployment model	Single appliance, on-prem, no inbound internet dependency for inference

Performance — streaming / near–real-time

Metric	Auricus Voice 32
Sustained concurrent streams	~48–64
Latency profile	Bounded by your rack — no provider RTT, no shared-region jitter

Performance — batch

Metric	Auricus Voice 32
Throughput	~100 files/min
End-to-end (ingest → language ID → decode → delivery)	~240–300 files/hour
Compressed-source overhead	~10–20% additional CPU time on MP3 vs uncompressed WAV
Long-form (> 30 s)	Supported with reduced per-file efficiency vs short clips

Audio formats

Parameter	Value
Containers / codecs	WAV, MP3, FLAC, OGG
Sample rates	Phone-band (8 kHz) and wideband (16 kHz) — quality dashboards track both separately
Channels	Mono and stereo input; per-channel handling configurable
Long audio	No hard upper bound; raise client read timeouts for files > ~15 min

Languages & language identification

Parameter	Value
Transcription languages	~99 (multilingual coverage)
Language-ID classes	107 automatic detection
LID benchmark accuracy	93.3% on the reference corpus
Detection latency	Sub-second on the primary path; CPU fallback available
Per-language priors	Configurable boosts and regional defaults for mixed-language deployments

API & integration

Parameter	Value
Transport	HTTPS / HTTP, REST + JSON
Authentication	Bearer token
Async model	Job submission, asynchronous workers, horizontal scale-out across accelerators
Result delivery	Polling or webhook callback to a customer-supplied URL
Webhook events	`transcript` (raw transcript) · `wer` (corrected + raw + measured WER when ground truth is supplied)
Backpressure	Standard HTTP 429 with `Retry-After` semantics for both per-job and queue-depth limits
Webhook reliability	Retries with dead-letter handling for terminal failures

Detailed endpoint paths, queue topology, port assignments, and tuning knobs are shared with pilot customers under NDA.

Observability

Parameter	Value
Metrics endpoint	Prometheus exposition format
Metric families	Stage durations, device utilisation, queue depth, jobs in flight, jobs by language, WER ratio, language-detection counters, canary metrics
Shipped Grafana dashboards	System overview, pipeline performance (P50/P95/P99 + per-stage breakdown), quality (WER trends + per-language), device health, queue management, SLO tracking, language analytics
SLO — availability	99.9% target (1-hour completed-vs-all-events ratio); ~43.2 min downtime / 30-day budget
SLO — latency	95% of pipeline stages complete within ≤ 10 s (1-hour rolling)

Compliance & data handling

Parameter	Value
Network egress	None required for inference
Audio retention	Configurable; default is in-memory plus working storage purged after job completion
Transcript retention	Customer-controlled
Encryption in transit	TLS terminates at the appliance ingress
Audit logging	Job submission, completion, and failure events emitted as structured logs with request IDs
Personal-data minimisation	Audio retained only for job duration; transcripts delivered and purged from appliance-side stores per customer policy
No automated individual decision-making	Transcription only; no Art. 22 GDPR profiling or automated decision-making in the product

→ Full compliance posture, including the EU Digital Omnibus mapping: Compliance.

Variants comparison

SKU	Concurrent live calls	Matched annual audio (M min/yr)	Batch (files/min)	End-to-end (files/hr)	Peak accelerator power	Target use case
Auricus Voice 8	12	2	25	60–75	80 W	Mid-size contact center, departmental fleet
Auricus Voice 16	24	4	50	120–150	160 W	Large enterprise contact center, regional carrier
Auricus Voice 32	48	8	100	240–300	320 W	National contact center, telco / government scale

Concurrent live calls = sustained concurrent near-real-time conversations per appliance. Matched annual audio = realistic operational capacity per appliance under a typical mixed real-time + batch duty cycle. Batch (files/min) = sustained transcription throughput per appliance. files/hr (E2E) = end-to-end including ingest, language ID, decode, and delivery overhead. Peak accelerator power scales linearly across the family; add ~120 W chassis baseline for total appliance draw.

Specifications represent the current production reference appliance and are subject to change. Performance figures are configuration-dependent. Customer pilots use the same evaluation harness for traceability.

Hardware & deployment

Performance — streaming / near–real-time

Performance — batch

Audio formats

Languages & language identification

API & integration

Observability

Compliance & data handling

Variants comparison

Mentions légales

Site

Entreprise

Nous contacter