Voice cloning data with cloning rights in writing.
Voice cloning is the most legally fraught use case in audio AI. Generic TTS consent does not cover the synthesis of a recognisable person — ours does. Every speaker signs a release that names voice cloning explicitly, identity-verified, jurisdiction-aware, and revocable on a written SLA. No other supplier ships this stack.
Built for the work.
Named cloning consent
Every release uses the words “voice cloning,” “voice model,” and “synthetic voice.” Identity verified at signing, jurisdiction recorded.
Multi-take capture
Neutral, warm, urgent, sad, excited, whispered takes of the same script — the emotional sweep your few-shot encoder needs.
Studio-grade audio
48 kHz / 24-bit WAV in treated rooms on Shure SM7B, Rode NT1, or MKH 416 chains. No rooms with refrigerators.
Phonetic balance
Reads cover the full ARPABET so the cloned voice generalises to text it never spoke.
Revocation SLA
Speakers can withdraw. We propagate the takedown to licensees inside the contractual window, in writing.
Ethical guardrails
No fraud, no political deepfakes, no non-consensual intimate content, no impersonation of non-consenting public figures. We have walked away from contracts.
How much voice you actually need.
The thing other suppliers do not ship.
Named consent
The release form literally says “voice cloning.” Most suppliers hide behind generic “AI training” language. We do not.
Cloning rights in writing
Per-speaker, per-jurisdiction, per-buyer where possible. Auditable from day one.
Multi-take capture
Same script across emotional states + free conversation + phonetic-balance reads, in one session.
Emotion variation
Neutral, warm, urgent, sad, excited, whispered. Tagged in the manifest so your encoder can condition.
Ethical guardrails
EU AI Act Article 5 baked into every contract. No impersonation, no political deepfakes, no fraud — refused in writing.
Revocation SLA
Speakers can withdraw, named contact for life, downstream takedown propagation. Other suppliers cannot do this because they do not know who their speakers are.
From email to first cloned voice.
Sample request
Tell us the model and tier. We return a 30-min sample with audio, cloning licence template, and consent receipt within 48 hours.
Mutual NDA
Standard one-page mutual.
MSA + Voice Cloning licence
Named cloning rights, perpetual commercial training, identity verification on every speaker, EU AI Act compliant.
First delivery
Pilot voice with multi-take audio, emotion sweep, ARPABET reads, manifest, and per-speaker signed release.
Manifest & provenance
Per-file lineage: speaker, mic, room, consent version, jurisdiction, ID verification, SHA-256 per file.
Ongoing + revocation SLA
Named contact for life. If a speaker withdraws, takedown propagates to licensees inside the contractual window.
Common questions.
What is voice cloning training data?
Speech recordings paired with explicit, named, written consent from each speaker authorising the synthesis of a model of their voice. Quality, multi-take coverage, and the legal stack matter as much as hours.
Why does cloning need a separate licence?
Voice cloning is the most legally fraught use case in audio AI. Generic TTS or “AI training” consent does not cover synthesising a recognisable person. Our Voice Cloning licence names cloning explicitly, in writing, on every speaker. No other supplier does this.
Do speakers actually know they are being cloned?
Yes. The release form uses the words “voice cloning,” “voice model,” and “synthetic voice,” lists the buyer where possible, and is signed before recording. Identity is verified at signing.
What about the EU AI Act?
Article 5 prohibits manipulative deepfakes; Article 50 requires disclosure of synthetic audio. Our licence requires downstream deployment to comply with both, and forbids non-consensual impersonation, fraud, and political deepfakes.
How much audio per cloned voice?
Few-shot clones (XTTS, OpenVoice) work from 5–30 seconds. High-fidelity clones want 30 min – 2 hrs. Production-grade voice doubles want 5–20 hrs across multiple emotional states and recording sessions.
Can a speaker revoke?
Yes. We operate a written revocation SLA — speakers can withdraw, and we propagate the takedown to licensees within the contractual window. The named consent contact is for life.
Do you do multi-take and emotion sweeps?
Yes. Custom cloning sessions include neutral, warm, urgent, sad, excited, and whispered takes of the same script, plus phonetically balanced reads, plus free conversation.
What will you refuse to clone?
Non-consenting public figures, minors, deceased persons without estate authority, and any use case targeting fraud, political impersonation, or non-consensual sexual content. We have walked away from contracts over this.
How is voice cloning data priced?
Per voice, with tiers for exclusivity and territory. Custom commissions are quoted on hours, sessions, and exclusivity. Contact partnerships@aipodcast.io.
Want a representative sample?
30 minutes of audio + transcripts + metadata, delivered within 48 hours of NDA.