Podcast audio licensing for AI labs that want the whole catalogue.
AI labs need real human conversation at scale. Podcasters have hundreds of thousands of hours of it — recorded well, in studios, with the people who can grant rights cleanly. We are the bridge: bulk back-catalogue licensing with retroactive guest re-clearance, exclusivity tiers, transparent creator revenue share, and the only consent stack a frontier lab can actually defend in court.
Built for the work.
Direct from creators
Bulk back catalogues licensed directly from working podcasters and networks. Not scraped, not RSS-rinsed.
Rights documented
Signed creator release plus retroactive guest re-clearance on every legacy episode. Written speaker releases.
Studio-grade audio
Real podcast studios — Shure SM7B, Rode NT1, MKH 416 chains in treated rooms. EBU R128 broadcast loudness.
Transcripts included
Word-aligned transcripts with diarization in JSON, CTM, TextGrid, SRT, or VTT.
Custom acquisition
Need a specific show, network, or vertical? We can broker the licence end-to-end, including the legal lift on guest re-clearance.
Enterprise terms
Indemnification, three exclusivity tiers, named human contact on every deal, named human contact on every deal.
What you can license.
Inside the deal.
Back catalogue
The full released archive — every episode, every season — with retroactive guest re-clearance.
Exclusive segments
Curated extracts: deep-dive episodes, specific guests, domain-balanced cuts, or interview formats.
Future episodes
Rolling forward consent. New episodes flow into your shard monthly with the same provenance.
Ad-free masters
Production masters with ad inserts removed — cleaner signal, no host-read repetition pollution.
Raw multitrack
Per-mic stems where available. Speaker isolation pre-baked, perfect for source separation and diarization training.
Transcripts
Word-aligned text in JSON/CTM/SRT/VTT. Standalone licence for text LLMs and RAG.
From email to first catalogue.
Sample request
Tell us the verticals, hours, and exclusivity tier. We return a 30-min sample shard plus a candidate catalogue list within 48 hours.
Mutual NDA
Standard one-page mutual.
MSA + bulk licence
Licence terms negotiated per project, exclusivity tier, indemnity terms negotiated per deal, named contact, named human contact on every deal.
First delivery
Pilot back-catalogue tranche with audio, transcripts, manifests, per-episode consent receipts, and guest clearance log.
Manifest & provenance
Per-episode lineage: show, host, every guest, recording date, jurisdiction, consent version, SHA-256.
Ongoing delivery
Monthly increments of new episodes, exclusivity enforcement, creator royalty reporting, named human contact on every deal.
Common questions.
Can you legally license podcast audio for AI?
Yes — when the rights are properly granted. Every creator signs an explicit AI training release before audio enters the catalogue, and every guest in every episode is re-cleared in writing. Scraped public feeds do not meet this bar.
Can I license a whole back catalogue?
Yes. We license bulk back catalogues — entire shows, networks, or back-catalogue tranches — including retroactive guest re-clearance for legacy episodes. This is the cleanest path to large volumes of consented conversational audio fast.
What about co-hosts and guests?
Every voice in every recording has a signed release. For new episodes, guests sign when they walk into the studio. For legacy episodes, we do retroactive guest re-clearance — and we exclude any episode where a guest cannot be reached or declines.
What can I license?
Released back catalogue, exclusive segments, future episodes on a rolling basis, ad-free production masters, and raw multitrack stems where available. Aligned transcripts ship with every tier.
Is the audio exclusive?
Three tiers: non-exclusive (default), category-exclusive (no other AI buyer in your sub-vertical), and full-exclusive (no other AI buyer at all). Exclusive carries a premium.
How is revenue shared with creators?
Creators receive a transparent revenue share on every licensed hour, paid quarterly, with per-episode reporting. This is what makes consent sustainable — speakers stay paid for the life of the model.
What does it cost?
Pricing depends on hours, language, exclusivity tier, and metadata depth. Contact partnerships@aipodcast.io for a quote.
How is this different from a public dataset?
Public datasets lack documented consent, commercial training rights, speaker metadata, and a revocation path. Ours has all four — plus indemnity terms negotiated per deal, audit trail, and a named human contact for life.
I am a podcaster being acquired or considering selling — how does that work?
Talk to us. We can structure either a pure data licence (you keep the show) or work alongside an acquisition with the AI rights as a separate, ongoing royalty stream. Email partnerships@aipodcast.io.
Want a representative sample?
30 minutes of audio + transcripts + metadata, delivered after a quick scoping call.