All Collections
About Audioburst
What metadata is collected from podcast episodes?
What metadata is collected from podcast episodes?

A full list of metadata collected from podcast episodes submitted to Audioburst

Tara avatar
Written by Tara
Updated over a week ago


When Audioburst's AI-powered system processes new audio, it collects and analyzes a standard set of metadata for each minute of content. This metadata allows us to surface just the right news story, podcast clip, or playlist to listeners. 

Below is a list of all of the information that is collected when you submit a podcast episode to Audioburst's system. 

(The information for radio content can be found here)

Some of the data is pulled directly from your RSS feed, while other pieces of data are created during processing: 


Podcast Name
 - The name of the podcast, as it appears in the RSS

Podcast Description - The podcast description, as it appears in the RSS

Episode Name - The name of the episode, as it appears in the RSS

Episode Description  - The description of the episode, as it appears in the RSS

Entry Date - The episode's release date, as it appears in the RSS

Air Time - A timestamp for when the podcast was released, as it appears in the RSS

Full Transcript - An automatic system-generated transcription of the episode

Keywords - A list of the keywords used in the episode. Keywords are short phrases from the audio that have meaningful impact on the content of the clip 

Entities - A list of the main entities from the episode. An entity is a word or phrase that represents a person, location, company, job title, or another meaningful attribute in the content 

Duration - The length of the episode 

Bursts - A list of burst IDs extracted from the episode (bursts are Audioburst's short-form clips created automatically by our AI-powered engine)

Per-word Timing - Each word in the transcript is tagged with a start time as well as the word's duration, and our confidence in the system's transcription of the word     

Natural Language Processing (NLP) Data - Enhance keyword and entity extraction by including relevancy, type, and sentiment 

Audioburst Analysis  - An in-depth break down of the relationship between bursts, extraction of main audio and textual features, and the comparison between previous and next bursts: 

Each breakdown contains the core keywords and entities for the bursts in that section, allowing an understanding of what the user is listening to in the defined episode analysis area 

Did this answer your question?