Browse Prior Art Database

A method to find the wanted audio content by analyzing the audio content based on the embedded audio tags and recognizing the voiceprint

IP.com Disclosure Number: IPCOM000246942D
Publication Date: 2016-Jul-18
Document File: 6 page(s) / 102K

Publishing Venue

The IP.com Prior Art Database

Abstract

The core idea of this invention is to mark key point automatic embed tags in audio stream and detect different human voice by voiceprint analysis. The embedded tags which human can not hear are added to audio stream automatically from provider side by analyzing symbols in speech texts that contain the search key words of the speech: date, name, features and links of the websites and so on. The audio dividing part uses voiceprint recognition technology. Different people have different voiceprints, which can be used to recognize different people. While one audio is playing, it detects the voiceprint for different people. And based on different people' speech, the audio is divided into pieces.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 52% of the total text.

Page 01 of 6

A method to find the wanted audio content by analyzing the audio content based on the embedded audio tags and recognizing the voiceprint

This idea including two parts, the provider and consumer.

Provider system :

The provider can be a radio broadcaster, a meeting/lecture speaker, dialog show and etc. The provider's speech has some predefine key words and related to pre-define tags.

While a provider delivers a speech, system will tracking the voice and locate the speech progress in the text and analyze the highlight key word token and add tags into the audio. The embedded audio tags can only be heard and recognized by the mobile devices , not by human ears. The embedded audio tags are like symbols in texts that contain the search key words of the speech : date, name, features and links of the websites and so on.

1


Page 02 of 6

Consumer System and features

Main flow of the consumer side

2


Page 03 of 6

The system is formed by three parts, the Voice Recording part, the Frequency Scanning/Matching part, and the Audio Dividing part.

1. The Voice Recording part records the voice. It starts to record when activated in the system, and stores the recorded audio in an audio file format in the device storage. By using the smart Audio Dividing part signal, it divides the whole audio content into pieces that are displayed and stored chronically and individually.

2. The Frequency Scanning/Matching part completes the tasks in loops.

The first task is to scan the Frequency Modulation channels(FM), scan whole frequency range from low to high and recognize the active radio channel, then transmit the radio channel live audio electric signal into spectrum .

Second, it gets the audio input from the Voice Recording part, transmits the audio into frequency spectrum too. Then it compares this spectrum with the scanned channel one. If the two spectrums do not match (wave frequency and amplitude and etc.), it starts to scan a next FM channel and compare again until the exact audio radio channel is found .

3


Page 04 of 6

The Frequency Matching part exams the audio wave's frequency and amplitude that is got from recording part and then forms an audiogram. Then it exams the scanned channel audio wave to form another audiogram . Next, two audiograms are compared with the same wave frequency range & amplitudes with the similar rate, then calculates all rates dispersion. If the dispersion (adjustable) is below a ratio, for example,...