Multimedia Big Data Analytics

Description

Information extraction from multimedia documents composed of audio, image, video, and text, as well as knowledge analysis across large volumes of data containing information flows from heterogeneous sources: audio, video, image, text, etc.

Competitive Advantages

For years, a large number of institutions and companies have been generating and accessing massive amounts of multimedia data whose value dramatically decreases—or is entirely lost—if it cannot be accessed easily and immediately.
If these resources are not properly cataloged and easy to locate, the data becomes worthless.
Generating textual metadata through advanced processing techniques such as Deep Learning and Artificial Intelligence, applied to the image, video, or audio content of multimedia documents, enables indexing, cataloging, and text-based searches—unlocking the full value of the content available to an organization.

VivoLab has extensive experience in multimedia information processing, machine learning system development, and artificial intelligence systems. It also has its own large-scale computing infrastructure, featuring state-of-the-art graphics processing units (GPUs).

Applications

A vast amount of valuable information can be extracted from multimedia documents. The analysis processes can be tailored to the nature of the information source:

Transcription of video content: converting speech into text from individuals appearing in the video.
Identification and tracking of speakers or key participants using both visual and acoustic information.
Acoustic segmentation and classification of content: speech, music, noise, and relevant acoustic events (e.g., bangs, sirens, explosions).
Scene description based on the images within the document: indoors, outdoors, urban area, meeting room, etc.
Sentiment analysis derived from speech and the spoken content of individuals in the document.
Automatic generation of reports and summaries from multimedia documents.
Development of efficient, inclusive, and accessible interfaces and content.

Classification

Technology Areas:

Artificial Intelligence and cognitive systems
Interaction technologies (e.g., human-machine interaction, motion recognition, and language technologies)
Data mining, Big Data, database management

Categories:

Technological development
Technology transfer
Concept validation and prototyping
Testing and validation

Keywords

Multimedia Big Data; Automatic speech recognition; Facial and voice biometrics; Image and video processing and analysis; Audio processing and analysis; Natural Language Processing; Deep Learning; Artificial Intelligence

Contact

info@aragonedih.com

Success Stories

Etiqmedia

ETIQMEDIA is a technology company focused on developing tools to optimize the management and exploitation of audiovisual content. Since its inception, Etiqmedia has received support from VivoLab for the development of key technological components within its product line. This has allowed it to develop proprietary and distinctive technology, securing a strong position in a highly competitive market.

Corporación RTVE

VivoLab and RTVE have a long-standing technological collaboration that has enabled the public broadcaster to streamline and improve the management of some of its routine processes. This long-term relationship has culminated in the creation of a dedicated research chair aimed at automating parts of RTVE’s audiovisual and audio content documentation process, both during the program production phase and for permanent archival in its documentary repositories.