24 Artículos

« Anterior Página: 1 de 2 Siguiente »

Audiovisual Biometric Network with Deep Feature Fusion for Identification and Text Prompted Verification

Acceso

en línea

Juan Carlos Atenco, Juan Carlos Moreno and Juan Manuel Ramirez

In this work we present a bimodal multitask network for audiovisual biometric recognition. The proposed network performs the fusion of features extracted from face and speech data through a weighted sum to jointly optimize the contribution of each modali... ver más

Revista: Algorithms Formato: Electrónico

Tabla de contenido: Vol: 16 Num: 0 Par: 2 Año: 2023

Analyzing Noise Robustness of Cochleogram and Mel Spectrogram Features in Deep Learning Based Speaker Recognition

Acceso

en línea

Wondimu Lambamo, Ramasamy Srinivasagan and Worku Jifara

The performance of speaker recognition systems is very well on the datasets without noise and mismatch. However, the performance gets degraded with the environmental noises, channel variation, physical and behavioral changes in speaker. The types of Spea... ver más

Revista: Applied Sciences Formato: Electrónico

Tabla de contenido: Vol: 13 Num: 0 Par: 1 Año: 2023

Whisper40: A Multi-Person Chinese Whisper Speaker Recognition Dataset Containing Same-Text Neutral Speech

Acceso

en línea

Jingwen Yang and Ruohua Zhou

Whisper speaker recognition (WSR) has received extensive attention from researchers in recent years, and it plays an important role in medical, judicial, and other fields. Among them, the establishment of a whisper dataset is very important for the study... ver más

Revista: Information Formato: Electrónico

Tabla de contenido: Vol: 15 Num: 0 Par: 4 Año: 2024

SAPBERT: Speaker-Aware Pretrained BERT for Emotion Recognition in Conversation

Acceso

en línea

Seunguook Lim and Jihie Kim

Emotion recognition in conversation (ERC) is receiving more and more attention, as interactions between humans and machines increase in a variety of services such as chat-bot and virtual assistants. As emotional expressions within a conversation can heav... ver más

Revista: Algorithms Formato: Electrónico

Tabla de contenido: Vol: 16 Num: 0 Par: 1 Año: 2023

Global?Local Self-Attention Based Transformer for Speaker Verification

Acceso

en línea

Fei Xie, Dalong Zhang and Chengming Liu

Transformer models are now widely used for speech processing tasks due to their powerful sequence modeling capabilities. Previous work determined an efficient way to model speaker embeddings using the Transformer model by combining transformers with conv... ver más

Revista: Applied Sciences Formato: Electrónico

Tabla de contenido: Vol: 12 Num: 0 Par: 19 Año: 2022

Web Radio Automation for Audio Stream Management in the Era of Big Data

Acceso

en línea

Nikolaos Vryzas, Nikolaos Tsipas and Charalampos Dimoulas

Radio is evolving in a changing digital media ecosystem. Audio-on-demand has shaped the landscape of big unstructured audio data available online. In this paper, a framework for knowledge extraction is introduced, to improve discoverability and enrichmen... ver más

Revista: Information Formato: Electrónico

Tabla de contenido: Vol: 11 Num: 0 Par: 4 Año: 2020

Editorial for Special Issue ?IberSPEECH2018: Speech and Language Technologies for Iberian Languages?

Acceso

en línea

Francesc Alías, Antonio Bonafonte and António Teixeira

The main goal of this Special Issue is to present the latest advances in research and novel applications of speech and language technologies based on the works presented at the IberSPEECH edition held in Barcelona in 2018, paying special attention to tho... ver más

Revista: Applied Sciences Formato: Electrónico

Tabla de contenido: Vol: 10 Num: 0 Par: 1 Año: 2020

Voice Interaction Recognition Design in Real-Life Scenario Mobile Robot Applications

Acceso

en línea

Shih-An Li, Yu-Ying Liu, Yun-Chien Chen, Hsuan-Ming Feng, Pi-Kang Shen and Yu-Che Wu

This paper designed a voice interactive robot system that can conveniently execute assigned service tasks in real-life scenarios. It is equipped without a microphone where users can control the robot with spoken commands; the voice commands are then reco... ver más

Revista: Applied Sciences Formato: Electrónico

Tabla de contenido: Vol: 13 Num: 0 Par: 5 Año: 2023

Language Identification-Based Evaluation of Single Channel Speech Separation of Overlapped Speeches

Acceso

en línea

Zuhragvl Aysa, Mijit Ablimit, Hankiz Yilahun and Askar Hamdulla

In multi-lingual, multi-speaker environments (e.g., international conference scenarios), speech, language, and background sounds can overlap. In real-world scenarios, source separation techniques are needed to separate target sounds. Downstream tasks, su... ver más

Revista: Information Formato: Electrónico

Tabla de contenido: Vol: 13 Num: 0 Par: 10 Año: 2022

Speech Emotion Recognition System Using Gaussian Mixture Model and Improvement proposed via Boosted GMM

Acceso

en línea

Pavitra Patel,A. A. Chaudhari,M. A. Pund,D. H. Deshmukh Pág. 56 - 64

Speech emotion recognition is an important issue which affects the human machine interaction. Automatic recognition of human emotion in speech aims at recognizing the underlying emotional state of a speaker from the speech signal. Gaussian mixture models... ver más

Revista: IRA-International Journal of Technology & Engineering Formato: Electrónico

Tabla de contenido: Vol: 0 Num: Proceedings of the International Conference on Sci Par: 0 Año: 2017

An Automatic Speaker Clustering Pipeline for the Air Traffic Communication Domain

Acceso

en línea

Driss Khalil, Amrutha Prasad, Petr Motlicek, Juan Zuluaga-Gomez, Iuliia Nigmatulina, Srikanth Madikeri and Christof Schuepbach

In air traffic management (ATM), voice communications are critical for ensuring the safe and efficient operation of aircraft. The pertinent voice communications?air traffic controller (ATCo) and pilot?are usually transmitted in a single channel, which po... ver más

Revista: Aerospace Formato: Electrónico

Tabla de contenido: Vol: 10 Num: 0 Par: 10 Año: 2023

Training-Free Acoustic-Based Hand Gesture Tracking on Smart Speakers

Acceso

en línea

Xiao Xu, Xuehan Zhang, Zhongxu Bao, Xiaojie Yu, Yuqing Yin, Xu Yang and Qiang Niu

Hand gesture recognition is an essential Human?Computer Interaction (HCI) mechanism for users to control smart devices. While traditional device-based methods support acceptable recognition performance, the recent advance in wireless sensing could enable... ver más

Revista: Applied Sciences Formato: Electrónico

Tabla de contenido: Vol: 13 Num: 0 Par: 21 Año: 2023

Arabic Mispronunciation Recognition System Using LSTM Network

Acceso

en línea

Abdelfatah Ahmed, Mohamed Bader, Ismail Shahin, Ali Bou Nassif, Naoufel Werghi and Mohammad Basel

The Arabic language has always been an immense source of attraction to various people from different ethnicities by virtue of the significant linguistic legacy that it possesses. Consequently, a multitude of people from all over the world are yearning to... ver más

Revista: Information Formato: Electrónico

Tabla de contenido: Vol: 14 Num: 0 Par: 7 Año: 2023

Multimodal Interaction System for Home Appliances Control

Acceso

en línea

Hanif Fakhrurroja,Carmadi Machbub,Ary Setijadi Prihatmanto,Ayu Purwarianti Pág. pp. 44 - 67

This paper proposes a way to control home appliances using a multimodal interaction system such as speech, gestures, and smartphone applications. The sensor to capture speech, in the Indonesian language, and gestures from users are Kinect v2. Speech reco... ver más

Revista: International Journal of Interactive Mobile Technologies (iJIM) Formato: Electrónico

Tabla de contenido: Vol: 14 Num: 15 Par: 0 Año: 2020

Multi-Modal Emotion Aware System Based on Fusion of Speech and Brain Information

Acceso

en línea

Rania M. Ghoniem, Abeer D. Algarni and Khaled Shaalan

In multi-modal emotion aware frameworks, it is essential to estimate the emotional features then fuse them to different degrees. This basically follows either a feature-level or decision-level strategy. In all likelihood, while features from several moda... ver más

Revista: Information Formato: Electrónico

Tabla de contenido: Vol: 10 Num: 0 Par: 7 Año: 2019

Development and Testing of a Device to Increase the Level of Automation of a Conventional Milking Parlor through Vocal Commands

Acceso

en línea

Mauro Zaninelli

A portable wireless device with a ?vocal commands? feature for activating the mechanical milking phase in conventional milking parlors was developed and tested to increase the level of automation in the milking procedures. The device was tested in the la... ver más

Revista: Agriculture Formato: Electrónico

Tabla de contenido: Vol: 7 Num: 1 Par: January Año: 2017

A Dual-Branch Speech Enhancement Model with Harmonic Repair

Acceso

en línea

Lizhen Jia, Yanyan Xu and Dengfeng Ke

Recent speech enhancement studies have mostly focused on completely separating noise from human voices. Due to the lack of specific structures for harmonic fitting in previous studies and the limitations of the traditional convolutional receptive field, ... ver más

Revista: Applied Sciences Formato: Electrónico

Tabla de contenido: Vol: 14 Num: 0 Par: 4 Año: 2024

Design of Digital Constrained Linear Least-Squares Multiple-Resonator-Based Harmonic Filtering

Acceso

en línea

Miodrag D. Ku?ljevic and Vladimir V. Vujicic

Although voiced speech signals are physical signals which are approximately harmonic and electric power signals are true harmonic, the algorithms used for harmonic analysis in electric power systems can be successfully used in speech processing, includin... ver más

Revista: Acoustics Formato: Electrónico

Tabla de contenido: Vol: 4 Num: 0 Par: 1 Año: 2022

An Investigation of a Feature-Level Fusion for Noisy Speech Emotion Recognition

Acceso

en línea

Sara Sekkate, Mohammed Khalil, Abdellah Adib and Sofia Ben Jebara

Because one of the key issues in improving the performance of Speech Emotion Recognition (SER) systems is the choice of an effective feature representation, most of the research has focused on developing a feature level fusion using a large set of featur... ver más

Revista: Computers Formato: Electrónico

Tabla de contenido: Vol: 8 Num: 0 Par: 4 Año: 2019

Data Augmentation for Speaker Identification under Stress Conditions to Combat Gender-Based Violence

Acceso

en línea

Esther Rituerto-González, Alba Mínguez-Sánchez, Ascensión Gallardo-Antolín and Carmen Peláez-Moreno

A Speaker Identification system for a personalized wearable device to combat gender-based violence is presented in this paper. Speaker recognition systems exhibit a decrease in performance when the user is under emotional or stress conditions, thus the o... ver más

Revista: Applied Sciences Formato: Electrónico

Tabla de contenido: Vol: 9 Num: 0 Par: 11 Año: 2019

« Anterior Página: 1 de 2 Siguiente »