|
|
Venues (Conferences, Journals, ...)
|
|
GrowBag graphs for keyword ? (Num. hits/coverage)
Group by:
The graphs summarize 800 occurrences of 559 keywords
|
|
|
Results
Found 3474 publication records. Showing 3474 according to the selection in the facets
Hits ?▲ |
Authors |
Title |
Venue |
Year |
Link |
Author keywords |
17 | Xavier Alameda-Pineda |
Egocentric Audio-Visual Scene Analysis. A Machine Learning and Signal Processing Approach. (Analyse Égocentrique de Scènes Audio-Visuelles. Une approche par Apprentissage Automatique et Traitement du Signal). |
|
2013 |
RDF |
|
17 | Kenji Ozawa, Masashi Obinata, Yuichiro Kinoshita |
Kansei Estimation Models for the Sense of Presence in Audio-Visual Content with Different Audio Reproduction Methods. |
SNPD |
2012 |
DBLP DOI BibTeX RDF |
|
17 | Shifeng Pan, Jianhua Tao 0001, Ya Li |
The CASIA Audio Emotion Recognition Method for Audio/Visual Emotion Challenge 2011. |
ACII (2) |
2011 |
DBLP DOI BibTeX RDF |
|
17 | Meriem Bendris |
Indexation audio-visuelle des personnes dans un contexte de télévision. (Audio-visual indexing of people in TV-context). |
|
2011 |
RDF |
|
17 | Mickael Rouvier |
Structuration de contenus audio-visuel pour le résumé automatique. (Audio-visual content structuring for automatic summarization). |
|
2011 |
RDF |
|
17 | Marc Rébillat |
Vibrations de plaques multi-excitateurs de grandes dimensions pour la création d'environnements virtuels audio-visuels: approches acoustique, mécanique et perceptive. (Vibrations of large multi-actuator panels for the creation of audio-visual virtual environments: acoustical, mechanical and perceptual approaches). |
|
2011 |
RDF |
|
17 | Erik Vladimir Ortega González |
Étude de son 3D pour une interaction audio-visuelle en environnement virtuel enrichi. (Study of 3D sound for an audio-visual interaction in enriched virtual environment). |
|
2011 |
RDF |
|
17 | Benjamin Belmudez, Sebastian Möller 0001, Blazej Lewcio, Alexander Raake, Muhammad Amir Mehmood |
Audio and video channel impact on perceived audio-visual quality in different interactive contexts. |
MMSP |
2009 |
DBLP DOI BibTeX RDF |
|
17 | Yasemin Demir, Engin Erzin, Yücel Yemez, A. Murat Tekalp |
Evaluation of audio features for audio-visual analysis of dance figures. |
EUSIPCO |
2008 |
DBLP BibTeX RDF |
|
17 | Naoki Nitanda, Miki Haseyama, Hideo Kitajima |
An audio-scene cut detection method using fuzzy c-means algorithm for audio-visual indexing. |
ISCAS (2) |
2004 |
DBLP BibTeX RDF |
|
17 | Roland Goecke, Gerasimos Potamianos, Chalapathy Neti |
Noisy audio feature enhancement using audio-visual speech data. |
ICASSP |
2002 |
DBLP DOI BibTeX RDF |
|
17 | Martin Heckmann, Thorsten Wild, Frédéric Berthommier, Kristian Kroschel |
Comparing audio- and a-posteriori-probability-based stream confidence measures for audio-visual speech recognition. |
INTERSPEECH |
2001 |
DBLP DOI BibTeX RDF |
|
17 | Alexandre Joly, Nathalie Montard, Marcel Buttin |
Audio-visual quality and interactions between television audio and video. |
ISSPA |
2001 |
DBLP DOI BibTeX RDF |
|
17 | John R. Hershey, Javier R. Movellan |
Audio Vision: Using Audio-Visual Synchrony to Locate Sounds. |
NIPS |
1999 |
DBLP BibTeX RDF |
|
17 | Caterina Saraceno, Riccardo Leonardi |
Identification of Story Units in Audio-Visual Sequences by Joint Audio and Video Processing. |
ICIP (1) |
1998 |
DBLP DOI BibTeX RDF |
|
17 | László Czap |
Audio and audio-visual perception of consonants disturbed by white noise and 'cocktail party'. |
ICSLP |
1998 |
DBLP DOI BibTeX RDF |
|
17 | Jinsil Seo, Greg J. Corness |
nite_aura: audio-visual immersive installation. |
ACM Multimedia |
2008 |
DBLP DOI BibTeX RDF |
alternative immersive installation |
17 | Huanhuan Lu, Bingjun Zhang, Ye Wang 0007, Wee Kheng Leow |
iDVT: an interactive digital violin tutoring system based on audio-visual fusion. |
ACM Multimedia |
2008 |
DBLP DOI BibTeX RDF |
fingering analysis, multimodal fusion, hand tracking, music transcription, onset detection |
17 | Shishir Shah |
Commentary Paper on "Person Tracking With Audio-Visual Cues Using the Iterative Decoding Framework". |
AVSS |
2008 |
DBLP DOI BibTeX RDF |
|
17 | Shankar T. Shivappa, Mohan M. Trivedi, Bhaskar D. Rao |
Person Tracking with Audio-Visual Cues Using the Iterative Decoding Framework. |
AVSS |
2008 |
DBLP DOI BibTeX RDF |
|
17 | Amitava Das, Ohil K. Manyam, Makarand Tapaswi |
Audio-Visual Person Authentication with Multiple Visualized-Speech Features and Multiple Face Profiles. |
ICVGIP |
2008 |
DBLP DOI BibTeX RDF |
|
17 | Ebroul Izquierdo |
Editorial: Knowledge Engineering, Semantics, and Signal Processing in Audio-Visual Information Retrieval. |
IEEE Trans. Circuits Syst. Video Technol. |
2007 |
DBLP DOI BibTeX RDF |
|
17 | Dongdong Li, Yingchun Yang, Zhaohui Wu 0001 |
Dynamic Bayesian Networks for Audio-Visual Speaker Recognition. |
ICB |
2006 |
DBLP DOI BibTeX RDF |
|
17 | Robbie De Sutter, Stijn Notebaert, Rik Van de Walle |
Evaluation of Metadata Standards in the Context of Digital Audio-Visual Libraries. |
ECDL |
2006 |
DBLP DOI BibTeX RDF |
|
17 | JongSuk Choi, Munsang Kim, Hyun-Don Kim |
Probabilistic Speaker Localization in Noisy Environments by Audio-Visual Integration. |
IROS |
2006 |
DBLP DOI BibTeX RDF |
|
17 | Wolfgang Hürst |
Interactive audio-visual video browsing. |
ACM Multimedia |
2006 |
DBLP DOI BibTeX RDF |
multimedia user interfaces, interactivity, video browsing |
17 | Ming S. Liu, Thomas S. Huang |
Video Based Person Authentication via Audio/Visual Association. |
ICME |
2006 |
DBLP DOI BibTeX RDF |
|
17 | Sukhee Cho, Namho Hur, Jinwoong Kim, Kugjin Yun, Soo In Lee |
Carriage of 3D Audio-Visual Services by T-DMB. |
ICME |
2006 |
DBLP DOI BibTeX RDF |
|
17 | Véronique Malaisé, Lora Aroyo, Hennie Brugman, Luit Gazendam, Annemieke de Jong, Christian Negru, Guus Schreiber |
Evaluating a Thesaurus Browser for an Audio-visual Archive. |
EKAW |
2006 |
DBLP DOI BibTeX RDF |
|
17 | Michael Ditze, Torsten Bresser, Frank Berger |
Resource Adaptation for Audio-Visual Devices in the UPnP QoS Architecture. |
AINA (2) |
2006 |
DBLP DOI BibTeX RDF |
Universal Plug and Play, Service Oriented Architectures, End-to-end Quality of Service |
17 | Dongdong Li, Yingchun Yang, Zhenyu Shan, Gang Pan 0001, Zhaohui Wu 0001 |
AVAS: An Audio-Visual Attendance System. |
PCM |
2006 |
DBLP DOI BibTeX RDF |
time and attendance system, multimodal, speaker recognition |
17 | David A. Sadlier, Noel E. O'Connor |
Event detection in field sports video using audio-visual features and a support vector Machine. |
IEEE Trans. Circuits Syst. Video Technol. |
2005 |
DBLP DOI BibTeX RDF |
|
17 | Rebecca Lunsford, Sharon L. Oviatt, Rachel Coulston |
Audio-visual cues distinguishing self- from system-directed speech in younger and older adults. |
ICMI |
2005 |
DBLP DOI BibTeX RDF |
intended addressee, open-microphone engagement, spoken amplitude, system adaptation, user modeling, multimodal interaction, universal access, gaze, individual differences |
17 | Jukka Holm, Kai Havukainen, Juha Arrasvuori |
Personalizing game content using audio-visual media. |
Advances in Computer Entertainment Technology |
2005 |
DBLP DOI BibTeX RDF |
image-controlled games, media-controlled games, musically controlled games, personalization, games, music, audio, MIDI |
17 | Changsheng Xu, Xi Shao, Namunu Chinthaka Maddage, Mohan S. Kankanhalli |
Automatic music video summarization based on audio-visual-text analysis and alignment. |
SIGIR |
2005 |
DBLP DOI BibTeX RDF |
chorus, shot, summarization, alignment, music video, lyrics |
17 | Bouchra Abboud, Hervé Bredin, Guido Aversano, Gérard Chollet |
Audio-visual Identity Verification: An Introductory Overview. |
WNSP |
2005 |
DBLP DOI BibTeX RDF |
|
17 | David Demirdjian, Kevin W. Wilson, Michael Siracusa, Trevor Darrell |
Real-time audio-visual tracking for meeting analysis. |
ICMI |
2004 |
DBLP DOI BibTeX RDF |
speaker localization, tracking, stereo, microphone array |
17 | Alexandros Eleftheriadis, Danny Hong |
Flavor: a formal language for audio-visual object representation. |
ACM Multimedia |
2004 |
DBLP DOI BibTeX RDF |
XFlavor, flavor, media representation |
17 | Huaxin Xu, Tat-Seng Chua |
The fusion of audio-visual features and external knowledge for event detection in team sports video. |
Multimedia Information Retrieval |
2004 |
DBLP DOI BibTeX RDF |
semantic, event detection, sports video, event modeling |
17 | Raphaël Troncy |
Integrating Structure and Semantics into Audio-visual Documents. |
ISWC |
2003 |
DBLP DOI BibTeX RDF |
|
17 | Milind R. Naphade, Thomas S. Huang |
Recognizing high-level audio-visual concepts using context. |
ICIP (3) |
2001 |
DBLP DOI BibTeX RDF |
|
17 | Gopal Sarma Pingali, Gamze Tunali, Ingrid Carlbom |
Audio-visual tracking for natural interactivity. |
ACM Multimedia (1) |
1999 |
DBLP DOI BibTeX RDF |
|
17 | Antonis Karidis, Apostolos Meliones, Georgios I. Stassinopoulos |
Efficient Integrated Collaborative Production of High-End Audio-Visual Content. |
ISCC |
1999 |
DBLP DOI BibTeX RDF |
audiovisual production, distributed audiovisual production, distributed video production, on-line collaboration, non-interactive collaboration, distributed virtual studio, content creation tool, CSCW, collaborative work, MPEG, conferencing, distribution management, virtual studio, video production, task sharing, digital video editing |
17 | Antonis Karidis, Apostolos Meliones |
DAVID-E: A Collaborative Working Paradigm for Distributed High-End Audio-Visual Content Creation. |
ICMCS, Vol. 2 |
1999 |
DBLP DOI BibTeX RDF |
networked collaborative working environment, audiovisual production, distributed virtual studio, CSCW, MPEG, remote collaboration, content creation, conferencing, distribution management, task sharing |
17 | S. Ben-Yacoub, Juergen Luettin, Kenneth Jonsson, Jiri Matas, Josef Kittler |
Audio-Visual Person Verification. |
CVPR |
1999 |
DBLP DOI BibTeX RDF |
Multi-Modal Verification, SVM, Face Recognition, Fusion, Speaker Recognition |
17 | Georg Carle, Jörg Ottensmeyer |
RTMC: An Error Control Protocol for IP-based Audio-Visual Multicast Applications. |
ICCCN |
1998 |
DBLP DOI BibTeX RDF |
Audio Video Applications, Transport Protocol, FEC, IP Multicast |
16 | Gamal Fahmy, John A. Black Jr., Sethuraman Panchanathan |
Texture characterization for joint compression and classification based on human perception in the wavelet domain. |
IEEE Trans. Image Process. |
2006 |
DBLP DOI BibTeX RDF |
|
16 | Nikola K. Kasabov |
Integrative connectionist learning systems inspired by nature: current models, future trends and challenges. |
Nat. Comput. |
2009 |
DBLP DOI BibTeX RDF |
Connectionist learning systems, Evolving spiking neural networks, Multiple task learning, Multimodal audio-visual information processing, Gene-spiking neural networks, Computational neurogenetic modeling, Quantum spiking neural networks, Artificial neural networks, Spiking neural networks |
16 | M. H. Sadaghiani, Kah Phooi Seng, Siew Wen Chin, Li-Minn Ang |
Enhanced Lips Detection and Tracking System. |
IVIC |
2009 |
DBLP DOI BibTeX RDF |
Lips detection and tracking, lips-skin ratio, H??? filtering, Watershed, audio-visual speech recognition |
16 | Martha A. Larson, Roeland Ordelman, Franciska de Jong, Wessel Kraaij, Joachim Köhler |
Searching multimedia content with a spontaneous conversational speech track. |
ACM Multimedia |
2009 |
DBLP DOI BibTeX RDF |
audio-visual retrieval, multimedia access, spoken content, speech recognition, speech |
16 | Dhaval Shah, Kyu Jeong Han, Shrikanth S. Narayanan |
A Low-Complexity Dynamic Face-Voice Feature Fusion Approach to Multimodal Person Recognition. |
ISM |
2009 |
DBLP DOI BibTeX RDF |
biometric, multimodal, speaker, audio-visual |
16 | Azzedine Boukerche, Haifa Maamar, Abu Hossain |
An efficient hybrid multicast transport protocol for collaborative virtual environment with networked haptic. |
Multim. Syst. |
2008 |
DBLP DOI BibTeX RDF |
Collaborative haptic audio visual environments, Reliable multicast transport protocol, Brain tumor tele-surgery application, Tracheotomy tele-surgery application, Collaborative virtual environments |
16 | Serafeim Papastefanos, Fotis Andritsopoulos, Vassiliki Mpilili, Christos Theoharatos, Nikos Achilleopoulos |
Direct searching of multimedia content based on video characteristics extracted from compressed domain. |
DIMEA |
2008 |
DBLP DOI BibTeX RDF |
audio visual search, system design, multimedia content, compressed domain |
16 | Dinesh Babu Jayagopi, Sileye O. Ba, Jean-Marc Odobez, Daniel Gatica-Perez |
Predicting two facets of social verticality in meetings from five-minute time slices and nonverbal cues. |
ICMI |
2008 |
DBLP DOI BibTeX RDF |
audio-visual feature extraction, social verticality, status, dominance, meetings |
16 | Hayley Hung, Dinesh Babu Jayagopi, Sileye O. Ba, Jean-Marc Odobez, Daniel Gatica-Perez |
Investigating automatic dominance estimation in groups from visual attention and speaking activity. |
ICMI |
2008 |
DBLP DOI BibTeX RDF |
audio-visual feature extraction, dominance modeling, visual focus of attention, meetings |
16 | Ulrich Fischer 0003 |
Walking the Edit - A Research Project of the Master Cinema Network in Switzerland. |
ICIDS |
2008 |
DBLP DOI BibTeX RDF |
Audio-visual database and metadata, automatic editing engine, participative concept, geo-localisation (GPS), interface design, cartography, urban space |
16 | Josef Chaloupka, Zdenek Chaloupka |
Czech Artificial Computerized Talking Head George. |
COST 2102 Conference (Prague) |
2008 |
DBLP DOI BibTeX RDF |
Audio-visual speech synthesis, Czech talking head |
16 | Steffi Beckhaus, Roland Schröder-Kroll, Martin Berghoff |
Back to the sandbox: playful interaction with granules landscapes. |
TEI |
2008 |
DBLP DOI BibTeX RDF |
continuous TUIs, innovative human computer interaction, interactive audio-visual installation, music tables |
16 | Athanasios K. Noulas, Ben J. A. Kröse |
On-line multi-modal speaker diarization. |
ICMI |
2007 |
DBLP DOI BibTeX RDF |
multi-modal, speaker diarization, audio-visual, speaker detection |
16 | Cynthia A. Murnan |
The magical world of an information commons. |
SIGUCCS |
2007 |
DBLP DOI BibTeX RDF |
academic commons, architect, audio/visual, café, circulation, finishes, furnishings, group study, information commons, interlibrary loan, design, collaboration, information technology, computers, library, workstations, reference, multi-media |
16 | Hayley Hung, Dinesh Babu Jayagopi, Chuohao Yeo, Gerald Friedland, Sileye O. Ba, Jean-Marc Odobez, Kannan Ramchandran, Nikki Mirghafori, Daniel Gatica-Perez |
Using audio and video features to classify the most dominant person in a group meeting. |
ACM Multimedia |
2007 |
DBLP DOI BibTeX RDF |
audio-visual feature extraction, dominance modelling, meetings, data annotation |
16 | Athanasios K. Noulas, Ben J. A. Kröse |
EM detection of common origin of multi-modal cues. |
ICMI |
2006 |
DBLP DOI BibTeX RDF |
audio-visual synchrony, multi-modal cue assignment, multi-modal, content extraction, speaker detection |
16 | Reede Ren, Joemon M. Jose |
Attention guided football video content recommendation on mobile devices. |
MobiMedia |
2006 |
DBLP DOI BibTeX RDF |
attention feature, audio-visual fusion, sports video retrieval, linear prediction |
16 | Riccardo Rabagliati |
AVI and the art system: interactive works at the Venice Biennale. |
AVI |
2006 |
DBLP DOI BibTeX RDF |
audio-visual installation, digital art, interactive experience |
16 | Uma Srinivasan 0001, Silvia Pfeiffer, Surya Nepal, Michael H. Lee, Lifang Gu, Stephen Barrass |
A Survey of MPEG-1 Audio, Video and Semantic Analysis Techniques. |
Multim. Tools Appl. |
2005 |
DBLP DOI BibTeX RDF |
audio-visual segmentation, sports highlights, feature extraction, video analysis, content analysis, semantic analysis, audio analysis, scene change detection, MPEG-1 |
16 | Sy Bor Wang, David Demirdjian |
Inferring body pose using speech content. |
ICMI |
2005 |
DBLP DOI BibTeX RDF |
arm gesture recogntion, audio-visual tracking, untethered body pose tracking |
16 | Jiping Liu, Yanxiang He, Min Peng 0002 |
NewsBR: A Content-Based News Video Browsing and Retrieval System. |
CIT |
2004 |
DBLP DOI BibTeX RDF |
Content-based video browsing and retrieval, news story segmentation, audio-visual features analysis, caption recognition |
16 | Brian Ireson |
"Minions". |
ACM Multimedia |
2004 |
DBLP DOI BibTeX RDF |
audio-visual projection, basic stamp, christian, islam, prayer, ultrasonic range finder, multimedia, interactive, culture |
16 | Iain A. Matthews, Timothy F. Cootes, J. Andrew Bangham, Stephen J. Cox, Richard W. Harvey |
Extraction of Visual Features for Lipreading. |
IEEE Trans. Pattern Anal. Mach. Intell. |
2002 |
DBLP DOI BibTeX RDF |
sieve, connected-set morphology, statistical methods, active appearance model, Audio-visual speech recognition |
16 | Kaoru Nakazono |
Frame Rate as a QoS Parameter and Its Influence on Speech Perception. |
Multim. Syst. |
1998 |
DBLP DOI BibTeX RDF |
Audio-visual integration, McGurk effect, QoS, Multimedia communication |
16 | Daby M. Sow, Alexandros Eleftheriadis |
Algorithmic Representation of Visual Information. |
ICIP (2) |
1997 |
DBLP DOI BibTeX RDF |
algorithmic representation, complexity distortion theory, programmable communication systems design, programmable decoders, intelligent encoders design, tools downloading, audio-visual information, video coding, MPEG-4, rate distortion theory, visual information, algorithm selection |
16 | Yoshitaka Shibata, Naoya Seta, Shogo Shimizu |
Media synchronization protocols for packet audio-video system on multimedia information networks. |
HICSS (2) |
1995 |
DBLP DOI BibTeX RDF |
audio-visual systems, media synchronization protocols, packet audio-video system, multimedia information networks, distributed multimedia information services, semantically synchronized multimedia objects, distributed workstation environment, data output timing, packet stream regulation, audio/video transmission system architecture, strict synchronization, relaxed synchronization, silence-detected synchronization, operating system environments, interprocess communication functions, tasks/threads, synchronization accuracy evaluation, performance evaluation, performance evaluation, timing, UNIX, packet switching, synchronisation, multimedia communication, rate control, network operating systems, access protocols, information networks, continuous media, Mach, load conditions |
15 | Xiaoyuan He, Ryo Kojima, Osamu Hasegawa |
Developmental Word Grounding Through a Growing Neural Network With a Humanoid Robot. |
IEEE Trans. Syst. Man Cybern. Part B |
2007 |
DBLP DOI BibTeX RDF |
|
15 | Kyungae Cha, Sangwook Kim |
MPEG-4 STUDIO: An Object-Based Authoring System for MPEG-4 Contents. |
Multim. Tools Appl. |
2005 |
DBLP DOI BibTeX RDF |
MPEG-4 contents, MPEG-4 content authoring, MPEG-4 systems, multimedia authoring |
15 | Harini Sridharan, Aankus Mani, Hari Sundaram |
A Multimodal Complexity Comprehension-Time Framework for Automated Presentation Synthesis. |
ICME |
2005 |
DBLP DOI BibTeX RDF |
|
15 | Jongeun Cha, Seung Man Kim, Ian Oakley, Jeha Ryu, Kwan-Heng Lee |
Haptic Interaction with Depth Video Media. |
PCM (1) |
2005 |
DBLP DOI BibTeX RDF |
|
15 | Shigeo Morishima, Satoshi Nakamura 0001 |
Multi-Modal Translation System and Its Evaluation. |
ICMI |
2002 |
DBLP DOI BibTeX RDF |
|
15 | Nicola Adami, Alessandro Bugatti, A. Corghi, Riccardo Leonardi, Pierangelo Migliorati, Lorenzo A. Rossi, Caterina Saraceno |
ToCAI: A Framework for Indexing and Retrieval of Multimedia Documents. |
ICIAP |
1999 |
DBLP DOI BibTeX RDF |
|
15 | Katharina Garbe, Iris Herbst |
Extending X3D with Perceptual Auditory Properties. |
VR |
2008 |
DBLP DOI BibTeX RDF |
|
15 | Thomas Moeck, Nicolas Bonneel, Nicolas Tsingos, George Drettakis, Isabelle Viaud-Delmon, David Alloza |
Progressive perceptual audio rendering of complex scenes. |
SI3D |
2007 |
DBLP DOI BibTeX RDF |
audio rendering, auditory masking, ventriloquism, clustering |
15 | Kent Walker, William L. Martens |
Perception of Audio-Generated and Custom Motion Programs in Multimedia Display of Action-Oriented DVD Films. |
HAID |
2006 |
DBLP DOI BibTeX RDF |
|
15 | Robert G. Malkin, Datong Chen, Jie Yang 0001, Alex Waibel |
Directing Attention in Online Aggregate Sensor Streams via Auditory Blind Value Assignment. |
ICME |
2006 |
DBLP DOI BibTeX RDF |
|
15 | Cormac Herley |
Accurate repeat finding and object skipping using fingerprints. |
ACM Multimedia |
2005 |
DBLP DOI BibTeX RDF |
repeat finding, segmentation |
15 | John W. Fisher III, Trevor Darrell |
Probabalistic Models and Informative Subspaces for Audiovisual Correspondence. |
ECCV (3) |
2002 |
DBLP DOI BibTeX RDF |
|
15 | Atsushi Kara |
Protecting Privacy in Remote-Patient Monitoring. |
Computer |
2001 |
DBLP DOI BibTeX RDF |
|
14 | Xize Cheng, Linjun Li, Tao Jin 0004, Rongjie Huang, Wang Lin, Zehan Wang, Huangdai Liu, Ye Wang, Aoxiong Yin, Zhou Zhao |
MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
14 | Jinxing Zhou, Dan Guo, Yiran Zhong, Meng Wang 0001 |
Improving Audio-Visual Video Parsing with Pseudo Visual Labels. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
14 | Joanna Hong, Minsu Kim, Jeongsoo Choi, Yong Man Ro |
Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
14 | Yusheng Dai, Hang Chen, Jun Du, Xiaofei Ding, Ning Ding, Feijun Jiang, Chin-Hui Lee 0001 |
Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion Encoder. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
14 | Junjie Li, Meng Ge, Zexu Pan, Rui Cao, Longbiao Wang, Jianwu Dang 0001, Shiliang Zhang |
Rethinking the visual cues in audio-visual speaker extraction. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
14 | Xize Cheng, Tao Jin, Rongjie Huang, Linjun Li, Wang Lin, Zehan Wang, Ye Wang, Huadai Liu, Aoxiong Yin, Zhou Zhao |
MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition. |
ICCV |
2023 |
DBLP DOI BibTeX RDF |
|
14 | Chenyue Zhang, Hang Chen, Jun Du, Bao-Cai Yin, Jia Pan, Chin-Hui Lee 0001 |
Incorporating Visual Information Reconstruction into Progressive Learning for Optimizing audio-visual Speech Enhancement. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
14 | Chen Chen, Dong Wang 0013, Thomas Fang Zheng |
CN-CVS: A Mandarin Audio-Visual Dataset for Large Vocabulary Continuous Visual to Speech Synthesis. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
14 | Valentina Sanguineti, Sanket Kumar Thakur, Pietro Morerio, Alessio Del Bue, Vittorio Murino |
Audio-Visual Inpainting: Reconstructing Missing Visual Information with Sound. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
14 | Ehsan Yaghoubi, André Peter Kelm, Timo Gerkmann, Simone Frintrop |
Acoustic and Visual Knowledge Distillation for Contrastive Audio-Visual Localization. |
ICMI |
2023 |
DBLP DOI BibTeX RDF |
|
14 | Yusheng Dai, Hang Chen, Jun Du, Xiaofei Ding, Ning Ding, Feijun Jiang, Chin-Hui Lee 0001 |
Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion Encoder. |
ICME |
2023 |
DBLP DOI BibTeX RDF |
|
14 | Joanna Hong, Minsu Kim, Jeongsoo Choi, Yong Man Ro |
Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring. |
CVPR |
2023 |
DBLP DOI BibTeX RDF |
|
14 | Vijay John, Yasutomo Kawanishi |
Audio-Visual Sensor Fusion Framework Using Person Attributes Robust to Missing Visual Modality for Person Recognition. |
MMM (2) |
2023 |
DBLP DOI BibTeX RDF |
|
14 | Tingle Li, Yichen Liu, Andrew Owens, Hang Zhao |
Learning Visual Styles from Audio-Visual Associations. |
CoRR |
2022 |
DBLP DOI BibTeX RDF |
|
14 | Xinmeng Xu, Yang Wang, Jie Jia, Binbin Chen 0006, Dejun Li |
Improving Visual Speech Enhancement Network by Learning Audio-visual Affinity with Multi-head Attention. |
CoRR |
2022 |
DBLP DOI BibTeX RDF |
|
Displaying result #301 - #400 of 3474 (100 per page; Change: ) Pages: [ <<][ 1][ 2][ 3][ 4][ 5][ 6][ 7][ 8][ 9][ 10][ 11][ 12][ 13][ >>] |
|