The FacetedDBLP logo    Search for: in:

Disable automatic phrases ?     Syntactic query expansion: ?

Searching for phrase audio-visual (changed automatically) with no syntactic query expansion in all metadata.

Publication years (Num. hits)
1974-1993 (15) 1994-1996 (21) 1997 (55) 1998 (29) 1999 (34) 2000 (42) 2001 (73) 2002 (68) 2003 (103) 2004 (113) 2005 (100) 2006 (101) 2007 (144) 2008 (145) 2009 (118) 2010 (84) 2011 (103) 2012 (95) 2013 (102) 2014 (101) 2015 (85) 2016 (90) 2017 (99) 2018 (139) 2019 (160) 2020 (198) 2021 (269) 2022 (292) 2023 (418) 2024 (78)
Publication types (Num. hits)
article(1159) book(1) incollection(17) inproceedings(2241) phdthesis(43) proceedings(13)
Venues (Conferences, Journals, ...)
CoRR(567) ICASSP(158) INTERSPEECH(152) AVSP(136) HAVE(79) ACM Multimedia(75) ICME(69) ICMI(57) AVEC@ACM Multimedia(45) AVEC@MM(41) CVPR(40) IEEE Trans. Multim.(39) EUSIPCO(35) MMSP(32) IEEE Access(28) AAAI(24) More (+10 of total 800)
GrowBag graphs for keyword ? (Num. hits/coverage)

Group by:
The graphs summarize 800 occurrences of 559 keywords

Results
Found 3474 publication records. Showing 3474 according to the selection in the facets
Hits ? Authors Title Venue Year Link Author keywords
11Olivier Gillet, Slim Essid, Gaël Richard On the Correlation of Automatic Audio and Visual Segmentations of Music Videos. Search on Bibsonomy IEEE Trans. Circuits Syst. Video Technol. The full citation details ... 2007 DBLP  DOI  BibTeX  RDF
11Matej Rojc, Tomaz Rotovnik, Miso Brus, Dusan Jan, Zdravko Kacic Embodied Conversational Agents in Wizard-of-Oz and Multimodal Interaction Applications. Search on Bibsonomy COST 2102 Workshop (Vietri) The full citation details ... 2007 DBLP  DOI  BibTeX  RDF speech recognition, conversational agents, text-to-speech synthesis, speech-to-speech translation
11Daniel Gatica-Perez, Dong Zhang 0001, Samy Bengio Extracting information from multimedia meeting collections. Search on Bibsonomy Multimedia Information Retrieval The full citation details ... 2005 DBLP  DOI  BibTeX  RDF human interaction modeling, semantic, graphical models, meeting
11Dusan Macho, Jaume Padrell, Alberto Abad, Climent Nadeu, Javier Hernando, John W. McDonough, Matthias Wölfel, Ulrich Klee, Maurizio Omologo, Alessio Brutti, Piergiorgio Svaizer, Gerasimos Potamianos, Stephen M. Chu Automatic Speech Activity Detection, Source Localization, and Speech Recognition on the Chil Seminar Corpus. Search on Bibsonomy ICME The full citation details ... 2005 DBLP  DOI  BibTeX  RDF
11John R. Smith, David S. Doermann, Amarnath Gupta, Jonathan Goldstein, Uri Shaft, Nalini K. Ratha Multimedia applications: beyond similarity searches. Search on Bibsonomy CVDB The full citation details ... 2005 DBLP  DOI  BibTeX  RDF
11Lina Peng, K. Selçuk Candan, Kyung Dong Ryu, Karam S. Chatha, Hari Sundaram ARIA: an adaptive and programmable media-flow architecture for interactive arts. Search on Bibsonomy ACM Multimedia The full citation details ... 2004 DBLP  DOI  BibTeX  RDF multi-model art, tools for creating multimedia art, interactive
11Regunathan Radhakrishnan, Ajay Divakaran, Ziyou Xiong A time series clustering based framework for multimedia mining and summarization using audio features. Search on Bibsonomy Multimedia Information Retrieval The full citation details ... 2004 DBLP  DOI  BibTeX  RDF video summarization, time series analysis, audio classification
11Samy Bengio Multimodal Authentication Using Asynchronous HMMs. Search on Bibsonomy AVBPA The full citation details ... 2003 DBLP  DOI  BibTeX  RDF
11Kieron Messer, Josef Kittler, Barbara Levienaise-Obadia, William J. Christmas, Dimitri Koubaroulis Generation of semantic cues for sports video annotation. Search on Bibsonomy ICIP (3) The full citation details ... 2001 DBLP  DOI  BibTeX  RDF
11Sascha Spors, Rudolf Rabenstein, Norbert Strobel Joint audio-video object tracking. Search on Bibsonomy ICIP (1) The full citation details ... 2001 DBLP  DOI  BibTeX  RDF
11Ismail Haritaoglu, Alex Cozzi, David Koons, Myron Flickner, Dmitry N. Zotkin, Ramani Duraiswami, Yaser Yacoob Attentive Toys. Search on Bibsonomy ICME The full citation details ... 2001 DBLP  DOI  BibTeX  RDF
11Yiqiang Chen, Wen Gao 0001, Zhaoqi Wang, Li Zuo Speech Driven MPEG-4 Based Face Animation via Neural Network. Search on Bibsonomy IEEE Pacific Rim Conference on Multimedia The full citation details ... 2001 DBLP  DOI  BibTeX  RDF
11Jason P. A. Charlesworth, Philip N. Garner Spoken content metadata and MPEG-7. Search on Bibsonomy ACM Multimedia Workshops The full citation details ... 2000 DBLP  DOI  BibTeX  RDF robust retrieval, spoken content, interoperability, MPEG-7, automatic speech recognition, spoken document retrieval
11Leonardo Chiariglione MPEG: Achievements and Future Projects. Search on Bibsonomy ICMCS, Vol. 1 The full citation details ... 1999 DBLP  DOI  BibTeX  RDF
11Javad Peymanfard, Samin Heydarian, Ali Lashini, Hossein Zeinali, Mohammad Reza Mohammadi, Nasser Mozayani A multi-purpose audio-visual corpus for multi-modal Persian speech recognition: The Arman-AV dataset. Search on Bibsonomy Expert Syst. Appl. The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Shiqing Zhang, Yijiao Yang, Chen Chen, Xingnan Zhang, Qingming Leng, Xiaoming Zhao 0002 Deep learning-based multimodal emotion recognition from audio, visual, and text modalities: A systematic review of recent advancements and future prospects. Search on Bibsonomy Expert Syst. Appl. The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Sisi You, Yukun Zuo, Hantao Yao, Changsheng Xu Incremental Audio-Visual Fusion for Person Recognition in Earthquake Scene. Search on Bibsonomy ACM Trans. Multim. Comput. Commun. Appl. The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Yibo Zhang, Weiguo Lin, Junfeng Xu Joint Audio-Visual Attention with Contrastive Learning for More General Deepfake Detection. Search on Bibsonomy ACM Trans. Multim. Comput. Commun. Appl. The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Gülnaziye Bingöl, Simone Porcu, Alessandro Floris, Luigi Atzori QoE Estimation of WebRTC-based Audio-visual Conversations from Facial and Speech Features. Search on Bibsonomy ACM Trans. Multim. Comput. Commun. Appl. The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Dandan Zhu, Kun Zhu 0024, Weiping Ding 0001, Nana Zhang, Xiongkuo Min, Guangtao Zhai, Xiaokang Yang MTCAM: A Novel Weakly-Supervised Audio-Visual Saliency Prediction Model With Multi-Modal Transformer. Search on Bibsonomy IEEE Trans. Emerg. Top. Comput. Intell. The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Mengting Liu, Ying Zhou, Yuwei Wu, Feng Gao Cogeneration of Innovative Audio-visual Content: A New Challenge for Computing Art. Search on Bibsonomy Mach. Intell. Res. The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Yasuki Noguchi Audio-Visual Fission Illusion and Individual Alpha Frequency: Perspective on Buergers and Noppeney (2022). Search on Bibsonomy J. Cogn. Neurosci. The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Yasheng Sun, Wenqing Chu, Hang Zhou, Kaisiyuan Wang, Hideki Koike AVI-Talking: Learning Audio-Visual Instructions for Expressive 3D Talking Face Generation. Search on Bibsonomy IEEE Access The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Jun Zhang 0030, Yi Xiao, Yizhuang Ding, Liuchen Chen, Aiguo Song Interaction-Based Active Perception Method and Vibration-Audio-Visual Information Fusion for Asteroid Surface Material Identification. Search on Bibsonomy IEEE Trans. Instrum. Meas. The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Yinsheng Li, Shaoshuai Guo, Maixia Fu A new method of audio-visual environment emotion assessment based on range fusion decision. Search on Bibsonomy Multim. Tools Appl. The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Kholoud Alwashmi, Georg F. Meyer, Fiona J. Rowe, Ryan Ward Enhancing learning outcomes through multisensory integration: A fMRI study of audio-visual training in virtual reality. Search on Bibsonomy NeuroImage The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Yidi Li, Jiale Ren, Yawei Wang, Guoquan Wang, Xia Li, Hong Liu 0008 Audio-visual keyword transformer for unconstrained sentence-level keyword spotting. Search on Bibsonomy CAAI Trans. Intell. Technol. The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Xin Sun, Xuan Wang, Qiong Liu, Xi Zhou Multi-Level Signal Fusion for Enhanced Weakly-Supervised Audio-Visual Video Parsing. Search on Bibsonomy IEEE Signal Process. Lett. The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Xiaoting Wu, Xueyi Zhang, Xiaoyi Feng, Miguel Bordallo López, Li Liu 0002 Audio-Visual Kinship Verification: A New Dataset and a Unified Adaptive Adversarial Multimodal Learning Approach. Search on Bibsonomy IEEE Trans. Cybern. The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Julio Navío-Marco, Luis Manuel Ruiz-Gómez, Raquel Arguedas Sanz, Carmen López-Martín The student as a prosumer of educational audio-visual resources: a higher education hybrid learning experience. Search on Bibsonomy Interact. Learn. Environ. The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Zhengyu Zhu, Chao Luo, Liping Liao, Pei Lin, Yao Li Combining key pronunciation detection, frontal lip reconstruction, and time-delay for audio-visual consistency judgment. Search on Bibsonomy Digit. Signal Process. The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Edurne Bernal-Berdun, Mateo Vallejo, Qi Sun, Ana Serrano, Diego Gutierrez Modeling the Impact of Head-Body Rotations on Audio-Visual Spatial Perception for Virtual Reality Applications. Search on Bibsonomy IEEE Trans. Vis. Comput. Graph. The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Qin Yang, Yuqi Li, Chenglin Li, Hao Wang 0183, Sa Yan, Li Wei, Wenrui Dai, Junni Zou, Hongkai Xiong, Pascal Frossard SVGC-AVA: 360-Degree Video Saliency Prediction With Spherical Vector-Based Graph Convolution and Audio-Visual Attention. Search on Bibsonomy IEEE Trans. Multim. The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Yuanyuan Jiang, Jianqin Yin, Yonghao Dang Leveraging the Video-Level Semantic Consistency of Event for Audio-Visual Event Localization. Search on Bibsonomy IEEE Trans. Multim. The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Maregu Assefa, Wei Jiang 0016, Jinyu Zhan, Kumie Gedamu, Getinet Yilma, Melese Ayalew, Deepak Adhikari Audio-Visual Contrastive and Consistency Learning for Semi-Supervised Action Recognition. Search on Bibsonomy IEEE Trans. Multim. The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Haochen Han, Qinghua Zheng, Minnan Luo, Kaiyao Miao, Feng Tian 0002, Yan Chen 0031 Noise-Tolerant Learning for Audio-Visual Action Recognition. Search on Bibsonomy IEEE Trans. Multim. The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Yasheng Sun, Wenqing Chu, Hang Zhou, Kaisiyuan Wang, Hideki Koike AVI-Talking: Learning Audio-Visual Instructions for Expressive 3D Talking Face Generation. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Qilang Ye, Zitong Yu, Xin Liu Answering Diverse Questions via Text Attached with Key Audio-Visual Clues. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Bruno Korbar, Jaesung Huh, Andrew Zisserman Look, Listen and Recognise: Character-Aware Audio-Visual Subtitling. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Xueyuan Chen, Yuejiao Wang, Xixin Wu, Disong Wang, Zhiyong Wu 0001, Xunying Liu, Helen Meng Exploiting Audio-Visual Features with Pretrained AV-HuBERT for Multi-Modal Dysarthric Speech Reconstruction. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Zhe Chen, Heyang Liu, Wenyi Yu, Guangzhi Sun, Hongcheng Liu, Ji Wu, Chao Zhang, Yu Wang, Yanfeng Wang M3AV: A Multimodal, Multigenre, and Multipurpose Audio-Visual Academic Lecture Dataset. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Yusheng Dai, Hang Chen, Jun Du, Ruoyu Wang 0029, Shihao Chen, Jiefeng Ma, Haotian Wang, Chin-Hui Lee A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11He Wang, Pengcheng Guo, Pan Zhou, Lei Xie MLCA-AVSR: Multi-Layer Cross Attention Fusion based Audio-Visual Speech Recognition. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Samuel Pegg, Kai Li, Xiaolin Hu 0001 TDFNet: An Efficient Audio-Visual Speech Separation Model with Top-down Fusion. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Licai Sun, Zheng Lian, Bin Liu, Jianhua Tao 0001 HiCMAE: Hierarchical Contrastive Masked Autoencoder for Self-Supervised Audio-Visual Emotion Recognition. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Hao Wang, Shuhei Kurita, Shuichiro Shimizu, Daisuke Kawahara SlideAVSR: A Dataset of Paper Explanation Videos for Audio-Visual Speech Recognition. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11R. Gnana Praveen, Jahangir Alam Dynamic Cross Attention for Audio-Visual Person Verification. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Adrian S. Roman, Baladithya Balamurugan, Rithik Pothuganti Enhanced Sound Event Localization and Detection in Real 360-degree audio-visual soundscapes. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Haoxu Wang, Ming Cheng, Qiang Fu, Ming Li Robust Wake Word Spotting With Frame-Level Cross-Modal Attention Based Audio-Visual Conformer. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Jongsuk Kim, Hyeongkeun Lee, Kyeongha Rho, Junmo Kim, Joon Son Chung EquiAV: Leveraging Equivariance for Audio-Visual Contrastive Learning. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Elena Ryumina, Maxim Markitantov, Dmitry Ryumin, Heysem Kaya, Alexey Karpov 0001 Audio-Visual Compound Expression Recognition Method based on Late Modality Fusion and Rule-based Decision. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11HyoJung Han, Mohamed Anwar, Juan Pino 0001, Wei-Ning Hsu, Marine Carpuat, Bowen Shi, Changhan Wang XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Yuxin Guo, Shijie Ma, Yuhao Zhao, Hu Su, Wei Zou Cross Pseudo-Labeling for Semi-Supervised Audio-Visual Source Localization. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Swapnil Bhosale, Haosen Yang, Diptesh Kanojia, Jiankang Deng, Xiatian Zhu Unsupervised Audio-Visual Segmentation with Modality Alignment. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Yukun Zuo, Hantao Yao, Liansheng Zhuang, Changsheng Xu Hierarchical Augmentation and Distillation for Class Incremental Audio-Visual Video Recognition. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Yan-Bo Lin, Gedas Bertasius Siamese Vision Transformers are Scalable Audio-visual Learners. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Ziyang Chen, Israel D. Gebru, Christian Richardt, Anurag Kumar 0003, William Laney, Andrew Owens, Alexander Richard Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and Benchmark. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Jinxiang Liu, Yikun Liu, Fei Zhang, Chen Ju, Ya Zhang 0002, Yanfeng Wang Audio-Visual Segmentation via Unlabeled Frame Exploitation. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11R. Gnana Praveen, Jahangir Alam Audio-Visual Person Verification based on Recursive Fusion of Joint Cross-Attention. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11José-M. Acosta-Triana, David Gimeno-Gómez, Carlos D. Martínez-Hinarejos AnnoTheia: A Semi-Automatic Annotation Toolkit for Audio-Visual Speech Technologies. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Tassadaq Hussain, Kia Dashtipour, Yu Tsao 0001, Amir Hussain 0001 Audio-Visual Speech Enhancement in Noisy Environments via Emotion-Based Contextual Cues. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Fan Yu, Haoxu Wang, Xian Shi, Shiliang Zhang LCB-net: Long-Context Biasing for Audio-Visual Speech Recognition. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Rui Wang, Dengpan Ye, Long Tang, Yunming Zhang, Jiacheng Deng 0003 AVT2-DWF: Improving Deepfake Detection with Audio-Visual Fusion and Dynamic Weighting Strategies. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Yuxin Guo, Shijie Ma, Hu Su, Zhiqing Wang, Yuhao Zhao, Wei Zou, Siyang Sun, Yun Zheng Dual Mean-Teacher: An Unbiased Semi-Supervised Framework for Audio-Visual Source Localization. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Xianghu Yue, Xiaohai Tian, Malu Zhang, Zhizheng Wu 0001, Haizhou Li 0001 CoAVT: A Cognition-Inspired Unified Audio-Visual-Text Pre-Training Model for Multimodal Processing. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Heqing Zou, Meng Shen, Yuchen Hu, Chen Chen 0075, Eng Siong Chng, Deepu Rajan Cross-Modality and Within-Modality Regularization for Audio-Visual DeepFake Detection. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Denis Dresvyanskiy, Maxim Markitantov, Jiawei Yu, Peitong Li, Heysem Kaya, Alexey Karpov 0001 SUN Team's Contribution to ABAW 2024 Competition: Audio-visual Valence-Arousal Estimation and Expression Recognition. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11R. Gnana Praveen, Jahangir Alam Cross-Attention is Not Always Needed: Dynamic Cross-Attention for Audio-Visual Dimensional Emotion Recognition. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Christian Marinoni, Riccardo Fosco Gramaccioni, Changan Chen, Aurelio Uncini, Danilo Comminiello Overview of the L3DAS23 Challenge on Audio-Visual Extended Reality. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Qilang Ye, Zitong Yu, Rui Shao, Xinyu Xie, Philip H. S. Torr, Xiaochun Cao CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenarios. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Yunlong Tang 0002, Daiki Shimada, Jing Bi, Chenliang Xu AVicuna: Audio-Visual LLM with Interleaver and Context-Boundary Alignment for Temporal Referential Dialogue. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Hengwei Liu, Xiaodong Gu 0001 Masked co-attention model for audio-visual event localization. Search on Bibsonomy Appl. Intell. The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Subhayu Ghosh, Snehashis Sarkar, Sovan Ghosh, Frank Zalkow, Nanda Dulal Jana Audio-visual speech synthesis using vision transformer-enhanced autoencoders with ensemble of loss functions. Search on Bibsonomy Appl. Intell. The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Pierre Albert, Fasih Haider, Saturnino Luz CUSCO: An Unobtrusive Custom Secure Audio-Visual Recording System for Ambient Assisted Living. Search on Bibsonomy Sensors The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Hao-Yan Zhang, Long-Bo Zhang, Qi-Feng Shi, Zhen-Tao Liu Audio-Visual Bimodal Combination-Based Speaker Tracking Method for Mobile Robot. Search on Bibsonomy J. Adv. Comput. Intell. Intell. Informatics The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Rui-Chen Zheng, Yang Ai, Zhen-Hua Ling Incorporating Ultrasound Tongue Images for Audio-Visual Speech Enhancement. Search on Bibsonomy IEEE ACM Trans. Audio Speech Lang. Process. The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Zhe Chen, Hongcheng Liu, Yu Wang 0027 DialogMCF: Multimodal Context Flow for Audio Visual Scene-Aware Dialog. Search on Bibsonomy IEEE ACM Trans. Audio Speech Lang. Process. The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Yaoting Wang, Weisong Liu, Guangyao Li, Jian Ding, Di Hu 0001, Xi Li Prompting Segmentation with Sound Is Generalizable Audio-Visual Source Localizer. Search on Bibsonomy AAAI The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Zhangbin Li, Dan Guo, Jinxing Zhou, Jing Zhang, Meng Wang 0001 Object-Aware Adaptive-Positivity Learning for Audio-Visual Question Answering. Search on Bibsonomy AAAI The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Shengyi Gao, Zhe Chen, Guo Chen, Wenhai Wang, Tong Lu AVSegFormer: Audio-Visual Segmentation with Transformer. Search on Bibsonomy AAAI The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Xiulong Liu, Sudipta Paul 0007, Moitreya Chatterjee, Anoop Cherian CAVEN: An Embodied Conversational Agent for Efficient Audio-Visual Navigation in Noisy Environments. Search on Bibsonomy AAAI The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Renjie Wu 0008, Hu Wang, Feras Dayoub, Hsiang-Ting Chen Segment beyond View: Handling Partially Missing Modality for Audio-Visual Semantic Segmentation. Search on Bibsonomy AAAI The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Jiadong Wang, Zexu Pan, Malu Zhang, Robby T. Tan, Haizhou Li 0001 Restoring Speaking Lips from Occlusion for Audio-Visual Speech Recognition. Search on Bibsonomy AAAI The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Dawei Hao, Yuxin Mao, Bowen He, Xiaodong Han, Yuchao Dai, Yiran Zhong Improving Audio-Visual Segmentation with Bidirectional Generation. Search on Bibsonomy AAAI The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Abduljalil Radman, Jorma Laaksonen AV-PEA: Parameter-Efficient Adapter for Audio-Visual Multimodal Learning. Search on Bibsonomy VISIGRAPP (2): VISAPP The full citation details ... 2024 DBLP  BibTeX  RDF
11Sze An Peter Tan, Guangyu Gao, Jia Zhao Audio-Visual Segmentation by Leveraging Multi-scaled Features Learning. Search on Bibsonomy MMM (2) The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Shilong Yu, Chenhui Yang MAVAR-SE: Multi-scale Audio-Visual Association Representation Network for End-to-End Speaker Extraction. Search on Bibsonomy MMM (2) The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Yating Xu, Conghui Hu, Gim Hee Lee Rethink Cross-Modal Fusion in Weakly-Supervised Audio-Visual Video Parsing. Search on Bibsonomy WACV The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Jinxiang Liu, Yu Wang, Chen Ju, Chaofan Ma, Ya Zhang, Weidi Xie Annotation-free Audio-Visual Segmentation. Search on Bibsonomy WACV The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
11Jiwei Zhang 0012, Yi Yu 0001, Suhua Tang, Wei Li 0012, Jianming Wu Multi-scale network with shared cross-attention for audio-visual correlation learning. Search on Bibsonomy Neural Comput. Appl. The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Yiming Zhao, Hongdong Zhao, Xuezhi Zhang, Weina Liu Vehicle classification based on audio-visual feature fusion with low-quality images and noise. Search on Bibsonomy J. Intell. Fuzzy Syst. The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Jiwei Zhang 0012, Yi Yu 0001, Suhua Tang, Jianming Wu, Wei Li 0012 Variational Autoencoder with CCA for Audio-Visual Cross-modal Retrieval. Search on Bibsonomy ACM Trans. Multim. Comput. Commun. Appl. The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Donghuo Zeng, Jianming Wu, Gen Hattori, Rong Xu, Yi Yu 0001 Learning Explicit and Implicit Dual Common Subspaces for Audio-visual Cross-modal Retrieval. Search on Bibsonomy ACM Trans. Multim. Comput. Commun. Appl. The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Dandan Zhu, Xuan Shao, Qiangqiang Zhou, Xiongkuo Min, Guangtao Zhai, Xiaokang Yang A Novel Lightweight Audio-visual Saliency Model for Videos. Search on Bibsonomy ACM Trans. Multim. Comput. Commun. Appl. The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Rynhardt Kruger, Febe de Wet, Thomas Niesler Mathematical Content Browsing for Print-disabled Readers Based on Virtual-world Exploration and Audio-visual Sensory Substitution. Search on Bibsonomy ACM Trans. Access. Comput. The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Triantafyllos Kefalas, Eftychia Fotiadou, Markos Georgopoulos, Yannis Panagakis, Pingchuan Ma 0001, Stavros Petridis, Themos Stafylakis, Maja Pantic KAN-AV dataset for audio-visual face and speech analysis in the wild. Search on Bibsonomy Image Vis. Comput. The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Luis Guillermo, Jose-Maria Rojas, Willy Ugarte Emotional 3D speech visualization from 2D audio visual data. Search on Bibsonomy Int. J. Model. Simul. Sci. Comput. The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Yutao Zhang, Kaixing Wu, Mengfan Zhao An Audio-Visual Separation Model Integrating Dual-Channel Attention Mechanism. Search on Bibsonomy IEEE Access The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Yuya Chiba, Ryuichiro Higashinaka Dialogue Situation Recognition in Everyday Conversation From Audio, Visual, and Linguistic Information. Search on Bibsonomy IEEE Access The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Maryam Qamar, Suleman Qamar, Muhammad Muneeb, Sung-Ho Bae, Anis Ur Rahman 0001 Saliency Prediction in Uncategorized Videos Based on Audio-Visual Correlation. Search on Bibsonomy IEEE Access The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
Displaying result #501 - #600 of 3474 (100 per page; Change: )
Pages: [<<][1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][>>]
Valid XHTML 1.1! Valid CSS! [Valid RSS]
Maintained by L3S.
Previously maintained by Jörg Diederich.
Based upon DBLP by Michael Ley.
open data data released under the ODC-BY 1.0 license