|
|
Venues (Conferences, Journals, ...)
|
|
GrowBag graphs for keyword ? (Num. hits/coverage)
Group by:
The graphs summarize 800 occurrences of 559 keywords
|
|
|
Results
Found 3474 publication records. Showing 3474 according to the selection in the facets
Hits ?▲ |
Authors |
Title |
Venue |
Year |
Link |
Author keywords |
11 | Leandro A. Passos, João Paulo Papa, Javier Del Ser, Amir Hussain 0001, Ahsan Adeel |
Multimodal audio-visual information fusion using canonical-correlated Graph Neural Network for energy-efficient speech enhancement. |
Inf. Fusion |
2023 |
DBLP DOI BibTeX RDF |
|
11 | |
Retracted: Investigating the interactive audio-visual course mode for college English using virtual reality and artificial intelligence. |
IET Softw. |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Di Guo, Huaping Liu 0001, Fuchun Sun 0001 |
Audio-visual language instruction understanding for robotic sorting. |
Robotics Auton. Syst. |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Qiya Song, Bin Sun 0001, Shutao Li |
Multimodal Sparse Transformer Network for Audio-Visual Speech Recognition. |
IEEE Trans. Neural Networks Learn. Syst. |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Yuqin Cao, Xiongkuo Min, Wei Sun 0029, Guangtao Zhai |
Attention-Guided Neural Networks for Full-Reference and No-Reference Audio-Visual Quality Assessment. |
IEEE Trans. Image Process. |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Yuqin Cao, Xiongkuo Min, Wei Sun 0029, Guangtao Zhai |
Subjective and Objective Audio-Visual Quality Assessment for User Generated Content. |
IEEE Trans. Image Process. |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Guinan Li, Jiajun Deng, Mengzhe Geng, Zengrui Jin, Tianzi Wang, Shujie Hu, Mingyu Cui, Helen Meng, Xunying Liu |
Audio-Visual End-to-End Multi-Channel Speech Separation, Dereverberation and Recognition. |
IEEE ACM Trans. Audio Speech Lang. Process. |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Xinyuan Qian, Zhengdong Wang, Jiadong Wang, Guohui Guan, Haizhou Li 0001 |
Audio-Visual Cross-Attention Network for Robotic Speaker Tracking. |
IEEE ACM Trans. Audio Speech Lang. Process. |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Shentong Mo, Weiguo Pian, Yapeng Tian |
Class-Incremental Grouping Network for Continual Audio-Visual Learning. |
ICCV |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Yuxin Mao, Jing Zhang, Mochu Xiang, Yiran Zhong, Yuchao Dai |
Multimodal Variational Auto-encoder based Audio-Visual Segmentation. |
ICCV |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Muhammad Adi Nugroho, Sangmin Woo, Sumin Lee, Changick Kim |
Audio-Visual Glance Network for Efficient Video Recognition. |
ICCV |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Jinyu Chen, Wenguan Wang, Si Liu 0001, Hongsheng Li, Yi Yang 0001 |
Omnidirectional Information Gathering for Knowledge Transfer-based Audio-Visual Navigation. |
ICCV |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Xiaobao Guo, Nithish Muthuchamy Selvaraj, Zitong Yu, Adams Wai-Kin Kong, Bingquan Shen, Alex C. Kot |
Audio-Visual Deception Detection: DOLOS Dataset and Parameter-Efficient Crossmodal Learning. |
ICCV |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Kranthi Kumar Rachavarapu, A. N. Rajagopalan 0001 |
Boosting Positive Segments for Weakly-Supervised Audio-Visual Video Parsing. |
ICCV |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Weiguo Pian, Shentong Mo, Yunhui Guo, Yapeng Tian |
Audio-Visual Class-Incremental Learning. |
ICCV |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Zhe Niu, Brian Mak |
On the Audio-visual Synchronization for Lip-to-Speech Synthesis. |
ICCV |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Jie Hong, Zeeshan Hayder, Junlin Han, Pengfei Fang, Mehrtash Harandi, Lars Petersson |
Hyperbolic Audio-visual Zero-shot Learning. |
ICCV |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Yang Liu 0084, Ying Tan, Haoyuan Lan |
Self-Supervised Contrastive Learning for Audio-Visual Action Recognition. |
ICIP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Yuqin Cao, Xiongkuo Min, Wei Sun 0029, Xiao-Ping (Steven) Zhang, Guangtao Zhai |
Audio-Visual Quality Assessment for User Generated Content: Database and Method. |
ICIP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Pavel Korshunov, Haolin Chen, Philip N. Garner, Sébastien Marcel |
Vulnerability of Automatic Identity Recognition to Audio-Visual Deepfakes. |
IJCB |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Syrine Haddad, Olfa Dâassi, Safya Belghith |
Emotion Recognition from Audio-Visual Information based on Convolutional Neural Network. |
ICCAD |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Moinak Bhattacharya, Prateek Prasanna |
Audio-visual feature fusion for improved thoracic disease classification. |
Medical Imaging: Computer-Aided Diagnosis |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Shan Liu, Bohan Wu, Shu Ma, Zhen Yang |
Advanced Audio-Visual Multimodal Warnings for Drivers: Effect of Specificity and Lead Time on Effectiveness. |
HCI (8) |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Kazuki Seto, Yumi Asahi |
Sound Logo to Increase TV Advertising Effectiveness Based on Audio-Visual Features. |
HCI (5) |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Chang Wang, Jun Du, Hang Chen, Ruoyu Wang 0029, Chao-Han Huck Yang, Jiangjiang Zhao, Yuling Ren, Qinglong Li, Chin-Hui Lee 0001 |
Enhancing Privacy Preservation with Quantum Computing for Word-Level Audio-Visual Speech Recognition. |
APSIPA ASC |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Yu-Ching Chung, Ji-Yan Han, Bo-Sin Wang, Wei-Zhong Zheng, Kung-Yao Shen, Ying-Hui Lai |
An Audio-Visual Speech Enhancement System Based on 3D Image Features: An Application in Hearing Aids. |
APSIPA ASC |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Haodong Zhou, Tao Li, Jie Wang, Lin Li, Qingyang Hong |
CASA-Net: Cross-attention and Self-attention for End-to-End Audio-visual Speaker Diarization. |
APSIPA ASC |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Yoto Fujita, Yoshiaki Bando, Keisuke Imoto, Masaki Onishi, Kazuyoshi Yoshii |
DOA-Aware Audio-Visual Self-Supervised Learning for Sound Event Localization and Detection. |
APSIPA ASC |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Vinaya Sree Katamneni, Ajita Rattani |
MIS-AVoiDD: Modality Invariant and Specific Representation for Audio-Visual Deepfake Detection. |
ICMLA |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Amandine Brunetto, Sascha Hornauer, Stella X. Yu, Fabien Moutarde |
The Audio-Visual BatVision Dataset for Research on Sight and Sound. |
IROS |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Haru Kondoh, Asako Kanezaki |
Multi-Goal Audio-Visual Navigation Using Sound Direction Map. |
IROS |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Yusaku Nakajima, Masashi Hamaya, Kazutoshi Tanaka, Takafumi Hawai, Felix von Drigalski, Yasuo Takeichi, Yoshitaka Ushiku, Kanta Ono |
Robotic Powder Grinding with Audio-Visual Feedback for Laboratory Automation in Materials Science. |
IROS |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Yizhuo Yang, Shenghai Yuan, Muqing Cao, Jianfei Yang, Lihua Xie |
AV-PedAware: Self-Supervised Audio-Visual Fusion for Dynamic Pedestrian Awareness. |
IROS |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Zexu Pan, Gordon Wichern, Yoshiki Masuyama, François G. Germain, Sameer Khurana, Chiori Hori, Jonathan Le Roux |
Scenario-Aware Audio-Visual TF-Gridnet for Target Speech Extraction. |
ASRU |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Jiachen Lian, Alexei Baevski, Wei-Ning Hsu, Michael Auli |
Av-Data2Vec: Self-Supervised Learning of Audio-Visual Speech Representations with Contextualized Target Representations. |
ASRU |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Cheng-I Jeff Lai, Freda Shi, Puyuan Peng, Yoon Kim, Kevin Gimpel, Shiyu Chang, Yung-Sung Chuang, Saurabhchand Bhati, David D. Cox, David Harwath, Yang Zhang 0001, Karen Livescu, James R. Glass |
Audio-Visual Neural Syntax Acquisition. |
ASRU |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Changan Chen, Wei Sun, David Harwath, Kristen Grauman |
Learning Audio-Visual Dereverberation. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Yuqian Kuang, Xiaopeng Fan |
Collaborative Audio-Visual Event Localization Based on Sequential Decision and Cross-Modal Consistency. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Zirun Zhu, Hemin Yang, Min Tang, Ziyi Yang, Sefik Emre Eskimez, Huaming Wang |
Real-Time Audio-Visual End-To-End Speech Enhancement. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Zhongweiyang Xu, Xulin Fan, Mark Hasegawa-Johnson |
Dual-Path Cross-Modal Attention for Better Audio-Visual Speech Extraction. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Roshan Sharma, Weipeng He, Ju Lin, Egor Lakomkin, Yang Liu, Kaustubh Kalgaonkar |
Egocentric Audio-Visual Noise Suppression. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Jiahong Li, Chenda Li, Yifei Wu, Yanmin Qian |
Robust Audio-Visual ASR with Unified Cross-Modal Attention. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Rodrigo Mira, Buye Xu, Jacob Donley, Anurag Kumar 0003, Stavros Petridis, Vamsi Krishna Ithapu, Maja Pantic |
LA-VOCE: LOW-SNR Audio-Visual Speech Enhancement Using Neural Vocoders. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | R. Gnana Praveen, Eric Granger, Patrick Cardinal |
Recursive Joint Attention for Audio-Visual Fusion in Regression Based Emotion Recognition. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Gaopeng Xu, Xianliang Wang, Sang Wang, Junfeng Yuan, Wei Guo, Wei Li, Jie Gao |
The NIO System for Audio-Visual Diarization and Recognition in MISP Challenge 2022. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Zhe Wang, Shilong Wu, Hang Chen, Mao-Kui He, Jun Du, Chin-Hui Lee 0001, Jingdong Chen, Shinji Watanabe 0001, Sabato Marco Siniscalchi, Odette Scharenborg, Diyuan Liu, Baocai Yin, Jia Pan, Jianqing Gao, Cong Liu 0006 |
The Multimodal Information Based Speech Processing (Misp) 2022 Challenge: Audio-Visual Diarization And Recognition. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Cassia Valentini-Botinhao, Andrea Lorena Aldana Blanco, Ondrej Klejch, Peter Bell 0001 |
Efficient Intelligibility Evaluation Using Keyword Spotting: A Study on Audio-Visual Speech Enhancement. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Christian Marinoni, Riccardo F. Gramaccioni, Changan Chen, Aurelio Uncini, Danilo Comminiello |
Overview of the L3DAS23 Challenge on Audio-Visual Extended Reality. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Mandar Gogate, Kia Dashtipour, Amir Hussain 0001 |
Towards Pose-Invariant Audio-Visual Speech Enhancement in the Wild for Next-Generation Multi-Modal Hearing Aids. |
ICASSP Workshops |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Hongbo Chen, Dongchen Zhu, Guanghui Zhang, Wenjun Shi, Xiaolin Zhang, Jiamao Li |
CM-CS: Cross-Modal Common-Specific Feature Learning For Audio-Visual Video Parsing. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Hui Chen, Hanyi Zhang, Longbiao Wang, Kong Aik Lee, Meng Liu, Jianwu Dang 0001 |
Self-Supervised Audio-Visual Speaker Representation with Co-Meta Learning. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Yifei Wu, Chenda Li, Yanmin Qian |
Light-Weight Visualvoice: Neural Network Quantization On Audio Visual Speech Separation. |
ICASSP Workshops |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Jing-Xuan Zhang, Genshun Wan, Zhen-Hua Ling, Jia Pan, Jianqing Gao, Cong Liu 0006 |
Self-Supervised Audio-Visual Speech Representations Learning by Multimodal Self-Distillation. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Haitao Xu, Liangfa Wei, Jie Zhang 0042, Jianming Yang, Yannan Wang, Tian Gao, Xin Fang, Li-Rong Dai 0001 |
A Multi-Scale Feature Aggregation Based Lightweight Network for Audio-Visual Speech Enhancement. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | I-Chun Chern, Kuo-Hsuan Hung, Yi-Ting Chen, Tassadaq Hussain, Mandar Gogate, Amir Hussain 0001, Yu Tsao 0001, Jen-Cheng Hou |
Audio-Visual Speech Enhancement and Separation by Utilizing Multi-Modal Self-Supervised Embeddings. |
ICASSP Workshops |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Ming Cheng, Haoxu Wang, Ziteng Wang, Qiang Fu, Ming Li 0026 |
The WHU-Alibaba Audio-Visual Speaker Diarization System for the MISP 2022 Challenge. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Xiaoming Ren, Chao Li, Shenjian Wang, Biao Li |
Practice of the Conformer Enhanced Audio-Visual Hubert on Mandarin and English. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Haoxu Wang, Ming Cheng, Qiang Fu, Ming Li 0026 |
The DKU Post-Challenge Audio-Visual Wake Word Spotting System for the 2021 MISP Challenge: Deep Analysis. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Yangcheng Li, Zefang Yu, Suncheng Xiang, Ting Liu 0016, Yuzhuo Fu |
AV-TAD: Audio-Visual Temporal Action Detection With Transformer. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Timothée Dhaussy, Bassam Jabaian, Fabrice Lefèvre, Radu Horaud |
Audio-Visual Speaker Diarization in the Framework of Multi-User Human-Robot Interaction. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Ya Jiang, Hang Chen, Jun Du, Qing Wang 0008, Chin-Hui Lee 0001 |
Incorporating Lip Features into Audio-Visual Multi-Speaker DOA Estimation by Gated Fusion. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Ali Golmakani, Mostafa Sadeghi, Romain Serizel |
Audio-Visual Speech Enhancement with a Deep Kalman Filter Generative Model. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Chang-Sung Sung, Jun-Cheng Chen, Chu-Song Chen |
Hearing and Seeing Abnormality: Self-Supervised Audio-Visual Mutual Learning for Deepfake Detection. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Pingchuan Ma 0001, Alexandros Haliassos, Adriana Fernandez-Lopez, Honglie Chen, Stavros Petridis, Maja Pantic |
Auto-AVSR: Audio-Visual Speech Recognition with Automatic Labels. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Ruize Xu, Ruoxuan Feng, Shi-Xiong Zhang, Di Hu 0001 |
MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Rajat Hebbar, Digbalay Bose, Krishna Somandepalli, Veena Vijai, Shrikanth Narayanan |
A Dataset for Audio-Visual Sound Event Detection in Movies. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Tao Li, Haodong Zhou, Jie Wang, Qingyang Hong, Lin Li |
The XMU System for Audio-Visual Diarization and Recognition in MISP Challenge 2022. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Pengcheng Guo, He Wang, Bingshen Mu, Ao Zhang, Peikun Chen |
The NPU-ASLP System for Audio-Visual Speech Recognition in MISP 2022 Challenge. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Meng Liu, Kong Aik Lee, Longbiao Wang, Hanyi Zhang, Chang Zeng, Jianwu Dang 0001 |
Cross-Modal Audio-Visual Co-Learning for Text-Independent Speaker Verification. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Jiuxin Lin, Xinyu Cai, Heinrich Dinkel, Jun Chen 0024, Zhiyong Yan, Yongqing Wang, Junbo Zhang, Zhiyong Wu 0001, Yujun Wang, Helen Meng |
Av-Sepformer: Cross-Attention Sepformer for Audio-Visual Target Speaker Extraction. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Prerna Singh, Ayush Tripathi, Lalan Kumar, Tapan Kumar Gandhi |
Brain Connectivity Features-based Age Group Classification using Temporal Asynchrony Audio-Visual Integration Task. |
EMBC |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Hannes Oppermann, Antonia Thelen, Jens Haueisen |
Entrainment and resonance effects with a new mobile audio-visual stimulation device. |
EMBC |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Yagna Gudipalli, Gauri Deshpande, Sachin Patel, Björn W. Schuller |
Deep Modelling Strategies for Human Confidence Classification using Audio-visual Data. |
EMBC |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Xiaojing Yu, Lan Zhang 0002, Xiang-Yang Li |
E-Talk: Accelerating Active Speaker Detection with Audio-Visual Fusion and Edge-Cloud Computing. |
SECON |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Sunan Li, Hailun Lian, Cheng Lu 0005, Yan Zhao, Chuangao Tang, Yuan Zong, Wenming Zheng |
Audio-Visual Group-based Emotion Recognition using Local and Global Feature Aggregation based Multi-Task Learning. |
ICMI |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Guangyao Li, Wenxuan Hou, Di Hu 0001 |
Progressive Spatio-temporal Perception for Audio-Visual Question Answering. |
ACM Multimedia |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Tianyu Liu, Peng Zhang 0005, Wei Huang 0013, Yufei Zha, Tao You, Yanning Zhang |
Induction Network: Audio-Visual Modality Gap-Bridging for Self-Supervised Sound Source Localization. |
ACM Multimedia |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Shiping Ge, Zhiwei Jiang, Yafeng Yin, Cong Wang, Zifeng Cheng, Qing Gu |
Learning Event-Specific Localization Preferences for Audio-Visual Event Localization. |
ACM Multimedia |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Chenyu Yang, Mengxi Chen, Yanfeng Wang, Yu Wang 0027 |
Uncertainty-Guided End-to-End Audio-Visual Speaker Diarization for Far-Field Recordings. |
ACM Multimedia |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Sung Jin Um, Dongjin Kim, Jung Uk Kim |
Audio-Visual Spatial Integration and Recursive Attention for Robust Sound Source Localization. |
ACM Multimedia |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Chen Liu 0028, Peike Patrick Li, Xingqun Qi, Hu Zhang, Lincheng Li, Dadong Wang, Xin Yu 0002 |
Audio-Visual Segmentation by Exploring Cross-Modal Mutual Semantics. |
ACM Multimedia |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Chao Sun, Min Chen 0003, Jialiang Cheng, Han Liang, Chuanbo Zhu 0002, Jincai Chen |
SCLAV: Supervised Cross-modal Contrastive Learning for Audio-Visual Coding. |
ACM Multimedia |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Hongye Liu, Xianhai Xie, Yang Gao, Zhou Yu 0001 |
Parameter-Efficient Transfer Learning for Audio-Visual-Language Tasks. |
ACM Multimedia |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Haotian Wang, Yuxuan Xi, Hang Chen, Jun Du, Yan Song 0001, Qing Wang 0008, Hengshun Zhou, Chenxi Wang, Jiefeng Ma, Pengfei Hu 0006, Ya Jiang, Shi Cheng, Jie Zhang 0042, Yuzhe Weng |
Hierarchical Audio-Visual Information Fusion with Multi-label Joint Decoding for MER 2023. |
ACM Multimedia |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Chenyang Lyu, Wenxi Li, Tianbo Ji, Longyue Wang, Liting Zhou, Cathal Gurrin, Linyi Yang, Yi Yu 0001, Yvette Graham, Jennifer Foster |
Graph-Based Video-Language Learning with Multi-Grained Audio-Visual Alignment. |
ACM Multimedia |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Jiayi Zhang, Weixin Li 0001 |
Multi-Modal and Multi-Scale Temporal Fusion Architecture Search for Audio-Visual Video Parsing. |
ACM Multimedia |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Wenrui Li, Xi-Le Zhao, Zhengyu Ma, Xingtao Wang, Xiaopeng Fan, Yonghong Tian 0001 |
Motion-Decoupled Spiking Transformer for Audio-Visual Zero-Shot Learning. |
ACM Multimedia |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Soichiro Komura, Katsuyoshi Maeyama, Akira Taniguchi, Tadahiro Taniguchi |
Lexical Acquisition from Audio-Visual Streams Using a Multimodal Recurrent State-Space Model. |
ICDL |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Yumi Hughes, Kae Mukai, Katsumi Watanabe, Kazutoshi Kudo |
An on-line study about recognition of improvisation theatre using audio-visual information. |
CogSci |
2023 |
DBLP BibTeX RDF |
|
11 | Huilin Tian, Jingke Meng, Yuhan Yao, Wei-Shi Zheng 0001 |
Unimodal-Multimodal Collaborative Enhancement for Audio-Visual Event Localization. |
PRCV (6) |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Pritam Sarkar, Aaron Posen, Ali Etemad |
AVCAffe: A Large Scale Audio-Visual Dataset of Cognitive Load and Affect for Remote Work. |
AAAI |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Pritam Sarkar, Ali Etemad |
Self-Supervised Audio-Visual Representation Learning with Relaxed Cross-Modal Synchronicity. |
AAAI |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Mingrui Lao, Nan Pu, Yu Liu 0012, Kai He, Erwin M. Bakker, Michael S. Lew |
COCA: COllaborative CAusal Regularization for Audio-Visual Question Answering. |
AAAI |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Chen Chen 0075, Yuchen Hu, Qiang Zhang, Heqing Zou, Beier Zhu, Eng Siong Chng |
Leveraging Modality-Specific Representations for Audio-Visual Speech Recognition via Reinforcement Learning. |
AAAI |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Simon Jenni, Alexander Black 0001, John P. Collomosse |
Audio-Visual Contrastive Learning with Temporal Self-Supervision. |
AAAI |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Jingfei Xia, Mingchen Zhuge, Tiantian Geng, Shun Fan, Yuantai Wei, Zhenyu He 0001, Feng Zheng |
Skating-Mixer: Long-Term Sport Audio-Visual Modeling with MLPs. |
AAAI |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Peijun Bao, Wenhan Yang, Boon Poh Ng, Meng Hwa Er, Alex C. Kot |
Cross-Modal Label Contrastive Learning for Unsupervised Audio-Visual Event Localization. |
AAAI |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Mina Huh, Saelyne Yang, Yi-Hao Peng, Xiang 'Anthony' Chen, Young-Ho Kim, Amy Pavel |
AVscript: Accessible Video Editing with Audio-Visual Scripts. |
CHI |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Wenting Zhao 0003, Shigang Wang, Yan Zhao 0012, Jian Wei, Tianshu Li |
A Novel Intelligent Assessment Based on Audio-Visual Data for Chinese Zither Fingerings. |
ICIG (4) |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Xiaoyu Wu, Jucheng Qiu, Qiurui Yue |
GLTCM: Global-Local Temporal and Cross-Modal Network for Audio-Visual Event Localization. |
ICIG (2) |
2023 |
DBLP DOI BibTeX RDF |
|
|
|