|
|
Venues (Conferences, Journals, ...)
|
|
GrowBag graphs for keyword ? (Num. hits/coverage)
Group by:
The graphs summarize 800 occurrences of 559 keywords
|
|
|
Results
Found 3474 publication records. Showing 3474 according to the selection in the facets
Hits ?▲ |
Authors |
Title |
Venue |
Year |
Link |
Author keywords |
11 | Yuxin Zhu, Xilei Zhu, Huiyu Duan, Jie Li, Kaiwei Zhang, Yucheng Zhu, Li Chen, Xiongkuo Min, Guangtao Zhai |
Audio-Visual Saliency for Omnidirectional Videos. |
ICIG (5) |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Hong-Liang Dai, Xinfeng Zhang 0003, Haiyang Yu 0002 |
An Attention-based Audio-visual Fusion Method for Short Video Classification. |
BDIOT |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Yibo He, Kah Phooi Seng, Li-Minn Ang, Xingyu Zhao |
Cycle-Consistent Generative Adversarial Network Architectures for Audio Visual Speech Recognition. |
ICSPCC |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Peng Zhang, Hui Zhao, Meijuan Li, Yida Chen, Jianqiang Zhang, Fuqiang Wang, Xiaoming Wu |
Audio-Visual Emotion Recognition Based on Multi-Scale Channel Attention and Global Interactive Fusion. |
SMC |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Jinxin Wang, Chao Yang 0024, Zhongwen Guo, Xiaomei Li, Weigang Wang |
An End-to-End Mandarin Audio-Visual Speech Recognition Model with a Feature Enhancement Module. |
SMC |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Yapeng Li, Yong Luo 0002, Bo Du 0001 |
Audio-Visual Generalized Zero-Shot Learning Based on Variational Information Bottleneck. |
ICME |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Jinxin Wang, Zhongwen Guo, Chao Yang 0024, Xiaomei Li, Ziyuan Cui |
Multi-Scale Hybrid Fusion Network for Mandarin Audio-Visual Speech Recognition. |
ICME |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Wenrui Li, Zhengyu Ma, Liang-Jian Deng, Hengyu Man, Xiaopeng Fan |
Modality-Fusion Spiking Transformer Network for Audio-Visual Zero-Shot Learning. |
ICME |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Otniel-Bogdan Mercea, Thomas Hummel 0001, A. Sophia Koepke, Zeynep Akata |
Text-to-Feature Diffusion for Audio-Visual Few-Shot Learning. |
DAGM |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Yutong Jiang, Kaoru Hirota, Yaping Dai, Ye Ji, Shuai Shao |
Abnormal Emotion Recognition Based on Audio-Visual Modality Fusion. |
ICIRA (1) |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Shentong Mo, Yapeng Tian |
Audio-Visual Grouping Network for Sound Localization from Mixtures. |
CVPR |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Junwen Xiong, Ganglai Wang, Peng Zhang 0005, Wei Huang 0013, Yufei Zha, Guangtao Zhai |
CASP-Net: Rethinking Video Saliency Prediction from an Audio-Visual Consistency Perceptual Perspective. |
CVPR |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Weixuan Sun, Jiayi Zhang, Jianyuan Wang, Zheyuan Liu 0002, Yiran Zhong, Tianpeng Feng, Yandong Guo, Yanhao Zhang, Nick Barnes |
Learning Audio-Visual Source Localization via False Negative Aware Contrastive Learning. |
CVPR |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Chao Feng, Ziyang Chen, Andrew Owens |
Self-Supervised Video Forensics by Audio-Visual Anomaly Detection. |
CVPR |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Jiaben Chen, Renrui Zhang, Dongze Lian, Jiaqi Yang, Ziyao Zeng, Jianbo Shi |
iQuery: Instruments as Queries for Audio-Visual Sound Separation. |
CVPR |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Reuben Tan, Arijit Ray, Andrea Burns, Bryan A. Plummer, Justin Salamon, Oriol Nieto, Bryan Russell, Kate Saenko |
Language-Guided Audio-Visual Source Separation via Trimodal Consistency. |
CVPR |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Chao Huang, Yapeng Tian, Anurag Kumar, Chenliang Xu |
Egocentric Audio-Visual Object Localization. |
CVPR |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Wenru Zheng, Ryota Yoshihashi, Rei Kawakami, Ikuro Sato, Asako Kanezaki |
Multi Event Localization by Audio-Visual Fusion with Omnidirectional Camera and Microphone Array. |
CVPR Workshops |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Davide Cozzolino, Alessandro Pianese, Matthias Nießner, Luisa Verdoliva |
Audio-Visual Person-of-Interest DeepFake Detection. |
CVPR Workshops |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Aggelina Chatziagapi, Dimitris Samaras |
AVFace: Towards Detailed Audio-Visual 4D Face Reconstruction. |
CVPR |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Junyu Gao 0002, Mengyuan Chen, Changsheng Xu |
Collecting Cross-Modal Presence-Absence Evidence for Weakly-Supervised Audio- Visual Event Perception. |
CVPR |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Yan-Bo Lin, Yi-Lin Sung, Jie Lei 0003, Mohit Bansal, Gedas Bertasius |
Vision Transformers are Parameter-Efficient Audio-Visual Learners. |
CVPR |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Tiantian Geng, Teng Wang, Jinming Duan 0001, Runmin Cong, Feng Zheng |
Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline. |
CVPR |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Juheon Hwang, Jiwoo Kang |
Audio-visual Neural Face Generation with Emotional Stimuli. |
IEEE Big Data |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Yi-Lu Jiang, Wen-Chang Chang, Chih-Yi Chiu |
Pineapple Quality Classification in a Multimodal Audio-Visual Dataset. |
IEEE Big Data |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Xingjian Diao, Ming Cheng, Shitong Cheng |
AV-MaskEnhancer: Enhancing Video Representations through Audio-Visual Masked Autoencoder. |
ICTAI |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Donghuo Zeng, Kazushi Ikeda |
Triplet Loss with Curriculum Learning for Audio-Visual Retrieval. |
ISM |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Antonio Rios-Navarro, Enrique Piñero-Fuentes, Salvador Canas-Moreno, Aqib Javed, Jim Harkin, Alejandro Linares-Barranco |
LIPSFUS: A neuromorphic dataset for audio-visual sensory fusion of lip reading. |
ISCAS |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Abhijeet Bishnu, Ankit Gupta 0008, Mandar Gogate, Kia Dashtipour, Tughrul Arslan, Ahsan Adeel, Amir Hussain 0001, Mathini Sellathurai, Tharmalingam Ratnarajah |
Live Demonstration: Cloud-based Audio-Visual Speech Enhancement in Multimodal Hearing-aids. |
ISCAS |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Zheng Zhang 0043, Zheng Ning, Chenliang Xu, Yapeng Tian, Toby Jia-Jun Li |
PEANUT: A Human-AI Collaborative Tool for Annotating Audio-Visual Data. |
UIST |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Leena Mathur, Ralph Adolphs, Maja J. Mataric |
Towards Intercultural Affect Recognition: Audio-Visual Affect Recognition in the Wild Across Six Cultures. |
FG |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Ryan Buyssens |
[in]florescence - a tangible audio-visual installation. |
SIGGRAPH Labs |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Josef Chaloupka, Karel Palecek |
Audio-Visual Broadcast Transcription System in the Era of Covid-19. |
TSP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Hang Zhang, Xin Li, Lidong Bing |
Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding. |
EMNLP (Demos) |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Yuanyuan Jiang, Jianqin Yin |
Target-Aware Spatio-Temporal Reasoning via Answering Questions in Dynamic Audio-Visual Scenarios. |
EMNLP (Findings) |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Joanna Hong, Se Jin Park, Yong Man Ro |
Intuitive Multilingual Audio-Visual Speech Recognition with a Single-Trained Model. |
EMNLP (Findings) |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Craig Cieciura, Maxine Glancy, Philip J. B. Jackson |
Producing Personalised Object-Based Audio-Visual Experiences: an Ethnographic Study. |
IMX |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Gaurav Singh, Paul Ghanem, Taskin Padir |
Sporadic Audio-Visual Embodied Assistive Robot Navigation For Human Tracking. |
PETRA |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Haoyi Duan, Yan Xia, Mingze Zhou, Li Tang, Jieming Zhu, Zhou Zhao |
Cross-modal Prompts: Adapting Large Pre-trained Models for Audio-Visual Downstream Tasks. |
NeurIPS |
2023 |
DBLP BibTeX RDF |
|
11 | Kazuki Shimada, Archontis Politis, Parthasaarathy Sudarsanam, Daniel Aleksander Krause 0001, Kengo Uchida, Sharath Adavanne, Aapo Hakala, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Tuomas Virtanen, Yuki Mitsufuji |
STARSS23: An Audio-Visual Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events. |
NeurIPS |
2023 |
DBLP BibTeX RDF |
|
11 | Yung-Hsuan Lai, Yen-Chun Chen 0001, Frank Wang |
Modality-Independent Teachers Meet Weakly-Supervised Audio-Visual Event Parser. |
NeurIPS |
2023 |
DBLP BibTeX RDF |
|
11 | Yingying Fan, Yu Wu 0011, Bo Du, Yutian Lin |
Revisit Weakly-Supervised Audio-Visual Video Parsing from the Language Perspective. |
NeurIPS |
2023 |
DBLP BibTeX RDF |
|
11 | Susan Liang, Chao Huang, Yapeng Tian, Anurag Kumar, Chenliang Xu |
AV-NeRF: Learning Neural Fields for Real-World Audio-Visual Scene Synthesis. |
NeurIPS |
2023 |
DBLP BibTeX RDF |
|
11 | Yuxin Guo, Shijie Ma, Hu Su, Zhiqing Wang, Yuhao Zhao, Wei Zou, Siyang Sun, Yun Zheng |
Dual Mean-Teacher: An Unbiased Semi-Supervised Framework for Audio-Visual Source Localization. |
NeurIPS |
2023 |
DBLP BibTeX RDF |
|
11 | Shentong Mo, Bhiksha Raj |
Weakly-Supervised Audio-Visual Segmentation. |
NeurIPS |
2023 |
DBLP BibTeX RDF |
|
11 | Luchcha Lam, Minsoo Choi, Magzhan Mukanova, Klay Hauser, Fangzheng Zhao, Richard E. Mayer, Christos Mousas, Nicoletta Adamo-Villani |
Effects of Body Type and Voice Pitch on Perceived Audio-Visual Correspondence and Believability of Virtual Characters. |
SAP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Qichen Zheng, Jie Hong, Moshiur Farazi |
A Generative Approach to Audio-Visual Generalized Zero-Shot Learning: Combining Contrastive and Discriminative Techniques. |
IJCNN |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Silong Liang, Chunxiao Li, Naying Cui, Minghui Sun, Hao Xue |
3DSEAVNet: 3D-Squeeze-and-Excitation Networks for Audio-Visual Saliency Prediction. |
IJCNN |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Jinqiao Dou, Xi Chen 0025, Yuehai Wang |
Specialty may be better: A decoupling multi-modal fusion network for Audio-visual event localization. |
IJCNN |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Gnana Praveen Rajasekhar, Jahangir Alam |
Audio-Visual Speaker Verification via Joint Cross-Attention. |
SPECOM (2) |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Salam Nandakishor, Debadatta Pati |
Improvement of Audio-Visual Keyword Spotting System Accuracy Using Excitation Source Feature. |
SPECOM (2) |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Denis Ivanko, Elena Ryumina, Dmitry Ryumin, Alexandr Axyonov, Alexey M. Kashevnik, Alexey Karpov 0001 |
EMO-AVSR: Two-Level Approach for Audio-Visual Emotional Speech Recognition. |
SPECOM (1) |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Yidan Fan, Yongxin Yu, Wenhuan Lu, Yahong Han |
A Cross-modal and Redundancy-reduced Network for Weakly-Supervised Audio-Visual Violence Detection. |
MMAsia |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Darshan Singh S, Anchit Gupta, C. V. Jawahar, Makarand Tapaswi |
Unsupervised Audio-Visual Lecture Segmentation. |
WACV |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Tanvir Mahmud, Diana Marculescu |
AVE-CLIP: AudioCLIP-based Multi-window Temporal Transformer for Audio Visual Event Localization. |
WACV |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Madhav Agarwal, Rudrabha Mukhopadhyay, Vinay P. Namboodiri, C. V. Jawahar |
Audio-Visual Face Reenactment. |
WACV |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Arda Senocak, Junsik Kim 0001, Tae-Hyun Oh, Dingzeyu Li, In So Kweon |
Event-Specific Audio-Visual Fusion Layers: A Simple and New Perspective on Video Understanding. |
WACV |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Maxime Burchi, Radu Timofte |
Audio-Visual Efficient Conformer for Robust Speech Recognition. |
WACV |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Jialiang Cheng, Chao Sun, Jincai Chen, Ping Lu 0006 |
Audio-visual mutual learning for Weakly Supervised Violence Detection. |
ICISE |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Jialing Zou, Jiahao Mei, Guangze Ye, Tianyu Huai, Qiwei Shen, Daoguo Dong |
EMID: An Emotional Aligned Dataset in Audio-Visual Modality. |
MCGE@MM |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Kei Suzuki, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai |
Audio-Visual Class Association Based on Two-stage Self-supervised Contrastive Learning towards Robust Scene Analysis. |
SII |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Yuan Gong 0001, Andrew Rouditchenko, Alexander H. Liu, David Harwath, Leonid Karlinsky, Hilde Kuehne, James R. Glass |
Contrastive Audio-Visual Masked Autoencoder. |
ICLR |
2023 |
DBLP BibTeX RDF |
|
11 | Haoyue Cheng, Zhaoyang Liu, Wayne Wu, Limin Wang 0002 |
Filter-Recovery Network for Multi-Speaker Audio-Visual Speech Separation. |
ICLR |
2023 |
DBLP BibTeX RDF |
|
11 | Shentong Mo, Pedro Morgado 0001 |
A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition. |
ICML |
2023 |
DBLP BibTeX RDF |
|
11 | Yuchen Hu, Ruizhe Li 0001, Chen Chen 0075, Heqing Zou, Qiushi Zhu, Eng Siong Chng |
Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech Recognition. |
IJCAI |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Xilei Zhu, Huiyu Duan, Yuqin Cao, Yuxin Zhu, Yucheng Zhu, Jing Liu, Li Chen, Xiongkuo Min, Guangtao Zhai |
Perceptual Quality Assessment of Omnidirectional Audio-Visual Signals. |
CICAI (2) |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Hanyuan Wang, Majid Mirmehdi, Dima Damen, Toby Perrett |
Centre Stage: Centricity-based Audio-Visual Temporal Action Detection. |
BMVC Workshop |
2023 |
DBLP BibTeX RDF |
|
11 | Yating Xu, Conghui Hu, Gim Hee Lee |
Motion and Context-Aware Audio-Visual Conditioned Video Prediction. |
BMVC |
2023 |
DBLP BibTeX RDF |
|
11 | Feixiang Wang, Shuang Yang, Shiguang Shan, Xilin Chen 0001 |
Dual Attention for Audio-Visual Speech Enhancement with Facial Cues. |
BMVC |
2023 |
DBLP BibTeX RDF |
|
11 | Jiarui Yu, Haoran Li, Yanbin Hao, Jinmeng Wu, Tong Xu 0001, Shuo Wang 0008, Xiangnan He 0001 |
How Can Contrastive Pre-training Benefit Audio-Visual Segmentation? A Study from Supervised and Zero-shot Perspectives. |
BMVC |
2023 |
DBLP BibTeX RDF |
|
11 | Tomoya Yoshinaga, Keitaro Tanaka, Shigeo Morishima |
Audio-Visual Speech Enhancement with Selective Off-Screen Speech Extraction. |
EUSIPCO |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Özkan Çayli, Xubo Liu, Volkan Kiliç, Wenwu Wang 0001 |
Knowledge Distillation for Efficient Audio-Visual Video Captioning. |
EUSIPCO |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Kenichi Ito, Juro Hosoi, Yuki Ban, Takayuki Kikuchi, Kyosuke Nakagawa, Hanako Kitagawa, Chizuru Murakami, Yosuke Imai, Shin'ichi Warisawa |
Wind comfort and emotion can be changed by the cross-modal presentation of audio-visual stimuli of indoor and outdoor environments. |
VR |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Dan Si, Qing Ye, Jindi Lv, Yuhao Zhou, Jiancheng Lv 0001 |
Violence-MFAS: Audio-Visual Violence Detection Using Multimodal Fusion Architecture Search. |
ICONIP (14) |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Yuchen Hu, Chen Chen 0075, Ruizhe Li 0001, Heqing Zou, Eng Siong Chng |
MIR-GAN: Refining Frame-Level Modality-Invariant Representations with Adversarial Network for Audio-Visual Speech Recognition. |
ACL (1) |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Rongjie Huang, Huadai Liu, Xize Cheng, Yi Ren 0006, Linjun Li, Zhenhui Ye, Jinzheng He, Lichao Zhang, Jinglin Liu, Xiang Yin, Zhou Zhao |
AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation. |
ACL (1) |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Wang Lin, Tao Jin, Wenwen Pan, Linjun Li, Xize Cheng, Ye Wang, Zhou Zhao |
TAVT: Towards Transferable Audio-Visual Text Generation. |
ACL (1) |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Yuchen Hu, Ruizhe Li 0001, Chen Chen 0075, Chengwei Qin, Qiu-Shi Zhu, Eng Siong Chng |
Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech Recognition. |
ACL (1) |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Juan F. Montesinos |
Towards efficient audio-visual source separation and synthesis |
|
2023 |
RDF |
|
11 | Shota Abe, Shuichi Sakamoto, Zhengile Cui, Yôiti Suzuki |
Determination of optimal levels of whole-body vibration using audio-visual information of multimodal content. |
J. Inf. Hiding Multim. Signal Process. |
2022 |
DBLP BibTeX RDF |
|
11 | Yasar Dasdemir, Rüstem Özakar |
Affective states classification performance of audio-visual stimuli from EEG signals with multiple-instance learning. |
Turkish J. Electr. Eng. Comput. Sci. |
2022 |
DBLP DOI BibTeX RDF |
|
11 | Maria Pawelec |
Deepfakes and Democracy (Theory): How Synthetic Audio-Visual Media for Disinformation and Hate Speech Threaten Core Democratic Functions. |
Digit. Soc. |
2022 |
DBLP DOI BibTeX RDF |
|
11 | Jianning Wu, Zhuqing Jiang, Qingchao Chen, Shiping Wen 0001, Aidong Men, Haiying Wang 0005 |
Toward a perceptive pretraining framework for Audio-Visual Video Parsing. |
Inf. Sci. |
2022 |
DBLP DOI BibTeX RDF |
|
11 | Hacene Terbouche, Liam Schoneveld, Oisin Benson, Alice Othmani |
Comparing Learning Methodologies for Self-Supervised Audio-Visual Representation Learning. |
IEEE Access |
2022 |
DBLP DOI BibTeX RDF |
|
11 | Tomoya Sato, Yusuke Sugano, Yoichi Sato |
Self-Supervised Learning for Audio-Visual Relationships of Videos With Stereo Sounds. |
IEEE Access |
2022 |
DBLP DOI BibTeX RDF |
|
11 | Pratibha Kumari 0001, Mukesh Saini |
An Adaptive Framework for Anomaly Detection in Time-Series Audio-Visual Data. |
IEEE Access |
2022 |
DBLP DOI BibTeX RDF |
|
11 | Honghui Xu, Zhipeng Cai 0001, Daniel Takabi, Wei Li 0059 |
Audio-Visual Autoencoding for Privacy-Preserving Video Streaming. |
IEEE Internet Things J. |
2022 |
DBLP DOI BibTeX RDF |
|
11 | Moonisa Ahsan, Fabio Marton, Ruggero Pintus, Enrico Gobbetti |
Audio-visual annotation graphs for guiding lens-based scene exploration. |
Comput. Graph. |
2022 |
DBLP DOI BibTeX RDF |
|
11 | Saswati Debnath, Pinki Roy |
Audio-visual speech recognition based on machine learning approach. |
Int. J. Adv. Intell. Paradigms |
2022 |
DBLP DOI BibTeX RDF |
|
11 | Jannik Zürn, Wolfram Burgard |
Self-Supervised Moving Vehicle Detection From Audio-Visual Cues. |
IEEE Robotics Autom. Lett. |
2022 |
DBLP DOI BibTeX RDF |
|
11 | Xinyuan Qian, Qiquan Zhang, Guohui Guan, Wei Xue |
Deep Audio-Visual Beamforming for Speaker Localization. |
IEEE Signal Process. Lett. |
2022 |
DBLP DOI BibTeX RDF |
|
11 | Gonzalo D. Sad, Lucas D. Terissi, Juan Carlos Gómez |
Complementary models for audio-visual speech classification. |
Int. J. Speech Technol. |
2022 |
DBLP DOI BibTeX RDF |
|
11 | |
Detecting adversarial attacks on audio-visual speech recognition using deep learning method. |
Int. J. Speech Technol. |
2022 |
DBLP DOI BibTeX RDF |
|
11 | Aishan Liu, Huiyuan Xie, Xianglong Liu 0001, Zixin Yin, Shunchang Liu |
Revisiting audio visual scene-aware dialog. |
Neurocomputing |
2022 |
DBLP DOI BibTeX RDF |
|
11 | Zhen Liang, Xihao Zhang, Rushuang Zhou, Li Zhang 0041, Linling Li, Gan Huang, Zhiguo Zhang 0001 |
Cross-individual affective detection using EEG signals with audio-visual embedding. |
Neurocomputing |
2022 |
DBLP DOI BibTeX RDF |
|
11 | Xinyuan Qian, Alessio Brutti, Oswald Lanz, Maurizio Omologo, Andrea Cavallaro |
Audio-Visual Tracking of Concurrent Speakers. |
IEEE Trans. Multim. |
2022 |
DBLP DOI BibTeX RDF |
|
11 | Aihua Zheng, Menglan Hu, Bo Jiang 0002, Yan Huang, Yan Yan 0002, Bin Luo 0001 |
Adversarial-Metric Learning for Audio-Visual Cross-Modal Matching. |
IEEE Trans. Multim. |
2022 |
DBLP DOI BibTeX RDF |
|
11 | Efthymios Tzinis, Scott Wisdom, Tal Remez, John R. Hershey |
AudioScopeV2: Audio-Visual Attention Architectures for Calibrated Open-Domain On-Screen Sound Separation. |
CoRR |
2022 |
DBLP DOI BibTeX RDF |
|
11 | Ziwei Ji, Yan Xu 0012, I-Tsun Cheng, Samuel Cahyawijaya, Rita Frieske, Etsuko Ishii, Min Zeng, Andrea Madotto, Pascale Fung |
VScript: Controllable Script Generation with Audio-Visual Presentation. |
CoRR |
2022 |
DBLP DOI BibTeX RDF |
|
11 | Wenliang Dai, Samuel Cahyawijaya, Tiezheng Yu, Elham J. Barezi, Peng Xu 0008, Cheuk Tung Shadow Yiu, Rita Frieske, Holy Lovenia, Genta Indra Winata, Qifeng Chen, Xiaojuan Ma, Bertram E. Shi, Pascale Fung |
CI-AVSR: A Cantonese Audio-Visual Speech Dataset for In-car Command Recognition. |
CoRR |
2022 |
DBLP BibTeX RDF |
|
|
|