The FacetedDBLP logo    Search for: in:

Disable automatic phrases ?     Syntactic query expansion: ?

Publications at "Eval4NLP"( http://dblp.L3S.de/Venues/Eval4NLP )

URL (DBLP): http://dblp.uni-trier.de/db/conf/eval4nlp

Publication years (Num. hits)
2020 (18) 2021 (25) 2022 (12) 2023 (20)
Publication types (Num. hits)
inproceedings(71) proceedings(4)
Venues (Conferences, Journals, ...)
Eval4NLP(75)
GrowBag graphs for keyword ? (Num. hits/coverage)

Group by:
No Growbag Graphs found.

Results
Found 75 publication records. Showing 75 according to the selection in the facets
Hits ? Authors Title Venue Year Link Author keywords
1Yuan Lu, Yu-Ting Lin Characterised LLMs Affect its Evaluation of Summary and Translation. Search on Bibsonomy Eval4NLP The full citation details ... 2023 DBLP  BibTeX  RDF
1Ghazaleh Mahmoudi Exploring Prompting Large Language Models as Explainable Metrics. Search on Bibsonomy Eval4NLP The full citation details ... 2023 DBLP  BibTeX  RDF
1Neema Kotonya, Saran Krishnasamy, Joel R. Tetreault, Alejandro Jaimes Little Giants: Exploring the Potential of Small LLMs as Evaluation Metrics in Summarization in the Eval4NLP 2023 Shared Task. Search on Bibsonomy Eval4NLP The full citation details ... 2023 DBLP  BibTeX  RDF
1Daniil Larionov, Vasiliy Viskov, George Kokush, Alexander Panchenko, Steffen Eger Team NLLG submission for Eval4NLP 2023 Shared Task: Retrieval-Augmented In-Context Learning for NLG Evaluation. Search on Bibsonomy Eval4NLP The full citation details ... 2023 DBLP  BibTeX  RDF
1Abhishek Pradhan, Ketan Kumar Todi Understanding Large Language Model Based Metrics for Text Summarization. Search on Bibsonomy Eval4NLP The full citation details ... 2023 DBLP  BibTeX  RDF
1Pavan Baswani, Ananya Mukherjee, Manish Shrivastava 0001 LTRC_IIITH's 2023 Submission for Prompting Large Language Models as Explainable Metrics Task. Search on Bibsonomy Eval4NLP The full citation details ... 2023 DBLP  BibTeX  RDF
1Jeremy Block, Yu-Peng Chen, Abhilash Budharapu, Lisa Anthony, Bonnie J. Dorr Summary Cycles: Exploring the Impact of Prompt Engineering on Large Language Models' Interaction with Interaction Log Information. Search on Bibsonomy Eval4NLP The full citation details ... 2023 DBLP  BibTeX  RDF
1Savita Bhat, Vasudeva Varma Large Language Models As Annotators: A Preliminary Evaluation For Annotating Low-Resource Language Content. Search on Bibsonomy Eval4NLP The full citation details ... 2023 DBLP  BibTeX  RDF
1Lukas Weber, Krishnan Jothi Ramalingam, Matthias Beyer, Axel Zimmermann 0005 WRF: Weighted Rouge-F1 Metric for Entity Recognition. Search on Bibsonomy Eval4NLP The full citation details ... 2023 DBLP  BibTeX  RDF
1Christoph Leiter, Juri Opitz, Daniel Deutsch, Yang Gao 0021, Rotem Dror, Steffen Eger The Eval4NLP 2023 Shared Task on Prompting Large Language Models as Explainable Metrics. Search on Bibsonomy Eval4NLP The full citation details ... 2023 DBLP  BibTeX  RDF
1Joonghoon Kim, Sangmin Lee, Seung Hun Han, Saeran Park, Jiyoon Lee, Kiyoon Jeong, Pilsung Kang 0001 Which is better? Exploring Prompting Strategy For LLM-based Metrics. Search on Bibsonomy Eval4NLP The full citation details ... 2023 DBLP  BibTeX  RDF
1Daniel Deutsch, Rotem Dror, Steffen Eger, Yang Gao 0021, Christoph Leiter, Juri Opitz, Andreas Rücklé (eds.) Proceedings of the 4th Workshop on Evaluation and Comparison of NLP Systems, Eval4NLP 2023, Bali, Indonesia, November 1, 2023 Search on Bibsonomy Eval4NLP The full citation details ... 2023 DBLP  BibTeX  RDF
1Rui Zhang, Fuhai Song, Hui Huang, Jinghao Yuan, Muyun Yang, Tiejun Zhao HIT-MI&T Lab's Submission to Eval4NLP 2023 Shared Task. Search on Bibsonomy Eval4NLP The full citation details ... 2023 DBLP  BibTeX  RDF
1Abbas Akkasi, Kathleen C. Fraser, Majid Komeili Reference-Free Summarization Evaluation with Large Language Models. Search on Bibsonomy Eval4NLP The full citation details ... 2023 DBLP  BibTeX  RDF
1Yanran Chen, Steffen Eger Transformers Go for the LOLs: Generating (Humourous) Titles from Scientific Abstracts End-to-End. Search on Bibsonomy Eval4NLP The full citation details ... 2023 DBLP  BibTeX  RDF
1Jad Doughman, Shady Shehata, Leen Al Qadi, Youssef Nafea, Fakhri Karray Can a Prediction's Rank Offer a More Accurate Quantification of Bias? A Case Study Measuring Sexism in Debiased Language Models. Search on Bibsonomy Eval4NLP The full citation details ... 2023 DBLP  BibTeX  RDF
1Nitin Ramrakhiyani, Vasudeva Varma, Girish K. Palshikar, Sachin Pawar Zero-shot Probing of Pretrained Language Models for Geography Knowledge. Search on Bibsonomy Eval4NLP The full citation details ... 2023 DBLP  BibTeX  RDF
1Yixuan Wang, Qingyan Chen, Duygu Ataman Delving into Evaluation Metrics for Generation: A Thorough Assessment of How Metrics Generalize to Rephrasing Across Languages. Search on Bibsonomy Eval4NLP The full citation details ... 2023 DBLP  BibTeX  RDF
1Zahra Kolagar, Sebastian Steindl, Alessandra Zarcone EduQuick: A Dataset Toward Evaluating Summarization of Informal Educational Content for Social Media. Search on Bibsonomy Eval4NLP The full citation details ... 2023 DBLP  BibTeX  RDF
1Vatsal Raina, Adian Liusie, Mark J. F. Gales Assessing Distractors in Multiple-Choice Tests. Search on Bibsonomy Eval4NLP The full citation details ... 2023 DBLP  BibTeX  RDF
1Shohei Higashiyama, Masao Ideuchi, Masao Utiyama, Yoshiaki Oida, Eiichiro Sumita A Japanese Corpus of Many Specialized Domains for Word Segmentation and Part-of-Speech Tagging. Search on Bibsonomy Eval4NLP The full citation details ... 2022 DBLP  DOI  BibTeX  RDF
1Yunmeng Li, Jun Suzuki, Makoto Morishita, Kaori Abe, Ryoko Tokuhisa, Ana Brassard, Kentaro Inui Chat Translation Error Detection for Assisting Cross-lingual Communications. Search on Bibsonomy Eval4NLP The full citation details ... 2022 DBLP  DOI  BibTeX  RDF
1Kaori Abe, Sho Yokoi, Tomoyuki Kajiwara, Kentaro Inui Why is sentence similarity benchmark not predictive of application-oriented task performance? Search on Bibsonomy Eval4NLP The full citation details ... 2022 DBLP  DOI  BibTeX  RDF
1Guanyi Chen, Fahime Same, Kees van Deemter Assessing Neural Referential Form Selectors on a Realistic Multilingual Dataset. Search on Bibsonomy Eval4NLP The full citation details ... 2022 DBLP  DOI  BibTeX  RDF
1Ryan Chi, Nathan Kim, Patrick Liu, Zander Lack, Ethan A. Chi GLARE: Generative Left-to-right AdversaRial Examples. Search on Bibsonomy Eval4NLP The full citation details ... 2022 DBLP  DOI  BibTeX  RDF
1Mateusz Krubi'nski, Pavel Pecina From COMET to COMES - Can Summary Evaluation Benefit from Translation Evaluation? Search on Bibsonomy Eval4NLP The full citation details ... 2022 DBLP  DOI  BibTeX  RDF
1Parush Gera, Tempestt J. Neal A Comparative Analysis of Stance Detection Approaches and Datasets. Search on Bibsonomy Eval4NLP The full citation details ... 2022 DBLP  DOI  BibTeX  RDF
1Daniel Deutsch, Can Udomcharoenchaikit, Juri Opitz, Yang Gao 0021, Marina Fomicheva, Steffen Eger (eds.) Proceedings of the 3rd Workshop on Evaluation and Comparison of NLP Systems, Eval4NLP 2022, Online, November 20, 2022 Search on Bibsonomy Eval4NLP The full citation details ... 2022 DBLP  BibTeX  RDF
1Shohei Zhou, Alisha Zachariah, Devin Conathan, Jeffery Kline Assessing Resource-Performance Trade-off of Natural Language Models using Data Envelopment Analysis. Search on Bibsonomy Eval4NLP The full citation details ... 2022 DBLP  DOI  BibTeX  RDF
1Zhengxiang Wang Random Text Perturbations Work, but not Always. Search on Bibsonomy Eval4NLP The full citation details ... 2022 DBLP  DOI  BibTeX  RDF
1Juri Opitz, Anette Frank Better Smatch = Better Parser? AMR evaluation is not so simple anymore. Search on Bibsonomy Eval4NLP The full citation details ... 2022 DBLP  DOI  BibTeX  RDF
1Roberta Rocca, Alejandro de la Vega Evaluating the role of non-lexical markers in GPT-2's language modeling behavior. Search on Bibsonomy Eval4NLP The full citation details ... 2022 DBLP  DOI  BibTeX  RDF
1Qingkai Zeng 0001, Mengxia Yu, Wenhao Yu 0002, Tianwen Jiang, Meng Jiang 0001 Validating Label Consistency in NER Data Annotation. Search on Bibsonomy Eval4NLP The full citation details ... 2021 DBLP  BibTeX  RDF
1Nicolas Garneau, Luc Lamontagne Trainable Ranking Models to Evaluate the Semantic Accuracy of Data-to-Text Neural Generator. Search on Bibsonomy Eval4NLP The full citation details ... 2021 DBLP  BibTeX  RDF
1Heather Lent, Semih Yavuz, Tao Yu, Tong Niu, Yingbo Zhou, Dragomir Radev, Xi Victoria Lin Testing Cross-Database Semantic Parsers With Canonical Utterances. Search on Bibsonomy Eval4NLP The full citation details ... 2021 DBLP  BibTeX  RDF
1Melda Eksi, Erik Gelbing, Jonathan Stieber, Chi Viet Vu Explaining Errors in Machine Translation with Absolute Gradient Ensembles. Search on Bibsonomy Eval4NLP The full citation details ... 2021 DBLP  BibTeX  RDF
1Emma Manning, Nathan Schneider 0001 Referenceless Parsing-Based Evaluation of AMR-to-English Generation. Search on Bibsonomy Eval4NLP The full citation details ... 2021 DBLP  BibTeX  RDF
1Benjamin Murauer, Günther Specht Developing a Benchmark for Reducing Data Bias in Authorship Attribution. Search on Bibsonomy Eval4NLP The full citation details ... 2021 DBLP  BibTeX  RDF
1Chester Palen-Michel, Nolan Holley, Constantine Lignos SeqScore: Addressing Barriers to Reproducible Named Entity Recognition Evaluation. Search on Bibsonomy Eval4NLP The full citation details ... 2021 DBLP  BibTeX  RDF
1Lucie Gianola, Hicham El Boukkouri, Cyril Grouin, Thomas Lavergne, Patrick Paroubek, Pierre Zweigenbaum Differential Evaluation: a Qualitative Analysis of Natural Language Processing System Behavior Based Upon Data Resistance to Processing. Search on Bibsonomy Eval4NLP The full citation details ... 2021 DBLP  BibTeX  RDF
1Oleg V. Vasilyev 0001, John Bohannon ESTIME: Estimation of Summary-to-Text Inconsistency by Mismatched Embeddings. Search on Bibsonomy Eval4NLP The full citation details ... 2021 DBLP  BibTeX  RDF
1Alexey Tikhonov, Igor Samenko, Ivan P. Yamshchikov StoryDB: Broad Multi-language Narrative Dataset. Search on Bibsonomy Eval4NLP The full citation details ... 2021 DBLP  BibTeX  RDF
1Ayush Garg 0001, Sammed S. Kagi, Vivek Srivastava, Mayank Singh 0001 MIPE: A Metric Independent Pipeline for Effective Code-Mixed NLG Evaluation. Search on Bibsonomy Eval4NLP The full citation details ... 2021 DBLP  BibTeX  RDF
1Christoph Wolfgang Leiter Reference-Free Word- and Sentence-Level Translation Evaluation with Token-Matching Metrics. Search on Bibsonomy Eval4NLP The full citation details ... 2021 DBLP  BibTeX  RDF
1Vivek Srivastava, Mayank Singh 0001 HinGE: A Dataset for Generation and Evaluation of Code-Mixed Hinglish Text. Search on Bibsonomy Eval4NLP The full citation details ... 2021 DBLP  BibTeX  RDF
1Yang Liu 0254, Alan Medlar, Dorota Glowacka Statistically Significant Detection of Semantic Shifts using Contextual Word Embeddings. Search on Bibsonomy Eval4NLP The full citation details ... 2021 DBLP  BibTeX  RDF
1Urja Khurana, Eric T. Nalisnick, Antske Fokkens How Emotionally Stable is ALBERT? Testing Robustness with Stochastic Weight Averaging on a Sentiment Analysis Task. Search on Bibsonomy Eval4NLP The full citation details ... 2021 DBLP  BibTeX  RDF
1Peter Polák, Muskaan Singh, Ondrej Bojar Explainable Quality Estimation: CUNI Eval4NLP Submission. Search on Bibsonomy Eval4NLP The full citation details ... 2021 DBLP  BibTeX  RDF
1Marina Fomicheva, Piyawat Lertvittayakumjorn, Wei Zhao 0033, Steffen Eger, Yang Gao 0021 The Eval4NLP Shared Task on Explainable Quality Estimation: Overview and Results. Search on Bibsonomy Eval4NLP The full citation details ... 2021 DBLP  BibTeX  RDF
1Enzo Terreau, Antoine Gourru, Julien Velcin Writing Style Author Embedding Evaluation. Search on Bibsonomy Eval4NLP The full citation details ... 2021 DBLP  BibTeX  RDF
1Yang Gao 0021, Steffen Eger, Wei Zhao 0033, Piyawat Lertvittayakumjorn, Marina Fomicheva (eds.) Proceedings of the 2nd Workshop on Evaluation and Comparison of NLP Systems, Eval4NLP 2021, Punta Cana, Dominican Republic, November 10, 2021 Search on Bibsonomy Eval4NLP The full citation details ... 2021 DBLP  BibTeX  RDF
1Marcos V. Treviso, Nuno Miguel Guerreiro, Ricardo Rei, André F. T. Martins IST-Unbabel 2021 Submission for the Explainable Quality Estimation Shared Task. Search on Bibsonomy Eval4NLP The full citation details ... 2021 DBLP  BibTeX  RDF
1Oskar Wysocki, Malina Florea, Dónal Landers, André Freitas What is SemEval evaluating? A Systematic Analysis of Evaluation Campaigns in NLP. Search on Bibsonomy Eval4NLP The full citation details ... 2021 DBLP  BibTeX  RDF
1Yo Ehara Evaluation of Unsupervised Automatic Readability Assessors Using Rank Correlations. Search on Bibsonomy Eval4NLP The full citation details ... 2021 DBLP  BibTeX  RDF
1Tasnim Kabir, Marine Carpuat The UMD Submission to the Explainable MT Quality Estimation Shared Task: Combining Explanation Models with Sequence Labeling. Search on Bibsonomy Eval4NLP The full citation details ... 2021 DBLP  BibTeX  RDF
1Raphael Rubino, Atsushi Fujita, Benjamin Marie Error Identification for Machine Translation with Metric Embedding and Attention. Search on Bibsonomy Eval4NLP The full citation details ... 2021 DBLP  BibTeX  RDF
1David Chen, Maury Courtland, Adam Faulkner, Aysu Ezen-Can Error-Sensitive Evaluation for Ordinal Target Variables. Search on Bibsonomy Eval4NLP The full citation details ... 2021 DBLP  BibTeX  RDF
1Neslihan Iskender, Tim Polzehl, Sebastian Möller 0001 Best Practices for Crowd-based Evaluation of German Summarization: Comparing Crowd, Expert and Automatic Evaluation. Search on Bibsonomy Eval4NLP The full citation details ... 2020 DBLP  DOI  BibTeX  RDF
1Kawin Ethayarajh, Dorsa Sadigh BLEU Neighbors: A Reference-less Approach to Automatic Evaluation. Search on Bibsonomy Eval4NLP The full citation details ... 2020 DBLP  DOI  BibTeX  RDF
1Hwanhee Lee, Seunghyun Yoon 0002, Franck Dernoncourt, Doo Soon Kim, Trung Bui, Kyomin Jung ViLBERTScore: Evaluating Image Caption Using Vision-and-Language BERT. Search on Bibsonomy Eval4NLP The full citation details ... 2020 DBLP  DOI  BibTeX  RDF
1Jingcheng Niu, Gerald Penn Grammaticality and Language Modelling. Search on Bibsonomy Eval4NLP The full citation details ... 2020 DBLP  DOI  BibTeX  RDF
1Hanna Wecker, Annemarie Friedrich, Heike Adel ClusterDataSplit: Exploring Challenging Clustering-Based Data Splits for Model Performance Evaluation. Search on Bibsonomy Eval4NLP The full citation details ... 2020 DBLP  DOI  BibTeX  RDF
1Rahul Jha, Keping Bi, Yang Li, Mahdi Pakdaman, Asli Celikyilmaz, Ivan Zhiboedov, Kieran McDonald Artemis: A Novel Annotation Methodology for Indicative Single Document Summarization. Search on Bibsonomy Eval4NLP The full citation details ... 2020 DBLP  DOI  BibTeX  RDF
1Shiran Dudy, Steven Bedrick Are Some Words Worth More than Others? Search on Bibsonomy Eval4NLP The full citation details ... 2020 DBLP  DOI  BibTeX  RDF
1Adam Poliak A survey on Recognizing Textual Entailment as an NLP Evaluation. Search on Bibsonomy Eval4NLP The full citation details ... 2020 DBLP  DOI  BibTeX  RDF
1Kiril Gashteovski, Rainer Gemulla, Bhushan Kotnis, Sven Hertling, Christian Meilicke On Aligning OpenIE Extractions with Knowledge Bases: A Case Study. Search on Bibsonomy Eval4NLP The full citation details ... 2020 DBLP  DOI  BibTeX  RDF
1Reda Yacouby, Dustin Axman Probabilistic Extension of Precision, Recall, and F1 Score for More Thorough Evaluation of Classification Models. Search on Bibsonomy Eval4NLP The full citation details ... 2020 DBLP  DOI  BibTeX  RDF
1Steffen Eger, Yang Gao 0021, Maxime Peyrard, Wei Zhao 0033, Eduard H. Hovy (eds.) Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems, Eval4NLP 2020, Online, November 20, 2020 Search on Bibsonomy Eval4NLP The full citation details ... 2020 DBLP  BibTeX  RDF
1João Sedoc, Lyle H. Ungar Item Response Theory for Efficient Human Evaluation of Chatbots. Search on Bibsonomy Eval4NLP The full citation details ... 2020 DBLP  DOI  BibTeX  RDF
1Nathan Stringham, Mike Izbicki Evaluating Word Embeddings on Low-Resource Languages. Search on Bibsonomy Eval4NLP The full citation details ... 2020 DBLP  DOI  BibTeX  RDF
1Jesper Brink Andersen, Mikkel Bak Bertelsen, Mikkel Hørby Schou, Manuel R. Ciosici, Ira Assent One of these words is not like the other: a reproduction of outlier identification using non-contextual word representations. Search on Bibsonomy Eval4NLP The full citation details ... 2020 DBLP  DOI  BibTeX  RDF
1Jacob Bremerman, Huda Khayrallah, Douglas W. Oard, Matt Post On the Evaluation of Machine Translation n-best Lists. Search on Bibsonomy Eval4NLP The full citation details ... 2020 DBLP  DOI  BibTeX  RDF
1Oleg V. Vasilyev 0001, Vedant Dharnidharka, John Bohannon Fill in the BLANC: Human-free quality estimation of document summaries. Search on Bibsonomy Eval4NLP The full citation details ... 2020 DBLP  DOI  BibTeX  RDF
1Xi Chen 0071, Nan Ding 0002, Tomer Levinboim, Radu Soricut Improving Text Generation Evaluation with Batch Centering and Tempered Word Mover Distance. Search on Bibsonomy Eval4NLP The full citation details ... 2020 DBLP  DOI  BibTeX  RDF
1Klaus-Michael Lux, Maya Sappelli, Martha A. Larson Truth or Error? Towards systematic analysis of factual errors in abstractive summaries. Search on Bibsonomy Eval4NLP The full citation details ... 2020 DBLP  DOI  BibTeX  RDF
Displaying result #1 - #75 of 75 (100 per page; Change: )
Valid XHTML 1.1! Valid CSS! [Valid RSS]
Maintained by L3S.
Previously maintained by Jörg Diederich.
Based upon DBLP by Michael Ley.
open data data released under the ODC-BY 1.0 license