The FacetedDBLP logo    Search for: in:

Disable automatic phrases ?     Syntactic query expansion: ?

Searching for reward with no syntactic query expansion in all metadata.

Publication years (Num. hits)
1967-1985 (17) 1986-1990 (17) 1991-1993 (24) 1994-1995 (22) 1996 (16) 1997-1998 (31) 1999 (27) 2000 (31) 2001 (49) 2002 (63) 2003 (54) 2004 (91) 2005 (98) 2006 (149) 2007 (170) 2008 (164) 2009 (123) 2010 (93) 2011 (81) 2012 (89) 2013 (107) 2014 (86) 2015 (105) 2016 (125) 2017 (159) 2018 (195) 2019 (268) 2020 (321) 2021 (403) 2022 (471) 2023 (585) 2024 (185)
Publication types (Num. hits)
article(2178) incollection(26) inproceedings(2189) mastersthesis(1) phdthesis(25)
Venues (Conferences, Journals, ...)
CoRR(908) NeuroImage(175) AAMAS(88) ICML(74) NeurIPS(73) AAAI(71) J. Cogn. Neurosci.(52) IJCNN(46) IJCAI(40) IEEE Access(38) ICRA(36) ICLR(34) HICSS(29) IROS(28) CogSci(24) AISTATS(22) More (+10 of total 1273)
GrowBag graphs for keyword ? (Num. hits/coverage)

Group by:
The graphs summarize 862 occurrences of 567 keywords

Results
Found 4420 publication records. Showing 4419 according to the selection in the facets
Hits ? Authors Title Venue Year Link Author keywords
11Kamyar Azizzadenesheli, Trung Dang, Aranyak Mehta, Alexandros Psomas 0001, Qian Zhang Reward Selection with Noisy Observations. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11David Yunis, Justin Jung, Falcon Z. Dai, Matthew R. Walter Subwords as Skills: Tokenization for Sparse-Reward Reinforcement Learning. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Jiwon Kim, Moon-Ju Kang, KangHun Lee, HyungJun Moon, Bo-Kwan Jeon Deep Reinforcement Learning for Asset Allocation: Reward Clipping. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Sahana Ramnath, Brihi Joshi, Skyler Hallinan, Ximing Lu, Liunian Harold Li, Aaron Chan, Jack Hessel, Yejin Choi 0001, Xiang Ren 0001 Tailoring Self-Rationalizers with Multi-Reward Distillation. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Dapeng Zhi, Peixin Wang, Cheng Chen, Min Zhang 0002 Robustness Verification of Deep Reinforcement Learning Based Control Systems using Reward Martingales. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Haolin Ruan, Zhi Chen, Chin Pang Ho Risk-Averse MDPs under Reward Ambiguity. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Michael Kölle 0001, Tim Matheis, Philipp Altmann, Kyrill Schmid Learning to Participate through Trading of Reward Shares. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Ziang Song, Tianle Cai, Jason D. Lee, Weijie J. Su Reward Collapse in Aligning Large Language Models. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Dhawal Gupta, Yash Chandak, Scott M. Jordan, Philip S. Thomas, Bruno Castro da Silva Behavior Alignment via Reward Function Optimization. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Tianchi Cai, Shenliao Bao, Jiyan Jiang, Shiji Zhou, Wenpeng Zhang 0003, Lihong Gu, Jinjie Gu, Guannan Zhang Model-free Reinforcement Learning with Stochastic Reward Stabilization for Recommender Systems. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Wenhao Lu, Sven Magg, Xufeng Zhao, Martin Gromniak, Stefan Wermter A Closer Look at Reward Decomposition for High-Level Robotic Explanations. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Patrik Keller Parallel Proof-of-Work with DAG-Style Voting and Targeted Reward Discounting. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Elizabeth Bates, Vasilios Mavroudis, Chris Hicks Reward Shaping for Happier Autonomous Cyber Security Agents. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Washim Uddin Mondal, Vaneet Aggarwal Reinforcement Learning with Delayed, Composite, and Partially Anonymous Reward. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Hong-Peng Zhang Maneuver Decision-Making Through Automatic Curriculum Reinforcement Learning Without Handcrafted Reward functions. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Hongzheng Yang, Cheng Chen, Yueyao Chen, Markus Scheppach, Hon-Chi Yip, Qi Dou 0001 Uncertainty Estimation for Safety-critical Scene Segmentation via Fine-grained Reward Maximization. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Zishan Ahmad, Suman Saurabh, Vaishakh Sreekanth Menon, Asif Ekbal, Roshni R. Ramnani, Anutosh Maitra INA: An Integrative Approach for Enhancing Negotiation Strategies with Reward-Based Dialogue System. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Peter Barnett, Rachel Freedman, Justin Svegliato, Stuart Russell 0001 Active Reward Learning from Multiple Teachers. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Lei Li 0040, Yekun Chai, Shuohuan Wang, Yu Sun, Hao Tian, Ningyu Zhang 0001, Hua Wu Tool-Augmented Reward Modeling. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Bryan Brandt, Prithviraj Dasgupta Synthetically Generating Human-like Data for Sequential Decision Making Tasks via Reward-Shaped Imitation Learning. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Shuai Zhao 0006, Xiaohan Wang, Linchao Zhu, Yi Yang 0001 Test-Time Adaptation with CLIP Reward for Zero-Shot Generalization in Vision-Language Models. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Sruthi Rachamalla, Henry Hexmoor Driver Safety Reward with Cooperative Platooning using Blockchain. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Jiuzhou Han, Wray L. Buntine, Ehsan Shareghi Reward Engineering for Generating Semi-structured Explanation. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Fan-Ming Luo, Tian Xu, Xingchen Cao, Yang Yu 0001 Reward-Consistent Dynamics Models are Strongly Generalizable for Offline Reinforcement Learning. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Sayak Ray Chowdhury, Xingyu Zhou 0001, Nagarajan Natarajan Differentially Private Reward Estimation with Preference Feedback. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Mihir Prabhudesai, Anirudh Goyal, Deepak Pathak, Katerina Fragkiadaki Aligning Text-to-Image Diffusion Models with Reward Backpropagation. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Zaifan Jiang, Xing Huang, Chao Wei Preference as Reward, Maximum Preference Optimization with Importance Sampling. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Xianjie Zhang, Jiahao Sun, Chen Gong 0005, Kai Wang, Yifei Cao, Hao Chen, Yu Liu Mutual Information as Intrinsic Reward of Reinforcement Learning Agents for On-demand Ride Pooling. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Yuki Oyama Global path preference and local response: A reward decomposition approach for network path choice analysis in the presence of locally perceived attributes. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Youjia Zhang, Pingzhong Tang Collusion-proof And Sybil-proof Reward Mechanisms For Query Incentive Networks. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Lauren H. Cooke, Harvey Klyne, Edwin Zhang, Cassidy Laidlaw, Milind Tambe, Finale Doshi-Velez Toward Computationally Efficient Inverse Reinforcement Learning via Reward Shaping. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Cevahir Köprülü, Ufuk Topcu Reward-Machine-Guided, Self-Paced Reinforcement Learning. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Akansha Kalra, Daniel S. Brown Can Differentiable Decision Trees Learn Interpretable Reward Functions? Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Gen Li 0005, Yuling Yan, Yuxin Chen 0002, Jianqing Fan Minimax-Optimal Reward-Agnostic Exploration in Reinforcement Learning. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Daniel Shin, Anca D. Dragan, Daniel S. Brown Benchmarks and Algorithms for Offline Preference-Based Reward Learning. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Panagiotis Liampas Risk-averse Batch Active Inverse Reward Design. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Wesley A. Suttle, Amrit Singh Bedi, Bhrij Patel, Brian M. Sadler, Alec Koppel, Dinesh Manocha Beyond Exponentially Fast Mixing in Average-Reward Reinforcement Learning via Multi-Level Monte Carlo Actor-Critic. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Yecheng Jason Ma, William Liang, Guanzhi Wang, De-An Huang, Osbert Bastani, Dinesh Jayaraman, Yuke Zhu, Linxi Fan, Anima Anandkumar Eureka: Human-Level Reward Design via Coding Large Language Models. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Chendi Qu, Jianping He 0001, Xiaoming Duan, Jiming Chen 0001 Inverse Reinforcement Learning with Unknown Reward Model based on Structural Risk Minimization. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Sukai Huang, Nir Lipovetzky, Trevor Cohn A Reminder of its Brittleness: Language Reward Shaping May Hinder Learning for Instruction Following Agents. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Andrea Soltoggio, Eseoghene Ben-Iwhiwhu, Christos Peridis, Pawel Ladosz, Jeffery Dick, Praveen K. Pilly, Soheil Kolouri The configurable tree graph (CT-graph): measurable problems in partially observable and distal reward environments for lifelong reinforcement learning. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Bhargav Ganguly, Vaneet Aggarwal Quantum Acceleration of Infinite Horizon Average-Reward Reinforcement Learning. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Changhun Lee, Chiehyeon Lim A Bi-objective Perspective on Controllable Language Models: Reward Dropout Improves Off-policy Control Performance. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Yiliu Wang, Wei Chen, Milan Vojnovic Combinatorial Bandits for Maximum Value Reward Function under Max Value-Index Feedback. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Ekdeep Singh Lubana, Johann Brehmer, Pim de Haan, Taco Cohen FoMo Rewards: Can we cast foundation models as reward functions? Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Qisen Yang, Huanqian Wang, Mukun Tong, Wenjie Shi, Gao Huang, Shiji Song Leveraging Reward Consistency for Interpretable Feature Discovery in Reinforcement Learning. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Ali Baheri Understanding Reward Ambiguity Through Optimal Transport Theory in Inverse Reinforcement Learning. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Washim Uddin Mondal, Vaneet Aggarwal Improved Sample Complexity Analysis of Natural Policy Gradient Algorithm with General Parameterization for Infinite Horizon Discounted Reward Markov Decision Processes. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Minae Kwon, Sang Michael Xie, Kalesha Bullard, Dorsa Sadigh Reward Design with Language Models. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Zihan Zhang, Qiaomin Xie Sharper Model-free Reinforcement Learning for Average-reward Markov Decision Processes. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Tao Huang, Guangqi Jiang, Yanjie Ze, Huazhe Xu Diffusion Reward: Learning Rewards via Conditional Video Diffusion. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Boyuan Zheng, Jianlong Zhou, Fang Chen 0001 Genetic Imitation Learning by Reward Extrapolation. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Peeyush Kumar Reward Shaping via Diffusion Process in Reinforcement Learning. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Hao Jiang, Tien Mai, Pradeep Varakantham Solving Constrained Reinforcement Learning through Augmented State and Reward Penalties. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Krishnendu Chatterjee, Ehsan Kafshdar Goharshady, Mehrdad Karrabi, Petr Novotný 0001, Dorde Zikelic Solving Long-run Average Reward Robust MDPs via Stochastic Games. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Haoyan Yang, Zhitao Li, Yong Zhang, Jianzong Wang, Ning Cheng 0001, Ming Li, Jing Xiao 0006 PRCA: Fitting Black-Box Large Language Models for Retrieval Question Answering via Pluggable Reward-Driven Contextual Adapter. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Aaron Nicolson, Jason Dowling, Bevan Koopman Longitudinal Data and a Semantic Similarity Reward for Chest X-Ray Report Generation. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Haoxin Lin, Hongqiu Wu, Jiaji Zhang, Yihao Sun, Junyin Ye, Yang Yu Episodic Return Decomposition by Difference of Implicitly Assigned Sub-Trajectory Reward. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Yashaswini Murthy, Mehrdad Moharrami, R. Srikant 0001 Performance Bounds for Policy-Based Average Reward Reinforcement Learning Algorithms. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Benjamin D. Kraske, Anshu Saksena, Anna L. Buczak, Zachary N. Sunberg Explanation through Reward Model Reconciliation using POMDP Tree Search. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Hanze Dong, Wei Xiong 0015, Deepanshu Goyal, Rui Pan, Shizhe Diao, Jipeng Zhang, Kashun Shum, Tong Zhang 0001 RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Leo Ardon, Daniel Furelos-Blanco, Alessandra Russo Learning Reward Machines in Cooperative Multi-Agent Tasks. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Juan Rocamonde, Victoriano Montesinos, Elvis Nava, Ethan Perez, David Lindner Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Guy Azran, Mohamad H. Danesh, Stefano V. Albrecht, Sarah Keren Contextual Pre-Planning on Reward Machine Abstractions for Enhanced Transfer in Deep Reinforcement Learning. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Vijay Shankaran Vivekanand, Rajkumar Kubendran Custom DNN using Reward Modulated Inverted STDP Learning for Temporal Pattern Recognition. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Kunyang Lin, Yufeng Wang, Peihao Chen, Runhao Zeng, Siyuan Zhou, Mingkui Tan, Chuang Gan DCIR: Dynamic Consistency Intrinsic Reward for Multi-Agent Reinforcement Learning. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Yinchuan Li, Zhigang Li, Wenqian Li, Yunfeng Shao 0001, Yan Zheng, Jianye Hao Generative Flow Networks for Precise Reward-Oriented Active Learning on Graphs. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Shintaro Ueki, Fujio Toriumi, Toshiharu Sugawara Effect of Monetary Reward on Users' Individual Strategies Using Co-Evolutionary Learning. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Lina Mezghani, Sainbayar Sukhbaatar, Piotr Bojanowski, Alessandro Lazaric, Karteek Alahari Learning Goal-Conditioned Policies Offline with Self-Supervised Reward Shaping. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Yudi Zhang 0007, Yali Du 0001, Biwei Huang, Ziyan Wang, Jun Wang 0012, Meng Fang, Mykola Pechenizkiy GRD: A Generative Approach for Interpretable Reward Redistribution in Reinforcement Learning. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Cansu Sancaktar, Justus H. Piater, Georg Martius Regularity as Intrinsic Reward for Free Play. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Feiyang Wu, Zhaoyuan Gu, Hanran Wu, Anqi Wu, Ye Zhao 0002 Infer and Adapt: Bipedal Locomotion Reward Learning from Demonstrations via Inverse Reinforcement Learning. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Chengyang Ying, Zhongkai Hao, Xinning Zhou, Hang Su 0006, Songming Liu, Jialian Li, Dong Yan, Jun Zhu 0001 Reward Informed Dreamer for Task Generalization in Reinforcement Learning. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Nikolina Covic, Jochen Cremer, Hrvoje Pandzic Learning a Reward Function for User-Preferred Appliance Scheduling. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Hao Li, Xiao-Hu Zhou, Xiao-Liang Xie, Shi-Qi Liu 0004, Zhen-Qiu Feng, Xiao-Yin Liu, Mei-Jiang Gui, Tian-Yu Xiang, De-Xing Huang, Bo-Xian Yao, Zeng-Guang Hou CROP: Conservative Reward for Model-based Offline Policy Optimization. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Hadar Schreiber Galler, Tom Zahavy, Guillaume Desjardins, Alon Cohen APART: Diverse Skill Discovery using All Pairs with Ascending Reward and DropouT. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Gen Li 0005, Wenhao Zhan, Jason D. Lee, Yuejie Chi, Yuxin Chen 0002 Reward-agnostic Fine-tuning: Provable Statistical Benefits of Hybrid Reinforcement Learning. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Yuan Cheng, Ruiquan Huang, Jing Yang 0002, Yingbin Liang Improved Sample Complexity for Reward-free Reinforcement Learning under Low-rank MDPs. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Lin-Chi Wu, Zengjie Zhang, Sofie Haesaert, Zhiqiang Ma 0001, Zhiyong Sun Risk-Aware Reward Shaping of Reinforcement Learning Agents for Autonomous Driving. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Uri Gadot, Esther Derman, Navdeep Kumar, Maxence Mohamed Elfatihi, Kfir Levy, Shie Mannor Solving Non-Rectangular Reward-Robust MDPs via Frequency Regularization. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Joar Skalse, Lucy Farnik, Sumeet Ramesh Motwani, Erik Jenner, Adam Gleave, Alessandro Abate STARC: A General Framework For Quantifying Differences Between Reward Functions. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Xuzhe Dang, Stefan Edelkamp, Nicolas Ribault CLIP-Motion: Learning Reward Functions for Robotic Actions Using Consecutive Observations. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Ted Moskovitz, Aaditya K. Singh, DJ Strouse, Tuomas Sandholm, Ruslan Salakhutdinov, Anca D. Dragan, Stephen McAleer Confronting Reward Model Overoptimization with Constrained RLHF. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Vivek Myers, Erdem Biyik, Dorsa Sadigh Active Reward Learning from Online Preferences. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Chaoyi Gu, Varuna De Silva, Corentin Artaud, Rafael Pina Embedding Contextual Information through Reward Shaping in Multi-Agent Learning: A Case Study from Google Football. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Jueming Hu, Jean-Raphaël Gaglione, Yanze Wang, Zhe Xu 0005, Ufuk Topcu, Yongming Liu Reinforcement Learning With Reward Machines in Stochastic Games. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Firas Al-Hafez, Davide Tateo, Oleg Arenz, Guoping Zhao, Jan Peters 0001 LS-IQ: Implicit Reward Regularization for Inverse Reinforcement Learning. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Souradip Chakraborty, Amisha Bhaskar, Anukriti Singh, Pratap Tokekar, Dinesh Manocha, Amrit Singh Bedi REBEL: A Regularization-Based Solution for Reward Overoptimization in Reinforcement Learning from Human Feedback. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Siyuan Li, Weiyang Jin, Zedong Wang, Fang Wu, Zicheng Liu 0006, Cheng Tan 0012, Stan Z. Li SemiReward: A General Reward Model for Semi-supervised Learning. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Roberto Cipollone 0002, Giuseppe De Giacomo, Marco Favorito, Luca Iocchi, Fabio Patrizi Exploiting Multiple Abstractions in Episodic RL via Reward Shaping. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Dingwen Kong, Lin F. Yang Provably Feedback-Efficient Reinforcement Learning via Active Reward Learning. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Yihao Feng, Shentao Yang, Shujian Zhang, Jianguo Zhang 0005, Caiming Xiong, Mingyuan Zhou, Huan Wang Fantastic Rewards and How to Tame Them: A Case Study on Reward Learning for Task-oriented Dialogue Systems. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Yue Wang 0068, Alvaro Velasquez, George K. Atia, Ashley Prater-Bennette, Shaofeng Zou Model-Free Robust Average-Reward Reinforcement Learning. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Mingqi Yuan, Bo Li 0037, Xin Jin, Wenjun Zeng Automatic Intrinsic Reward Shaping for Exploration in Deep Reinforcement Learning. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Philipp Altmann, Thomy Phan, Fabian Ritz, Thomas Gabor, Claudia Linnhoff-Popien DIRECT: Learning from Sparse and Shifting Rewards using Discriminative Reward Co-Training. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Ali Abedi 0009, Hossein Karshenas, Peyman Adibi Multi-modal reward for visual relationships-based image captioning. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11John Kliem, Prithviraj Dasgupta Reward Shaping for Improved Learning in Real-time Strategy Game Play. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Keming Lu, Hongyi Yuan, Runji Lin, Junyang Lin, Zheng Yuan 0002, Chang Zhou, Jingren Zhou Routing to the Expert: Efficient Reward-guided Ensemble of Large Language Models. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Kush Bhatia, Wenshuo Guo, Jacob Steinhardt Reward Learning as Doubly Nonparametric Bandits: Optimal Design and Scaling Laws. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
11Ziyuan Cao, Reshma Anugundanahalli Ramachandra, Kelin Yu Temporal Video-Language Alignment Network for Reward Shaping in Reinforcement Learning. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
Displaying result #801 - #900 of 4419 (100 per page; Change: )
Pages: [<<][1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][>>]
Valid XHTML 1.1! Valid CSS! [Valid RSS]
Maintained by L3S.
Previously maintained by Jörg Diederich.
Based upon DBLP by Michael Ley.
open data data released under the ODC-BY 1.0 license