Hits ?▲ |
Authors |
Title |
Venue |
Year |
Link |
Author keywords |
102 | Kyle Polich, Piotr J. Gmytrasiewicz |
Interactive dynamic influence diagrams. |
AAMAS |
2007 |
DBLP DOI BibTeX RDF |
|
99 | Prashant Doshi, Yifeng Zeng, Qiongyu Chen |
Graphical models for interactive POMDPs: representations and solutions. |
Auton. Agents Multi Agent Syst. |
2009 |
DBLP DOI BibTeX RDF |
Interactive POMDPs, Sequential multiagent decision making, Probabilistic graphical models |
88 | Nevin Lianwen Zhang, Weihong Zhang |
Space-Progressive Value Iteration: An Anytime Algorithm for a Class of POMDPs. |
ECSQARU |
2001 |
DBLP DOI BibTeX RDF |
|
74 | Steven D. Prestwich, Armagan Tarim, Roberto Rossi 0002, Brahim Hnich |
A Cultural Algorithm for POMDPs from Stochastic Inventory Control. |
Hybrid Metaheuristics |
2008 |
DBLP DOI BibTeX RDF |
|
74 | Makoto Tasaki, Yuichi Yabu, Yuki Iwanari, Makoto Yokoo, Milind Tambe, Janusz Marecki, Pradeep Varakantham |
Introducing Communication in Dis-POMDPs with Locality of Interaction. |
IAT |
2008 |
DBLP DOI BibTeX RDF |
|
74 | Prashant Doshi, Yifeng Zeng, Qiongyu Chen |
Graphical models for online solutions to interactive POMDPs. |
AAMAS |
2007 |
DBLP DOI BibTeX RDF |
dynamic influence diagrams, decision-making, agent modeling |
74 | Bharaneedharan Rathnasabapathy, Prashant Doshi, Piotr J. Gmytrasiewicz |
Exact solutions of interactive POMDPs using behavioral equivalence. |
AAMAS |
2006 |
DBLP DOI BibTeX RDF |
|
74 | Georgios Theocharous, Kevin P. Murphy, Leslie Pack Kaelbling |
Representing Hierarchical POMDPs as DBNs for Multi-scale Robot Localization. |
ICRA |
2004 |
DBLP DOI BibTeX RDF |
|
73 | Weihong Zhang |
Value Iteration over Belief Subspace. |
ECSQARU |
2001 |
DBLP DOI BibTeX RDF |
|
70 | Akshat Kumar, Shlomo Zilberstein |
Constraint-based dynamic programming for decentralized POMDPs with structured interactions. |
AAMAS (1) |
2009 |
DBLP BibTeX RDF |
DEC-POMDPs, multiagent planning |
70 | Frans A. Oliehoek, Nikos Vlassis |
Q-value functions for decentralized POMDPs. |
AAMAS |
2007 |
DBLP DOI BibTeX RDF |
decentralized POMDPs, planning under uncertainty, cooperative multiagent systems |
70 | Christopher Amato, Daniel S. Bernstein, Shlomo Zilberstein |
Solving POMDPs using quadratically constrained linear programs. |
AAMAS |
2006 |
DBLP DOI BibTeX RDF |
optimization, POMDPs, planning under uncertainty |
59 | Michael R. James 0001, Satinder Singh 0001 |
SarsaLandmark: an algorithm for learning in POMDPs with landmarks. |
AAMAS (1) |
2009 |
DBLP BibTeX RDF |
reinforcement learning, landmark, POMDP, partial observability |
59 | Enlu Zhou, Michael C. Fu 0001, Steven I. Marcus |
A density projection approach to dimension reduction for continuous-state POMDPs. |
CDC |
2008 |
DBLP DOI BibTeX RDF |
|
59 | Pradeep Varakantham, Janusz Marecki, Yuichi Yabu, Milind Tambe, Makoto Yokoo |
Letting loose a SPIDER on a network of POMDPs: generating quality guaranteed policies. |
AAMAS |
2007 |
DBLP DOI BibTeX RDF |
distributed POMDP, globally optimal solution, partially observable markov decision process (POMDP), multi-agent systems |
59 | Pradeep Varakantham, Rajiv T. Maheswaran, Milind Tambe |
Implementation Techniques for Solving POMDPs in Personal Assistant Agents. |
PROMAS |
2005 |
DBLP DOI BibTeX RDF |
|
59 | Chenggang Wang, James G. Schmolze |
Planning with POMDPs Using a Compact, Logic-Based Representation. |
ICTAI |
2005 |
DBLP DOI BibTeX RDF |
|
59 | Pradeep Varakantham, Rajiv T. Maheswaran, Milind Tambe |
Exploiting belief bounds: practical POMDPs for personal assistant agents. |
AAMAS |
2005 |
DBLP DOI BibTeX RDF |
meeting rescheduling, partially observable markov decision process (POMDP), task allocation |
59 | Piotr J. Gmytrasiewicz, Prashant Doshi |
Interactive POMDPs: Properties and Preliminary Results. |
AAMAS |
2004 |
DBLP DOI BibTeX RDF |
|
58 | Diego R. Pereira, Luciano V. Gonçalves, Graçaliz Pereira Dimuro, Antônio Carlos da Rocha Costa |
Towards the Self-regulation of Personality-Based Social Exchange Processes in Multiagent Systems. |
SBIA |
2008 |
DBLP DOI BibTeX RDF |
self-regulation of social exchanges, Belief-Desire-Intention, multiagent systems, social simulation, Partially Observable Markov Decision Process |
58 | Maayan Roth, Reid G. Simmons, Manuela M. Veloso |
Reasoning about joint beliefs for execution-time communication decisions. |
AAMAS |
2005 |
DBLP DOI BibTeX RDF |
communication, POMDP, distributed execution, robot teams |
55 | Jilles Steeve Dibangoye, Abdel-Illah Mouaddib, Brahim Chaib-draa |
Point-based incremental pruning heuristic for solving finite-horizon DEC-POMDPs. |
AAMAS (1) |
2009 |
DBLP BibTeX RDF |
decentralized pomdps, point-based solver, artificial intelligence, branch-and-bound, planning under uncertainty |
55 | Frans A. Oliehoek, Shimon Whiteson, Matthijs T. J. Spaan |
Lossless clustering of histories in decentralized POMDPs. |
AAMAS (1) |
2009 |
DBLP BibTeX RDF |
decentralized POMDPs, planning under uncertainty, cooperative multiagent systems |
45 | Abdeslam Boularias, Brahim Chaib-draa |
Predictive representations for policy gradient in POMDPs. |
ICML |
2009 |
DBLP DOI BibTeX RDF |
|
45 | Noel Welsh, Jeremy L. Wyatt |
United We Stand: Population Based Methods for Solving Unknown POMDPs. |
EWRL |
2008 |
DBLP DOI BibTeX RDF |
|
45 | Finale Doshi, Joelle Pineau, Nicholas Roy |
Reinforcement learning with limited reinforcement: using Bayes risk for active learning in POMDPs. |
ICML |
2008 |
DBLP DOI BibTeX RDF |
|
45 | Jason D. Williams, S. Young |
Scaling POMDPs for Spoken Dialog Management. |
IEEE Trans. Speech Audio Process. |
2007 |
DBLP DOI BibTeX RDF |
|
45 | Anton Chechetka, Katia P. Sycara |
Subjective approximate solutions for decentralized POMDPs. |
AAMAS |
2007 |
DBLP DOI BibTeX RDF |
perception and action, coordination, cooperation, teamwork, multiagent planning |
45 | Pradeep Varakantham, Ranjit Nair, Milind Tambe, Makoto Yokoo |
Winning back the CUP for distributed POMDPs: planning over continuous belief spaces. |
AAMAS |
2006 |
DBLP DOI BibTeX RDF |
continuous initial beliefs, distributed POMDP, partially observable Markov decision process (POMDP), multi-agent systems |
44 | Masoumeh T. Izadi, Doina Precup |
Point-Based Planning for Predictive State Representations. |
Canadian AI |
2008 |
DBLP DOI BibTeX RDF |
|
44 | Milind Tambe, Emma Bowring, Hyuckchul Jung, Gal A. Kaminka, Rajiv T. Maheswaran, Janusz Marecki, Pragnesh Jay Modi, Ranjit Nair, Stephen Okamoto, Jonathan P. Pearce, Praveen Paruchuri, David V. Pynadath, Paul Scerri, Nathan Schurr, Pradeep Varakantham |
Conflicts in teamwork: hybrids to the rescue. |
AAMAS |
2005 |
DBLP DOI BibTeX RDF |
game theory, BDI, POMDP, DCOP |
40 | Matthijs T. J. Spaan, Geoffrey J. Gordon, Nikos Vlassis |
Decentralized planning under uncertainty for teams of communicating agents. |
AAMAS |
2006 |
DBLP DOI BibTeX RDF |
decentralized POMDPs, artificial intelligence, planning under uncertainty, cooperative multiagent systems |
31 | Jonathan Cohen 0001 |
Formation dynamique d'équipes dans les DEC-POMDPS ouverts à base de méthodes Monte-Carlo. (Dynamic team formation in open DEC-POMDPs with Monte-Carlo methods). |
|
2019 |
RDF |
|
31 | Christopher Amato, Daniel S. Bernstein, Shlomo Zilberstein |
Optimizing fixed-size stochastic controllers for POMDPs and decentralized POMDPs. |
Auton. Agents Multi Agent Syst. |
2010 |
DBLP DOI BibTeX RDF |
|
31 | Ranjit Nair, Pradeep Varakantham, Milind Tambe, Makoto Yokoo |
Networked Distributed POMDPs: A Synthesis of Distributed Constraint Optimization and POMDPs. |
AAAI |
2005 |
DBLP BibTeX RDF |
|
31 | Ranjit Nair, Pradeep Varakantham, Milind Tambe, Makoto Yokoo |
Networked Distributed POMDPs: A Synergy of Distributed Constraint Optimization and POMDPs. |
IJCAI |
2005 |
DBLP BibTeX RDF |
|
30 | Patrick Dallaire, Camille Besse, Stéphane Ross, Brahim Chaib-draa |
Bayesian reinforcement learning in continuous POMDPs with gaussian processes. |
IROS |
2009 |
DBLP DOI BibTeX RDF |
|
30 | Vikram Krishnamurthy |
Optimal Threshold Policies for Multivariate Stopping-Time POMDPs. |
ECSQARU |
2009 |
DBLP DOI BibTeX RDF |
|
30 | Christopher Amato, Shlomo Zilberstein |
Achieving goals in decentralized POMDPs. |
AAMAS (1) |
2009 |
DBLP BibTeX RDF |
|
30 | Stéphane Ross, Brahim Chaib-draa, Joelle Pineau |
Bayesian reinforcement learning in continuous POMDPs with application to robot navigation. |
ICRA |
2008 |
DBLP DOI BibTeX RDF |
|
30 | Nicholas Armstrong-Crews, Manuela M. Veloso |
An approximate algorithm for solving oracular POMDPs. |
ICRA |
2008 |
DBLP DOI BibTeX RDF |
|
30 | Abdeslam Boularias, Masoumeh T. Izadi, Brahim Chaib-draa |
Prediction-Directed Compression of POMDPs. |
ICMLA |
2008 |
DBLP DOI BibTeX RDF |
|
30 | François Laviolette, Ludovic Tobin |
A Stochastic Point-Based Algorithm for POMDPs. |
Canadian AI |
2008 |
DBLP DOI BibTeX RDF |
|
30 | Ai-Hua Bian, Chong-Jun Wang, Shifu Chen |
Preprocessing for Point-Based Algorithms of POMDPs. |
ICTAI (1) |
2008 |
DBLP DOI BibTeX RDF |
|
30 | Feng Wu 0001, Xiaoping Chen |
Solving Large-Scale and Sparse-Reward DEC-POMDPs with Correlation-MDPs. |
RoboCup |
2007 |
DBLP DOI BibTeX RDF |
|
30 | Daan Wierstra, Alexander Förster, Jan Peters 0001, Jürgen Schmidhuber |
Solving Deep Memory POMDPs with Recurrent Policy Gradients. |
ICANN (1) |
2007 |
DBLP DOI BibTeX RDF |
|
30 | Masoumeh T. Izadi, Doina Precup, Danielle Azar |
Belief Selection in Point-Based Planning Algorithms for POMDPs. |
Canadian AI |
2006 |
DBLP DOI BibTeX RDF |
|
30 | David J. Montana, Eric Van Wyk, Marshall Brinn, Joshua Montana, Stephen Milligan |
Genomic computing networks learn complex POMDPs. |
GECCO |
2006 |
DBLP DOI BibTeX RDF |
POMDP, evolutionary neural networks |
30 | Sébastien Paquet, Ludovic Tobin, Brahim Chaib-draa |
Real-Time Decision Making for Large POMDPs. |
Canadian AI |
2005 |
DBLP DOI BibTeX RDF |
|
30 | Ranjit Nair, Milind Tambe, Maayan Roth, Makoto Yokoo |
Communication for Improving Policy Computation in Distributed POMDPs. |
AAMAS |
2004 |
DBLP DOI BibTeX RDF |
|
29 | Yanjie Li, Baoqun Yin, Hongsheng Xi |
Partially Observable Markov Decision Processes and Performance Sensitivity Analysis. |
IEEE Trans. Syst. Man Cybern. Part B |
2008 |
DBLP DOI BibTeX RDF |
|
29 | Kazuteru Miyazaki, Shigenobu Kobayashi |
Proposal of Exploitation-Oriented Learning PS-r#. |
IDEAL |
2008 |
DBLP DOI BibTeX RDF |
|
29 | Daan Wierstra, Tom Schaul, Jan Peters 0001, Jürgen Schmidhuber |
Episodic Reinforcement Learning by Logistic Reward-Weighted Regression. |
ICANN (1) |
2008 |
DBLP DOI BibTeX RDF |
|
29 | Yang Xiang 0004, Franklin Hanshar |
Planning in Multiagent Expedition with Collaborative Design Networks. |
Canadian AI |
2007 |
DBLP DOI BibTeX RDF |
|
29 | Le Tien Dung, Takashi Komeda, Motoki Takagi |
Mixed Reinforcement Learning for Partially Observable Markov Decision Process. |
CIRA |
2007 |
DBLP DOI BibTeX RDF |
|
29 | Daan Wierstra, Jürgen Schmidhuber |
Policy Gradient Critics. |
ECML |
2007 |
DBLP DOI BibTeX RDF |
|
29 | Finale Doshi, Nicholas Roy |
Efficient model learning for dialog management. |
HRI |
2007 |
DBLP DOI BibTeX RDF |
human-robot interaction, decision-making under uncertainty, model learning |
29 | Deepak Verma, Rajesh P. N. Rao |
Planning and Acting in Uncertain Environments using Probabilistic Inference. |
IROS |
2006 |
DBLP DOI BibTeX RDF |
|
29 | Joelle Pineau, Geoffrey J. Gordon |
POMDP Planning for Robust Robot Control. |
ISRR |
2005 |
DBLP DOI BibTeX RDF |
|
29 | Anthony R. Cassandra, Marian H. Nodine, Shilpa Bondale, Steve Ford, David L. Wells |
Using decision-theoretic models to enhance agent system survivability. |
AAMAS |
2005 |
DBLP DOI BibTeX RDF |
|
29 | Sébastien Paquet, Ludovic Tobin, Brahim Chaib-draa |
An online POMDP algorithm for complex multiagent environments. |
AAMAS |
2005 |
DBLP DOI BibTeX RDF |
online search, POMDP |
29 | Ranjit Nair, Milind Tambe |
Coordinating Teams in Uncertain Environments: A Hybrid BDI-POMDP Approach. |
PROMAS |
2004 |
DBLP DOI BibTeX RDF |
|
29 | Rinat Khoussainov |
Towards Well-Defined Multi-agent Reinforcement Learning. |
AIMSA |
2004 |
DBLP DOI BibTeX RDF |
|
29 | David V. Pynadath, Stacy Marsella |
Fitting and Compilation of Multiagent Models through Piecewise Linear Functions. |
AAMAS |
2004 |
DBLP DOI BibTeX RDF |
|
29 | Martijn C. Schut, Michael J. Wooldridge, Simon Parsons |
On Partially Observable MDPs and BDI Models. |
Foundations and Applications of Multi-Agent Systems |
2002 |
DBLP DOI BibTeX RDF |
|
29 | Ivo Kwee, Marcus Hutter, Jürgen Schmidhuber |
Market-Based Reinforcement Learning in Partially Observable Worlds. |
ICANN |
2001 |
DBLP DOI BibTeX RDF |
|
29 | Martin Mundhenk, Judy Goldsmith, Eric Allender |
The Complexity of Policy Evaluation for Finite-Horizon Partially-Observable Markov Decision Processes. |
MFCS |
1997 |
DBLP DOI BibTeX RDF |
|
25 | Stéphane Ross, Masoumeh T. Izadi, Mark Mercer, David L. Buckeridge |
Sensitivity Analysis of POMDP Value Functions. |
ICMLA |
2009 |
DBLP DOI BibTeX RDF |
Value Function Error Bound, Perturbation Analysis, POMDPs |
25 | Simon Andrew Williamson, Enrico H. Gerding, Nicholas R. Jennings |
Reward shaping for valuing communications during multi-agent coordination. |
AAMAS (1) |
2009 |
DBLP BibTeX RDF |
decentralised POMDPs, communication, agents |
25 | Xi-Ren Cao |
Basic Ideas for Event-Based Optimization of Markov Systems. |
Discret. Event Dyn. Syst. |
2005 |
DBLP DOI BibTeX RDF |
Markov decision processes (MDPs), performance potentials, policy gradients, aggregation, perturbation analysis, POMDPs, policy iteration |
16 | Or Wertheim, Dan R. Suissa, Ronen I. Brafman |
Plug'n Play Task-Level Autonomy for Robotics Using POMDPs and Probabilistic Programs. |
IEEE Robotics Autom. Lett. |
2024 |
DBLP DOI BibTeX RDF |
|
16 | Daniele Meli, Alberto Castellini, Alessandro Farinelli |
Learning Logic Specifications for Policy Guidance in POMDPs: an Inductive Logic Programming Approach. |
J. Artif. Intell. Res. |
2024 |
DBLP DOI BibTeX RDF |
|
16 | J.-Anne Yow, Neha Priyadarshini Garg, Wei Tech Ang |
Shared Autonomy of a Robotic Manipulator for Grasping Under Human Intent Uncertainty Using POMDPs. |
IEEE Trans. Robotics |
2024 |
DBLP DOI BibTeX RDF |
|
16 | Daniele Meli, Alberto Castellini, Alessandro Farinelli |
Learning Logic Specifications for Policy Guidance in POMDPs: an Inductive Logic Programming Approach. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
16 | Johan Peralez, Aurélien Delage, Olivier Buffet, Jilles Steeve Dibangoye |
Solving Hierarchical Information-Sharing Dec-POMDPs: An Extensive-Form Game Approach. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
16 | Michael Lanier, Ying Xu, Nathan Jacobs, Chongjie Zhang, Yevgeniy Vorobeychik |
Learning Interpretable Policies in Hindsight-Observable POMDPs through Partially Supervised Reinforcement Learning. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
16 | Huifan Gao, Yifeng Zeng, Yinghui Pan |
Inducing Individual Students' Learning Strategies through Homomorphic POMDPs. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
16 | Yannick Eich, Bastian Alt, Heinz Koeppl |
Approximate Control for Continuous-Time POMDPs. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
16 | Maris F. L. Galesloot, Thiago D. Simão, Sebastian Junges, Nils Jansen 0001 |
Factored Online Planning in Many-Agent POMDPs. |
AAAI |
2024 |
DBLP DOI BibTeX RDF |
|
16 | Yannick Eich, Bastian Alt, Heinz Koeppl |
Approximate Control for Continuous-Time POMDPs. |
AISTATS |
2024 |
DBLP BibTeX RDF |
|
16 | Wei Zheng, Hai Lin 0002 |
Provable-Correct Partitioning Approach for Continuous-Observation POMDPs With Special Observation Distributions. |
IEEE Control. Syst. Lett. |
2023 |
DBLP DOI BibTeX RDF |
|
16 | Manav Vora, Pranay Thangeda, Michael N. Grussing, Melkior Ornik |
Welfare Maximization Algorithm for Solving Budget-Constrained Multi-Component POMDPs. |
IEEE Control. Syst. Lett. |
2023 |
DBLP DOI BibTeX RDF |
|
16 | Marijana Peti, Frano Petric, Stjepan Bogdan |
Decentralized Coordination of Multi-Agent Systems Based on POMDPs and Consensus for Active Perception. |
IEEE Access |
2023 |
DBLP DOI BibTeX RDF |
|
16 | Giacomo Arcieri, Cyprien Hoelzl, Oliver Schwery, Daniel Straub, Konstantinos G. Papakonstantinou, Eleni N. Chatzi |
Bridging POMDPs and Bayesian decision making for robust maintenance planning under model uncertainty: An application to railway systems. |
Reliab. Eng. Syst. Saf. |
2023 |
DBLP DOI BibTeX RDF |
|
16 | Junchao Li, Mingyu Cai, Zhaoan Wang, Shaoping Xiao |
Model-based motion planning in POMDPs with temporal logic specifications. |
Adv. Robotics |
2023 |
DBLP DOI BibTeX RDF |
|
16 | Moran Barenboim, Moshe Shienman, Vadim Indelman |
Monte Carlo Planning in Hybrid Belief POMDPs. |
IEEE Robotics Autom. Lett. |
2023 |
DBLP DOI BibTeX RDF |
|
16 | Franck Djeumou, Christian Ellis, Murat Cubuktepe, Craig Lennon, Ufuk Topcu |
Task-guided IRL in POMDPs that scales. |
Artif. Intell. |
2023 |
DBLP DOI BibTeX RDF |
|
16 | Majid Khonji |
Approximability and efficient algorithms for constrained fixed-horizon POMDPs with durative actions. |
Artif. Intell. |
2023 |
DBLP DOI BibTeX RDF |
|
16 | Timothy L. Molloy, Girish N. Nair |
Smoother Entropy for Active State Trajectory Estimation and Obfuscation in POMDPs. |
IEEE Trans. Autom. Control. |
2023 |
DBLP DOI BibTeX RDF |
|
16 | Victor Cohen, Axel Parmentier |
Future memories are not needed for large classes of POMDPs. |
Oper. Res. Lett. |
2023 |
DBLP DOI BibTeX RDF |
|
16 | Michael H. Lim, Tyler J. Becker, Mykel J. Kochenderfer, Claire J. Tomlin, Zachary N. Sunberg |
Optimality Guarantees for Particle Belief Approximation of POMDPs. |
J. Artif. Intell. Res. |
2023 |
DBLP DOI BibTeX RDF |
|
16 | Maris F. L. Galesloot, Thiago D. Simão, Sebastian Junges, Nils Jansen 0001 |
Factored Online Planning in Many-Agent POMDPs. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
16 | Hai Nguyen, Sammie Katt, Yuchen Xiao, Christopher Amato |
On-Robot Bayesian Reinforcement Learning for POMDPs. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
16 | Rui Yan 0002, Gabriel Santos, Gethin Norman, David Parker 0001, Marta Kwiatkowska |
Point-based Value Iteration for Neuro-Symbolic POMDPs. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
16 | Jonathan N. Lee, Alekh Agarwal, Christoph Dann, Tong Zhang 0001 |
Learning in POMDPs is Sample-Efficient with Hindsight Observability. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
16 | Thiago D. Simão, Marnix Suilen, Nils Jansen 0001 |
Safe Policy Improvement for POMDPs via Finite-State Controllers. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
16 | Manav Vora, Pranay Thangeda, Michael N. Grussing, Melkior Ornik |
Welfare Maximization Algorithm for Solving Budget-Constrained Multi-Component POMDPs. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
16 | Marcus Hörger, Hanna Kurniawati, Dirk P. Kroese, Nan Ye |
Adaptive Discretization using Voronoi Trees for Continuous POMDPs. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
16 | Soichiro Nishimori, Sotetsu Koyamada, Shin Ishii |
End-to-End Policy Gradient Method for POMDPs and Explainable Agents. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
16 | Roman Andriushchenko, Alexander Bork, Milan Ceska 0002, Sebastian Junges, Joost-Pieter Katoen, Filip Macák |
Search and Explore: Symbiotic Policy Synthesis in POMDPs. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|