Teaching Large Language Models to Reason with Reinforcement Learning. | D2R Server publishing the DBLP Bibliography Database, hosted at L3S Research Center

Property	Value
dcterms:bibliographicCitation	<http://dblp.uni-trier.de/rec/bibtex/journals/corr/abs-2403-04642>
dc:creator	<https://dblp.l3s.de/d2r/resource/authors/Alex_Havrilla>
dc:creator	<https://dblp.l3s.de/d2r/resource/authors/Christoforos_Nalmpantis>
dc:creator	<https://dblp.l3s.de/d2r/resource/authors/Eric_Hambro>
dc:creator	<https://dblp.l3s.de/d2r/resource/authors/Jane_Dwivedi-Yu>
dc:creator	<https://dblp.l3s.de/d2r/resource/authors/Maksym_Zhuravinskyi>
dc:creator	<https://dblp.l3s.de/d2r/resource/authors/Roberta_Raileanu>
dc:creator	<https://dblp.l3s.de/d2r/resource/authors/Sainbayar_Sukhbaatar>
dc:creator	<https://dblp.l3s.de/d2r/resource/authors/Sharath_Chandra_Raparthy>
dc:creator	<https://dblp.l3s.de/d2r/resource/authors/Yuqing_Du>
foaf:homepage	<http://dx.doi.org/doi.org%2F10.48550%2FarXiv.2403.04642>
foaf:homepage	<https://doi.org/10.48550/arXiv.2403.04642>
dc:identifier	DBLP journals/corr/abs-2403-04642 (xsd:string)
dc:identifier	DOI doi.org%2F10.48550%2FarXiv.2403.04642 (xsd:string)
dcterms:issued	2024 (xsd:gYear)
swrc:journal	<https://dblp.l3s.de/d2r/resource/journals/corr>
rdfs:label	Teaching Large Language Models to Reason with Reinforcement Learning. (xsd:string)
foaf:maker	<https://dblp.l3s.de/d2r/resource/authors/Alex_Havrilla>
foaf:maker	<https://dblp.l3s.de/d2r/resource/authors/Christoforos_Nalmpantis>
foaf:maker	<https://dblp.l3s.de/d2r/resource/authors/Eric_Hambro>
foaf:maker	<https://dblp.l3s.de/d2r/resource/authors/Jane_Dwivedi-Yu>
foaf:maker	<https://dblp.l3s.de/d2r/resource/authors/Maksym_Zhuravinskyi>
foaf:maker	<https://dblp.l3s.de/d2r/resource/authors/Roberta_Raileanu>
foaf:maker	<https://dblp.l3s.de/d2r/resource/authors/Sainbayar_Sukhbaatar>
foaf:maker	<https://dblp.l3s.de/d2r/resource/authors/Sharath_Chandra_Raparthy>
foaf:maker	<https://dblp.l3s.de/d2r/resource/authors/Yuqing_Du>
owl:sameAs	<http://bibsonomy.org/uri/bibtexkey/journals/corr/abs-2403-04642/dblp>
owl:sameAs	<http://dblp.rkbexplorer.com/id/journals/corr/abs-2403-04642>
rdfs:seeAlso	<http://dblp.uni-trier.de/db/journals/corr/corr2403.html#abs-2403-04642>
rdfs:seeAlso	<https://doi.org/10.48550/arXiv.2403.04642>
dc:title	Teaching Large Language Models to Reason with Reinforcement Learning. (xsd:string)
dc:type	<http://purl.org/dc/dcmitype/Text>
rdf:type	swrc:Article
rdf:type	foaf:Document
swrc:volume	abs/2403.04642 (xsd:string)