Yuu Jinnai


Home
Google Scholar
GitHub

Current Research:
Minimum Bayes Risk Decoding
Language Model Alignment

Prior Projects:
Parallel Best-First Search
Automated Skill Discovery

日本語:
Japanese
Open Data Structures
ヒューリスティック探索入門


Hosted on GitHub Pages — Theme by orderedlist

Yuu Jinnai

I am a researcher at CyberAgent AI Lab, focusing on natural language processing, reinforcement learning, and language model alignment. My work spans from theoretical foundations to practical applications in text generation and AI safety.

Contact: ddyuudd [at] gmail [dot] com
Links: Google Scholar | GitHub | CV

Biography

Research Interests

Artificial Intelligence, Reinforcement Learning, Language Model Alignment, Text Generation, Classical Planning, Heuristic Search

Current Research Focus

Selected Publications

View all publications Google Scholar

Highlighted Work

   
c3nlp2024 Best Paper Award: Jinnai Y. 2024. Does Cross-Cultural Alignment Change the Commonsense Morality of Language Models? C3NLP Workshop at ACL 2024. PAPER TALK
icml2024 Jinnai Y, et al. 2024. Model-based Minimum Bayes Risk Decoding. ICML-24. PAPER CODE TALK
iclr-20 Y. Jinnai, et al. 2020. Exploration in Reinforcement Learning with Deep Covering Options. ICLR-20. PAPER

All Publications

2025

   
emnlp2025 Yuu Jinnai and Ukyo Honda. 2025. Annotation-Efficient Preference Optimization for Language Model Alignment. In Findings of the Association for Computational Linguistics (EMNLP-25 Findings). PAPER CODE TALK
acl2025 Yuki Ichihara, Yuu Jinnai, Kaito Ariu, Tetsuro Morimura, Eiji Uchibe. 2025. Theoretical Guarantees for Minimum Bayes Risk Decoding. Annual Meeting of the Association for Computational Linguistics (ACL-25). PAPER
acl2025 Ayuto Tsutsumi, Yuu Jinnai. 2025. Do Large Language Models Know Folktales? A Case Study of Yokai in Japanese Folktales. In Findings of the Association for Computational Linguistics (ACL-25 Findings). PAPER CODE DATASET
acl2025 Yuu Jinnai. 2025. Document-Level Text Generation with Minimum Bayes Risk Decoding using Optimal Transport. Annual Meeting of the Association for Computational Linguistics (ACL-25). PAPER CODE TALK
tmlr2025 Ichihara, Y., Jinnai, Y., Morimura, T., Ariu, K., Abe, K., Sakamoto, M., & Uchibe, E. (2025). Evaluation of Best-of-N Sampling Strategies for Language Model Alignment. Transactions on Machine Learning Research (TMLR) PAPER CODE TALK
naacl2025 Jinnai, Y., Morimura, T., Ariu, K., & Abe, K. (2024). Regularized Best-of-N Sampling with Minimum Bayes Risk Objective for Language Model Alignment. 2025 Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL-25) PAPER CODE TALK

2024

   
emnlp2024 Morimura, T., Sakamoto, M., Jinnai, Y., Abe, K., & Ariu, K. (2024). Filtered Direct Preference Optimization. The 2024 Conference on Empirical Methods in Natural Language Processing. (EMNLP-24) PAPER CODE
c3nlp2024 Jinnai Y. 2024. Does Cross-Cultural Alignment Change the Commonsense Morality of Language Models? Proceedings of the 2nd Workshop on Cross-Cultural Considerations in NLP (C3NLP Workshop at ACL 2024). Best Paper Award. PAPER TALK MODEL DATASET
icml2024 Jinnai Y, Morimura T, Honda U, Ariu K, Abe K. Model-based minimum bayes risk decoding. Proc. 41st International Conference on Machine Learning. (ICML-24) PAPER CODE TALK
acl2024hyperparameter Jinnai Y, Ariu K. Hyperparameter-Free Approach for Faster Minimum Bayes Risk Decoding. In Findings of the Association for Computational Linguistics. (ACL-24 Findings) PAPER CODE TALK
acl2024diverse Jinnai Y, Honda U, Morimura T, Zhang P. Generating Diverse and High-Quality Texts by Minimum Bayes Risk Decoding. In Findings of the Association for Computational Linguistics. (ACL-24 Findings) PAPER CODE TALK
naacl2024 Ohashi A, Honda U, Morimura T, Jinnai Y. 2024. On the True Distribution Approximation of Minimum Bayes-Risk Decoding. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics. (NAACL-24) PAPER CODE TALK

2021

   
lipschitz Lecarpentier E, Abel D, Asadi K, Jinnai Y, Rachelson E, Littman Michael L. 2021. Lipschitz Lifelong Reinforcement Learning. Proc. 35th AAAI conference on Artificial Intelligence (AAAI-21) arXiv Poster CODE

2020

   
iclr-20 Y. Jinnai, J. Park, M.C. Machado, and G.D. Konidaris. Exploration in Reinforcement Learning with Deep Covering Options. Accepted, Proceedings of the Eighth International Conference on Learning Representations. (ICLR-20) PAPER
alphaX Wang L*, Zhao Y*, Jinnai Y, Tian Y, Fonseca R. 2020. AlphaX: eXploring Neural Architectures with Deep Neural Networks and Monte Carlo Tree Search. Proc. 34th AAAI conference on Artificial Intelligence (AAAI-20) *These authors contributed equally to this work. PAPER CODE

2019

   
options-for-RL Jinnai Y. Park JW, Abel D, Konidaris G. 2019. Discovering Options for Exploration by Minimizing Cover Time. Proc. 36th International Conference on Machine Learning. (ICML-19) PAPER CODE TALK
options-for-planning Jinnai Y, Abel D, Hershkowitz E, Littman M, Konidaris G. 2019. Finding Options that Minimize Planning Time. Proc. 36th International Conference on Machine Learning. (ICML-19) PAPER CODE TALK
options-for-RL Jinnai Y, Abel D, Park JW, Hershkowitz E, Littman M, Konidaris G. 2019. Skill Discovery with Well-Defined Objectives. ICLR Worshop on Structure and Priors in Reinforcement Learning. PAPER
aaai-19 Abel D, Arumugam D, Asadi K, Jinnai Y, Littman M, Wong L. S. 2019. State Abstraction as Compression in Apprenticeship Learning. Proc. 33rd AAAI Conference on Artificial Intelligence (AAAI-19). PAPER CODE

2018

   
icml-18 Abel D*, Jinnai Y*, Guo Y, Konidaris G, Littman M. 2018. Policy and Value Transfer for Lifelong Reinforcement Learning. Proc. 35th International Conference on Machine Learning. (ICML-18) *These authors contributed equally to this work. PAPER POSTER CODE TALK by D. Abel
book Fukunaga A, Botea A, Jinnai Y, Kishimoto A. 2018. Parallel A* for State-Space Search. Handbook of Parallel Constraint Reasoning, Youssef Hamadi, Lakhdar Sais (eds.), Springer. ISBN 978-3-319-63515-6. BOOK

2017

   
hsdip Jinnai Y, Fukunaga A. 2017. A Graph-Partitioning Based Approach for Parallel Best-First Search. ICAPS 2017 Workshop on Heuristic and Search for Domain-Independent Planning (HSDIP). PAPER SLIDES CODE
aaai-17 Jinnai Y, Fukunaga A. 2017. Learning to Prune Dominated Action Sequences in Online Black-box Planning. Proc. 31st AAAI Conference on Artificial Intelligence. (AAAI-17) PAPER SLIDES CODE
jair-17 Jinnai Y, Fukunaga A. 2017. On Hash-Based Work Distribution Methods for Parallel Best-First Search. Journal of Artificial Intelligence Research. (JAIR) PAPER CODE
pastar-survey (Preprint) Fukunaga A., Botea A, Jinnai Y., Kishimoto A. 2017. A Survey of Parallel A*. arXiv 1708.05296 PAPER

2016

   
icaps-16 Jinnai Y, Fukunaga A. 2016. Automated Creation of Efficient Work Distribution Functions for Parallel Best-First Search. Proc. 19th International Conference on Automated Planning and Scheduling. (ICAPS-16) PAPER SLIDES VIDEO CODE
aaai-16 Jinnai Y, Fukunaga A. 2016. Abstract Zobrist Hashing: An Efficient Work Distribution Method for Parallel Best-First Search. Proc. 30th AAAI Conference on Artificial Intelligence. (AAAI-16) PAPER POSTER CODE (PDDL) CODE (sliding-tile, path-finding, MSA)

Grants and Scholarships

Invited Talks

Awards and Honors

Patents

Services