Ryokan Ri

Logo

On the day of my Ph.D. graduation ceremony.

View My GitHub Profile

About Me

Ryokan Ri (李 凌寒), Ph.D.

A Senior Research Engineer at SB Intuitions, specializing in the development and application of large language models (LLMs).

My primary research interest is in Natural Language Processing (NLP). Some of my specific research interests include:

Links:   Google ScholarLinkedIn GitHub

Research and Engineering Focus

Large Language Models

LLM Books

I have co-authored books (“Introduction to Large Language Models”) on Large Language Models (LLMs) for Japanese practitioners. These books cover the basics of LLMs with sample code and practical tips for training and fine-tuning LLMs.

Sarashina: Japanese-centric LLM

I was involved in developing the Japanese-centric LLM Sarashina. Sarashina achieved top-class performance in Japanese language tasks among open-sourced LLMs. This was a result of team effort at SB Intuitions.

FlexEval: LLM Evaluation Tool

While many LLM evaluation libraries are available, they cover different domains and methods, and switching between them is cumbersome. I am developing a unified evaluation tool, FlexEval, that can be used to evaluate LLMs with various evaluation tasks, metrics, and methods, ranging from few-shot evaluation to LLM-as-a-judge approaches. It abstracts away various implementations of language models, evaluation datasets, and tasks, allowing users to specify them with configuration files.

Seeking Language Universals in Neural Networks

My Ph.D. research focused on the commonalities across languages captured by neural network models.

“Pretraining with Artificial Language: Studying Transferable Knowledge in Language Models”, R. Ri and Y. Tsuruoka.

In this work, we investigate the transferability of knowledge acquired through language modeling. Natural language can be characterized by various linguistic or statistical properties at different levels, making it difficult to study which are transferable. We created an artificial language with controlled properties, pretrained a neural network on it, and transferred it to natural languages. Our findings show that simple statistical dependencies are key to transferability.

“mLUKE: The Power of Entity Representations in Multilingual Pretrained Language Models”, R. Ri, I. Yamada, and Y. Tsuruoka.

We developed a multilingual encoder-style language model, mLUKE, that utilizes information about entities (Wikipedia articles) shared across languages. We demonstrate that including entity information in pre-training improves the performance of cross-lingual transfer tasks.

Education

Working Experience

Awards and Honors

Publications

International Conferences / Journals

For the full publication list, please refer to Google Scholar.

Analyzing Multilinguality of Neural Networks

Multilingual Language Models and Representation Learning

Book

Skills

Natural Languages

Programming Languages

Fun Facts