Tech Buzz China · Researcher profileandChina Research CollectiveAI Proem China Research CollectiveAs of Jul 14, 2026

SONG Junxiao (宋俊潇)

Principal Researcher, DeepSeek

Source check

Known for

DeepSeek-R1 (co-author, published in Nature) · DeepSeek-Prover series

Current org

DeepSeek (深度求索)

Schools

Zhejiang U (浙江大学) ·HKUST

Articles

Videos

Links

Career path

Studied

Zhejiang U (浙江大学)HKUST

→

Now

DeepSeek (深度求索)

Inventor of GRPO (the RL algorithm behind R1)

Zhejiang U BS, HKUST PhD (Daniel Palomar)

Profile

Source check

SONG Junxiao (宋俊潇) is a Principal Researcher at DeepSeek. He is a co-author of the landmark DeepSeek-R1 paper published in Nature, which demonstrated how reinforcement learning can incentivize reasoning capabilities in large language models.

Song was previously a Ph.D. researcher at the Hong Kong University of Science and Technology (HKUST), in the Department of Electronic and Computer Engineering. His research at DeepSeek focuses on reinforcement learning for reasoning, mathematical reasoning in LLMs, and automated theorem proving.

He has contributed to multiple DeepSeek model releases including DeepSeek-R1, DeepSeek-Prover, and DeepSeek-Prover-V2.

Known for

DeepSeek-R1 (co-author, published in Nature)DeepSeek-Prover seriesDeepSeek-Prover-V2

Education

Hong Kong University of Science and Technology

Ph.D. in Electronic and Computer Engineering

Articles / interviews

Article

DeepSeek-R1 incentivizes reasoning in LLMs through reinforcement learning

Nature

Profile links

deepseek.com researchgate.net/profile/Junxiao-Song-2

Sources

DeepSeek-R1 incentivizes reasoning in LLMs through reinforcement learning · Nature Junxiao Song PhD Researcher · ResearchGate DeepSeek-Prover-V2 · arXiv · 2025