1 1

Aochong Oliver Li

aochongoliverli

https://aochong-li.github.io/

AI & ML interests

Large Language Models, Natural Language Processing, Machine Learning

Organizations

None yet

Collections 1

models 89

aochongoliverli/Qwen2.5-3B-limo-qwq-16k-3epochs-5e-5lr-step150

3B • Updated Mar 27 • 2

aochongoliverli/Qwen2.5-3B-math8k-distill-AM-Distill-Qwen-32B-16k-5epochs-2e-5lr-step500

Text Generation • 3B • Updated Sep 22, 2025 • 2

aochongoliverli/Qwen2.5-3B-math8k-distill-AM-Distill-Qwen-32B-16k-5epochs-2e-5lr-step400

Text Generation • 3B • Updated Sep 22, 2025 • 1

aochongoliverli/Qwen2.5-1.5B-math8k-distill-AM-Distill-Qwen-32B-16k-5epochs-2e-5lr-step600

Text Generation • 2B • Updated Sep 19, 2025 • 1

aochongoliverli/Qwen2.5-0.5B-math8k-distill-AM-Distill-Qwen-32B-16k-5epochs-5e-5lr-step500

Text Generation • 0.5B • Updated Sep 18, 2025 • 1

aochongoliverli/Qwen2.5-0.5B-math8k-distill-AM-Distill-Qwen-32B-16k-5epochs-5e-5lr-step400

Text Generation • 0.5B • Updated Sep 18, 2025 • 1

aochongoliverli/Qwen2.5-7B-math8k-distill-AM-Distill-Qwen-32B-16k-10epochs-5e-5lr-step100

Updated Sep 13, 2025

aochongoliverli/Qwen2.5-1.5B-math8k-distill-AM-Distill-Qwen-32B-16k-5epochs-2e-5lr-step500

Text Generation • 2B • Updated Sep 8, 2025 • 2

aochongoliverli/Qwen2.5-1.5B-math8k-distill-AM-Distill-Qwen-32B-16k-5epochs-2e-5lr-step400

Text Generation • 2B • Updated Sep 8, 2025 • 2

aochongoliverli/Qwen2.5-1.5B-math8k-distill-AM-Distill-Qwen-32B-16k-5epochs-2e-5lr-step300

Text Generation • 2B • Updated Sep 8, 2025 • 2

View 89 models

datasets 72

aochongoliverli/Qwen2.5-1.5B-math8k-AM-5epochs-5e-5lr-step400-dapo-5epochs-8rollouts-16384max-len-rollouts

Viewer • Updated Sep 24, 2025 • 7.59k • 11

aochongoliverli/Qwen2.5-1.5B-math8k-AM-10epochs-2e-5lr-step400-dapo-5epochs-8rollouts-16384max-len-rollouts

Viewer • Updated Sep 21, 2025 • 1.28k • 5

aochongoliverli/Qwen2.5-0.5B-math8k-AM-400steps-dapo-5epochs-8rollouts-16384max-len-rollouts

Viewer • Updated Sep 20, 2025 • 7.59k • 13

aochongoliverli/Qwen2.5-1.5B-math8k-AM-400steps-dapo-5epochs-8rollouts-16384max-len-rollouts

Viewer • Updated Sep 16, 2025 • 7.59k • 41

aochongoliverli/Qwen4B-MegaMath-pro-max-4096-len-sft-no-external-knowledge

Viewer • Updated Sep 6, 2025 • 2.87k • 9

View 72 datasets

Aochong Oliver Li

AI & ML interests

Organizations

Collections 1

aochongoliverli/KUP

aochongoliverli/KUP

models 89

aochongoliverli/Qwen2.5-3B-limo-qwq-16k-3epochs-5e-5lr-step150

aochongoliverli/Qwen2.5-3B-math8k-distill-AM-Distill-Qwen-32B-16k-5epochs-2e-5lr-step500

aochongoliverli/Qwen2.5-3B-math8k-distill-AM-Distill-Qwen-32B-16k-5epochs-2e-5lr-step400

aochongoliverli/Qwen2.5-1.5B-math8k-distill-AM-Distill-Qwen-32B-16k-5epochs-2e-5lr-step600

aochongoliverli/Qwen2.5-0.5B-math8k-distill-AM-Distill-Qwen-32B-16k-5epochs-5e-5lr-step500

aochongoliverli/Qwen2.5-0.5B-math8k-distill-AM-Distill-Qwen-32B-16k-5epochs-5e-5lr-step400

aochongoliverli/Qwen2.5-7B-math8k-distill-AM-Distill-Qwen-32B-16k-10epochs-5e-5lr-step100

aochongoliverli/Qwen2.5-1.5B-math8k-distill-AM-Distill-Qwen-32B-16k-5epochs-2e-5lr-step500

aochongoliverli/Qwen2.5-1.5B-math8k-distill-AM-Distill-Qwen-32B-16k-5epochs-2e-5lr-step400

aochongoliverli/Qwen2.5-1.5B-math8k-distill-AM-Distill-Qwen-32B-16k-5epochs-2e-5lr-step300

datasets 72

aochongoliverli/wmdp_shot_examples_256

aochongoliverli/wmdp_biochem_inquiries_800

aochongoliverli/pmc_openaccess_split

aochongoliverli/allcode-results

aochongoliverli/allscience-results

aochongoliverli/Qwen2.5-1.5B-math8k-AM-5epochs-5e-5lr-step400-dapo-5epochs-8rollouts-16384max-len-rollouts

aochongoliverli/Qwen2.5-1.5B-math8k-AM-10epochs-2e-5lr-step400-dapo-5epochs-8rollouts-16384max-len-rollouts

aochongoliverli/Qwen2.5-0.5B-math8k-AM-400steps-dapo-5epochs-8rollouts-16384max-len-rollouts

aochongoliverli/Qwen2.5-1.5B-math8k-AM-400steps-dapo-5epochs-8rollouts-16384max-len-rollouts

aochongoliverli/Qwen4B-MegaMath-pro-max-4096-len-sft-no-external-knowledge

Aochong Oliver Li

AI & ML interests

Organizations

Collections 1

models 89 Sort: Recently updated

datasets 72 Sort: Recently updated

models 89

datasets 72