KuKu

dragonkue

wingsteel's profile picture

nosuchjihyun's profile picture

wkpark's profile picture

AI & ML interests

anything.

Recent Activity

updated a collection about 3 hours ago

papers

upvoted a paper 2 days ago

Is Position Bias in Dense Retrievers Built In-or Learned from Data?

updated a model 5 days ago

dragonkue/colbert-ko-0.1b

View all activity

Organizations

dragonkue 's collections 8

papers

Is Position Bias in Dense Retrievers Built In-or Learned from Data?

Paper • 2605.26578 • Published 5 days ago • 13

Reranker Models

A collection of high-performance Korean reranker models, including those I have trained myself as well as other strong baselines

dragonkue/bge-reranker-v2-m3-ko

Text Ranking • 0.6B • Updated Apr 3, 2025 • 62.8k • 23
telepix/PIXIE-Spell-Reranker-Preview-0.6B

Text Ranking • 0.6B • Updated Apr 2 • 93 • 5
BAAI/bge-reranker-v2-m3

Text Classification • 0.6B • Updated Jun 24, 2024 • 13.7M • • 1.01k
Qwen/Qwen3-Reranker-0.6B

Text Ranking • 0.6B • Updated Apr 16 • 1.37M • 354

Multi-modal Retrieval Models

Qwen/Qwen3-VL-Embedding-8B

Sentence Similarity • 8B • Updated Apr 16 • 1.43M • 419
Qwen/Qwen3-VL-Embedding-2B

Sentence Similarity • 2B • Updated Apr 16 • 1.19M • 411
Qwen/Qwen3-VL-Reranker-8B

Text Ranking • 9B • Updated Apr 16 • 454k • 148
Qwen/Qwen3-VL-Reranker-2B

Text Ranking • 2B • Updated Apr 16 • 326k • 193

Korean Sparse Retriever

telepix/PIXIE-Splade-Preview

Feature Extraction • 0.1B • Updated Sep 19, 2025 • 1.53k • • 13
yjoonjang/splade-ko-v1

Feature Extraction • 0.1B • Updated Jan 17 • 1.4k • • 16

Korean Embedding Models

A collection of high-performance Korean embedding models, including both models I trained myself and other publicly available strong baselines.

dragonkue/snowflake-arctic-embed-l-v2.0-ko

Sentence Similarity • 0.6B • Updated Oct 16, 2025 • 21.6k • • 47
dragonkue/BGE-m3-ko

Sentence Similarity • 0.6B • Updated Oct 16, 2025 • 374k • • 76
dragonkue/multilingual-e5-small-ko

Sentence Similarity • 0.1B • Updated Oct 16, 2025 • 9.12k • • 10
dragonkue/multilingual-e5-small-ko-v2

Sentence Similarity • 0.1B • Updated Oct 16, 2025 • 10.9k • • 4

Multilingual Embedding Models

A collection of multilingual embedding models suitable for use as training backbones

google/embeddinggemma-300m

Sentence Similarity • 0.3B • Updated Sep 25, 2025 • 1.88M • • 1.68k
BAAI/bge-m3

Sentence Similarity • Updated Jul 3, 2024 • 31.2M • • 3.06k
Snowflake/snowflake-arctic-embed-l-v2.0

Sentence Similarity • 0.6B • Updated Jul 28, 2025 • 1.03M • • 247
intfloat/multilingual-e5-large-instruct

Feature Extraction • 0.6B • Updated Jul 10, 2025 • 1.54M • • 623

Colbert (multi-vec)

dragonkue/colbert-ko-0.1b

Sentence Similarity • 0.1B • Updated 5 days ago • 319 • 4
LiquidAI/LFM2-ColBERT-350M

Sentence Similarity • 0.4B • Updated 26 days ago • 81.1k • 131
yjoonjang/colbert-ko-v1

Sentence Similarity • 0.1B • Updated Nov 28, 2025 • 22 • 16
mixedbread-ai/mxbai-edge-colbert-v0-32m

Sentence Similarity • 31.9M • Updated Apr 15 • 83.3k • • 45

Korean BERT

A collection of backbone models suitable for building Korean embedding or reranker models.

skt/A.X-Encoder-base

Text Classification • 0.1B • Updated Jan 20 • 2.7k • • 28

papers

Is Position Bias in Dense Retrievers Built In-or Learned from Data?

Paper • 2605.26578 • Published 5 days ago • 13

Korean Embedding Models

A collection of high-performance Korean embedding models, including both models I trained myself and other publicly available strong baselines.

dragonkue/snowflake-arctic-embed-l-v2.0-ko

Sentence Similarity • 0.6B • Updated Oct 16, 2025 • 21.6k • • 47
dragonkue/BGE-m3-ko

Sentence Similarity • 0.6B • Updated Oct 16, 2025 • 374k • • 76
dragonkue/multilingual-e5-small-ko

Sentence Similarity • 0.1B • Updated Oct 16, 2025 • 9.12k • • 10
dragonkue/multilingual-e5-small-ko-v2

Sentence Similarity • 0.1B • Updated Oct 16, 2025 • 10.9k • • 4

Reranker Models

A collection of high-performance Korean reranker models, including those I have trained myself as well as other strong baselines

dragonkue/bge-reranker-v2-m3-ko

Text Ranking • 0.6B • Updated Apr 3, 2025 • 62.8k • 23
telepix/PIXIE-Spell-Reranker-Preview-0.6B

Text Ranking • 0.6B • Updated Apr 2 • 93 • 5
BAAI/bge-reranker-v2-m3

Text Classification • 0.6B • Updated Jun 24, 2024 • 13.7M • • 1.01k
Qwen/Qwen3-Reranker-0.6B

Text Ranking • 0.6B • Updated Apr 16 • 1.37M • 354

Multilingual Embedding Models

A collection of multilingual embedding models suitable for use as training backbones

google/embeddinggemma-300m

Sentence Similarity • 0.3B • Updated Sep 25, 2025 • 1.88M • • 1.68k
BAAI/bge-m3

Sentence Similarity • Updated Jul 3, 2024 • 31.2M • • 3.06k
Snowflake/snowflake-arctic-embed-l-v2.0

Sentence Similarity • 0.6B • Updated Jul 28, 2025 • 1.03M • • 247
intfloat/multilingual-e5-large-instruct

Feature Extraction • 0.6B • Updated Jul 10, 2025 • 1.54M • • 623

Multi-modal Retrieval Models

Qwen/Qwen3-VL-Embedding-8B

Sentence Similarity • 8B • Updated Apr 16 • 1.43M • 419
Qwen/Qwen3-VL-Embedding-2B

Sentence Similarity • 2B • Updated Apr 16 • 1.19M • 411
Qwen/Qwen3-VL-Reranker-8B

Text Ranking • 9B • Updated Apr 16 • 454k • 148
Qwen/Qwen3-VL-Reranker-2B

Text Ranking • 2B • Updated Apr 16 • 326k • 193

Colbert (multi-vec)

dragonkue/colbert-ko-0.1b

Sentence Similarity • 0.1B • Updated 5 days ago • 319 • 4
LiquidAI/LFM2-ColBERT-350M

Sentence Similarity • 0.4B • Updated 26 days ago • 81.1k • 131
yjoonjang/colbert-ko-v1

Sentence Similarity • 0.1B • Updated Nov 28, 2025 • 22 • 16
mixedbread-ai/mxbai-edge-colbert-v0-32m

Sentence Similarity • 31.9M • Updated Apr 15 • 83.3k • • 45

Korean Sparse Retriever

telepix/PIXIE-Splade-Preview

Feature Extraction • 0.1B • Updated Sep 19, 2025 • 1.53k • • 13
yjoonjang/splade-ko-v1

Feature Extraction • 0.1B • Updated Jan 17 • 1.4k • • 16

Korean BERT

A collection of backbone models suitable for building Korean embedding or reranker models.

skt/A.X-Encoder-base

Text Classification • 0.1B • Updated Jan 20 • 2.7k • • 28