Instructions to use khazarai/BioGenesis-ToT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use khazarai/BioGenesis-ToT with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="khazarai/BioGenesis-ToT") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("khazarai/BioGenesis-ToT") model = AutoModelForCausalLM.from_pretrained("khazarai/BioGenesis-ToT") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use khazarai/BioGenesis-ToT with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "khazarai/BioGenesis-ToT" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "khazarai/BioGenesis-ToT", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/khazarai/BioGenesis-ToT
- SGLang
How to use khazarai/BioGenesis-ToT with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "khazarai/BioGenesis-ToT" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "khazarai/BioGenesis-ToT", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "khazarai/BioGenesis-ToT" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "khazarai/BioGenesis-ToT", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio new
How to use khazarai/BioGenesis-ToT with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for khazarai/BioGenesis-ToT to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for khazarai/BioGenesis-ToT to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for khazarai/BioGenesis-ToT to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="khazarai/BioGenesis-ToT", max_seq_length=2048, ) - Docker Model Runner
How to use khazarai/BioGenesis-ToT with Docker Model Runner:
docker model run hf.co/khazarai/BioGenesis-ToT
Model Card for BioGenesis-ToT
Overall Success Rate:
- khazarai/BioGenesis-ToT: 51.45
- Qwen/Qwen3-1.7B: 46.82
Benchmark: emre/TARA_Turkish_LLM_Benchmark
BioGenesis-ToT is a fine-tuned version of Qwen3-1.7B, optimized for mechanistic reasoning and explanatory understanding in biology. This model has been trained on the moremilk/ToT-Biology dataset — a reasoning-rich collection of biology questions emphasizing why and how processes occur, rather than simply what happens.
The model demonstrates strong capabilities in:
- Structured biological explanation generation
- Logical and causal reasoning
- Chain-of-thought (ToT) reasoning in scientific contexts
- Interdisciplinary biological analysis (e.g., bioengineering, medicine, ecology)
Uses
🚀 Intended Use
- Educational and scientific explanation generation
- Biological reasoning and tutoring applications
- Model interpretability research
- Training datasets for reasoning-focused LLMs
⚠️ Limitations
- Not a replacement for expert biological judgment
- May occasionally over-generalize or simplify complex phenomena
- Limited to reasoning quality within biological contexts (not trained for creative writing or coding)
How to Get Started with the Model
Use the code below to get started with the model.
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("khazarai/BioGenesis-ToT")
model = AutoModelForCausalLM.from_pretrained(
"khazarai/BioGenesis-ToT",
device_map={"": 0}
)
question = """
Describe the composition of the plasma membrane and explain how its structure relates to its function of selective permeability.
"""
messages = [
{"role" : "user", "content" : question}
]
text = tokenizer.apply_chat_template(
messages,
tokenize = False,
add_generation_prompt = True,
enable_thinking = True,
)
from transformers import TextStreamer
_ = model.generate(
**tokenizer(text, return_tensors = "pt").to("cuda"),
max_new_tokens = 2200,
temperature = 0.6,
top_p = 0.95,
top_k = 20,
streamer = TextStreamer(tokenizer, skip_prompt = True),
)
🧪 Dataset: moremilk/ToT-Biology
The ToT-Biology dataset emphasizes mechanistic understanding and explanatory reasoning within biology. It’s designed to help AI models develop interpretable, step-by-step reasoning abilities for complex biological systems.
It spans a wide range of biological subdomains:
- Foundational biology: Cell biology, genetics, evolution, and ecology
- Advanced topics: Systems biology, synthetic biology, computational biophysics
- Applied domains: Medicine, agriculture, bioengineering, and environmental science
Dataset features include:
- 🧩 Logical reasoning styles — deductive, inductive, abductive, causal, and analogical
- 🧠 Problem-solving techniques — decomposition, elimination, systems thinking, trade-off analysis
- 🔬 Real-world problem contexts — experiment design, pathway mapping, and data interpretation
- 🌍 Practical relevance — bridging theoretical reasoning and applied biological insight
- 🎓 Educational focus — for both AI training and human learning in scientific reasoning
🧭 Objective
This fine-tuning project aims to build an interpretable reasoning model capable of:
- Explaining biological mechanisms clearly and coherently
- Demonstrating transparent, step-by-step thought processes
- Applying logical reasoning techniques to biological and interdisciplinary problems
- Supporting educational and research use cases where reasoning transparency matters
Citation
BibTeX:
@model{khazarai/BioGenesis-ToT,
title = {BioGenesis-ToT: A Fine-Tuned Model for Explanatory Biological Reasoning},
author = {Rustam Shiriyev},
year = {2025},
publisher = {Hugging Face},
base_model = {Qwen3-1.7B},
dataset = {moremilk/ToT-Biology},
license = {MIT}
}
- Downloads last month
- 79
