DISLab
/

Gen-8B-R2

Question Answering

text-generation

text-generation-inference

Model card Files Files and versions

Gen-8B-R2 / README.md

Hwanjun's picture

Update README.md

91b0156 verified about 1 year ago

|

history blame contribute delete

2.03 kB

	---
	base_model:
	- meta-llama/Llama-3.2-3B-Instruct
	language:
	- en
	license: apache-2.0
	pipeline_tag: question-answering
	library_name: transformers
	---


	<div align="center">
	<b style="font-size: 40px;">Gen-8B-R2</b>
	</div>

	Note: We are still working on this.

	Are you looking for a more robust and reliable generation model for RAG system?

	Here is a Gen-8B-R2 model that effectively mitigates hallucinations caused by retrieval noise and information overload.

	See the details in our paper [Link](https://arxiv.org/pdf/2503.04789)


	### What is Gen-8B-R2?

	This model is one of the variant of Ext2Gen-8B-R2, which disables the process of extracting sentences from the chunk list.

	See the details of Ext2Gen-8B-R2 in https://huggingface.co/DISLab/Ext2Gen-8B-R2

	### Recommended Prompt

	- query: the query to answer
	- chunk_list: the list of retrieved chunks, e.g., ["chunk 1", "chunk 2", "chunk 3"]

	```python

	def prepare_sample_text(prompt):
	row_json = [{"role": "user", "content": prompt}]
	return tokenizer.apply_chat_template(row_json, tokenize=False)

	def format_prompt_template(query, chunk_list):

	chunk_list = ['[Chunk ID: '+ str(idx+1) + '] ' + chunk_text for idx, chunk_text in enumerate(chunk_list)]
	chunk_list = '

	'.join(chunk_list)

	prompt = '''
	You are an expert assistant trained to generate answers based on document chunks.


	### Generation Instruction:
	- Answer to the Query based on the given Chunk List.


	### Query:
	%s


	### Chunk List:
	%s


	### Output:
	''' % (query, chunk_list)

	return prompt.strip()


	prompt = format_prompt_template(query, noisy_chunks)
	prompt = prepare_sample_text(prompt)
	```


	Note that this prompt outputs both extracted relevant sentences and the answer to the query.

	The output follows a consistent format as seen in an example below.

	```
	The estimated number of deaths at Chelmno is 150-300,000, mainly Jews.
	```

	### Recommended Generation Parameters

	```python
	max_new_tokens=1024, # or 2048
	do_sample=True,
	temperature=0.8,
	top_p=0.9,
	```