yzhuang
/

Llama-3.1-8B-Instruct-AgenticLU

Model card Files Files and versions Community

Llama-3.1-8B-Instruct-AgenticLU / README.md

yzhuang's picture

Update README.md

a8d8b52 verified 4 days ago

|

history blame contribute delete

3.06 kB

	---
	license: mit
	datasets:
	- yzhuang/Agentic-Long-Context-Understanding-QA
	language:
	- en
	base_model:
	- meta-llama/Llama-3.1-8B-Instruct
	---
	<h1 align="center"> 📖 Agentic Long Context Understanding 📖 </h1>
	<p align="center"> <b>Self-Taught Agentic Long Context Understanding</b> (<a href="https://arxiv.org/abs/2502.15920">Arxiv</a>).
	</p>

	<p align="center">
	<img src="https://img.shields.io/badge/license-mit-blue.svg">
	<img src="https://img.shields.io/badge/python-3.9+-blue">
	</p>

	<p align="center"> AgenticLU refines complex, long-context queries through self-clarifications and contextual grounding, enabling robust long-document understanding in a single pass.
	</p>

	## Installation Requirements
	This codebase is largely based on [OpenRLHF](https://github.com/OpenRLHF/OpenRLHF) and [Helmet](https://github.com/princeton-nlp/HELMET), kudos to them.

	The requirements are the same
	```
	pip install openrlhf
	pip install -r ./HELMET/requirements.txt
	```


	## Dataset \& Model

	Dataset for SFT and DPO is avaliable at [here](https://huggingface.co./datasets/yzhuang/Agentic-Long-Context-Understanding-QA)

	Model is available at [here](https://huggingface.co./yzhuang/Llama-3.1-8B-Instruct-AgenticLU)

	## Data Generation Pipeline

	To generate traces with your custom model or dataset, follow the instructions:

	1. Get an OpenAI API key and set it as your env variable
	```
	export OPENAI_API_KEY="your_api_key_here"
	```

	2. Edit the bash sript as you needed for base model, search width and depth
	```
	PYTHONPATH="./":"$PYTHONPATH" python ./long_context_llm/qa_tree_datagen.py \
	--model_name_or_path meta-llama/Llama-3.1-8B-Instruct \
	--max_sample_size 8 \
	--max_tree_depth 2 \
	--dataset_name yzhuang/narrative_qa
	```

	3. The traces will be avaliable to you as ```dataset_dpo```, feel free to add this line to push to your huggingface account.
	```
	dataset_dpo.push_to_hub("YOUR REPO")
	```

	## Example Usage

	We show the training script of AgenticLU at [sft script](bash_scripts/sft_8b.sh), [dpo script](bash_scripts/rlhf_8b.sh).

	It is important to get [ring-attention](https://github.com/zhuzilin/ring-flash-attention) to work, as the inputs are extremely long and requires ring-attention and deepspeed for training.

	Examples for inferencing with the agentic workflow can be found [here](HELMET/scripts/run_agents.sh), with baseline prompting [scripts](HELMET/scripts/run_prompting.sh) avaliable.


	## Questions?

	If you have any questions related to the code or the paper, feel free to reach out to us at [email protected].

	## Citation

	If you find our paper and code useful, please cite us:
	```r
	@misc{zhuang2025selftaughtagenticlongcontext,
	title={Self-Taught Agentic Long Context Understanding},
	author={Yufan Zhuang and Xiaodong Yu and Jialian Wu and Ximeng Sun and Ze Wang and Jiang Liu and Yusheng Su and Jingbo Shang and Zicheng Liu and Emad Barsoum},
	year={2025},
	eprint={2502.15920},
	archivePrefix={arXiv},
	primaryClass={cs.CL},
	url={https://arxiv.org/abs/2502.15920},
	}
	```