Regarding the Data Generation Pipeline

#2
by chansurgeplus - opened

Is it possible to publish the data generation pipeline in the repository at https://github.com/FlagOpen/FlagEmbedding/tree/master/Long_LLM/longllm_qlora/src?

At least the exact steps, especially regarding extracting chunks from books/papers would be helpful.

Hi, we will! Please stay tuned :)

any news ? :)

Sign up or log in to comment