Running
6
🦀
ETL for LLMs
Welcome to our space! 🎊
The Unstructured.io Team provides libraries with open-source components for pre-processing text documents such as PDFs, HTML and Word Documents. These components are packaged as bricks 🧱, which provide users the building blocks they need to build pipelines targeted at the documents they care about. Bricks in the library fall into three categories:
In this space we explore different settings of deep-learning models fine-tuned with several datasets containing a specific document type and corresponding annotations.
Main GitHub repository link: here