gradio PyMuPDF pytesseract pillow python-docx llama-index python-dotenv sentence-transformers scikit-learn openai pdf2image PyPDF2