langchain openai streamlit PyMuPDF scikit-learn