openai pymupdf diff-match-patch pymongo spark PorterStemmer base streamlit