matthewfarant's picture
Update README.md
a49b59e

A newer version of the Gradio SDK is available: 5.5.0

Upgrade
metadata
title: Fertilizer Catalog Engine
emoji: 🌽
colorFrom: yellow
colorTo: green
sdk: gradio
sdk_version: 4.0.2
app_file: app.py
pinned: false
license: mit

Leveraging BERT and Active Transfer Learning for Product Mapping and Catalog Enrichment

Project Goal

A fertilizer company wants to map the product name they receive (queries) from several sources, incl. POS, to their existing product SKU catalog. To match these two, we're using fuzzy join based on the weighted average of two string similarity metrics: edit-based (levenshtein) & token-based (jaccard) approach. If the POS requested product doesn't exist in the current catalog, we will need to make sure whether the product is indeed fertilizer, by matching with the list of registered fertilizers dataset from the Ministry of Agriculture, Indonesia. And if the product is not found in both company's catalog and ministry dataset, we will perform a text-classification using BERT to determine if the product is indeed fertilizer & worthy of being considered as a new product. A sample of the POS dataset will be labelled by performing a web scraping from an e-commerce site in order to train the model. We will also use active learning (human-in-the-loop) method to keep the classification model relevant with new data