RMBG-1.4 / README.md

Update README.md

1fabafe verified 10 months ago

4.51 kB

	---
	license: other
	licence_name: bria-rmbg-1.4
	license_link: https://bria.ai/bria-huggingface-model-license-agreement/

	tags:
	- remove background
	- background
	- background removal
	- Pytorch
	- vision
	- legal liability

	extra_gated_prompt: This model weights by BRIA AI can be obtained after a commercial license is agreed upon. Fill in the form below and we reach out to you.
	extra_gated_fields:
	Name: text
	Company/Org name: text
	Org Type (Early/Growth Startup, Enterprise, Academy): text
	Role: text
	Country: text
	Email: text
	By submitting this form, I agree to BRIA’s Privacy policy and Terms & conditions, see links below: checkbox
	---

	# BRIA Background Removal v1.4 Model Card

	RMBG v1.4 is our state-of-the-art background removal model, designed to effectively separate foreground from background in a range of
	categories and image types. This model has been trained on a carefully selected dataset, which includes:
	general stock images, e-commerce, gaming, and advertising content, making it suitable for various use cases.
	Developed by BRIA AI, RMBG v1.4 is available as an open-source tool for non-commercial use.


	![examples](t4.png)

	### Model Description

	- Developed by: [BRIA AI](https://bria.ai/)
	- Model type: Background Removal
	- License: [bria-rmbg-1.4](https://bria.ai/bria-huggingface-model-license-agreement/)
	- The model is open for non-commercial use.
	- Commercial use is subject to a commercial agreement with BRIA. [Contact Us](https://bria.ai/contact-us)

	- Model Description: BRIA RMBG 1.4 is an saliency segmentation model trained exclusively on a professional-grade dataset.



	## Training data
	Bria-RMBG model was trained over 12,000 high-quality, high-resolution, manually labeled (pixel-wise accuracy), fully licensed images.
	For clarity, we provide our data distribution according to different categories, demonstrating our model’s versatility.

	### Distribution of images:

	\| Category \| Distribution \|
	\| -----------------------------------\| -----------------------------------:\|
	\| Objects only \| 45.11% \|
	\| People with objects/animals \| 25.24% \|
	\| People only \| 17.35% \|
	\| people/objects/animals with text \| 8.52% \|
	\| Text only \| 2.52% \|
	\| Animals only \| 1.89% \|

	\| Category \| Distribution \|
	\| -----------------------------------\| -----------------------------------------:\|
	\| Photorealistic \| 87.70% \|
	\| Non-Photorealistic \| 12.30% \|


	\| Category \| Distribution \|
	\| -----------------------------------\| -----------------------------------:\|
	\| Non Solid Background \| 52.05% \|
	\| Solid Background \| 47.95%


	\| Category \| Distribution \|
	\| -----------------------------------\| -----------------------------------:\|
	\| Single main foreground object \| 51.42% \|
	\| Multiple objects in the foreground \| 48.58% \|


	## Qualitative Evaluation

	![examples](results.png)

	- Inference Time : 1 sec on Nvidia A10 GPU

	## Architecture

	The model’s architecture is based on [IS-Net](https://github.com/xuebinqin/DIS).
	Yet, we employ a distinct training scheme and utilize our proprietary data for the training process, enhancing the model's effectiveness.


	## Usage

	```python
	import os
	import numpy as np
	from skimage import io
	from glob import glob
	from tqdm import tqdm
	import cv2
	import torch.nn.functional as F
	from torchvision.transforms.functional import normalize
	from models import BriaRMBG

	input_size=[1024,1024]
	net=BriaRMBG()

	model_path = "./model.pth"
	im_path = "./example_image.jpg"
	result_path = "."

	if torch.cuda.is_available():
	net.load_state_dict(torch.load(model_path))
	net=net.cuda()
	else:
	net.load_state_dict(torch.load(model_path,map_location="cpu"))
	net.eval()

	# prepare input
	im = io.imread(im_path)
	if len(im.shape) < 3:
	im = im[:, :, np.newaxis]
	im_size=im.shape[0:2]
	im_tensor = torch.tensor(im, dtype=torch.float32).permute(2,0,1)
	im_tensor = F.interpolate(torch.unsqueeze(im_tensor,0), size=input_size, mode='bilinear').type(torch.uint8)
	image = torch.divide(im_tensor,255.0)
	image = normalize(image,[0.5,0.5,0.5],[1.0,1.0,1.0])

	if torch.cuda.is_available():
	image=image.cuda()

	# inference
	result=net(image)

	# post process
	result = torch.squeeze(F.interpolate(result[0][0], size=im_size, mode='bilinear') ,0)
	ma = torch.max(result)
	mi = torch.min(result)
	result = (result-mi)/(ma-mi)

	# save result
	im_name=im_path.split('/')[-1].split('.')[0]
	im_array = (result*255).permute(1,2,0).cpu().data.numpy().astype(np.uint8)
	cv2.imwrite(os.path.join(result_path, im_name+".png"), im_array)
	```