jonathandinu commited on
Commit
92cb9d5
·
verified ·
1 Parent(s): b2b06c3

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +126 -8
README.md CHANGED
@@ -1,15 +1,133 @@
1
  ---
2
  language: en
3
- license: cc0-1.0
4
  library_name: transformers
5
  tags:
6
- - vision
7
- - image-segmentation
8
- - nvidia/mit-b5
9
- - transformers.js
10
- - onnx
11
  datasets:
12
- - celebamaskhq
13
  ---
14
 
15
- ## Face Parsing
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  language: en
 
3
  library_name: transformers
4
  tags:
5
+ - vision
6
+ - image-segmentation
7
+ - nvidia/mit-b5
8
+ - transformers.js
9
+ - onnx
10
  datasets:
11
+ - celebamaskhq
12
  ---
13
 
14
+ # Face Parsing
15
+
16
+ [Semantic segmentation](https://huggingface.co/docs/transformers/tasks/semantic_segmentation) model fine-tuned from [nvidia/mit-b5](https://huggingface.co/nvidia/mit-b5) with [CelebAMask-HQ](https://github.com/switchablenorms/CelebAMask-HQ) for face parsing. For additional options, see the Transformers [Segformer docs](https://huggingface.co/docs/transformers/model_doc/segformer).
17
+
18
+ > ONNX model for web inference contributed by [Xenova](https://huggingface.co/Xenova).
19
+
20
+ ## Usage in Python
21
+
22
+ ```python
23
+ import torch
24
+ from transformers import SegformerImageProcessor, SegformerForSemanticSegmentation
25
+ from PIL import Image
26
+ import requests
27
+
28
+ # convenience expression for automatically determining device
29
+ device = (
30
+ "cuda"
31
+ # Device for NVIDIA or AMD GPUs
32
+ if torch.cuda.is_available()
33
+ else "mps"
34
+ # Device for Apple Silicon (Metal Performance Shaders)
35
+ if torch.backends.mps.is_available()
36
+ else "cpu"
37
+ )
38
+
39
+ # load models
40
+ image_processor = SegformerImageProcessor.from_pretrained("jonathandinu/face-parsing")
41
+ model = SegformerForSemanticSegmentation.from_pretrained("jonathandinu/face-parsing")
42
+ model.to(device)
43
+
44
+ # expects a PIL.Image or torch.Tensor
45
+ url = "http://images.cocodataset.org/val2017/000000039769.jpg"
46
+ image = Image.open(requests.get(url, stream=True).raw)
47
+ pixel_values = F.resize(image, (512, 512)).unsqueeze(0)
48
+
49
+ # run inference on image
50
+ inputs = image_processor(images=image, return_tensors="pt")
51
+ outputs = model(**inputs)
52
+ logits = outputs.logits # shape (batch_size, num_labels, height/4, width/4)
53
+
54
+ # resize output to match input image dimensions
55
+ upsampled_logits = nn.functional.interpolate(logits,
56
+ size=image.shape[1:], # H x W
57
+ mode='bilinear',
58
+ align_corners=False)
59
+
60
+ # get label masks
61
+ masks = upsampled_logits.argmax(dim=1)[0]
62
+ ```
63
+
64
+ ## Usage in the browser (Transformers.js)
65
+
66
+ ```js
67
+ import {
68
+ pipeline,
69
+ env,
70
+ } from "https://cdn.jsdelivr.net/npm/@xenova/[email protected]";
71
+
72
+ // important to prevent errors since the model files are likely remote on HF hub
73
+ env.allowLocalModels = false;
74
+
75
+ // instantiate image segmentation pipeline with pretrained face parsing model
76
+ model = await pipeline("image-segmentation", "jonathandinu/face-parsing");
77
+
78
+ // async inference since it could take a few seconds
79
+ const output = await model(url);
80
+
81
+ // each label is a separate mask object
82
+ // [
83
+ // { score: null, label: 'background', mask: transformers.js RawImage { ... }}
84
+ // { score: null, label: 'hair', mask: transformers.js RawImage { ... }}
85
+ // ...
86
+ // ]
87
+ for (const m of output) {
88
+ print(`Found ${m.label}`);
89
+ m.mask.save(`${m.label}.png`);
90
+ }
91
+ ```
92
+
93
+ ### p5.js
94
+
95
+ Since [p5.js](https://p5js.org/) uses an animation loop abstraction, we need to take care loading the model and making predictions.
96
+
97
+ ```js
98
+ // ...
99
+
100
+ // asynchronously load transformers.js and instantiate model
101
+ async function preload() {
102
+ // load transformers.js library with a dynamic import
103
+ const { pipeline, env } = await import(
104
+ "https://cdn.jsdelivr.net/npm/@xenova/[email protected]"
105
+ );
106
+
107
+ // important to prevent errors since the model files are remote on HF hub
108
+ env.allowLocalModels = false;
109
+
110
+ // instantiate image segmentation pipeline with pretrained face parsing model
111
+ model = await pipeline("image-segmentation", "jonathandinu/face-parsing");
112
+
113
+ print("face-parsing model loaded");
114
+ loading = false;
115
+ }
116
+
117
+ // ...
118
+ ```
119
+
120
+ [full p5.js example](https://editor.p5js.org/jonathan.ai/sketches/wZn15Dvgh)
121
+
122
+ ### Model Description
123
+
124
+ - **Developed by:** [Jonathan Dinu](https://twitter.com/jonathandinu)
125
+ - **Model type:** Transformer-based semantic segmentation image model
126
+ - **License:** non-commercial research and educational purposes
127
+ - **Resources for more information:** Transformers docs on [Segformer](https://huggingface.co/docs/transformers/model_doc/segformer) and/or the [original research paper](https://arxiv.org/abs/2105.15203).
128
+
129
+ ## Limitations and Bias
130
+
131
+ ### Bias
132
+
133
+ While the capabilities of computer vision models are impressive, they can also reinforce or exacerbate social biases. The [CelebAMask-HQ](https://github.com/switchablenorms/CelebAMask-HQ) dataset used for fine-tuning is large but not necessarily perfectly diverse.