jonathandinu commited on
Commit
65972ac
·
verified ·
1 Parent(s): 92cb9d5

Upload folder using huggingface_hub

Browse files
Files changed (2) hide show
  1. README.md +16 -8
  2. demo.png +0 -0
README.md CHANGED
@@ -13,6 +13,8 @@ datasets:
13
 
14
  # Face Parsing
15
 
 
 
16
  [Semantic segmentation](https://huggingface.co/docs/transformers/tasks/semantic_segmentation) model fine-tuned from [nvidia/mit-b5](https://huggingface.co/nvidia/mit-b5) with [CelebAMask-HQ](https://github.com/switchablenorms/CelebAMask-HQ) for face parsing. For additional options, see the Transformers [Segformer docs](https://huggingface.co/docs/transformers/model_doc/segformer).
17
 
18
  > ONNX model for web inference contributed by [Xenova](https://huggingface.co/Xenova).
@@ -21,8 +23,11 @@ datasets:
21
 
22
  ```python
23
  import torch
 
24
  from transformers import SegformerImageProcessor, SegformerForSemanticSegmentation
 
25
  from PIL import Image
 
26
  import requests
27
 
28
  # convenience expression for automatically determining device
@@ -42,23 +47,27 @@ model = SegformerForSemanticSegmentation.from_pretrained("jonathandinu/face-pars
42
  model.to(device)
43
 
44
  # expects a PIL.Image or torch.Tensor
45
- url = "http://images.cocodataset.org/val2017/000000039769.jpg"
46
  image = Image.open(requests.get(url, stream=True).raw)
47
- pixel_values = F.resize(image, (512, 512)).unsqueeze(0)
48
 
49
  # run inference on image
50
- inputs = image_processor(images=image, return_tensors="pt")
51
  outputs = model(**inputs)
52
- logits = outputs.logits # shape (batch_size, num_labels, height/4, width/4)
53
 
54
  # resize output to match input image dimensions
55
  upsampled_logits = nn.functional.interpolate(logits,
56
- size=image.shape[1:], # H x W
57
  mode='bilinear',
58
  align_corners=False)
59
 
60
  # get label masks
61
- masks = upsampled_logits.argmax(dim=1)[0]
 
 
 
 
 
62
  ```
63
 
64
  ## Usage in the browser (Transformers.js)
@@ -111,7 +120,6 @@ async function preload() {
111
  model = await pipeline("image-segmentation", "jonathandinu/face-parsing");
112
 
113
  print("face-parsing model loaded");
114
- loading = false;
115
  }
116
 
117
  // ...
@@ -130,4 +138,4 @@ async function preload() {
130
 
131
  ### Bias
132
 
133
- While the capabilities of computer vision models are impressive, they can also reinforce or exacerbate social biases. The [CelebAMask-HQ](https://github.com/switchablenorms/CelebAMask-HQ) dataset used for fine-tuning is large but not necessarily perfectly diverse.
 
13
 
14
  # Face Parsing
15
 
16
+ ![example image and output](demo.png)
17
+
18
  [Semantic segmentation](https://huggingface.co/docs/transformers/tasks/semantic_segmentation) model fine-tuned from [nvidia/mit-b5](https://huggingface.co/nvidia/mit-b5) with [CelebAMask-HQ](https://github.com/switchablenorms/CelebAMask-HQ) for face parsing. For additional options, see the Transformers [Segformer docs](https://huggingface.co/docs/transformers/model_doc/segformer).
19
 
20
  > ONNX model for web inference contributed by [Xenova](https://huggingface.co/Xenova).
 
23
 
24
  ```python
25
  import torch
26
+ from torch import nn
27
  from transformers import SegformerImageProcessor, SegformerForSemanticSegmentation
28
+
29
  from PIL import Image
30
+ import matplotlib.pyplot as plt
31
  import requests
32
 
33
  # convenience expression for automatically determining device
 
47
  model.to(device)
48
 
49
  # expects a PIL.Image or torch.Tensor
50
+ url = "https://images.unsplash.com/photo-1539571696357-5a69c17a67c6"
51
  image = Image.open(requests.get(url, stream=True).raw)
 
52
 
53
  # run inference on image
54
+ inputs = image_processor(images=image, return_tensors="pt").to(device)
55
  outputs = model(**inputs)
56
+ logits = outputs.logits # shape (batch_size, num_labels, ~height/4, ~width/4)
57
 
58
  # resize output to match input image dimensions
59
  upsampled_logits = nn.functional.interpolate(logits,
60
+ size=image.size[::-1], # H x W
61
  mode='bilinear',
62
  align_corners=False)
63
 
64
  # get label masks
65
+ labels = upsampled_logits.argmax(dim=1)[0]
66
+
67
+ # move to CPU to visualize in matplotlib
68
+ labels_viz = labels.cpu().numpy()
69
+ plt.imshow(labels_viz)
70
+ plt.show()
71
  ```
72
 
73
  ## Usage in the browser (Transformers.js)
 
120
  model = await pipeline("image-segmentation", "jonathandinu/face-parsing");
121
 
122
  print("face-parsing model loaded");
 
123
  }
124
 
125
  // ...
 
138
 
139
  ### Bias
140
 
141
+ While the capabilities of computer vision models are impressive, they can also reinforce or exacerbate social biases. The [CelebAMask-HQ](https://github.com/switchablenorms/CelebAMask-HQ) dataset used for fine-tuning is large but not necessarily perfectly diverse or representative. Also, they are images of.... just celebrities.
demo.png ADDED