OPEA
/

Safetensors
molmo
custom_code
4-bit precision
intel/auto-round
cicdatopea commited on
Commit
72c90ad
·
verified ·
1 Parent(s): 1c3c008

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -6
README.md CHANGED
@@ -42,7 +42,7 @@ text = "Describe this image."
42
  # process the image and text
43
  inputs = processor.process(
44
  images=[Image.open(requests.get(image_url, stream=True).raw)],
45
- text="Describe this image."
46
  )
47
 
48
  # move inputs to the correct device and make a batch of size 1
@@ -50,7 +50,6 @@ inputs = {k: v.to(model.device).unsqueeze(0) for k, v in inputs.items()}
50
  inputs["images"] = inputs["images"].to(model.dtype)
51
 
52
  # generate output; maximum 200 new tokens; stop generation when <|endoftext|> is generated
53
- # with torch.autocast(device_type="cuda", enabled=True, dtype=torch.bfloat16):
54
  output = model.generate_from_batch(
55
  inputs,
56
  GenerationConfig(max_new_tokens=200, stop_strings="<|endoftext|>"),
@@ -73,19 +72,20 @@ print(generated_text)
73
  image_url = "http://images.cocodataset.org/train2017/000000411975.jpg"
74
  text = "How many people are there on the baseball field in the picture??"
75
  ##INT4:
76
- ## The image captures a lively scene on a baseball field, where a man in a blue shirt and khaki shorts is teaching a young girl how to play baseball. The girl, dressed in a white shirt and blue pants, is intently focused on holding a baseball bat. Nearby, a woman in a light blue shirt and blue jeans is bending down, likely offering guidance. The field is a mix of green grass and brown dirt, with a white line marking the edge of the grass. In the foreground, a man in a blue shirt stands observing the lesson, while two other individuals, one in a white shirt and another in a blue shirt, are seated on the ground, watching the interaction. The scene is set against the backdrop of a baseball diamond, with the pitcher's mound and base paths visible, adding context to the baseball lesson taking place.
77
 
78
  ##FP32:
79
- ## The image captures a lively scene on a baseball field, where a man, woman, and child are engaged in a game of baseball. The man, dressed in a black and white striped polo shirt, khaki cargo shorts, and sneakers, stands on the right side of the image. He is holding a microphone and appears to be calling out, possibly acting as an umpire or coach. The woman, wearing a light blue shirt and blue jeans, is bending over to hand a bat to the child, who is dressed in a white shirt and gray pants. The child is positioned in the middle of the field, ready to take a swing. In the foreground, a man in a blue shirt stands on the edge of the dirt area, observing the scene. Additionally, two other individuals are visible in the bottom left corner, also watching the action unfold. The background features the green grass of the outfield and the dirt of the infield, with a white line marking the edge of the grass.
80
 
81
 
82
  image_url = "https://intelcorp.scene7.com/is/image/intelcorp/processor-overview-framed-badge:1920-1080?wid=480&hei=270"
83
  text = "Which company does this image represent?"
84
  ##INT4:
85
- ## The image features a rectangular Intel logo set against a gradient blue background. The background transitions from a dark blue in the upper left corner to a lighter blue in the bottom right. The logo itself is composed of three nested squares. The outermost square is a light blue, followed by a slightly darker blue square, and finally, a medium blue square at the center. The medium blue square contains the text "Intel Inside" in white, with "Intel" positioned above "Inside." The word "Intel" is in a larger font, while "Inside" is slightly smaller. Additionally, there is a small trademark symbol (™) next to the "E" in "Intel." The overall design is clean and professional, emphasizing the brand's identity through its consistent use of blue tones and the iconic Intel logo.
86
 
87
  ##FP32:
88
- ## The image features a rectangular Intel logo set against a gradient blue background. The background transitions from a dark blue in the top left corner to a lighter blue in the bottom right. The logo itself is composed of three nested squares. The outermost square is a light blue rectangle with a smaller square missing from the bottom right corner. Inside this rectangle is a darker blue square, and within that is the smallest square, which contains the text "Intel Inside" in white. The word "Intel" is positioned above "Inside," with "Intel" being slightly larger. The overall design is clean and minimalistic, emphasizing the iconic Intel branding.
 
89
  ```
90
 
91
  ### Generate the model
 
42
  # process the image and text
43
  inputs = processor.process(
44
  images=[Image.open(requests.get(image_url, stream=True).raw)],
45
+ text=text
46
  )
47
 
48
  # move inputs to the correct device and make a batch of size 1
 
50
  inputs["images"] = inputs["images"].to(model.dtype)
51
 
52
  # generate output; maximum 200 new tokens; stop generation when <|endoftext|> is generated
 
53
  output = model.generate_from_batch(
54
  inputs,
55
  GenerationConfig(max_new_tokens=200, stop_strings="<|endoftext|>"),
 
72
  image_url = "http://images.cocodataset.org/train2017/000000411975.jpg"
73
  text = "How many people are there on the baseball field in the picture??"
74
  ##INT4:
75
+ ## Counting the <points x1="46.5" y1="37.1" x2="58.6" y2="48.3" x3="76.5" y3="33.0" alt="people on the baseball field">people on the baseball field</points> shows a total of 3.
76
 
77
  ##FP32:
78
+ ## Counting the <points x1="46.5" y1="37.6" x2="58.5" y2="49.0" x3="76.0" y3="33.1" alt="people on the baseball field">people on the baseball field</points> shows a total of 3.
79
 
80
 
81
  image_url = "https://intelcorp.scene7.com/is/image/intelcorp/processor-overview-framed-badge:1920-1080?wid=480&hei=270"
82
  text = "Which company does this image represent?"
83
  ##INT4:
84
+ ## The image represents Intel, a well-known technology company. The logo features the text "Intel" in white lowercase letters, followed by "INSIDE" in uppercase letters. This iconic logo design is instantly recognizable and has been a symbol of Intel's brand for many years.
85
 
86
  ##FP32:
87
+ ## The image represents Intel, a well-known technology company. The logo features the text "Intel" in white lowercase letters, with "INSIDE" in uppercase letters below it. This iconic logo design is instantly recognizable and associated with Intel's brand in the computer industry.
88
+
89
  ```
90
 
91
  ### Generate the model