OPEA
/

Molmo-7B-D-0924-int4-sym-inc

+---
+license: apache-2.0
+datasets:
+- NeelNanda/pile-10k
+base_model:
+- allenai/Molmo-7B-D-0924
+---
+## Model Details
+This model is an int4 model with group_size 128 and symmetric quantization of [allenai/Molmo-7B-D-0924](https://huggingface.co/allenai/Molmo-7B-D-0924) generated by [intel/auto-round](https://github.com/intel/auto-round). Load the model with revision="e64d453" to use AutoGPTQ format.
+## How To Use
+### INT4 Inference
+```python
+from auto_round import AutoRoundConfig ## must import for auto-round format
+from transformers import AutoModelForCausalLM, AutoProcessor, GenerationConfig
+from PIL import Image
+import requests
+quantized_model_path = "OPEA/Molmo-7B-D-0924-int4-sym-inc"
+# load the processor
+processor = AutoProcessor.from_pretrained(
+    quantized_model_path,
+    trust_remote_code=True,
+    torch_dtype='auto',
+    device_map='auto'
+)
+# load the model
+model = AutoModelForCausalLM.from_pretrained(
+    quantized_model_path,
+    trust_remote_code=True,
+    torch_dtype='auto',
+    device_map='auto'
+)
+image_url = "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-VL/assets/demo.jpeg"
+text = "Describe this image."
+# process the image and text
+inputs = processor.process(
+    images=[Image.open(requests.get(image_url, stream=True).raw)],
+    text="Describe this image."
+)
+# move inputs to the correct device and make a batch of size 1
+inputs = {k: v.to(model.device).unsqueeze(0) for k, v in inputs.items()}
+inputs["images"] = inputs["images"].to(model.dtype)
+# generate output; maximum 200 new tokens; stop generation when <|endoftext|> is generated
+# with torch.autocast(device_type="cuda", enabled=True, dtype=torch.bfloat16):
+output = model.generate_from_batch(
+    inputs,
+    GenerationConfig(max_new_tokens=200, stop_strings="<|endoftext|>"),
+    tokenizer=processor.tokenizer
+)
+# only get generated tokens; decode them to text
+generated_tokens = output[0,inputs['input_ids'].size(1):]
+generated_text = processor.tokenizer.decode(generated_tokens, skip_special_tokens=True)
+# print the generated text
+print(generated_text)
+##INT4:
+## In this serene beach scene, a woman with long, dark hair sits on the sandy shore, facing left. She is dressed in a plaid shirt with rolled-up sleeves, black pants, and sandals. Her eyes are closed, and she is smiling warmly as she reaches out to high-five a large, light brown dog. The dog, possibly a Labrador or a similar breed, sits on its hind legs with its front paws raised, eagerly engaging in the friendly gesture. The dog is adorned with a blue harness featuring a pattern of pink and blue flowers, and a red leash lies on the sand beside it. The beach is calm, with gentle waves lapping at the shore, and the sky above is a clear, light blue. The sun is setting, casting a soft, warm glow over the scene, enhancing the tranquil and joyful atmosphere.
+##BF16:
+## In this serene beach scene, a woman and her dog share a tender moment. The woman, with long dark hair, is seated on the sandy shore, her legs crossed as she faces the ocean. She is wearing a plaid shirt with rolled-up sleeves, black pants, and sandals. Her eyes are closed, and she is smiling warmly at her canine companion. The dog, a light brown Labrador, sits beside her with its front paws raised, eagerly reaching out to touch her hand. The dog is adorned with a blue harness decorated with pink and green flowers, and a red leash lies on the sand nearby. The beach is calm, with gentle waves lapping at the shore, and the sky above is a clear, light blue. The sun is setting, casting a soft, golden glow over the scene, enhancing the peaceful and joyful atmosphere.
+image_url = "http://images.cocodataset.org/train2017/000000411975.jpg"
+text = "How many people are there on the baseball field in the picture?？"
+##INT4:
+## The image captures a lively scene on a baseball field, where a man in a blue shirt and khaki shorts is teaching a young girl how to play baseball. The girl, dressed in a white shirt and blue pants, is intently focused on holding a baseball bat. Nearby, a woman in a light blue shirt and blue jeans is bending down, likely offering guidance. The field is a mix of green grass and brown dirt, with a white line marking the edge of the grass. In the foreground, a man in a blue shirt stands observing the lesson, while two other individuals, one in a white shirt and another in a blue shirt, are seated on the ground, watching the interaction. The scene is set against the backdrop of a baseball diamond, with the pitcher's mound and base paths visible, adding context to the baseball lesson taking place.
+##BF16:
+## The image captures a lively scene on a baseball field, where a man, woman, and child are engaged in a game of baseball. The man, dressed in a black and white striped polo shirt, khaki cargo shorts, and sneakers, stands on the right side of the image. He is holding a microphone and appears to be calling out, possibly acting as an umpire or coach. The woman, wearing a light blue shirt and blue jeans, is bending over to hand a bat to the child, who is dressed in a white shirt and gray pants. The child is positioned in the middle of the field, ready to take a swing. In the foreground, a man in a blue shirt stands on the edge of the dirt area, observing the scene. Additionally, two other individuals are visible in the bottom left corner, also watching the action unfold. The background features the green grass of the outfield and the dirt of the infield, with a white line marking the edge of the grass.
+image_url = "https://intelcorp.scene7.com/is/image/intelcorp/processor-overview-framed-badge:1920-1080?wid=480&hei=270"
+text = "Which company does this image represent?"
+##INT4:
+## The image features a rectangular Intel logo set against a gradient blue background. The background transitions from a dark blue in the upper left corner to a lighter blue in the bottom right. The logo itself is composed of three nested squares. The outermost square is a light blue, followed by a slightly darker blue square, and finally, a medium blue square at the center. The medium blue square contains the text "Intel Inside" in white, with "Intel" positioned above "Inside." The word "Intel" is in a larger font, while "Inside" is slightly smaller. Additionally, there is a small trademark symbol (™) next to the "E" in "Intel." The overall design is clean and professional, emphasizing the brand's identity through its consistent use of blue tones and the iconic Intel logo.
+##BF16:
+## The image features a rectangular Intel logo set against a gradient blue background. The background transitions from a dark blue in the top left corner to a lighter blue in the bottom right. The logo itself is composed of three nested squares. The outermost square is a light blue rectangle with a smaller square missing from the bottom right corner. Inside this rectangle is a darker blue square, and within that is the smallest square, which contains the text "Intel Inside" in white. The word "Intel" is positioned above "Inside," with "Intel" being slightly larger. The overall design is clean and minimalistic, emphasizing the iconic Intel branding.
+```
+### Generate the model
+Here is the sample command to reproduce the model.
+```bash
+pip install auto-round
+auto-round-mllm
+--model allenai/Molmo-7B-D-0924 \
+--device 0 \
+--group_size 128 \
+--bits 4 \
+--iters 1000 \
+--nsample 512 \
+--seqlen 2048 \
+--format 'auto_gptq,auto_round' \
+--output_dir "./tmp_autoround"
+```
+## Ethical Considerations and Limitations
+The model can produce factually incorrect output, and should not be relied on to produce factually accurate information. Because of the limitations of the pretrained model and the finetuning datasets, it is possible that this model could generate lewd, biased or otherwise offensive outputs.
+Therefore, before deploying any applications of the model, developers should perform safety testing.
+## Caveats and Recommendations
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.
+Here are a couple of useful links to learn more about Intel's AI software:
+- Intel Neural Compressor [link](https://github.com/intel/neural-compressor)
+## Disclaimer
+The license on this model does not constitute legal advice. We are not responsible for the actions of third parties who use this model. Please consult an attorney before using this model for commercial purposes.
+## Cite
+@article{cheng2023optimize, title={Optimize weight rounding via signed gradient descent for the quantization of llms}, author={Cheng, Wenhua and Zhang, Weiwei and Shen, Haihao and Cai, Yiyang and He, Xin and Lv, Kaokao and Liu, Yi}, journal={arXiv preprint arXiv:2309.05516}, year={2023} }
+[arxiv](https://arxiv.org/abs/2309.05516) [github](https://github.com/intel/auto-round)