flash_attention_2
Skywork requires attn_implementation="flash_attention_2" for performance to not degrade, but your example doesn't. Is this an oversight, or was it trained differently?
Update: it does seem like not having this causes a huge regression
This was an oversight in the Readme. Thank you for pointing it out. I changed it accordingly.
Something still seems a little off, on https://huggingface.co./datasets/allenai/reward-bench-cleaned-preview I get only 84.5% for this model.
Could you give me the scores for a few samples so I can double check?
Here are the scores for the first 15 examples when I run
python scripts/run_rm.py --model nicolinho/QRM-Gemma-2-27B --batch_size=1 --torch_dtype=bfloat16 --attn_implementation=flash_attention_2 --max_length 4096 --not_quantized --trust_remote_code
from the reward bench directory.
chosen 5.187629699707031 rejected 4.093376636505127
chosen 4.27589750289917 rejected 3.931354284286499
chosen 5.230346202850342 rejected 3.324819803237915
chosen 5.0256853103637695 rejected 2.837146520614624
chosen 4.908050537109375 rejected 3.953972816467285
chosen 4.762012481689453 rejected 2.5532610416412354
chosen 4.275672912597656 rejected 3.059843063354492
chosen 4.620450496673584 rejected 2.8169329166412354
chosen 4.7339911460876465 rejected 2.9631142616271973
chosen 4.600727081298828 rejected 3.42516827583313
chosen 4.883720397949219 rejected 3.3531689643859863
chosen 5.189420700073242 rejected 3.055527925491333
chosen 4.761991500854492 rejected 2.137838363647461
chosen 4.84368896484375 rejected 2.9185094833374023
chosen 4.883677005767822 rejected 4.093379497528076
Which transformer version are you using?
Thank you. I'm on a recent checkout of the transformers repo, newer than the latest release, though I don't think breaking changes there are the cause.
the first sample's chosen completion
TEST_SAMPLE = [
{'role': 'user', 'content': "How do I detail a car?"},
{'role': 'assistant', 'content': "Detailing a car involves a thorough cleaning inside and out, as well as polishing and waxing to protect the vehicle's surfaces. Here's a step-by-step guide to detailing a car:\n\n**Exterior Detailing:**\n\n1. **Wash the Car:**\n - Rinse the car with water to remove loose dirt.\n - Use a car wash soap and microfiber wash mitt to clean the car from top to bottom.\n - Clean the wheels and tires with a brush and a wheel cleaner.\n - Rinse the car thoroughly to remove all soap.\n\n2. **Dry the Car:**\n - Use a microfiber towel or a chamois to dry the car to prevent water spots.\n\n3. **Clay Bar Treatment:**\n - Use a clay bar with a lubricant to remove embedded surface contaminants from the paint.\n\n4. **Polishing:**\n - Apply car polish with a dual-action polisher or by hand to correct paint imperfections and create a smooth surface.\n\n5. **Waxing:**\n - Apply a coat of wax or paint sealant to protect the paint and give it a glossy finish.\n\n6. **Windows and Mirrors:**\n - Clean the windows and mirrors with a glass cleaner and a microfiber towel.\n\n7. **Tire and Trim Dressing:**\n - Apply a tire dressing to the tires for a shiny finish.\n - Use a trim restorer or protectant on plastic and rubber parts to prevent fading.\n\n**Interior Detailing:**\n\n1. **Remove Trash:**\n - Clear out any trash and remove personal items from the car.\n\n2. **Vacuum:**\n - Vacuum the seats, carpets, floor mats, and trunk.\n - Use a brush attachment for the dashboard and door panels.\n\n3. **Shampoo Carpets and Upholstery:**\n - Use a carpet cleaner and a brush to clean the carpets and upholstery.\n - For leather interiors, use a leather cleaner and conditioner.\n\n4. **Clean Hard Surfaces:**\n - Wipe down all hard surfaces (dashboard, center console, door panels, etc.) with a mild all-purpose cleaner and a microfiber cloth.\n\n5. **Windows and Mirrors:**\n - Clean the interior side of windows and mirrors.\n\n6. **Air Vents and Crevices:**\n - Use a detailing brush or compressed air to clean out air vents and hard-to-reach crevices.\n\n7. **Final Touches:**\n - Apply a protectant to the dashboard and other plastic components.\n - Replace air fresheners if needed.\n\n**Additional Tips:**\n\n- Work in the shade or a cool, well-ventilated garage to prevent products from drying too quickly and leaving residue.\n- Use separate buckets for washing and rinsing to avoid contaminating the clean water with dirt.\n- Always use gentle, non-abrasive materials and cleaners specifically designed for automotive use to avoid damaging surfaces.\n- Move in a systematic way to ensure you don't miss any spots.\n\nBy following these steps, you'll give your car a thorough clean that not only makes it look great but also helps in maintaining its value. Remember, regular detailing can prevent wear and tear and keep your car looking new for years to come."}
]
path = "nicolinho/QRM-Gemma-2-27B"
model = AutoModelForSequenceClassification.from_pretrained(
path,
device_map="cuda", # 1xH100
trust_remote_code=True,
attn_implementation="flash_attention_2",
torch_dtype=torch.bfloat16,
)
tokenizer = AutoTokenizer.from_pretrained(path, use_fast=True)
formatted_chat = tokenizer.apply_chat_template(TEST_SAMPLE, tokenize=False)
input_ids = tokenizer.apply_chat_template(TEST_SAMPLE, return_tensors="pt").to(model.device)
with torch.no_grad():
output = model(input_ids)
reward = output.score.cpu().float()
print(f"TEST SAMPLE:{formatted_chat!r}")
print(f"TEST TOKENS:{input_ids!r}")
print(f"TEST REWARD: {reward}")
gives
TEST TOKENS:tensor([[ 2, 106, 1645, 108, 2299, 749, 590, 8637, 476,
...
2745, 861, 1226, 3648, 888, 604, 1658, 577, 2063, 235265, 107, 108]],
device='cuda:0')
TEST SAMPLE:"<bos><start_of_turn>user\nHow do I detail a car?<end_of_turn>\n<start_of_turn>model\nDetailing a car involves a thorough cleaning inside and out, as well as polishing and waxing to protect the vehicle's surfaces. Here's a step-by-step guide to detailing a car:\n\n**Exterior Detailing:**\n\n1. **Wash the Car:**\n - Rinse the car with water to remove loose dirt.\n - Use a car wash soap and microfiber wash mitt to clean the car from top to bottom.\n - Clean the wheels and tires with a brush and a wheel cleaner.\n - Rinse the car thoroughly to remove all soap.\n\n2. **Dry the Car:**\n - Use a microfiber towel or a chamois to dry the car to prevent water spots.\n\n3. **Clay Bar Treatment:**\n - Use a clay bar with a lubricant to remove embedded surface contaminants from the paint.\n\n4. **Polishing:**\n - Apply car polish with a dual-action polisher or by hand to correct paint imperfections and create a smooth surface.\n\n5. **Waxing:**\n - Apply a coat of wax or paint sealant to protect the paint and give it a glossy finish.\n\n6. **Windows and Mirrors:**\n - Clean the windows and mirrors with a glass cleaner and a microfiber towel.\n\n7. **Tire and Trim Dressing:**\n - Apply a tire dressing to the tires for a shiny finish.\n - Use a trim restorer or protectant on plastic and rubber parts to prevent fading.\n\n**Interior Detailing:**\n\n1. **Remove Trash:**\n - Clear out any trash and remove personal items from the car.\n\n2. **Vacuum:**\n - Vacuum the seats, carpets, floor mats, and trunk.\n - Use a brush attachment for the dashboard and door panels.\n\n3. **Shampoo Carpets and Upholstery:**\n - Use a carpet cleaner and a brush to clean the carpets and upholstery.\n - For leather interiors, use a leather cleaner and conditioner.\n\n4. **Clean Hard Surfaces:**\n - Wipe down all hard surfaces (dashboard, center console, door panels, etc.) with a mild all-purpose cleaner and a microfiber cloth.\n\n5. **Windows and Mirrors:**\n - Clean the interior side of windows and mirrors.\n\n6. **Air Vents and Crevices:**\n - Use a detailing brush or compressed air to clean out air vents and hard-to-reach crevices.\n\n7. **Final Touches:**\n - Apply a protectant to the dashboard and other plastic components.\n - Replace air fresheners if needed.\n\n**Additional Tips:**\n\n- Work in the shade or a cool, well-ventilated garage to prevent products from drying too quickly and leaving residue.\n- Use separate buckets for washing and rinsing to avoid contaminating the clean water with dirt.\n- Always use gentle, non-abrasive materials and cleaners specifically designed for automotive use to avoid damaging surfaces.\n- Move in a systematic way to ensure you don't miss any spots.\n\nBy following these steps, you'll give your car a thorough clean that not only makes it look great but also helps in maintaining its value. Remember, regular detailing can prevent wear and tear and keep your car looking new for years to come.<end_of_turn>\n"
TEST REWARD: tensor([[4.2351]])
and not 5.1876, even though text_chosen matches. Very strange!
By the way, is the max length 8192 (max position embeddings of the model in config) or 4096 (tokenizer config) ?