YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co./docs/hub/model-cards#model-card-metadata)

license: apache-2.0

a DPO LoRA fine-tuned model with preference dataset

LoRA Experiment

RWKV-5.2-3b-World-DPO is merged model with base

Base Model

RWKV-5-World-3B-v2-20231113-ctx4096

Parameters: Lora Rank 8 Lora Alpha 16 ctx length 4096 epoch:19

Dataset Randomly chosed 1000pairs https://huggingface.co./datasets/HuggingFaceH4/ultrafeedback_binarized

trainer https://github.com/OpenMOSE/RWKV-LM-RLHF-DPO-LoRA

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.