Edit Models filters

Inference status

Misc

Inference Endpoints

AutoTrain Compatible

text-generation-inference

4-bit precision

8-bit precision

Misc with no match

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

270

Full-text search

Active filters: reward-trainer

mnoukhov/pythia410m-rm-tldr6.9b

Text Classification • Updated Jun 20 • 191

trl-internal-testing/rm_160m

Text Classification • Updated Jun 20 • 7

vwxyzjn/rm_1b

Text Classification • Updated Jun 20

trl-internal-testing/rm_sentiment_1b

Text Classification • Updated Jun 25 • 7

SiMajid/value_reward_modeling

Text Classification • Updated Jun 21 • 9

SiMajid/deberta_value

Text Classification • Updated Jun 22 • 7

SiMajid/xlm-roberta-base

Text Classification • Updated Jun 21 • 9

SiMajid/opt-350-value

Text Classification • Updated Jun 22 • 10

trl-internal-testing/rm_descriptiveness_1b

Text Classification • Updated Jun 25 • 6

trl-internal-testing/rm_hh_1b

Text Classification • Updated Jun 26 • 9

trl-internal-testing/rm_tldr_1b

Text Classification • Updated Jun 26 • 8

smohammadi/tinyllama_rm_sentiment_1b

Text Classification • Updated Jun 28 • 9

prometheus04/tinystarcoder-rlhf-model

Text Generation • Updated Jun 29 • 12

Baidicoot/reward_modeling

Updated Jul 2 • 3

Baidicoot/gemma-2b-jailbreak-RM

Updated Jul 2 • 7 • 1

mnoukhov/pythia160m-rm-tldr6.9b

Text Classification • Updated Jul 4 • 78

mnoukhov/pythia1b-rm-tldr6.9b

Text Classification • Updated Jul 3 • 11

blai88/reward_modeling_anthropic_hh

Updated Jul 6 • 3

mnoukhov/pythia2.8b-rm-tldr6.9b

Text Classification • Updated Jul 7 • 8

steve-sli/0721_185958-google-gemma-2b

steve-sli/0721_201833-google-gemma-2b

steve-sli/0721_210648-google-gemma-2b

steve-sli/0721_210856-google-gemma-2b

steve-sli/0721_211205-google-gemma-2b

steve-sli/0721_222324-google-gemma-2b

Updated Jul 21 • 1

SiMajid/value-reward-model-opt-350m-v3

Text Classification • Updated Jul 23 • 8

SiMajid/value-reward-model-opt-350m-v11

Text Classification • Updated Jul 25 • 10

SiMajid/value-reward-model-opt-350m-v12

Text Classification • Updated Jul 25 • 10

Penghaoo/workspace

Updated Jul 25 • 4

ChokeGM/train_dir

Text Classification • Updated Jul 26 • 5