YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co./docs/hub/model-cards#model-card-metadata)
WSDM Cup 2023 BERT Checkpoints:
- This repo contains the checkpoints of our competition in WSDM Cup 2023: Pre-training for Web Search and Unbiased Learning for Web Search.
Paper released
Please refer to our paper for details in this competition:
- Task1 Unbiased Learning to rank: Multi-Feature Integration for Perception-Dependent Examination-Bias Estimation
- Task2 Pretraining for web search: Pretraining De-Biased Language Model with Large-scale Click Logs for Document Ranking
Method Overview
- Pre-training BERT with MLM and CTR prediction loss (or multi-task CTR prediction loss).
- Finetuning BERT with pairwise ranking loss.
- Obtain prediction scores from different BERTs.
- Ensemble learning to combine BERT features and sparse features.
Details will be updated in the submission paper.
BERT features:
1) Model details: Checkpoints Download Here
Index | Model Flag | Method | Pretrain step | Finetune step | DCG on leaderboard |
---|---|---|---|---|---|
1 | large_group2_wwm_from_unw4625K | M1 | 1700K | 5130 | 11.96214 |
2 | large_group2_wwm_from_unw4625K | M1 | 1700K | 5130 | NAN |
3 | base_group2_wwm | M2 | 2150K | 5130 | ~11.32363 |
4 | large_group2_wwm_from_unw4625K | M1 | 590K | 5130 | 11.94845 |
5 | large_group2_wwm_from_unw4625K | M1 | 1700K | 4180 | NAN |
6 | large_group2_mt_pretrain | M3 | 1940K | 5130 | NAN |
2) Method details
Method | Model Layers | Details |
---|---|---|
M1 | 24 | WWM & CTR prediction as pretraining tasks |
M2 | 12 | WWM & CTR prediction as pretraining tasks |
M3 | 24 | WWM & Multi-task CTR prediction as pretraining tasks |
Contacts
- Xiangsheng Li: [email protected].
- Xiaoshu Chen: [email protected]
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
HF Inference API was unable to determine this model's library.