wsdm23_pretrain /
lixsh6's picture

WSDM Cup 2023 BERT Checkpoints:

Paper released

Please refer to our paper for details in this competition:

Method Overview

  • Pre-training BERT with MLM and CTR prediction loss (or multi-task CTR prediction loss).
  • Finetuning BERT with pairwise ranking loss.
  • Obtain prediction scores from different BERTs.
  • Ensemble learning to combine BERT features and sparse features.

Details will be updated in the submission paper.

BERT features:

1) Model details: Checkpoints Download Here
Index Model Flag Method Pretrain step Finetune step DCG on leaderboard
1 large_group2_wwm_from_unw4625K M1 1700K 5130 11.96214
2 large_group2_wwm_from_unw4625K M1 1700K 5130 NAN
3 base_group2_wwm M2 2150K 5130 ~11.32363
4 large_group2_wwm_from_unw4625K M1 590K 5130 11.94845
5 large_group2_wwm_from_unw4625K M1 1700K 4180 NAN
6 large_group2_mt_pretrain M3 1940K 5130 NAN
2) Method details
Method Model Layers Details
M1 24 WWM & CTR prediction as pretraining tasks
M2 12 WWM & CTR prediction as pretraining tasks
M3 24 WWM & Multi-task CTR prediction as pretraining tasks
