Model Card for mlpf-clic-clusters-v1.9.0

This model reconstructs particles in a detector, based on the tracks and calorimeter clusters recorded by the detector.

Model Details

The performance is measured with respect to generator-level jets and MET computed from Pythia particles, i.e. the truth-level jets and MET.

Jet performance ttbar jet resolution qq jet resolution ttbar jet resolution
MET performance ttbar MET resolution qq MET resolution ttbar MET resolution

Model Description

  • Developed by: Joosep Pata, Eric Wulff, Farouk Mokhtar, Mengke Zhang, David Southwick, Maria Girone, David Southwick, Javier Duarte, Michael Kagan
  • Model type: transformer
  • License: Apache License

Direct Use

This model may be used to study the physics and computational performance on ML-based reconstruction in simulation.

Out-of-Scope Use

This model is not intended for physics measurements on real data.

Bias, Risks, and Limitations

The model has only been trained on simulation data and has not been validated against real data. The model has not been peer reviewed or published in a peer-reviewed journal.

How to Get Started with the Model

Training Details

Trained on 8x MI250X for 26 epochs over ~3 days. The training was continued twice from a checkpoint due to the 24h time limit.

Training Data

The following datasets were used:


The truth and target definition was updated in jpata/particleflow#345 with respect to Pata, J., Wulff, E., Mokhtar, F. et al. Improved particle-flow event reconstruction with scalable neural networks for current and future particle detectors. Commun Phys 7, 124 (2024).

In particular, target particles for MLPF reconstruction are based on status=1 particles. For non-interacting status=1, nearby (dR<0.2) interacting status=0 are used instead. It's important to note that truth and target jets are defined in the center of mass frame, whereas PF particles are defined in the lab frame:

The datasets were generated using Key4HEP with the following scripts:

Training Procedure

Training script
#SBATCH --job-name=mlpf-train
#SBATCH --account=project_465000301
#SBATCH --time=1-00:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=32
#SBATCH --mem=200G
#SBATCH --gpus-per-task=8
#SBATCH --partition=standard-g
#SBATCH --no-requeue
#SBATCH -o logs/slurm-%x-%j-%N.out

cd /scratch/project_465000301/particleflow

module load LUMI/24.03 partition/G

export IMG=/scratch/project_465000301/pytorch-rocm6.2.simg
export PYTHONPATH=hep_tfds
export TFDS_DATA_DIR=/scratch/project_465000301/tensorflow_datasets
export MIOPEN_USER_DB_PATH=/tmp/${USER}-${SLURM_JOB_ID}-miopen-cache
export TF_CPP_MAX_VLOG_LEVEL=-1 #to suppress ROCm fusion is enabled messages
export ROCM_PATH=/opt/rocm
export KERAS_BACKEND=torch


#TF training
singularity exec \
    --rocm \
    -B /scratch/project_465000301 \
    -B /tmp \
    --env LD_LIBRARY_PATH=/opt/rocm/lib/ \
     $IMG python3 mlpf/ --dataset clic --gpus 8 \
     --data-dir $TFDS_DATA_DIR --config parameters/pytorch/pyg-clic.yaml \
     --train --gpu-batch-multiplier 128 --num-workers 8 --prefetch-factor 100 --checkpoint-freq 1 --conv-type attention --dtype bfloat16 --lr 0.0001 --num-epochs 30


Evaluation script
#SBATCH --partition gpu
#SBATCH --gres gpu:mig:1
#SBATCH --mem-per-gpu 200G
#SBATCH -o logs/slurm-%x-%j-%N.out

cd ~/particleflow

singularity exec -B /scratch/persistent --nv \
     --env PYTHONPATH=hep_tfds \
     --env KERAS_BACKEND=torch \
     $IMG  python3 mlpf/ --dataset clic --gpus 1 \
     --data-dir /scratch/persistent/joosep/tensorflow_datasets --config parameters/pytorch/pyg-clic.yaml \
     --test --make-plots --gpu-batch-multiplier 100 --load $WEIGHTS --dtype bfloat16 --prefetch-factor 10 --num-workers 8 --load $WEIGHTS --ntest 50000            



  • PF: particle flow reconstruction
  • MLPF: machine learning for particle flow
  • CLIC: Compact Linear Collider

Model Card Contact

Joosep Pata, [email protected]