Add TF weights
#2
by
neggles
- opened
Model converted by the transformers
' pt_to_tf
CLI. All converted model outputs and hidden layers were validated against its PyTorch counterpart.
Maximum crossload output difference=9.060e-06; Maximum crossload hidden layer difference=6.409e-03;
Maximum conversion output difference=9.060e-06; Maximum conversion hidden layer difference=6.409e-03;
CAUTION: The maximum admissible error was manually increased to 0.1!
See GitHub PR #25558 for details, precision overridden due to hidden states being a little weird in TF; final output logits are within 1.788e-05 for all model variants/sizes.