Model converted by the transformers' pt_to_tf CLI. All converted model outputs and hidden layers were validated against its PyTorch counterpart.

Maximum crossload output difference=2.003e-05; Maximum crossload hidden layer difference=2.060e-04;
Maximum conversion output difference=2.003e-05; Maximum conversion hidden layer difference=2.060e-04;

CAUTION: The maximum admissible error was manually increased to 0.1!

Note: Actual output differences are:

List of maximum output differences above the threshold (1e-05):
logits: 1.019e-05

List of maximum hidden layer differences above the threshold (1e-05):
hidden_states[1]: 1.431e-05
hidden_states[2]: 7.629e-05
hidden_states[3]: 9.537e-05
hidden_states[4]: 3.815e-05

Minor error in conversion code when I created this. See GH PR for details.

Thank you for your PR, @neggles !

lysandre changed pull request status to merged

Sign up or log in to comment