allenai/olmOCR-7B-0225-preview · Working with JSON template inputs / structured outputs

Hello Team,

Great, great model. It has impressive performance on handwriting even for domain specific vocabulary !
In my use-case, I have the capability to know the JSON template of the data that I want to extract in advance. We would have a database of known templates with their declared schema.

How would you recommend to take benefit of that with your model ? My first idea would be to add it to the prompt but if I refer to your paper:
4.2 - Increasing robustness
"Unlike when we created olmOCR-mix-0225, we do not enforce a specific JSON schema during inference on our fine-tuned model. This is for two reasons: first, we find that open source tools designed to force decode a sequence into a particular schema are unreliable, and that enforcing a schema which is even slightly off from what the model expects can cause generations to go out-of-domain or collapse into repetitions. Second, and most importantly, we note that, since the model was extensively fine-tuned on the structured output, it reliably adheres to the required schema without constraints."

Looking forward for your expertise !