File size: 1,546 Bytes

0db4d44

## UIE(Universal Information Extraction)

### Introduction

UIE(Universal Information Extraction) is an SOTA method in PaddleNLP, you can see details [here](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/model_zoo/uie).  
Paper is [here](https://arxiv.org/pdf/2203.12277.pdf)

### Usage

I save the UIE model as a entire model(Ernie 3.0 backbone + start/end layers), so you need to load model as:

#### 1. clone this model to your local path

```sh
git lfs install
git clone https://huggingface.co./xyj125/uie-base-chinese
```

If you don't have [`git-lfs`], you can also:

  * Download manually by click [`Files and versions`] at Top Of This Card.
  
#### 2. load this model from local

```python
import os
import torch
from transformers import AutoTokenizer

uie_model = 'uie-base-zh'
model = torch.load(os.path.join(uie_model, 'pytorch_model.bin'))        # load UIE model
tokenizer = AutoTokenizer.from_pretrained('uie-base')                   # load tokenizer
...

start_prob, end_prob = model(input_ids=batch['input_ids'],
                            token_type_ids=batch['token_type_ids'],
                            attention_mask=batch['attention_mask']))
print(f'start_prob ({type(start_prob)}): {start_prob.size()}')          # start_prob
print(f'end_prob ({type(end_prob)}): {end_prob.size()}')                # end_prob
...
```

Here is the output of model (with batch_size=16, max_seq_len=256):
```python
start_prob (<class 'torch.Tensor'>): torch.Size([16, 256])
end_prob (<class 'torch.Tensor'>): torch.Size([16, 256])
```