Pretrained ELECTRA Language Model for Korean (bw-electra-base-discriminator) ### Usage ## Load Model and Tokenizer ```python from transformers import ElectraModel,TFElectraModel,ElectraTokenizer # tensorflow model = TFElectraModel.from_pretrained("ifuseok/bw-electra-base-discriminator") # torch #model = ElectraModel.from_pretrained("ifuseok/bw-electra-base-discriminator",from_tf=True) tokenizer = ElectraTokenizer.from_pretrained("ifuseok/bw-electra-base-discriminator",do_lower) ``` ## Tokenizer example ```python from transformers import ElectraTokenizer tokenizer = ElectraTokenizer.from_pretrained("ifuseok/bw-electra-base-discriminator") tokenizer.tokenize("[CLS] Big Wave ELECTRA 모델을 공개합니다. [SEP]") ``` ## Example using ElectraForPreTraining(Torch) ```python import torch from transformers import ElectraForPreTraining, ElectraTokenizer discriminator = ElectraForPreTraining.from_pretrained("ifuseok/bw-electra-base-discriminator",from_tf=True) tokenizer = ElectraTokenizer.from_pretrained("ifuseok/bw-electra-base-discriminator",do_lower_case=False) sentence = "아무것도 하기가 싫다." fake_sentence = "아무것도 하기가 좋다." fake_tokens = tokenizer.tokenize(fake_sentence) fake_inputs = tokenizer.encode(fake_sentence, return_tensors="pt") discriminator_outputs = discriminator(fake_inputs) predictions = torch.round((torch.sign(discriminator_outputs[0]) + 1) / 2) print(list(zip(fake_tokens, predictions.tolist()[0][1:-1]))) ``` ## Example using ElectraForPreTraining(Tensorflow) ```python import tensorflow as tf from transformers import TFElectraForPreTraining, ElectraTokenizer discriminator = TFElectraForPreTraining.from_pretrained("ifuseok/bw-electra-base-discriminator" ) tokenizer = ElectraTokenizer.from_pretrained("ifuseok/bw-electra-base-discriminator", use_auth_token=access_token ,do_lower_case=False) sentence = "아무것도 하기가 싫다." fake_sentence = "아무것도 하기가 좋다." fake_tokens = tokenizer.tokenize(fake_sentence) fake_inputs = tokenizer.encode(fake_sentence, return_tensors="tf") discriminator_outputs = discriminator(fake_inputs) predictions = tf.round((tf.sign(discriminator_outputs[0]) + 1)/2).numpy() print(list(zip(fake_tokens, predictions.tolist()[0][1:-1]))) ```