YAML Metadata Error:"language" with value "chinese" is not valid. It must be an ISO 639-1, 639-2 or 639-3 code (two/three letters), or a special value like "code", "multilingual". If you want to use BCP-47 identifiers, you can specify them in language_bcp47.

ERNIE-Gram-chinese

Introduction

ERNIE-Gram: Pre-Training with Explicitly N-Gram Masked Language Modeling for Natural Language Understanding

More detail: https://arxiv.org/abs/2010.12148

Released Model Info

Model Name	Language	Model Structure
ernie-gram-chinese	Chinese	Layer:12, Hidden:768, Heads:12

This released Pytorch model is converted from the officially released PaddlePaddle ERNIE model and a series of experiments have been conducted to check the accuracy of the conversion.

Official PaddlePaddle ERNIE repo: https://github.com/PaddlePaddle/ERNIE
Pytorch Conversion repo: https://github.com/nghuyong/ERNIE-Pytorch

How to use

from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("swtx/ernie-gram-chinese")
model = AutoModel.from_pretrained("swtx/ernie-gram-chinese")

Downloads last month: 8

Paper for swtx/ernie-gram-chinese

ERNIE-Gram: Pre-Training with Explicitly N-Gram Masked Language Modeling for Natural Language Understanding

Paper • 2010.12148 • Published Oct 23, 2020