ERNIE-Gram: Pre-Training with Explicitly N-Gram Masked Language Modeling for Natural Language Understanding
Paper • 2010.12148 • Published
How to use swtx/ernie-gram-chinese with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("feature-extraction", model="swtx/ernie-gram-chinese") # Load model directly
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("swtx/ernie-gram-chinese")
model = AutoModel.from_pretrained("swtx/ernie-gram-chinese")YAML Metadata Error:"language" with value "chinese" is not valid. It must be an ISO 639-1, 639-2 or 639-3 code (two/three letters), or a special value like "code", "multilingual". If you want to use BCP-47 identifiers, you can specify them in language_bcp47.
ERNIE-Gram: Pre-Training with Explicitly N-Gram Masked Language Modeling for Natural Language Understanding
More detail: https://arxiv.org/abs/2010.12148
| Model Name | Language | Model Structure |
|---|---|---|
| ernie-gram-chinese | Chinese | Layer:12, Hidden:768, Heads:12 |
This released Pytorch model is converted from the officially released PaddlePaddle ERNIE model and a series of experiments have been conducted to check the accuracy of the conversion.
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("swtx/ernie-gram-chinese")
model = AutoModel.from_pretrained("swtx/ernie-gram-chinese")