Instructions to use google/flan-t5-xl with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use google/flan-t5-xl with Transformers:
# Load model directly from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("google/flan-t5-xl") model = AutoModelForSeq2SeqLM.from_pretrained("google/flan-t5-xl") - Notebooks
- Google Colab
- Kaggle
int8 model consumes the same GPU memory as default model.
#15
by Iamexperimenting - opened
Hi team, when I'm trying the load flan-t5-xl model I see the same GPU memory is getting consumed. Could you please help me here, I'm sagemaker studio with ml.g4dn.xlarge
for default - it consumes - 11448MiB/15109Mib
for float 16 - it consumes - 7532MiB/15109Mib
for int8 - it consumes - 11448MiB/15109Mib
Thanks
I just share a model that might be helpful to you.
https://huggingface.co/limcheekin/flan-t5-xl-ct2
Hi good day, may i how it consums on RAM when it use on cpu?