Hi, I would like to convert my llama based fine tuned safetensors model to tflite to use it in android. Is there any tutorial which explain how can I convert?
How about this? In addition, it seems that the method of converting to ONNX first and then to tflite was (and still is) often used in the past.
Hi! Converting a fine-tuned model like your Llama-based Safetensors model to TFLite for use on Android can be a bit tricky but definitely possible.
from transformers import TFAutoModelForCausalLM
model = TFAutoModelForCausalLM.from_pretrained("your-llama-model-path")
model.save_pretrained("saved_model_directory")
import tensorflow as tf
# Load the saved model
saved_model_dir = 'saved_model_directory'
model = tf.saved_model.load(saved_model_dir)
# Convert the model to TFLite format
converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)
tflite_model = converter.convert()
# Save the converted TFLite model
with open('model.tflite', 'wb') as f:
f.write(tflite_model)
But you should consider your model size and your andriod device’s RAM after the converting.