Whe I finetune a tinyLlama model using a sample of Alpaca data, the process trains ok in Colab however when I try to run this locally on a Macbook Ventura 13.6.4 using MPS , the validation loss is nan at the first step?
model = AutoModelForCausalLM.from_pretrained(
pretrained_model_name_or_path=āTinyLlama/TinyLlama-1.1B-Chat-v1.0ā,
device_map=āmpsā,
trust_remote_code=True,
low_cpu_mem_usage=True,
torch_dtype=torch.float16
)
peft_config = LoraConfig(
r=16,
lora_alpha=16,
lora_dropout=0.1,
bias=ānoneā,
task_type=āCAUSAL_LMā,
target_modules=[āq_projā, āk_projā,āv_projā,āo_projā],
modules_to_save=None,
)
training_args = TrainingArguments(
output_dir=ā./alpaca_output/ā,
report_to=ānoneā,
per_device_train_batch_size=4,
per_device_eval_batch_size=4,
learning_rate=2e-4,
lr_scheduler_type=ācosineā,
num_train_epochs=1,
evaluation_strategy=āstepsā,
# logging strategies
logging_strategy=āstepsā,
logging_steps=1,
gradient_checkpointing=True,
gradient_accumulation_steps=1,
seed=1,
save_strategy=āepochā,
)
trainer = SFTTrainer(
model,
peft_config=peft_config,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
packing=True,
max_seq_length=1024,
args=training_args,
formatting_func=create_alpaca_prompt
)