blind-assist/walk-train
Updated • 38
How to use blind-assist/internvl3-2b-walk-lora-v1 with PEFT:
from peft import PeftModel
from transformers import AutoModelForCausalLM
base_model = AutoModelForCausalLM.from_pretrained("OpenGVLab/InternVL3-2B")
model = PeftModel.from_pretrained(base_model, "blind-assist/internvl3-2b-walk-lora-v1")This is a LoRA adapter for InternV3-2B, fine-tuned on the WalkVLM dataset to assist visually impaired individuals with navigation hazard detection.
import torch
from peft import PeftModel
from transformers import AutoModel, AutoTokenizer
# Load Base Model
base_model = AutoModel.from_pretrained(
"OpenGVLab/InternVL3-2B",
torch_dtype=torch.bfloat16,
trust_remote_code=True,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("OpenGVLab/InternVL3-2B", trust_remote_code=True)
# Load LoRA Adapter
model = PeftModel.from_pretrained(base_model, "blind-assist/internvl3-2b-walk-lora-v1")
# Merge for faster inference (optional)
model = model.merge_and_unload()
# Use for inference
response = model.chat(
tokenizer=tokenizer,
pixel_values=pixel_values, # Your preprocessed image
question="Describe any obstacles in this scene.",
generation_config=dict(max_new_tokens=256)
)
If PEFT doesn't work due to model architecture, use manual merging:
# See our inference script at:
# https://github.com/Blind-Assist/InternVL/blob/walkvlm/internvl_chat/test_finetuned_model.py
adapter_config.json - PEFT LoRA configurationadapter_model.safetensors - LoRA weights only (~50MB)Same as base model (OpenGVLab/InternVL3-2B)
Base model
OpenGVLab/InternVL3-2B-Pretrained