HAC: Parameter-Efficient Hyperbolic Adaptation of CLIP for Zero-Shot VQA

HAC (Hyperbolic Adaptation of CLIP) is a parameter-efficient framework that enables pretrained CLIP models to transition into hyperbolic space via lightweight fine-tuning. This approach captures hierarchical structures more effectively than traditional Euclidean embeddings, specifically for tasks like zero-shot Visual Question Answering (VQA).

This repository contains the weights for HAC-B w/ LoRA.

Environment Setup

Create and configure the environment using Conda:

git clone https://github.com/fdibiton/HAC.git
cd HAC
conda create -n hac python=3.9 --yes
conda activate hac

# Install dependencies
python -m pip install --pre timm
python -m pip install -r requirements.txt

Evaluation

To run zero-shot VQA evaluation with the HAC-B w/ LoRA model, place the hac_vit_b_lora.pth file in the ./checkpoints directory and run:

python scripts/evaluate.py \
    --config configs/eval_vqa_all_categories.py \
    --train-config configs/train_hac_vit_b_lora.py \
    --checkpoint-path checkpoints/hac_vit_b_lora.pth

Note: The VQA evaluation datasets need to be downloaded and arranged beforehand. Please refer to the instructions in the GitHub repository for details.

Citation

@inproceedings{dibiton2026hac,
    title={HAC: Parameter-Efficient Hyperbolic Adaptation of CLIP for Zero-Shot VQA},
    author={Dibitonto, Francesco and Beyan, Cigdem and Murino, Vittorio},
    booktitle={International Conference on Pattern Recognition (ICPR)},
    year={2026}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for fdibiton/HAC