HAC: Parameter-Efficient Hyperbolic Adaptation of CLIP for Zero-Shot VQA
Paper • 2604.23665 • Published
HAC (Hyperbolic Adaptation of CLIP) is a parameter-efficient framework that enables pretrained CLIP models to transition into hyperbolic space via lightweight fine-tuning. This approach captures hierarchical structures more effectively than traditional Euclidean embeddings, specifically for tasks like zero-shot Visual Question Answering (VQA).
This repository contains the weights for HAC-B w/ LoRA.
Create and configure the environment using Conda:
git clone https://github.com/fdibiton/HAC.git
cd HAC
conda create -n hac python=3.9 --yes
conda activate hac
# Install dependencies
python -m pip install --pre timm
python -m pip install -r requirements.txt
To run zero-shot VQA evaluation with the HAC-B w/ LoRA model, place the hac_vit_b_lora.pth file in the ./checkpoints directory and run:
python scripts/evaluate.py \
--config configs/eval_vqa_all_categories.py \
--train-config configs/train_hac_vit_b_lora.py \
--checkpoint-path checkpoints/hac_vit_b_lora.pth
Note: The VQA evaluation datasets need to be downloaded and arranged beforehand. Please refer to the instructions in the GitHub repository for details.
@inproceedings{dibiton2026hac,
title={HAC: Parameter-Efficient Hyperbolic Adaptation of CLIP for Zero-Shot VQA},
author={Dibitonto, Francesco and Beyan, Cigdem and Murino, Vittorio},
booktitle={International Conference on Pattern Recognition (ICPR)},
year={2026}
}