Abstract
Large language models (LLMs) exhibit exceptional reasoning capabilities and have played significant roles in knowledge-based visual question-answering (VQA) systems. By conditioning on in-context examples and task-specific prompts, they comprehensively understand input questions and provide answers relevant to the context. However, due to the reliance on in-context examples, LLMs are susceptible to inheriting dataset biases in context descriptions and the provided examples. Innovative methods are required to ensure that LLMs can deliver unbiased yet contextually relevant responses. To tackle this challenge, we present GRAph-based Contextual DEbiasing (GRACE), a novel graph-based method for debiasing knowledge-based VQA models. This approach consists of two novel and generally applicable components. First, we propose an unsupervised context graph learning method that combats biases by explicitly creating a balanced context graph under the guidance of fairness constraints. Second, building upon the context graph, we consider both semantic features and reasoning processes to enhance prompting with more relevant and diverse in-context examples. Through extensive experimentation on both in-distribution (OK-VQA) and out-of-distribution (VQA-CP, GQA-OOD) datasets, we demonstrate the effectiveness of GRACE in mitigating biases and achieving generalization. Additionally, analyses of the model performance across gender groups demonstrate GRACE’s potential impacts on social equity. Our source code is publicly available at https://github.com/SuperJohnZhang/ContextGraphKVQA.
Original language | English (US) |
---|---|
Title of host publication | Computer Vision – ECCV 2024 - 18th European Conference, Proceedings |
Editors | Aleš Leonardis, Elisa Ricci, Stefan Roth, Olga Russakovsky, Torsten Sattler, Gül Varol |
Publisher | Springer Science and Business Media Deutschland GmbH |
Pages | 176-194 |
Number of pages | 19 |
ISBN (Print) | 9783031726422 |
DOIs | |
State | Published - 2025 |
Event | 18th European Conference on Computer Vision, ECCV 2024 - Milan, Italy Duration: Sep 29 2024 → Oct 4 2024 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 15075 LNCS |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | 18th European Conference on Computer Vision, ECCV 2024 |
---|---|
Country/Territory | Italy |
City | Milan |
Period | 9/29/24 → 10/4/24 |
Bibliographical note
Publisher Copyright:© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
Keywords
- Debias
- Knowledge VQA
- Large Language Model