Task description and call for participation SISAP 2025 Indexing Challenge

Introduction

The SISAP Indexing Challenge 2025 invites researchers and practitioners to participate in exciting tasks to advance the state of the art in similarity search and indexing. The challenge provides a platform for presenting innovative solutions and pushing the boundaries of efficiency and effectiveness in large-scale similarity search indexes. This year, we are opening two challenging tasks.

Datasets can be found in https://huggingface.co/datasets/sadit/SISAP2025/tree/main; you can clone the full repository or download each file.

Task 1: Resource-limited indexing

This task challenges participants to develop memory-efficient indexing solutions with reranking capabilities. Each solution will be run in a Linux container with limited memory and storage resources.

Task 2: K-nearest neighbor graph (a.k.a. metric self-join)

In this task, participants are asked to develop memory-efficient indexing solutions that will be used to compute an approximation of the k-nearest neighbor graph for k=15. Each solution will be run in a Linux container with limited memory and storage resources.

Test Data and Queries:

Hardware specifications

The evaluation will be carried out on a machine with the following specifications:

Registration and Participation

  1. Register for the challenge by opening a "Pre-registration request" issue in the GitHub repository https://github.com/sisap-challenges/challenge2025/. Fill out the required data, taking into account that the given data will be used to keep in contact while the challenge remains open.

  2. During the development phase, participants will have access to a gold-standard corresponding to that phase.

  3. Teams are required to provide public GitHub repositories with working GitHub Actions and clear instructions on how to run their solutions with the correct hyperparameters (up to 16 sets) for each task. You can use a small dataset like the given CCNEWS. Submissions are required to run in docker containers. Examples will be released soon, please visit the challenge site for updates.

  4. Participants' repositories will be cloned and tested at the time of the challenge. Results will be shared with the authors for verification and potential fixes before the final rankings are published.

  5. The evaluation queryset for Task 1 and the evaluation dataset for Task 2 will be disclosed after the evaluation phase.

Paper Submissions

All participants will be considered for paper submissions. We aim to accommodate all accepted papers within the conference program. Papers should be short, focusing on the presentation and poster.

We look forward to your participation and innovative solutions in the SISAP Indexing Challenge 2025! Let's push the frontiers of similarity search and indexing together.

Final comments

Any transformation of the dataset to load, index, and solve nearest neighbor queries is allowed. Transformations include but are not limited to, packing into different data types, dimensional reduction, locality-sensitive hashing, product quantization, or transforming into binary sketches. Reproducibility and open science are primary goals of the challenge, so we accept only public GitHub repositories with working GitHub Actions as submissions. Indexing algorithms may be already published or original contributions.

You can find more detailed information, data access, and registration at the SISAP Indexing Challenge website https://sisap-challenges.github.io/2025/

Important Dates

SISAP Indexing Challenge Chairs

CC BY-SA 4.0 sisap challenge committee. Last modified: February 24, 2025. Website built with Franklin.jl and the Julia programming language.