The goal is to build a tool that can take an input image (e.g., Fig. 1) and, with optional filters such as location or date range to narrow the search, automatically scans the e-rara archive for visually similar pages. The tool should return direct links to matching results, enabling researchers and users to quickly identify recurring motifs, printer’s devices, illustrations, or other visual elements across the archive.
Inputs
The dataset for this challenge is provided by e-rara.ch, which hosts digitized versions of historical books and offers an API for image access. The full archive contains over 154'000 titles and millions of scanned pages. However, for practical purposes, researchers often limit their scope to fewer than 100 titles, amounting to a few thousand pages - making local processing feasible. Although processing on the Ubelix cluster is also a possibility.
Goals
Art historians and scholars in related fields would greatly benefit from the ability to search for visually similar images within large catalogues of historical prints. A particularly valuable use case is identifying recurring visual elements - such as printer's imprints - across different books and editions.
For optimal relevance, the matching should account for different visual variations, such as:
- Different sizes
- Mirroring or rotation
- Ink smudges or degradation
- Colorization
Constraints & Considerations
- Approaches using image classifiers, local feature descriptors, or other vision methods are welcome.
- A fast matching algorithm is required given the large amount of fetched images.
- Solutions that do not require a GPU and can run locally are especially encouraged.
- Creativity in lightweight or approximate matching is valued.
Team
Our team will ideally include:
- Computer Vision engineer: interested in image processing, feature extraction, and pattern detection.
- Backend engineer: someone with expertise in working with APIs and cloud data extraction.
- Usability engineer: a designer interested in creating a web-based UI for our a tool.
Contacts
For any question you can contact matteo.boi@unibe.ch
This challenge originates from Torben Hanhart at the Institute of Art History, University of Bern.
Fig. 1: Printer’s imprint used in Bern, ca. 1400–1600. Example reference image, with the corresponding correct match identified within the archive.
Previous
Hackathon Bern
Next project