Detection of missing and misplaced books

Description

Can software be used to determine whether thousands of physical books are correctly placed on shelves and which are missing? This check is currently performed manually by staff at Leiden University Libraries. In this project, students are designing software that can be used in practice to significantly speed up this time-consuming task. The focus is on practically usable software, not on flawless recognition algorithms. Students are given the freedom to choose their own approach and explore creative solutions, as long as they make well-founded choices that contribute to faster and more targeted checks of bookcases.

In theory, you could identify each book individually by scanning its barcode. In practice, this isn't a realistic approach: the university library holds literally kilometers of books, and because the barcode is located inside the book, each book would have to be physically removed from the shelves, which is practically impossible. A much more efficient approach is to capture images of each shelf, which are analyzed by software and compared with data from the database. By recognizing the location code on the book's spine, the software can determine whether a book is correctly positioned relative to surrounding books and flag any missing books. Additional information on the book's spine, such as the title, can be used to verify whether the software has correctly recognised the location code. The challenge lies not in perfect recognition, but in largely correctly identifying discrepancies, allowing the software to serve as a first filter for further inspection.

Expected MVP

A minimum viable version demonstrates that software can meaningfully support the manual inspection process by making deviations on the shelf visible.

The MVP: * can read an Excel export from the library database and sort by LCC location code*. * can create or read images of a single bookshelf. * detects individual book spines in the image. * detects location codes (LCC) as accurately as possible (but not perfectly), allowing for assumptions based on data from the database. * compares the information detected from the images with the expected books according to the Excel export. * can indicate per shelf whether manual inspection is required (live during filming or saved in such a way that the shelf is easy to find), so that staff can specifically check where something is likely to be wrong.

A fully developed version: * detects, in addition to the location code, other information (title, author, or editor) on the book's spine. ? Uses the additional information on the spine of the book to verify that the location code has been read correctly, or to determine the location code if it has not been correctly recognized. * Can not only indicate that a discrepancy exists on a shelf but also determine its position. * Presents these discrepancies in a way that makes it easy for staff to physically locate them. * Supports multiple types of location codes; in addition to LCC, support for the location codes of core books in the law library is particularly desirable for practical use in daily checks*.

* Can process multiple bookshelves in succession.

For this project, it is especially important that location codes define a sort order; the substantive meaning of the codes is not relevant. The UBL uses various shelving systems, of which the Library of Congress Classification (LCC) system is the most common and forms the basis for this project. Other location codes are used for specific collections, including those for core books in the law library. Other systems used in the university library include NLM (National Library of Medicine) and MSC (Mathematics Subject Classification).

Project details