Images to Text for Museum Collection
Museum collections are an important source of information in understanding the historical knowledge on culture, people and scientific discoveries. Museums host range of data including metadata (including descriptive texts), and images of objects (from multiple views) and notes. For this project, we use the collections from the University of Melbourne Medical museums (Medical History Museum, Henry Forman Atkinson Dental Museum, Harry Brookes Allen Museum of Anatomy and Pathology).
Some collections contain images (some from different viewing angles or scale) and their associated descriptions. This project will explore vision language models to extract detailed description of an object, and use the outcomes to enhance the collection descriptions if they exist, or to construct new descriptions if there aren’t any descriptive texts associated.
Supervisor: Caren Han, Jianzhong Qi, Eun-Jung Holden
https://mdhs.unimelb.edu.au/about/our-museums
文献阅读
1. I2t: Image parsing to text description
2010, and-or graph