Although deep neural networks have been widely adopted and proven effective across many applications, the backpropagation process remains complex and not easily interpretable in a way that mirrors human understanding. Humans are capable of swiftly grasping and interpreting complex patterns to make decisions, whereas neural networks depend on computationally intensive methods for learning.
The XV Scanner™ is the world’s only fully dedicated X-ray lung function scanner, designed to do something completely unique–capture a single, full-breath cycle in motion, from four different angles simultaneously.
To optimise the scanning procedure, it is desirable to have a real-time, non-contact method for monitoring the patient’s breathing. This project is focused on the development of a depth-camera based breath tracking method/algorithm to extract a highly dependable, real-time breath trace from a seated patient. As the breath trace may be used for automatically initiating and terminating X-ray exposures during the patient’s breathing cycle.
Museum collections are an important source of information in understanding the historical knowledge on culture, people and scientific discoveries. Museums host range of data including metadata (including descriptive texts), and images of objects (from multiple views) and notes. For this project, we use the collections from the University of Melbourne Medical museums (Medical History Museum, Henry Forman Atkinson Dental Museum, Harry Brookes Allen Museum of Anatomy and Pathology).
With the growing adoption of wearable and mobile devices, the use of AI in mobile health is becoming increasingly relevant for a variety of healthcare applications. However, deploying AI models on these resource-limited client devices poses significant challenges, restricting their applicability. This is especially the case with the introduction of large language models (LLMs) and multimodal LLMs, which require substantial memory and therefore present additional deployment hurdles.
Model reprogramming [1] or input visual prompts [2] has been successfully used in reusing pretrained image classifiers. This project aims to explore whether model reprogramming can enable the transferability of pretrained deep neural networks across low-level image processing tasks. Specifically, we will investigate the feasibility of using pretrained models (e.g., U-Net) for image denoising, enhancement, and super-resolution on different tasks by introducing task-specific visual prompts.
Pose-guided person image synthesis task aims to render a person’s image with a desired pose and appearance. Specifically, the appearance is defined by a given source image and the pose by a set of keypoints. Having control over the synthesized person images in terms of pose and style is an important requisite for applications such as ecommerce, virtual reality, metaverse and content generation for the entertainment industry. However, the current solutions usually require a significant amount of time to generate an image. This project aims to develop a new diffusion model that can generate images in seconds.