Author : Chenyan Wu
Publisher :
ISBN 13 :
Total Pages : 0 pages
Book Rating : 4.:/5 (144 download)
Book Synopsis Spatial, Temporal, and Morphological Perspectives by : Chenyan Wu
Download or read book Spatial, Temporal, and Morphological Perspectives written by Chenyan Wu and published by . This book was released on 2024 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: Artificial intelligence (AI) has experienced significant transformation over the past decade, influencing a multitude of sectors and subsequently reshaping our industrial, economic, and societal frameworks. One outstanding application in this evolution is ChatGPT, stemming from the field of Natural Language Processing (NLP). This technology has been successfully integrated into programming assistance, education, brainstorming, etc., notably enhancing workforce efficiency. Concurrently, several promising computer vision (CV) applications--including autonomous driving, intelligent household robots, and AI medical diagnostics--are still in their developmental stages, with aspirations to reach milestones analogous to those accomplished by ChatGPT. Considering the mechanisms behind the above three CV applications, each requires collaborative interactions with humans. Thus, for these systems to gain widespread adoption, it is crucial that they deeply understand visual data depicting humans. This dissertation is dedicated to the analysis of such data, exploring it through three distinct perspectives: spatial and temporal, for the human body, and morphological, for organs. The human body can be conceptualized as a geometric entity in 3-D space. This dissertation begins with examining a primary spatial attribute of the human body: its orientation relative to the camera's perspective. Through building the most comprehensive human orientation dataset to date and developing a robust neural network, we achieve admirable results in estimating human orientation. Subsequently, we advance the spatial representation of human bodies to its utmost extent, aiming to reconstruct every human mesh within a single image. Instead of the conventional methods that depend on learning image features, we construct coherent multi-human meshes utilizing solely multi-human 2-D poses as input, processed through a single graph neural network. Surprisingly, this simple network, despite its minimal input information, performs comparable or even better than previous image-based approaches across various benchmarks. Such results indicate significant potential for future image-based approaches. Additionally, we investigate the human body from a temporal perspective. As human bodies move over time, static human images evolve into videos depicting human motion. To study human motion, we construct a highly precise video dataset focusing on human motor elements and propose a Transformer network to represent these elements. Our findings further demonstrate that the features derived from human motion can significantly improve comprehension in Bodily Expressed Emotion Understanding (BEEU), thereby setting a new state-of-the-art in BEEU. The aforementioned spatial-temporal characteristics do not account for human organs' detailed appearances and textures. This detailed visual information differentiates individuals and plays a vital role in AI medical diagnosis. Lastly, this dissertation employs segmentation methodologies to analyze the morphological characterizations of a specific human organ--the placenta. In sum, our research highlights the profound potential of AI in understanding the visual data of humans, paving the way for innovative applications and enhanced human-machine collaboration.