Author : Dustin Lange
Publisher : Universitätsverlag Potsdam
ISBN 13 : 3869560819
Total Pages : 32 pages
Book Rating : 4.8/5 (695 download)
Book Synopsis Extracting Structured Information from Wikipedia Articles to Populate Infoboxes by : Dustin Lange
Download or read book Extracting Structured Information from Wikipedia Articles to Populate Infoboxes written by Dustin Lange and published by Universitätsverlag Potsdam. This book was released on 2010 with total page 32 pages. Available in PDF, EPUB and Kindle. Book excerpt: Roughly every third Wikipedia article contains an infobox - a table that displays important facts about the subject in attribute-value form. The schema of an infobox, i.e., the attributes that can be expressed for a concept, is defined by an infobox template. Often, authors do not specify all template attributes, resulting in incomplete infoboxes. With iPopulator, we introduce a system that automatically populates infoboxes of Wikipedia articles by extracting attribute values from the article's text. In contrast to prior work, iPopulator detects and exploits the structure of attribute values for independently extracting value parts. We have tested iPopulator on the entire set of infobox templates and provide a detailed analysis of its effectiveness. For instance, we achieve an average extraction precision of 91% for 1,727 distinct infobox template attributes.