IRMA-International.org: Creator of Knowledge
Information Resources Management Association
Advancing the Concepts & Practices of Information Resources Management in Modern Organizations

Search Engine-Based Web Information Extraction

Search Engine-Based Web Information Extraction
View Sample PDF
Author(s): Gijs Geleijnse (Philips Research, The Netherlands)
Copyright: 2009
Pages: 34
Source title: Semantic Web Engineering in the Knowledge Society
Source Author(s)/Editor(s): Jorge Cardoso (SAP Research, Germany)and Miltiadis D. Lytras (Effat University, Saudi Arabia)
DOI: 10.4018/978-1-60566-112-4.ch009

Purchase

View Search Engine-Based Web Information Extraction on the publisher's website for pricing and purchasing information.

Abstract

In this chapter we discuss approaches to find, extract, and structure information from natural language texts on the Web. Such structured information can be expressed and shared using the standard Semantic Web languages and hence be machine interpreted. In this chapter we focus on two tasks in Web information extraction. The first part focuses on mining facts from the Web, while in the second part, we present an approach to collect community-based meta-data. A search engine is used to retrieve potentially relevant texts. From these texts, instances and relations are extracted. The proposed approaches are illustrated using various case-studies, showing that we can reliably extract information from the Web using simple techniques.

Related Content

R. Sundar, P. Balaji Srikaanth, Darshana A. Naik, V. P. Murugan, Madhavi Karumudi, Sampath Boopathi. © 2024. 26 pages.
Kamalendu Pal. © 2024. 26 pages.
Hayder Luis Endo Pérez, Amed Abel Leiva Mederos, José Antonio Senso-Ruíz, Ghislain Auguste Atemezing, Daniel Gálvez Lio, Jose Luis Sánchez-Chávez, Alfredo Simón Cueva. © 2024. 13 pages.
Graveth Uzoma Ejekwu, Samson Ajodo, O. Mashood Lawal, Oluwafemi S. Balogun. © 2024. 20 pages.
Marwa Ben Arab, Mouna Rekik, Lotfi Krichen. © 2024. 18 pages.
J. Vimala Devi, Rajesh Vyankatesh Argiddi, P. Renuka, K. Janagi, B. S. Hari, S. Boopathi. © 2024. 24 pages.
Marius Iulian Mihailescu, Stefania Loredana Nita. © 2024. 45 pages.
Body Bottom