Query Rewriting for Extracting Data Behind HTML Forms

0
85

Authors: David W. Embley, Stephen W. Liddle, Xueqi (Helen) Chen

Tags: 2004, conceptual modeling

Much of the information on the Web is stored in specialized searchable databases and can only be accessed by interacting with a form or a series of forms. As a result, enabling automated agents and Web crawlers to interact with form-based interfaces designed primarily for humans is of great value. This paper describes a system that can fill out Web forms automatically according to a given user query against an ontological description of an application domain and, to the extent possible, can extract just the relevant data behind these Web forms. Experimental results on two application domains show that the approach can work well.

Read the full paper here: https://link.springer.com/chapter/10.1007/978-3-540-30466-1_31