Authors: Mengchi Liu, Tok Wang Ling
Tags: 2000, conceptual modeling
Most documents available over the web conform to the HTML specification. Such documents are hierarchically structured in nature. The existing graph-based or tree-based data models for the web only provide a very low level representation of such hierarchical structure. In this paper, we introduce a conceptual model for the web that is able to represent the complex hierarchical structure within the web documents at a high level that is close to human conceptualization/visualization of the documents. We also describe how to convert HTML documents based on this conceptual model. Using the conceptual model and conversion method, we can capture the essence (i.e., semistructure) of HTML documents in a natural and simple way.Read the full paper here: https://link.springer.com/chapter/10.1007/3-540-45393-8_17