ER - Intl Conf on Conceptual Modeling

Extraction of Partial XML Documents Using IR-Based Structure and Contents Analysis

October 20, 2020

143

Authors: Hiroko Kinutani, Kenji Hatano, Masatoshi Yoshikawa, Shunsuke Uemura

Tags: 2001, conceptual modeling

As Internet technologies develop, XML is becoming widely used as a standard data/document format. Although the use of XML documents has attracted public attention, the application of IR technologies in XML document retrieval is still in its premature stage. We foresee that typical XML queries for end-users will be very terse, like those used with current Web search engines. Therefore, an XML search engine should be able to search appropriate retrieval results using only a few keywords. In this paper, we introduce a notion of context nodes. Context nodes are used to automatically extract coherent partial documents without the knowledge of XML document structures. This method is useful because it does not require domain analysts to analyze DTDs and specify candidate partial documents beforehand. We use the term “context search” to represent search methods which employ the notion of context node. As an instantiation of context search methods, we have developed algorithms to identify result partial documents in the vector space model. We made a performance evaluation to verify the effectiveness of our method.

Read the full paper here: https://link.springer.com/chapter/10.1007/3-540-46140-X_26

Extraction of Partial XML Documents Using IR-Based Structure and Contents Analysis

EDITOR PICKS

Roger H.L. Chiang – 2023 ASOCA Winner

Join us in the magical Miami for the 2023 AIS SIGSAND!

Participate in SAND sessions at AMCIS 2023 – August 10 –...

POPULAR POSTS

Participate in SAND sessions at AMCIS 2023 – August 10 –...

Conceptual Modelling in the “Digital First” Era — A Joint AIS...

TheoryOn: A Design Framework and System for Unlocking Behavioral Knowledge through...

POPULAR CATEGORY

Share this:

EDITOR PICKS

POPULAR POSTS

POPULAR CATEGORY