Data & Knowledge Engineering

Natural language analysis for semantic document modeling

October 22, 2020

188

Authors: Jon Atle Gulla, Terje Brasethvik

Tags: 2001, conceptual modeling

To ease the retrieval of documents published on the Web, the documents should be classified in a way that users find helpful and meaningful. This paper presents an approach to semantic document classification and retrieval based on natural language analysis and conceptual modeling. Users may define their own conceptual domain model, which is then used in combination with linguistic tools to define a controlled vocabulary for a document collection. Users may browse this domain model and interactively classify documents by selecting model fragments that describe the contents of the documents. Natural language tools are used to analyze the text of the documents and propose relevant model fragments in terms of selected domain model concepts and named relations. The proposed fragments are refined by the users and stored as document descriptions in RDF–XML format. For document retrieval, lexical analysis is used to pre-process search expressions and map these to the domain model for manual query-refinement. A prototype of the system is described, and the approach is illustrated with examples from a document collection published by the Norwegian Center for Medical Informatics (KITH).

Read the full paper here: https://pdf.sciencedirectassets.com/271546/1-s2.0-S0169023X00X00626/1-s2.0-S0169023X01000167/main.pdf?X-Amz-Security-Token=IQoJb3JpZ2luX2VjEBcaCXVzLWVhc3QtMSJHMEUCIQDVQn32LWNqaanjhMZLX3VM2Dvlo%2Bg8bEoMeXQeNUacCQIgOkhoUwx9j%2FlLBATE3%2BnVrwRwIe7LeXm08Yex7m%2FbLqgqtAMIMBADGgwwNTkwMDM1NDY4NjUiDK%2BO61X02G0vZFZfHCqRA6AL9JCIAprdzUASo%2B8yWHJWeQfLhN9WhJKTuFM0%2F9V2z%2BCy74gJ%2BS%2FIrqiS%2BPqhEF2EjGqRRMScroI%2FfI4muJsxlluECjAKIrcOhSIJNC0%2FuCNDTKbsJXesgBSCtkm6lkdV7gPU3%2BebnDT3K1TcMRrd19ar4Z6%2B5wYkec9tn8WRCpsOVHS4%2F%2BmcJLBwyFq9%2BCBvmRmiM9cKXs3zx3Oo4Z%2BfoA2WpHsxxRn1f9zGXMbGcujQ%2FhGcjY8m%2F%2BSZDimzFFS62vcS8DVcy2lEFoIp03g1Ah8HJ0RiVSciNBxA%2BL%2F1E6R8jER1bC9QEoUd9%2BrA9ApTo1vAhwKIDMYzkQgjzfYC1LB3dKaZLLDbiPbAB%2FCwsaXAFVV7IC%2BvQEN0N3tpHazTUXm0b5liwabTbMRV%2BH7KS4gN0fdDRebAxPOE5zlYQl3XcQSk7Xd6aUEJLuIRDpCnrJaksczcfbWOdE5bR%2FHSuXqnih8J09g654kbVw7djqxaykb9TlF4HlxNiLXg33w1EO3sydUdbvlV7xvwhdhNMNWOk%2FsFOusBeH%2F2YPtrLMHhj6kcBUYzoVUH%2FaYtyZU1MDHwP86U%2BCY2T9KsOF3XribUlYqvXO5JCbItsz1mvEBsdHI7BAvwDP2Xwc7pWNmIsK8lBoSvme%2BFA%2B3LVCyvzasOsMdVZbFwbRR092S9qRz392QzNHdyIv1UErxkSn9Qw%2FHUATd79sWHMLe74vGqkKPNTi0fdcox2hWOgZ1N6SX6WX2QrOgmLFRIyyMxHN68%2B1oIIMfFwst5EeKP9ydVotoFnxqyr7tcYAH5321U7NhAz5soGLs3s%2F7RHwI7Bzg8DIErxmtdUacW8s9jCMgieslpwg%3D%3D&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20200918T160134Z&X-Amz-SignedHeaders=host&X-Amz-Expires=300&X-Amz-Credential=ASIAQ3PHCVTYTRIXDH4Q%2F20200918%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Signature=2e657dbd93825a3a29173b9051f374f3f06e2ae6519fb04f462078fa4301e76b&hash=36433c7929d97251c75ff17b8fb502a0bc57b728782e6e939d48dbf26fe3568f&host=68042c943591013ac2b2430a89b270f6af2c76d8dfd086a07176afe7c76c2c61&pii=S0169023X01000167&tid=spdf-0f863fb0-0c92-4577-9d1e-f7402dee994c&sid=ce3e6b217906524963386c21db5fa5ebfaccgxrqa&type=client

Natural language analysis for semantic document modeling

EDITOR PICKS

Roger H.L. Chiang – 2023 ASOCA Winner

Join us in the magical Miami for the 2023 AIS SIGSAND!

Participate in SAND sessions at AMCIS 2023 – August 10 –...

POPULAR POSTS

Participate in SAND sessions at AMCIS 2023 – August 10 –...

Conceptual Modelling in the “Digital First” Era — A Joint AIS...

TheoryOn: A Design Framework and System for Unlocking Behavioral Knowledge through...

POPULAR CATEGORY

Share this:

EDITOR PICKS

POPULAR POSTS

POPULAR CATEGORY