CHOSUN

개념적 거리와 밀도를 기반으로 한 의미적 웹 문서 검색시스템

Metadata Downloads
Author(s)
황희철
Issued Date
2006
Abstract
Nowadays, there is so much quantity of lecture data acquired through the development and spread of internet techniques that we can easily get lecture data. We can obtain lots of data formed in various patterns ranging from simple text to multimedia data such as voice, movie and etc. But the data of custom templates can't be obtained easily using query in the web. So describing and organizing this vast amount of content is essential for realizing the full potential of the web as an information resource.
We should consider classification within larger context of subject-based metadata. Automatic classification is needed for at least two important reasons. The first is the sheer scale of resources available on the web and their ever-changing nature. And the second reason is that classification itself is a subjective activity. Different classification schemes are needed for different applications. Specific domain classification schemes, which can be quickly applied to large amounts of content using automatic methods, hold great promise for generating effective metadata.
If web content shifts from text-based pattern to multimedia oriented one, metadata will become even more important. Semantic interpretation of multimedia is incomplete without some mechanism in understanding semantic content which is not directly visible. For this reason, human assisted content-annotation using natural language is one of the most common methods, particularly in multimedia retrieval applications, and it provides the means for exploiting syntactic, semantic as well as lexical information. A simple form of human-assisted semantic annotation is an attachment of textual description (such as keyword, or a simple sentence) to multimedia data. Problems in text (such as keywords) based image retrieval require exact remembrance of annotating words for retrieval and efficient management of semantic or concept clusters. More advanced technique like thesaurus-based term rewriting can be used to relieve the problem of traditional keyword-based information retrieval. However, keyword-based retrieval is still in the level of syntactic pattern matching. In other words, dissimilarity computation among terms is usually done by string matching not concept matching.
In this paper, we propose the way of classifying document by the subject using similarity measure or conceptual density based on subject classification of U-WIN from dispersed web data. Specially, we focus on similarity measure and conceptual density for automatic subject classification.
Alternative Title
A Semantic Retrieval System for Web Documents based on Conceptual Distance and Density
Alternative Author(s)
Hwang, Hee Chul
Affiliation
조선대학교 교육대학원
Department
교육대학원 정보컴퓨터교육
Advisor
김판구
Awarded Date
2006-08
Table Of Contents
표 목차 = ⅲ
그림 목차 = ⅳ
ABSTRACT = ⅴ
Ⅰ. 서론 = 1
A. 연구배경 = 1
B. 연구범위 및 내용 = 2
Ⅱ. 관련 연구 = 3
A. 문서검색 시스템을 위한 기반 기술 = 3
1. 문서 색인 = 3
2. 내용기반 문서 여과 = 4
3. 협동에 의한 정보 추천 = 6
4. 다중 에이전트에 의한 정보 검색 = 7
5. 사용자 관심 학습 = 8
B. 기존 인터넷 정보 검색 = 10
1. 검색엔진 = 10
2. 메타검색 엔진 = 11
3. 오프라인 브라우저 = 12
4. 추천 시스템 = 13
5. 뉴스 및 메일 여과 시스템 = 15
6. 푸시 서비스 = 16
C. 온톨로지 = 19
Ⅲ. 시스템의 제안 및 유사도 측정 = 21
A. 시스템의 구성 = 21
1. 교육정보 메타데이터 = 21
2. U-WIN = 22
B. 유사도 측정 = 26
1. 개념적 거리 측정 = 26
2. 개념적 밀도 측정 = 29
Ⅳ. 실험 및 결과 분석 = 32
A. 실험 내용 = 32
B. 결과 분석 = 35
Ⅴ. 결론 = 38
참고문헌 = 39
Degree
Master
Publisher
조선대학교 교육대학원
Citation
황희철. (2006). 개념적 거리와 밀도를 기반으로 한 의미적 웹 문서 검색시스템.
Type
Dissertation
URI
https://oak.chosun.ac.kr/handle/2020.oak/14636
http://chosun.dcollection.net/common/orgView/200000232771
Appears in Collections:
Education > 3. Theses(Master)
Authorize & License
  • AuthorizeOpen
  • Embargo2008-09-01
Files in This Item:

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.