CHOSUN

Link grammar를 이용한 도메인 온톨로지 확장 방안

Metadata Downloads
Author(s)
윤병수
Issued Date
2009
Abstract
Ontology is constructed with concept, definition, and these relation. lately, much of data, ontology do not support reasoning information to users. so, there are so many necessary for ontology population. therefore many studies in ontology population. but most of study, they used manually extraction to concept, relation, properties. this method spend a lot of time and money to data mining. so, to solve this problem, automatic ontology population have studied. it save much of time, money. however it requires another knowledge-base or thesaurus. and they dependent on its source(thesaurus, knowledge-base, etc)
In my study, I deal with ontology population which are not using another famous dictionary, but using Link grammar and infobox of wikipedia. Link grammar is a syntactic parsing theory of English. I analyze link pattern to determine what link pattern will be candidate of relation-concept and extracted triples. then added weight value to each triples. in the result, i got relation and concepts, in order to weight value. weight values are purpose to extract good triples. then i apply infobox, navbox to classify relation, concepts.
First, I gather biology documents in wikipedia. then to get a body part of wiki-document, apply stampling process. and extract terminologies, which make database to extract important sentences. i set the process for terminology extraction(tagging, tokenize, extraction). and verify terminologies. then i select important sentences to apply Link grammar. I define 7-patterns to get concept-relation triple. after i find good triple through the PMI value and TF value.
At last, i classify concept with infobox, navbox and classify relation with pre-defined classify table which defined Relation hierarchy. i make metadata with this properties, then visualization it.
There are some error in my study, it occurs in wrong Pos tagging, stopword interruption. anaphora relosolution. but visualization works well, and it's possible to extract more than 1 relations in 1 sentence. finally, I will study about named entity recognization to finding correct triples .
Alternative Title
The Method of Domain Ontology Population Using Link grammar
Alternative Author(s)
Byungsu Youn
Affiliation
지능형컴퓨팅연구실
Department
일반대학원 컴퓨터공학과
Advisor
김판구
Awarded Date
2010-02
Table Of Contents
Ⅰ. 서론 1
A. 연구배경 및 목적 1
B. 연구내용 및 구성 3

Ⅱ. 관련 연구 4
A. 위키피디아를 이용한 연구 4
B. Link Grammar 6

Ⅲ. 핵심문장 추출 11
A. 문서 처리 12
1. 위키피디아 본문 추출 12
2. 문장 전처리 과정 13
B. 핵심문장 추출 14
1. 핵심어 추출 14
a. 토큰화(Tokenizing) 15
b. 태깅(Tagging) 16
c. 핵심어 추출 17
d. Term Finder와의 결과비교 19
2. 핵심어가 포함된 문장 추출 20
a. 핵심어를 이용한 문장추출 21
b. 문장 분할 21

Ⅳ. Link grammar를 이용한 관계 및 개념 추출 24
A. Patten기반 Triple 추출 24
1. Triple 및 Link관계정의 24
2. 문장 내 Triple 생성패턴 25
B. Triple 순위화 32
1. 상호정보량 가중치 부여 32
2. 인스턴스를 이용한 가중치 부여 33

Ⅴ. 온톨로지 설계 및 Visualization 35
A. 인스턴스, 관계정의 35
1. 인스턴스 정의 35
2. 관계 정의 37
B. 메타데이터 생성 40
1. 관계, 인스턴스 생성 40
2. Triple 내 Link와 관계생성 42
C. 시각화 44
1. 인터페이스 44

Ⅵ. 실험 및 평가 47
A. 실험 및 응용 방법 47
B. Precision Rate와 Recall Rate를 통한 평가 51
C. 실험 평가 53

Ⅶ. 결론 및 제언 54

참고문헌 55
Degree
Master
Publisher
조선대학교
Citation
윤병수. (2009). Link grammar를 이용한 도메인 온톨로지 확장 방안.
Type
Dissertation
URI
https://oak.chosun.ac.kr/handle/2020.oak/8395
http://chosun.dcollection.net/common/orgView/200000239169
Appears in Collections:
General Graduate School > 3. Theses(Master)
Authorize & License
  • AuthorizeOpen
  • Embargo2010-01-25
Files in This Item:

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.