CHOSUN

수사 구조 이론 관계 레이블 기반 문서 구조 유사성 분석에 대한 연구

Metadata Downloads
Author(s)
서동원
Issued Date
2023
Abstract
The performance has been greatly improved with the development of deep learning technology, which is a text processing technique that mainly used statistical techniques. As a result, more and more companies are actively entering the text processing model into their business. It is mainly used for document analysis such as classification of customer queries and report analysis. more recently, models using document analysis techniques have also begun to be used in the recruitment process. These models significantly increased the job efficiency of the interviewer by quantifying the applicant's resume for each item.
Each document has different characteristics depending on its purpose. For example, a report used by a company or a research institute interprets the contents of an experiment or writes an article in the form of a causal relationship. Conversely, in the case of an opinion, it is possible to interpret a phenomenon, but it is written in the form of asserting and persuading one's opinion on it. Also, the resume is written in the form of a cover letter. If the texts have the same personality, there are structural commonalities even though the content is different.
In this paper, using the rhetorical structural theory, we try to find structural commonalities that fit the characteristics of these texts, and to figure out how well they are structurally written when new texts are written. The experiment was conducted by analyzing the similarity by creating two text vector spaces for two groups of documents with different types of text.
Alternative Title
A study on the document structure similarity analysis based on Rhetoric Structure Theory Relationship Label
Alternative Author(s)
DongWon Seo
Affiliation
조선대학교 일반대학원
Department
일반대학원 컴퓨터공학과
Advisor
김판구
Awarded Date
2023-02
Table Of Contents
ABSTRACT

서론 1

관련연구 5
A. 텍스트 분석 5
i. 형태소 분석 (Morphological Analysis) 7
ii. 구문 분석 (Syntax Analysis) 12
iii. 의미 분석 (Semantic Analysis) 13
iv. 화용 분석 (Pragmatic Analysis) 14
B. 수사구조이론 16
i. RST Label 16
ii. 구문 분석 Parser 20
iii. RST Parser 23
C. 벡터 유사도 분석 25
i. 벡터 공간에서의 유사도 분석: 코사인 유사도 분석 25
ii. 텍스트 벡터화(Text Vectorization) 26
(ㄱ) 원-핫 인코딩(One-hot encoding) 27
(ㄴ) 빈도수 기반 텍스트 벡터화(TF-IDF) 27
(ㄷ) 단어 임베딩(Word Embedding) 28

문서 구조 유사도 분석 시스템 30
A. 문서 구조 유사도 분석 시스템 30
B. 실험 34

결론 39

참고문헌 40
Degree
Master
Publisher
조선대학교 대학원
Citation
서동원. (2023). 수사 구조 이론 관계 레이블 기반 문서 구조 유사성 분석에 대한 연구.
Type
Dissertation
URI
https://oak.chosun.ac.kr/handle/2020.oak/17656
http://chosun.dcollection.net/common/orgView/200000650855
Appears in Collections:
General Graduate School > 3. Theses(Master)
Authorize & License
  • AuthorizeOpen
  • Embargo2023-02-24
Files in This Item:

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.