수사 구조 이론 관계 레이블 기반 문서 구조 유사성 분석에 대한 연구
- Author(s)
- 서동원
- Issued Date
- 2023
- Abstract
- The performance has been greatly improved with the development of deep learning technology, which is a text processing technique that mainly used statistical techniques. As a result, more and more companies are actively entering the text processing model into their business. It is mainly used for document analysis such as classification of customer queries and report analysis. more recently, models using document analysis techniques have also begun to be used in the recruitment process. These models significantly increased the job efficiency of the interviewer by quantifying the applicant's resume for each item.
Each document has different characteristics depending on its purpose. For example, a report used by a company or a research institute interprets the contents of an experiment or writes an article in the form of a causal relationship. Conversely, in the case of an opinion, it is possible to interpret a phenomenon, but it is written in the form of asserting and persuading one's opinion on it. Also, the resume is written in the form of a cover letter. If the texts have the same personality, there are structural commonalities even though the content is different.
In this paper, using the rhetorical structural theory, we try to find structural commonalities that fit the characteristics of these texts, and to figure out how well they are structurally written when new texts are written. The experiment was conducted by analyzing the similarity by creating two text vector spaces for two groups of documents with different types of text.
- Alternative Title
- A study on the document structure similarity analysis based on Rhetoric Structure Theory Relationship Label
- Alternative Author(s)
- DongWon Seo
- Affiliation
- 조선대학교 일반대학원
- Department
- 일반대학원 컴퓨터공학과
- Advisor
- 김판구
- Awarded Date
- 2023-02
- Table Of Contents
서론 1
관련연구 5
A. 텍스트 분석 5
i. 형태소 분석 (Morphological Analysis) 7
ii. 구문 분석 (Syntax Analysis) 12
iii. 의미 분석 (Semantic Analysis) 13
iv. 화용 분석 (Pragmatic Analysis) 14
B. 수사구조이론 16
i. RST Label 16
ii. 구문 분석 Parser 20
iii. RST Parser 23
C. 벡터 유사도 분석 25
i. 벡터 공간에서의 유사도 분석: 코사인 유사도 분석 25
ii. 텍스트 벡터화(Text Vectorization) 26
(ㄱ) 원-핫 인코딩(One-hot encoding) 27
(ㄴ) 빈도수 기반 텍스트 벡터화(TF-IDF) 27
(ㄷ) 단어 임베딩(Word Embedding) 28
문서 구조 유사도 분석 시스템 30
A. 문서 구조 유사도 분석 시스템 30
B. 실험 34
결론 39
참고문헌 40
- Degree
- Master
- Publisher
- 조선대학교 대학원
- Citation
- 서동원. (2023). 수사 구조 이론 관계 레이블 기반 문서 구조 유사성 분석에 대한 연구.
- Type
- Dissertation
- https://oak.chosun.ac.kr/handle/2020.oak/17656
Appears in Collections:
- General Graduate School > 3. Theses(Master)
- Authorize & License
- AuthorizeOpen
- Embargo2023-02-24
- Files in This Item:
Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.