현재 표준안으로 만들어지고 있는 MPEG-7에서는 저차원의 특징뿐만 아니라 시공간적 관계 표현, 이벤트 인식에 이르기까지 의미적 인식을 위한 노력을 하고 있으나 이는 메타 데이터 형태로 미디어객체에 대해 단순히 키워드를 부여하는 방법으로 그 내용을 표현하는 정도이고, 진정한 의미적 내용을 표현하는 것은 현재로서는 불가능하다. 이에 비디오 내용을 의미적으로 인식하기 위해서는, 비디오내 움직임 객체의 섬세하고 세밀한 의미적 표현이 필수적이라고 생각된다. 본 논문에서는 현재까지 기술 개발된 움직임 객체의 분리추적기술 등을 기반으로 객체들 간의 시공간적 관계 표현 및 이들의 표현을 매칭시키기 위한 방법으로 온톨로지를 적용하는 방안을 제시하고자 한다. 이는 추후 멀티미디어 데이터 처리에 많은 응용이 발생할 것으로 보인다.|The MPEG-7 visual standard under development specifies content-based descriptors that allow users or agents (or search engines) to measure similarity in images or video based on visual criteria, and can be used to efficiently identify, filter, or browse images or video based on visual content. More specifically, MPEG-7 specifies color, texture, object shape, global motion, or object motion features for this purpose. This paper outlines the aim, methodologies, and broad details of the MPEG-7 standard development for video event description. Except for assistant by the MPEG-7 tools, we also put forward a novel method for video event analysis and description based on the Domain Knowledge in this paper. Semantic concepts in the context of the video event are described in one specific domain enriched with qualitative attributes of the semantic objects, multimedia processing approaches and domain independent factors: low level features (pixel color, motion vectors and spatio-temporal relationship). In order to apply large-scale semantic knowledge in vision problems effectively, catering the naive user’s retrieval and index processing with semantic (human) language, a few major issues are resolved in this paper. Firstly, how can we get the semantic shot for the specific Domain Knowledge? The former existing algorithm has been adopted to solve the problem. Secondly, what visual observables should be collected? This is usually dependent on the problem domain. Here, we consider one shot of the billiard game clip as the specific Domain Knowledge. Thirdly, how can these observables be translated into the semantic representation, we are from two aspects to expose that issue: Firstly, video event representation using MPEG-7 high level descriptors which was defined in the MPEG-7 XML files. Secondly, video object motion analysis with the help of the MPEG-7 low level descriptors(video object motion detection and moving trajectory analysis). In addition, the most important contribution in this work is exploiting the video object ontology to map the MPEG-7's high-level descriptors to low level features descriptors which have been defined in the MPEG's logical structure.
Semantic Video Event Description Assisted by Building Domain Knowledge
Song, Dan
조선대학교 대학원
일반대학원 전자계산학과
Table Of Contents
1. Introduction = 1
2. Related Work = 3
3. Video Analysis Based on Domain Knowledge = 6
3.1. Overview = 6
3.2. Video Shot Detection for Domain Knowledge = 7
3.3. Event Representation using MPEG-7 High Level Descriptors = 11
3.4. Video Object Description with Low Levels Features = 14
4. Domain Knowledge Ontology Infrastructure Building = 18
4.1. Building Object Ontology in Domain Knowledge = 18
4.2. Concepts Definitions in Domain Knowledge = 20
4.3. Video Event Representation = 22
4.4. Retrieval = 23
5. Conclusion = 27
Reference = 29
조선대학교 대학원
송단. (2005). 도메인 지식 구축에 의한 의미적 비디오 이벤트 표현.
