A new approach on search for similar documents with multiple categories using fuzzy clustering

Küçük Resim Yok

Tarih

2008

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

PERGAMON-ELSEVIER SCIENCE LTD

Erişim Hakkı

info:eu-repo/semantics/closedAccess

Özet

Searching for similar document has an important role in text mining and document management. In whether similar document search or in other text mining applications generally document classification is focused and class or category that the documents belong to is tried to be determined. The aim of the present study is the investigation of the case which includes the documents that belong to more than one category. The system used in the present study is a similar document search system that uses fuzzy clustering. The situation of belonging to more than one category for the documents is included by this system. The proposed approach consists of two stages to solve multicategories problem. The first stage is to find out the documents belonging to more than one category. The second stage is the determination of the categories to which these found documents belong to. For these two aims alpha-threshold Fuzzy Similarity Classification Method (alpha-FSCM) and Multiple Categories Vector Method (MCVM) are proposed as written order. Experimental results showed that proposed system can distinguish the documents that belong to more than one category efficiently. Regarding to the finding which documents belong to which classes, proposed system has better performance and success than the traditional approach. (c) 2007 Elsevier Ltd. All rights reserved.

Açıklama

Anahtar Kelimeler

text mining, document similarity, similarity search, fuzzy clustering, multiple categories

Kaynak

EXPERT SYSTEMS WITH APPLICATIONS

WoS Q Değeri

Q1

Scopus Q Değeri

Q1

Cilt

34

Sayı

4

Künye