A NEW FEATURE SELECTION METHOD FOR TEXT CATEGORIZATION BASED ON INFORMATION GAIN AND PARTICLE SWARM OPTIMIZATION

No Thumbnail Available

Date

2014

Journal Title

Journal ISSN

Volume Title

Publisher

IEEE

Access Rights

info:eu-repo/semantics/closedAccess

Abstract

Rapid increases of the documents which are created in digital media necessitate analyze and classify of these documents automatically. Feature extraction, feature selection and classifier selection in the analysis of documents and classification affects performance. In text document categorization, it is a fundamental problem that the numbers of extracted features are a lot of. In this study, by using a new feature selection method based on IG (information gain) and PSO (particle swarm optimization) algorithms, text categorization process performed. Reuters 21.578 and Classic3 corpus were used in the experiments. The roots of the words in the texts of corpus were taken as the features. Feature selection and categorization processes performed with k-Nearest Neighbors algorithm (K-NN) and Naive Bayes classifiers by using IG and PSO algorithms. Proposed system performance was evaluated by using CA (Classification Accuracy), Precision, Recall and F-measure criteria.

Description

3rd IEEE International Conference on Cloud Computing and Intelligence Systems (CCIS) -- NOV 27-29, 2014 -- PEOPLES R CHINA

Keywords

Text categorization, feature selection, particle swarm optimization

Journal or Series

2014 IEEE 3RD INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (CCIS)

WoS Q Value

N/A

Scopus Q Value

N/A

Volume

Issue

Citation