A NEW FEATURE SELECTION METHOD FOR TEXT CATEGORIZATION BASED ON INFORMATION GAIN AND PARTICLE SWARM OPTIMIZATION
No Thumbnail Available
Date
2014
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
IEEE
Access Rights
info:eu-repo/semantics/closedAccess
Abstract
Rapid increases of the documents which are created in digital media necessitate analyze and classify of these documents automatically. Feature extraction, feature selection and classifier selection in the analysis of documents and classification affects performance. In text document categorization, it is a fundamental problem that the numbers of extracted features are a lot of. In this study, by using a new feature selection method based on IG (information gain) and PSO (particle swarm optimization) algorithms, text categorization process performed. Reuters 21.578 and Classic3 corpus were used in the experiments. The roots of the words in the texts of corpus were taken as the features. Feature selection and categorization processes performed with k-Nearest Neighbors algorithm (K-NN) and Naive Bayes classifiers by using IG and PSO algorithms. Proposed system performance was evaluated by using CA (Classification Accuracy), Precision, Recall and F-measure criteria.
Description
3rd IEEE International Conference on Cloud Computing and Intelligence Systems (CCIS) -- NOV 27-29, 2014 -- PEOPLES R CHINA
Keywords
Text categorization, feature selection, particle swarm optimization
Journal or Series
2014 IEEE 3RD INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (CCIS)
WoS Q Value
N/A
Scopus Q Value
N/A