A Boolean function approach to feature selection in consistent decision information systems

No Thumbnail Available

Date

2011

Journal Title

Journal ISSN

Volume Title

Publisher

PERGAMON-ELSEVIER SCIENCE LTD

Access Rights

info:eu-repo/semantics/closedAccess

Abstract

The goal of feature selection (FS) is to find the minimal subset (MS) R of condition feature set C such that R has the same classification power as C and then reduce the dataset by discarding from it all features not contained in R. Usually one dataset may have a lot of MSs and finding all of them is known as an NP-hard problem. Therefore, when only one MS is required, some heuristic for finding only one or a small number of possible MSs is used. But in this case there is a risk that the best MSs would be overlooked. When the best solution of an FS task is required, the discernibility matrix (DM)-based approach, generating all MSs, is used. There are basically two factors that often cause to overflow the computer's memory due to which the DM-based FS programs fail. One of them is the largeness of sizes of discernibility functions (DFs) for large data sets; the other is the intractable space complexity of the conversion of a DF to disjunctive normal form (DNF). But usually most of the terms of DF and temporary results generated during DF to DNF conversion process are redundant ones. Therefore, usually the minimized DF (DFmin) and the final DNF is to be much simpler than the original DF and temporary results mentioned, respectively. Based on these facts, we developed a logic function-based feature selection method that derives DFmin from the truth table image of a dataset and converts it to DNF with preventing the occurrences of redundant terms. The proposed method requires no more amount of memory than that is required for constructing DFmin and final DNF separately. Due to this property, it can process most of datasets that can not be processed by DM-based programs. (C) 2011 Elsevier Ltd. All rights reserved.

Description

Keywords

Information system, Datasets, Feature selection, Discernibility function, Boolean functions

Journal or Series

EXPERT SYSTEMS WITH APPLICATIONS

WoS Q Value

Q1

Scopus Q Value

Q1

Volume

38

Issue

7

Citation