Abstract:
Generalization ability of a classi er is an important issue for any classi cation task. Two prominent
problems a ecting the generalization ability are over- tting and class-imbalance. It is thus important
to address these problems while developing a classi cation system. There has been an enormous
amount of work on classi cation problems in the machine learning literature, but handling over- tting
and class imbalance is still an open issue. Most of the existing works su er more or less from these
problems. This thesis presents a new evolutionary system, i.e., EDARIC, for rule induction and classi
cation. The evolutionary approach used in our new system is based on a destructive method that
starts with large-sized rules and gradually decreases the sizes as evolution progresses. The novelty of
this thesis lies mainly in the way it addresses and handles the over- tting problem by incorporating
an intelligent deletion mechanism for producing smaller-sized, i.e., generalized rules. Another beauty
of EDARIC is its simplicity, which is due to using a minimum number of operators and parameters
during evolution. Furthermore, EDARIC evolves multiple populations with appropriate operators
and uses an ensemble system to classify future unknown instances. These features help in avoiding
over- tting and class-imbalance problems, which are bene cial for improving generalization ability
of a classi cation system. EDARIC has been tested on 30 standard and 33 imbalanced benchmark
data-sets against more than 20 state-of-the-art evolutionary approaches and six state-of-the-art nonevolutionary
approaches. EDARIC has also been tested against its own variant (without the intelligent
deletion mechanism). The experimental results show that our proposed evolutionary system obtains
better generalization performance compared to the existing algorithms. As expected, EDARIC also
obtained better generalization performance than its own variant, which did not incorporate the intelligent
deletion mechanism.