Feature Selection Package - Algorithms - Gini Index
Description
The Gini Index is a statistical measure of dispersion. It is based on
another statistical phenomenon called the Lorentz curve, and is commonly
used to quantify wealth distributions, although its applications are endless.
Usage
Method Signature:
[out] = fsGini(X,Y)
Output:
out:
A struct containing the following fields:
- w - a list containing the information gain of each feature
when matched with fList.
- fList - the list of features ranked by their ability to classify
the data. fList(1) is the least important feature.
- prf - will always be -1. This means the greater the feature weight,
the more relevant the feature.
Input:
X:
The features on current trunk, each column is a feature vector on all
instances, and each row is a part of the instance.
Y:
The label of instances, in single column form: 1 2 3 4 5 ...
Code Example
% Using the wine.dat data set, which can be found at
% [fspackage_location]/classifiers/knn/wine.mat
fsGini(X,Y)
Keyword in Evaluator Framework
gini
Paper
BibTex entry for:
Gini, C. "Variabilita e mutabilita." 1912. Reprinted in Memorie di metodologia statistica (Ed. E. Pizetti and T. Salvemini.) Rome: Libreria Eredi Virgilio Veschi, 1955.
@article{gini-1912,
author = {Gini, C.},
title = {Variabilit‡ e mutabilita.},
journal = {Memorie di metodologia statistica},
publisher = {Libreria Eredi Virgilio Veschi, Rome},
year = {1912}
}