Feature Selection Package Documentation

The Gini Index is a statistical measure of dispersion. It is based on another statistical phenomenon called the Lorentz curve, and is commonly used to quantify wealth distributions, although its applications are endless.

Method Signature:

[out] = fsGini(X,Y)

Output:
out: A struct containing the following fields:

w - a list containing the information gain of each feature when matched with fList.
fList - the list of features ranked by their ability to classify the data. fList(1) is the least important feature.
prf - will always be -1. This means the greater the feature weight, the more relevant the feature.

Input:
X: The features on current trunk, each column is a feature vector on all instances, and each row is a part of the instance.
Y: The label of instances, in single column form: 1 2 3 4 5 ...

BibTex entry for:

Gini, C. "Variabilita e mutabilita." 1912. Reprinted in Memorie di metodologia statistica (Ed. E. Pizetti and T. Salvemini.) Rome: Libreria Eredi Virgilio Veschi, 1955.

@article{gini-1912,
   author = {Gini, C.},
   title = {Variabilit‡ e mutabilita.},
   journal = {Memorie di metodologia statistica},
   publisher = {Libreria Eredi Virgilio Veschi, Rome},
   year = {1912}
}