GB-AFS
GB-AFS (Graph-Based Automatic Feature Selection) is an approach designed to identify the optimal subset of features necessary for maintaining predictive performance, without necessitating user-specified parameters, such as the desired number of features to include. This self-sufficiency is what attributes the 'Automatic' aspect to its name. Operating as a filter-based methodology, GB-AFS is model-agnostic, allowing for the integration of feature selection seamlessly into the preprocessing phase, regardless of the predictive model being used.
The primary innovation and strength of GB-AFS lie in its unique capability to autonomously determine the smallest set of features required, circumventing the common limitation among filter-based methods that typically rely on user input for configuration.
Using GB-AFS: Code Examples and Visualization
GBFS offers a versatile and user-friendly Python library for feature selection in multi-class classification tasks. This section guides you through initializing the GB-AFS object with your dataset, selecting features, and visualizing the feature space.
Initialization and Parameters
To start using GB-AFS, you first need to initialize the GB-AFS object with your dataset and selection criteria:
| main.py | |
|---|---|
Parameters Explained
dataset_path: Path to your dataset file. Ensure your dataset is in a CSV format or another compatible format.separability_metric: Metric for evaluating feature separability.dim_reducer_model: Dimensionality reduction model applying to your dataset. Must implement afit_transformmethod for compatibility.label_column: Name of the column with labels in your dataset. Defaults to'class'.
Current supported metrics for separability_metric are jm, bhattacharyya, and wasserstein. To request support for additional metrics, please open an issue in the repository.
Feature Selection
Once the GB-AFS object is initialized, you can proceed with the feature selection process:
| main.py | |
|---|---|
Visualizing the Feature Space
GB-AFS also includes a method to visualize the selected features within the feature space, providing insights into their distribution and separability:
| main.py | |
|---|---|
This method generates a scatter plot highlighting the selected features. Features are displayed with their separability power indicated by color intensity, and selected features are marked distinctly.
References and Further Reading
For a deeper understanding of the GB-AFS method and its background, consider exploring the official paper.