Usage Guide
This guide provides an overview of how to use glmpynet for regularized logistic regression. It covers the hybrid API, integration with Scikit-learn, and the configuration system.
API Philosophy: A Hybrid Approach
glmpynet.LogisticRegression is designed to be a seamless, high-performance
replacement for sklearn.linear_model.LogisticRegression. To achieve this,
it provides a user-friendly hybrid API.
Simplicity by Default: For most use cases, you can interact with the class using familiar Scikit-learn parameters like
Candpenalty. This ensures smooth integration with tools likeGridSearchCV.Glmnet on Request: For advanced users who want to access the full power of the
glmnetengine, the class also provides an “escape hatch” to useglmnet-native parameters likealphaandnlambdadirectly.
The fit method automatically translates the user-provided parameters into
the format required by the C++ backend.
Basic Usage
For most users, the API will feel identical to Scikit-learn’s.
Example: L1 Regularization (Lasso)
This example shows how to train a model with an L1 penalty and make predictions, including getting class probabilities.
from glmpynet.logistic_regression import LogisticRegression
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
# 1. Generate a synthetic binary classification dataset
X, y = make_classification(random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)
# 2. Instantiate and fit the model with sklearn-style parameters
model = LogisticRegression(penalty='l1', C=0.5)
model.fit(X_train, y_train)
# 3. Make predictions
y_pred = model.predict(X_test)
# 4. Get class probabilities
y_proba = model.predict_proba(X_test)
print("Predicted probabilities for the first 5 samples:")
print(y_proba[:5])
Advanced Usage: Direct glmnet Parameters
If you are an advanced user familiar with the R glmnet package, you can
bypass the Scikit-learn style parameters and control the solver directly
by providing alpha. If alpha is provided, the penalty parameter
is ignored.
# Instantiate the model using the glmnet 'alpha' parameter for elastic net
# This is equivalent to penalty='l1'
model_glmnet = LogisticRegression(alpha=1.0)
model_glmnet.fit(X_train, y_train)
Integration with Scikit-learn
glmpynet.LogisticRegression is a fully compliant Scikit-learn estimator
and works seamlessly with tools like Pipeline and GridSearchCV.
Example: Using with GridSearchCV
from sklearn.model_selection import GridSearchCV
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
# Create a pipeline
pipeline = Pipeline([
('scaler', StandardScaler()),
('model', LogisticRegression())
])
# Define a parameter grid to search over
param_grid = {
'model__penalty': ['l1', 'l2'],
'model__C': [0.1, 1.0, 10.0]
}
# Set up and run the grid search
grid_search = GridSearchCV(pipeline, param_grid, cv=5, scoring='accuracy')
grid_search.fit(X_train, y_train)
print(f"Best parameters found: {grid_search.best_params_}")
print(f"Best cross-validation score: {grid_search.best_score_:.2f}")