{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Hyperparameter Tuning with GridSearchCV\n", "\n", "This notebook demonstrates hyperparameter tuning for `LogisticRegression` using `GridSearchCV` to optimize `C` and `penalty`.\n", "\n", "## Setup\n", "We use a synthetic dataset with varying feature informativeness." ] }, { "cell_type": "code", "metadata": { "ExecuteTime": { "end_time": "2025-07-22T14:41:11.112129Z", "start_time": "2025-07-22T14:41:09.541668Z" } }, "source": [ "from sklearn.datasets import make_classification\n", "from sklearn.model_selection import train_test_split, GridSearchCV\n", "from sklearn.metrics import accuracy_score\n", "from glmpynet import LogisticRegression\n", "\n", "# Generate synthetic dataset\n", "X, y = make_classification(n_samples=200, n_features=20, n_informative=10, random_state=42)\n", "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)\n", "\n", "# Define grid search\n", "param_grid = {\n", " 'C': [0.1, 1.0, 10.0],\n", " 'penalty': ['l1', 'l2']\n", "}\n", "model = LogisticRegression()\n", "grid_search = GridSearchCV(model, param_grid, cv=5, scoring='accuracy')\n", "grid_search.fit(X_train, y_train)\n", "\n", "# Evaluate best model\n", "print(f\"Best Parameters: {grid_search.best_params_}\")\n", "print(f\"Best Cross-Validation Score: {grid_search.best_score_:.2f}\")\n", "y_pred = grid_search.predict(X_test)\n", "accuracy = accuracy_score(y_test, y_pred)\n", "print(f\"Test Accuracy: {accuracy:.2f}\")" ], "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Best Parameters: {'C': 0.1, 'penalty': 'l1'}\n", "Best Cross-Validation Score: 0.82\n", "Test Accuracy: 0.72\n" ] } ], "execution_count": 1 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Explanation\n", "- `GridSearchCV` tunes `C` and `penalty` to maximize cross-validation accuracy.\n", "- The dataset has 10 informative features, testing regularization effects.\n", "- Best parameters typically include `C=1.0` or `10.0` with `l2` penalty.\n", "- With `glmnet`, expect different optimal parameters due to elastic-net regularization." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.0" } }, "nbformat": 4, "nbformat_minor": 2 }