{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Basic Binary Classification with LogisticNet\n", "\n", "This notebook demonstrates the basic usage of `LogisticNet` for binary classification, including fitting the model, making predictions, and evaluating accuracy.\n", "\n", "## Setup\n", "We use a synthetic dataset from `sklearn.datasets.make_classification` for reproducibility." ] }, { "cell_type": "code", "metadata": { "ExecuteTime": { "end_time": "2025-07-22T14:39:38.037332Z", "start_time": "2025-07-22T14:39:36.408324Z" } }, "source": [ "import numpy as np\n", "from sklearn.datasets import make_classification\n", "from sklearn.model_selection import train_test_split\n", "from sklearn.metrics import accuracy_score\n", "from glmpynet import LogisticRegression\n", "\n", "# Generate synthetic dataset\n", "X, y = make_classification(n_samples=200, n_features=20, n_classes=2, random_state=42)\n", "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)\n", "\n", "# Instantiate and fit LogisticNet\n", "model = LogisticRegression()\n", "model.fit(X_train, y_train)\n", "\n", "# Predict and evaluate\n", "y_pred = model.predict(X_test)\n", "accuracy = accuracy_score(y_test, y_pred)\n", "print(f\"Accuracy: {accuracy:.2f}\")\n", "\n", "# Predict probabilities\n", "y_proba = model.predict_proba(X_test)\n", "print(f\"Sample probabilities:\\n{y_proba[:5]}\")" ], "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Accuracy: 0.88\n", "Sample probabilities:\n", "[[0.94532103 0.05467897]\n", " [0.98537361 0.01462639]\n", " [0.9899398 0.0100602 ]\n", " [0.00795596 0.99204404]\n", " [0.0825289 0.9174711 ]]\n" ] } ], "execution_count": 1 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Explanation\n", "- `LogisticRegression()` is instantiated with default parameters (`C=1.0`, `penalty='l2'`).\n", "- The model is trained on a synthetic dataset with 200 samples and 20 features.\n", "- Predictions and probabilities are computed, with accuracy typically around 0.85-0.95 for this dataset.\n", "- When transitioning to `glmnet`, expect similar accuracy but potential differences in probabilities due to elastic-net regularization." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.0" } }, "nbformat": 4, "nbformat_minor": 2 }