{ "cells": [ { "cell_type": "markdown", "id": "01e8e78a", "metadata": {}, "source": [ "# Interpretting Regression" ] }, { "cell_type": "code", "execution_count": 1, "id": "4e02a824", "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import seaborn as sns\n", "import numpy as np\n", "import pandas as pd\n", "from sklearn import datasets, linear_model\n", "from sklearn.metrics import mean_squared_error, r2_score\n", "from sklearn.model_selection import cross_val_score\n", "from sklearn.model_selection import train_test_split\n", "from sklearn.preprocessing import PolynomialFeatures\n", "sns.set_theme(font_scale=2,palette='colorblind')" ] }, { "cell_type": "markdown", "id": "ffbbdf53", "metadata": {}, "source": [ "## Multivariate Regression\n", "\n", "We can also load data from Scikit learn.\n", "\n", "This dataset includes 10 features measured on a given date and an measure of\n", "diabetes disease progression measured one year later. The predictor we can train\n", "with this data might be someting a doctor uses to calculate a patient's risk." ] }, { "cell_type": "code", "execution_count": 2, "id": "be1c8d38", "metadata": {}, "outputs": [], "source": [ "# Load the diabetes dataset\n", "diabetes_X, diabetes_y = datasets.load_diabetes(return_X_y=True)\n", "X_train,X_test, y_train,y_test = train_test_split(diabetes_X, diabetes_y ,\n", " test_size=20,random_state=0)" ] }, { "cell_type": "code", "execution_count": 3, "id": "c03b30ae", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(422, 10)" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X_train.shape" ] }, { "cell_type": "code", "execution_count": 4, "id": "608cedb3", "metadata": {}, "outputs": [], "source": [ "regr_db = linear_model.LinearRegression()" ] }, { "cell_type": "code", "execution_count": 5, "id": "2215569c", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'fit_intercept': True, 'copy_X': True, 'n_jobs': None, 'positive': False}" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "regr_db.__dict__" ] }, { "cell_type": "markdown", "id": "7d154b5d", "metadata": {}, "source": [ "We have an empty estimator object at this point and then we can fit it as usual." ] }, { "cell_type": "code", "execution_count": 6, "id": "f371c793", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
LinearRegression()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
LinearRegression()