<- read_csv(here("posts/2023-09-20-abalone_age_ML/abalone-data.csv")) abalone_data
Introduction
Abalones are marine snails that are consumed by many cultures. Determining the age of abalone can be a time-consuming task, so rather than cutting the shell through the cone, staining it, and counting the number of rings through a microscope, other, easier to obtain, measurements can be used to predict the age of an abalone.
In this blog, I will be using regularized regression machine learning models made using R to predict the age of abalone based on a dataset with variables related to the sex, physical dimensions of the shell, and various weight measurements, along with the number of rings in the shell.
Regularized Regression Models
Regularization of regression models helps prevent overfitting of data onto the model, therefore reducing the variance of the model predictions. Regularization means having the model self-regulate to prevent overfitting by making sure its regression coefficients don’t get too large.
Regularized regression is an alternative to Ordinary Least Squares (OLS) regression. OLS Regression chooses a line (from the many that can be found) that minimizes the sum of squared errors (SSE).
Regularized regression introduces a penalty term to OLS Regression that constrains the size of coefficients to the point where the coefficient can only increase if there is a comparable decrease in model’s function. Introducing this penalty is intended to improve the model’s predictive performance.
I will be evaluating two types of regularized regression in this blog, ridge regression and lasso regression.
Data Exploration
Data Splitting
I’ll start with splitting the data into a 70/30 training/test split.
I’ll be using the caret
package in our approach to this task, as it supports the cration of predictive models. I will use the glmnet
package in order to perform ridge regression and the lasso. The main function in this package is glmnet(), which can be used to fit ridge regression models, lasso models, and more. In particular, we must pass in an x matrix of predictors as well as a y outcome vector , and we do not use the y∼x syntax.
#setting seed for reproducibility
set.seed(711)
# take a stratified sample
<- initial_split(data = abalone_data,
abalone_data_split prop = .70,
strata = Rings) #stratify by outcome variable
abalone_data_split
<Training/Testing/Total>
<2922/1255/4177>
<- training(abalone_data_split) #takes training portion from split data
abalone_data_train <- testing(abalone_data_split) #takes test portion from split data
abalone_data_test #essentially have two new data sets
abalone_data_train
# A tibble: 2,922 × 10
...1 Sex Length Diameter Height Whole_weight Shucked_weight Viscera_weight
<dbl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2 M 0.35 0.265 0.09 0.226 0.0995 0.0485
2 5 I 0.33 0.255 0.08 0.205 0.0895 0.0395
3 6 I 0.425 0.3 0.095 0.352 0.141 0.0775
4 17 I 0.355 0.28 0.085 0.290 0.095 0.0395
5 36 M 0.465 0.355 0.105 0.480 0.227 0.124
6 38 F 0.45 0.355 0.105 0.522 0.237 0.116
7 43 I 0.24 0.175 0.045 0.07 0.0315 0.0235
8 44 I 0.205 0.15 0.055 0.042 0.0255 0.015
9 45 I 0.21 0.15 0.05 0.042 0.0175 0.0125
10 46 I 0.39 0.295 0.095 0.203 0.0875 0.045
# ℹ 2,912 more rows
# ℹ 2 more variables: Shell_weight <dbl>, Rings <dbl>
skim(abalone_data_train) #takes a summary of this data
Name | abalone_data_train |
Number of rows | 2922 |
Number of columns | 10 |
_______________________ | |
Column type frequency: | |
character | 1 |
numeric | 9 |
________________________ | |
Group variables | None |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
---|---|---|---|---|---|---|---|
Sex | 0 | 1 | 1 | 1 | 0 | 3 | 0 |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
---|---|---|---|---|---|---|---|---|---|---|
…1 | 0 | 1 | 2087.01 | 1210.03 | 2.00 | 1033.50 | 2103.50 | 3121.75 | 4177.00 | ▇▇▇▇▇ |
Length | 0 | 1 | 0.52 | 0.12 | 0.07 | 0.45 | 0.54 | 0.62 | 0.81 | ▁▂▅▇▂ |
Diameter | 0 | 1 | 0.41 | 0.10 | 0.06 | 0.35 | 0.42 | 0.48 | 0.65 | ▁▂▅▇▁ |
Height | 0 | 1 | 0.14 | 0.04 | 0.00 | 0.12 | 0.14 | 0.16 | 1.13 | ▇▁▁▁▁ |
Whole_weight | 0 | 1 | 0.83 | 0.49 | 0.00 | 0.44 | 0.80 | 1.16 | 2.83 | ▇▇▅▁▁ |
Shucked_weight | 0 | 1 | 0.36 | 0.22 | 0.00 | 0.19 | 0.33 | 0.51 | 1.35 | ▇▇▃▁▁ |
Viscera_weight | 0 | 1 | 0.18 | 0.11 | 0.00 | 0.09 | 0.17 | 0.25 | 0.76 | ▇▇▂▁▁ |
Shell_weight | 0 | 1 | 0.24 | 0.14 | 0.00 | 0.13 | 0.23 | 0.33 | 0.90 | ▇▇▃▁▁ |
Rings | 0 | 1 | 9.94 | 3.22 | 1.00 | 8.00 | 9.00 | 11.00 | 29.00 | ▁▇▂▁▁ |
Fit a Regularized Regression Model
Ridge regression models:
Pushes many correlated features toward each other, don’t have one wildly positive, another wildly negative
Does not perform feature selection and retains all available features in final model
Tuning parameter is lambda:
- Lambda = 0, no effect & objective function of simply minimizing SSE
- As lambda increase to infinity, penalty becomes large and forces coefficients toward zero (not all the way)
Lasso regression models:
Will push coefficients all the way to zero
Improved model and conducts automated feature selection
Least absolute shrinkage and selection operator
I’ll be using the model.matrix()
function to create a predictor matrix, x, and assign the Rings variable to an outcome vector, y.
<- model.matrix(Rings ~ ., #sale price predicted by all predictors
pred_x -1] #take out first column
abalone_data_train)[,
# transform y with log() transformation
<- log(abalone_data_train$Rings) out_y
Fitting a ridge model (controlled by the alpha parameter) using the glmnet()
function, and making a plot showing how the estimated coefficients change with lambda. Lambda is a tuning parameter that helps keep this model from over-fitting to the training data. Lamda is the tuning parameter for both lasso and ridge regression.
#fit a ridge model
<- glmnet(x = pred_x,
ridge y = out_y,
alpha = 0)
#plot() the glmnet model object
plot(ridge, xvar = "lambda")
Using k-fold Cross Validation Resampling and Tuning Models
A methods of estimating a model’s generalization error is k-fold cross validation. Tuning is the process of finding the optimal value of lambda.
I will fit a ridge regression model and a lasso model, both with using cross validation to compare which model is a better predictor. The glmnet
package provides a cv.glmnet()
function to do this (similar to the glmnet() function that we just used). Use the alpha argument to control which type of model you are running. Plot the results.
# apply CV ridge regression
<- cv.glmnet(
ridge_reg x = pred_x,
y = out_y,
alpha = 0 #pure ridge is alpha = 0
)
# apply CV lasso regression
<- cv.glmnet(
lasso_reg x = pred_x,
y = out_y,
alpha = 1 #pure lasso is lasso = 0
)
# plot results
par(mfrow = c(1, 2))
plot(ridge_reg, main = "Ridge penalty\n\n")
# first line is minimum lambda
#second line is 1 sd away
plot(lasso_reg, main = "Lasso penalty\n\n")
The subtitles/upper x axis indicate the number of features in the model, the lower x axis is showing the value of the penalty of log(lamda), and the y axis is showing the 10 fold CV mean squared error between the predicted outcome variable and the predictor variable with the corresponding penalty log(lambda). The left dotted lines in each plot are the values of the penalty log(lambda) that produces the minimum, mean cross-validated errors. The right dotted lines in each plot indicate the values of the penalty log(lambda) that provide the most regularized model where the cross-validated error is within one standard error of the minimum MSE. The red dots make up the cross validation curve, with the error bars indicating the upper and lower standard deviations. With both ridge and lasso regression, the MSE increases as the penalty increases. Therefore, the performance of the models gets worse as the value of lambda increases.
Inspecting the ridge model object created with cv.glmnet(), I want to know the minimum mean squared error (MSE) and the value of lambda associated with this minimum MSE. MSE measures the amount of error in a model.
#inspect ridge model
head(ridge_reg)
$lambda
[1] 210.81570089 192.08741505 175.02289851 159.47434659 145.30708517
[6] 132.39840421 120.63649488 109.91948117 100.15453741 91.25708434
[11] 83.15005649 75.76323464 69.03263769 62.89996842 57.31210858
[16] 52.22065881 47.58151941 43.35450837 39.50301333 35.99367450
[21] 32.79609566 29.88258091 27.22789479 24.80904366 22.60507659
[26] 20.59690389 18.76713172 17.09991146 15.58080246 14.19664691
[31] 12.93545592 11.78630567 10.73924276 9.78519803 8.91590801
[36] 8.12384331 7.40214346 6.74455744 6.14538955 5.59945008
[41] 5.10201036 4.64876182 4.23577864 3.85948374 3.51661784
[46] 3.20421120 2.91955791 2.66019243 2.42386827 2.20853849
[51] 2.01233802 1.83356746 1.67067838 1.52225991 1.38702654
[56] 1.26380692 1.15153380 1.04923471 0.95602359 0.87109309
[61] 0.79370758 0.72319679 0.65894998 0.60041068 0.54707185
[66] 0.49847149 0.45418866 0.41383979 0.37707540 0.34357706
[71] 0.31305463 0.28524371 0.25990345 0.23681434 0.21577641
[76] 0.19660743 0.17914137 0.16322694 0.14872631 0.13551387
[81] 0.12347519 0.11250600 0.10251128 0.09340446 0.08510666
[86] 0.07754602 0.07065705 0.06438007 0.05866072 0.05344946
[91] 0.04870116 0.04437468 0.04043256 0.03684064 0.03356782
[96] 0.03058575 0.02786859 0.02539283 0.02313700 0.02108157
$cvm
[1] 0.10105059 0.10021351 0.10007332 0.09997545 0.09986844 0.09975146
[7] 0.09962364 0.09948404 0.09933164 0.09916535 0.09898401 0.09878637
[13] 0.09857111 0.09833683 0.09808204 0.09780515 0.09750460 0.09717867
[19] 0.09682560 0.09644358 0.09603080 0.09558539 0.09510553 0.09458941
[25] 0.09403528 0.09344150 0.09280658 0.09212920 0.09140827 0.09064299
[31] 0.08983290 0.08897794 0.08807850 0.08713547 0.08615028 0.08512497
[37] 0.08406218 0.08296516 0.08183780 0.08068461 0.07951063 0.07832139
[43] 0.07712282 0.07592114 0.07472273 0.07353398 0.07236116 0.07121025
[49] 0.07008683 0.06899589 0.06794177 0.06692804 0.06595745 0.06503187
[55] 0.06415232 0.06331898 0.06253121 0.06178771 0.06108651 0.06042511
[61] 0.05980021 0.05920928 0.05864996 0.05811806 0.05761006 0.05712263
[67] 0.05665263 0.05619735 0.05575652 0.05532311 0.05490039 0.05448054
[73] 0.05406846 0.05365618 0.05324998 0.05284175 0.05243929 0.05203758
[79] 0.05163540 0.05123857 0.05084387 0.05045044 0.05006493 0.04968524
[85] 0.04931157 0.04894272 0.04858578 0.04823904 0.04790330 0.04757925
[91] 0.04726772 0.04696939 0.04668484 0.04641446 0.04615849 0.04591692
[97] 0.04569012 0.04547725 0.04527888 0.04510068
$cvsd
[1] 0.003770703 0.003759516 0.003723304 0.003719699 0.003715752 0.003711432
[7] 0.003706707 0.003701537 0.003695885 0.003689707 0.003682957 0.003675584
[13] 0.003667536 0.003658755 0.003649180 0.003638743 0.003627375 0.003615005
[19] 0.003601552 0.003586936 0.003571069 0.003553863 0.003535225 0.003515060
[25] 0.003493272 0.003469763 0.003444435 0.003417195 0.003387948 0.003356609
[31] 0.003323098 0.003287343 0.003249286 0.003208884 0.003166109 0.003120955
[37] 0.003073437 0.003023599 0.002971510 0.002917270 0.002861010 0.002802894
[43] 0.002743118 0.002681910 0.002619524 0.002556245 0.002492378 0.002428246
[49] 0.002364185 0.002300534 0.002237633 0.002175816 0.002115398 0.002056677
[55] 0.001999921 0.001945370 0.001893229 0.001843668 0.001796823 0.001752805
[61] 0.001711763 0.001673478 0.001638242 0.001606037 0.001576812 0.001550563
[67] 0.001527294 0.001507006 0.001489580 0.001475266 0.001463850 0.001455632
[73] 0.001450352 0.001448445 0.001449546 0.001454223 0.001461934 0.001473238
[79] 0.001488074 0.001505921 0.001527271 0.001551951 0.001579158 0.001609156
[85] 0.001641469 0.001676191 0.001711731 0.001748420 0.001785621 0.001822769
[91] 0.001859352 0.001894881 0.001928907 0.001961034 0.001990928 0.002018321
[97] 0.002043051 0.002065043 0.002084141 0.002099895
$cvup
[1] 0.10482129 0.10397302 0.10379663 0.10369515 0.10358419 0.10346289
[7] 0.10333035 0.10318558 0.10302753 0.10285506 0.10266697 0.10246196
[13] 0.10223865 0.10199558 0.10173122 0.10144390 0.10113198 0.10079368
[19] 0.10042715 0.10003052 0.09960187 0.09913926 0.09864076 0.09810447
[25] 0.09752855 0.09691127 0.09625102 0.09554639 0.09479621 0.09399960
[31] 0.09315600 0.09226529 0.09132779 0.09034435 0.08931639 0.08824593
[37] 0.08713562 0.08598876 0.08480931 0.08360188 0.08237164 0.08112428
[43] 0.07986594 0.07860305 0.07734225 0.07609022 0.07485353 0.07363850
[49] 0.07245101 0.07129642 0.07017940 0.06910386 0.06807285 0.06708855
[55] 0.06615225 0.06526435 0.06442444 0.06363137 0.06288333 0.06217791
[61] 0.06151197 0.06088275 0.06028820 0.05972409 0.05918687 0.05867319
[67] 0.05817993 0.05770435 0.05724610 0.05679838 0.05636424 0.05593617
[73] 0.05551881 0.05510462 0.05469953 0.05429597 0.05390122 0.05351082
[79] 0.05312347 0.05274449 0.05237114 0.05200240 0.05164409 0.05129440
[85] 0.05095304 0.05061891 0.05029751 0.04998746 0.04968892 0.04940201
[91] 0.04912707 0.04886428 0.04861374 0.04837549 0.04814942 0.04793524
[97] 0.04773317 0.04754229 0.04736302 0.04720058
$cvlo
[1] 0.09727988 0.09645399 0.09635002 0.09625576 0.09615269 0.09604003
[7] 0.09591694 0.09578251 0.09563576 0.09547565 0.09530105 0.09511079
[13] 0.09490357 0.09467807 0.09443286 0.09416641 0.09387723 0.09356367
[19] 0.09322405 0.09285665 0.09245973 0.09203153 0.09157031 0.09107435
[25] 0.09054201 0.08997174 0.08936215 0.08871200 0.08802032 0.08728638
[31] 0.08650980 0.08569060 0.08482922 0.08392659 0.08298417 0.08200402
[37] 0.08098875 0.07994156 0.07886629 0.07776734 0.07664962 0.07551849
[43] 0.07437970 0.07323923 0.07210320 0.07097773 0.06986878 0.06878201
[49] 0.06772264 0.06669536 0.06570414 0.06475223 0.06384205 0.06297520
[55] 0.06215240 0.06137361 0.06063798 0.05994404 0.05928968 0.05867230
[61] 0.05808845 0.05753580 0.05701171 0.05651202 0.05603324 0.05557206
[67] 0.05512534 0.05469034 0.05426694 0.05384785 0.05343654 0.05302491
[73] 0.05261811 0.05220773 0.05180044 0.05138753 0.05097735 0.05056435
[79] 0.05014733 0.04973265 0.04931660 0.04889849 0.04848577 0.04807608
[85] 0.04767011 0.04726653 0.04687405 0.04649062 0.04611768 0.04575648
[91] 0.04540836 0.04507451 0.04475593 0.04445343 0.04416757 0.04389860
[97] 0.04364707 0.04341221 0.04319474 0.04300078
$nzero
s0 s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 s12 s13 s14 s15 s16 s17 s18 s19
10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
s20 s21 s22 s23 s24 s25 s26 s27 s28 s29 s30 s31 s32 s33 s34 s35 s36 s37 s38 s39
10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
s40 s41 s42 s43 s44 s45 s46 s47 s48 s49 s50 s51 s52 s53 s54 s55 s56 s57 s58 s59
10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
s60 s61 s62 s63 s64 s65 s66 s67 s68 s69 s70 s71 s72 s73 s74 s75 s76 s77 s78 s79
10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
s80 s81 s82 s83 s84 s85 s86 s87 s88 s89 s90 s91 s92 s93 s94 s95 s96 s97 s98 s99
10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
#find minimum MSE
min(ridge_reg$cvm)
[1] 0.04510068
#find value of lambda at this minimum val
$lambda.min ridge_reg
[1] 0.02108157
The minimum MSE of the cross validation folds is 0.0427. The corresponding lambda value with this MSE minimum is 0.0215.
I’ll do the same for the lasso model.
#inspect ridge model
head(lasso_reg)
$lambda
[1] 2.108157e-01 1.920874e-01 1.750229e-01 1.594743e-01 1.453071e-01
[6] 1.323984e-01 1.206365e-01 1.099195e-01 1.001545e-01 9.125708e-02
[11] 8.315006e-02 7.576323e-02 6.903264e-02 6.289997e-02 5.731211e-02
[16] 5.222066e-02 4.758152e-02 4.335451e-02 3.950301e-02 3.599367e-02
[21] 3.279610e-02 2.988258e-02 2.722789e-02 2.480904e-02 2.260508e-02
[26] 2.059690e-02 1.876713e-02 1.709991e-02 1.558080e-02 1.419665e-02
[31] 1.293546e-02 1.178631e-02 1.073924e-02 9.785198e-03 8.915908e-03
[36] 8.123843e-03 7.402143e-03 6.744557e-03 6.145390e-03 5.599450e-03
[41] 5.102010e-03 4.648762e-03 4.235779e-03 3.859484e-03 3.516618e-03
[46] 3.204211e-03 2.919558e-03 2.660192e-03 2.423868e-03 2.208538e-03
[51] 2.012338e-03 1.833567e-03 1.670678e-03 1.522260e-03 1.387027e-03
[56] 1.263807e-03 1.151534e-03 1.049235e-03 9.560236e-04 8.710931e-04
[61] 7.937076e-04 7.231968e-04 6.589500e-04 6.004107e-04 5.470718e-04
[66] 4.984715e-04 4.541887e-04 4.138398e-04 3.770754e-04 3.435771e-04
[71] 3.130546e-04 2.852437e-04 2.599034e-04 2.368143e-04 2.157764e-04
[76] 1.966074e-04 1.791414e-04 1.632269e-04 1.487263e-04 1.355139e-04
[81] 1.234752e-04 1.125060e-04 1.025113e-04 9.340446e-05
$cvm
[1] 0.10062068 0.09363761 0.08719941 0.08178955 0.07731528 0.07361844
[7] 0.07056437 0.06804298 0.06596307 0.06424436 0.06280284 0.06147836
[13] 0.06025119 0.05918640 0.05830676 0.05757759 0.05697411 0.05647471
[19] 0.05606198 0.05572053 0.05543812 0.05520472 0.05499648 0.05473028
[25] 0.05445873 0.05369819 0.05205259 0.05065539 0.04949829 0.04853919
[31] 0.04774056 0.04708089 0.04652605 0.04607464 0.04568779 0.04535745
[37] 0.04508427 0.04485471 0.04466536 0.04450730 0.04437715 0.04426876
[43] 0.04417988 0.04410692 0.04404566 0.04399501 0.04395262 0.04392116
[49] 0.04390205 0.04388789 0.04384475 0.04374347 0.04362202 0.04351883
[55] 0.04343351 0.04336283 0.04330407 0.04325528 0.04321498 0.04318149
[61] 0.04315369 0.04313059 0.04311157 0.04309579 0.04308262 0.04307181
[67] 0.04306282 0.04305544 0.04304929 0.04304415 0.04304001 0.04303656
[73] 0.04303369 0.04303129 0.04302939 0.04302771 0.04302643 0.04302536
[79] 0.04302448 0.04302377 0.04302314 0.04302264 0.04302228 0.04302269
$cvsd
[1] 0.004671339 0.004480974 0.004144777 0.003868594 0.003637909 0.003446647
[7] 0.003288545 0.003158840 0.003053394 0.002966980 0.002897399 0.002849305
[13] 0.002800528 0.002754500 0.002719805 0.002692277 0.002671008 0.002654625
[19] 0.002642354 0.002633340 0.002626828 0.002622295 0.002618588 0.002617145
[25] 0.002629552 0.002654682 0.002609113 0.002555043 0.002510587 0.002474131
[31] 0.002441149 0.002413325 0.002385445 0.002369313 0.002350763 0.002332022
[37] 0.002316471 0.002302372 0.002290722 0.002280780 0.002271843 0.002263104
[43] 0.002255779 0.002249514 0.002244039 0.002238635 0.002234054 0.002232952
[49] 0.002238990 0.002243972 0.002237215 0.002235569 0.002234950 0.002234581
[55] 0.002234558 0.002234703 0.002234934 0.002235192 0.002235587 0.002235993
[61] 0.002236414 0.002236885 0.002237321 0.002237774 0.002238175 0.002238641
[67] 0.002239027 0.002239437 0.002239805 0.002240198 0.002240502 0.002240825
[73] 0.002241129 0.002241371 0.002241635 0.002241891 0.002242093 0.002242321
[79] 0.002242511 0.002242691 0.002242837 0.002242976 0.002243137 0.002243263
$cvup
[1] 0.10529202 0.09811858 0.09134418 0.08565814 0.08095318 0.07706509
[7] 0.07385292 0.07120182 0.06901646 0.06721134 0.06570024 0.06432767
[13] 0.06305172 0.06194090 0.06102657 0.06026987 0.05964512 0.05912934
[19] 0.05870433 0.05835387 0.05806495 0.05782701 0.05761507 0.05734743
[25] 0.05708829 0.05635287 0.05466171 0.05321043 0.05200888 0.05101332
[31] 0.05018170 0.04949422 0.04891149 0.04844395 0.04803855 0.04768948
[37] 0.04740074 0.04715708 0.04695608 0.04678808 0.04664900 0.04653186
[43] 0.04643566 0.04635644 0.04628970 0.04623364 0.04618667 0.04615412
[49] 0.04614104 0.04613186 0.04608196 0.04597904 0.04585697 0.04575341
[55] 0.04566807 0.04559753 0.04553901 0.04549047 0.04545057 0.04541748
[61] 0.04539011 0.04536747 0.04534889 0.04533356 0.04532080 0.04531045
[67] 0.04530185 0.04529488 0.04528909 0.04528435 0.04528052 0.04527738
[73] 0.04527482 0.04527266 0.04527102 0.04526960 0.04526853 0.04526768
[79] 0.04526699 0.04526646 0.04526598 0.04526561 0.04526541 0.04526596
$cvlo
[1] 0.09594934 0.08915663 0.08305463 0.07792095 0.07367737 0.07017179
[7] 0.06727583 0.06488414 0.06290967 0.06127738 0.05990544 0.05862906
[13] 0.05745066 0.05643190 0.05558696 0.05488532 0.05430310 0.05382009
[19] 0.05341962 0.05308719 0.05281129 0.05258242 0.05237789 0.05211314
[25] 0.05182918 0.05104350 0.04944348 0.04810035 0.04698771 0.04606506
[31] 0.04529941 0.04466757 0.04414060 0.04370532 0.04333703 0.04302543
[37] 0.04276780 0.04255234 0.04237464 0.04222652 0.04210531 0.04200566
[43] 0.04192411 0.04185741 0.04180162 0.04175637 0.04171856 0.04168821
[49] 0.04166306 0.04164391 0.04160753 0.04150791 0.04138707 0.04128425
[55] 0.04119896 0.04112812 0.04106914 0.04102008 0.04097940 0.04094549
[61] 0.04091728 0.04089371 0.04087425 0.04085801 0.04084445 0.04083317
[67] 0.04082379 0.04081601 0.04080948 0.04080395 0.04079951 0.04079573
[73] 0.04079256 0.04078992 0.04078775 0.04078582 0.04078434 0.04078304
[79] 0.04078197 0.04078108 0.04078031 0.04077966 0.04077914 0.04077943
$nzero
s0 s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 s12 s13 s14 s15 s16 s17 s18 s19
0 2 2 2 2 2 2 2 2 3 3 4 4 4 4 4 4 4 4 4
s20 s21 s22 s23 s24 s25 s26 s27 s28 s29 s30 s31 s32 s33 s34 s35 s36 s37 s38 s39
4 4 5 5 5 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7
s40 s41 s42 s43 s44 s45 s46 s47 s48 s49 s50 s51 s52 s53 s54 s55 s56 s57 s58 s59
7 7 7 7 8 8 8 8 8 8 9 10 10 10 10 10 10 10 10 10
s60 s61 s62 s63 s64 s65 s66 s67 s68 s69 s70 s71 s72 s73 s74 s75 s76 s77 s78 s79
10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
s80 s81 s82 s83
10 10 10 10
#find minimum MSE
min(lasso_reg$cvm)
[1] 0.04302228
#find corresponding lambda value
$lambda.min lasso_reg
[1] 0.0001025113
The minimum MSE of the cross validation folds is 0.0409. The corresponding lambda value is 8.676e-5.
I will refer to the “one-standard-error” rule when tuning lambda to select the best model. This rule tells us to pick the most parsimonious model (fewest number of predictors) while still remaining within one standard error of the overall minimum cross validation error. The cv.glmnet()
model object has a column that automatically finds the value of lambda associated with the model that produces an MSE that is one standard error from the MSE minimum ($lambda.1se).
Now I’ll find the number of predictors associated with this model
#number of predictors associated with ridge regression model
$nzero ridge_reg
s0 s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 s12 s13 s14 s15 s16 s17 s18 s19
10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
s20 s21 s22 s23 s24 s25 s26 s27 s28 s29 s30 s31 s32 s33 s34 s35 s36 s37 s38 s39
10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
s40 s41 s42 s43 s44 s45 s46 s47 s48 s49 s50 s51 s52 s53 s54 s55 s56 s57 s58 s59
10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
s60 s61 s62 s63 s64 s65 s66 s67 s68 s69 s70 s71 s72 s73 s74 s75 s76 s77 s78 s79
10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
s80 s81 s82 s83 s84 s85 s86 s87 s88 s89 s90 s91 s92 s93 s94 s95 s96 s97 s98 s99
10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
#number of predictors in lasso model
$nzero lasso_reg
s0 s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 s12 s13 s14 s15 s16 s17 s18 s19
0 2 2 2 2 2 2 2 2 3 3 4 4 4 4 4 4 4 4 4
s20 s21 s22 s23 s24 s25 s26 s27 s28 s29 s30 s31 s32 s33 s34 s35 s36 s37 s38 s39
4 4 5 5 5 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7
s40 s41 s42 s43 s44 s45 s46 s47 s48 s49 s50 s51 s52 s53 s54 s55 s56 s57 s58 s59
7 7 7 7 8 8 8 8 8 8 9 10 10 10 10 10 10 10 10 10
s60 s61 s62 s63 s64 s65 s66 s67 s68 s69 s70 s71 s72 s73 s74 s75 s76 s77 s78 s79
10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
s80 s81 s82 s83
10 10 10 10
There are 10 predictors associated with the ridge regression model. The number of predictors in the lasso regression model decreases from 10 to 0 as the penalty, log(lamda), increases.
Regularized Regression Comparison
The lasso regression model worked better for this task. It predicts a very similar mean squared error at higher values, and has a lower overall MSE as the penalty increases at lower values. The lasso regression model also estimates fewer predictors within one standard error of the overall minimum cross validation error. This means that less predictors are used for achieving a similar predictions in the lasso regression model.
Citation
@online{favre2023,
author = {Favre, Kiran},
title = {Predicting the {Age} of {Abalone} with {Regularized}
{Regression} {Models}},
date = {2023-09-25},
url = {https://kiranfavre.github.io/posts/2023-09-25_abaloneageML/},
langid = {en}
}