GENERALIZED LINEAR MODEL
Generalized linear models are an offshoot of the interpretable model
linear regression. It predicts a target as a weighted sum of inputs. The
way linear regression makes interpretability easy is by viewing the
learned relationships’ linearity. This is achieved through modeling the
dependence of the target variable y on some features of X. It’s mainly
used to tackle quantitative problems by statisticians and computer
scientists (C Molnar, 2019).
A GLM is one of the extensions of the linear model to model nonlinear
outcomes, the essential defining feature of a GLM is to allow
non-Gaussian outcome distributions and connect the distributions and the
weighted sum through a nonlinear function. Thus a GLM can be modeled to
give a categorical outcome and a count outcome, which a linear
regression model cannot produce (C Molnar, 2019).
GLM consists of THREE essential parts, namely:
Random Component – refers to the probability
distribution of the response variable (Y), e.g., normal distribution
for Y in the linear regression or binomial distribution
for Y in the binary logistic regression. Also called a noise
model or error model.
Systematic component - specifies the explanatory
variables (X 1, X 2… Xk ) in the model, more specifically their
linear combination in creating the so-called linear predictor ;
e.g., β0 + β1x 1 +
β2x 2 as we have seen in linear
regression.
Link Function, η or g (μ) - specifies the link
between random and systematic components. It says how the expected value
of the response relates to the linear predictor of explanatory
variables; e.g., η = g (E(Yi )) = E (Yi ) for
linear regression or η = logit (π) for logistic
regression.