FAQ: How are the likelihood ratio, Wald, and Lagrange multiplier (score) tests different and/or similar?

**Purpose:** This page introduces the concepts of the a) likelihood ratio test, b) Wald test, and c) score test. To see how the likelihood ratio test and Wald test are implemented in Stata refer to How can I perform the likelihood ratio and Wald test in Stata?

A researcher estimated the following model, which predicts high versus low writing scores on a standardized test (**hiwrite**), using students’ gender (**female**), and scores on standardized test scores in reading (**read**), math (**math**), and science (**science**). The output for the model looks like this:

## The likelihood

All three tests use the likelihood of the models being compared to assess their fit. The likelihood is the probability the data given the parameter estimates. The goal of a model is to find values for the parameters (coefficients) that maximize value of the likelihood function, that is, to find the set of parameter estimates that make the data most likely. Many procedures use the log of the likelihood, rather than the likelihood itself, because it is easier to work with. The log likelihood (i.e., the log of the likelihood) will always be negative, with higher values (closer to zero) indicating a better fitting model. The above example involves a logistic regression model, however, these tests are very general, and can be applied to any model with a likelihood function. Note that even models for which a likelihood or a log likelihood isnot typically displayed by statistical software (e.g., ordinary least squares regression) have likelihood functions.

Đang xem: Likelihood ratio test là gì

As mentioned above, the likelihood is a function of the coefficient estimates and the data. The data are fixed, that is, you cannot change them, so one changes the estimates of the coefficients in such a way as to maximize the probability (likelihood). Different parameter estimates, or sets of estimates give different values of the likelihood. In the figure below, the arch or curve shows the changes in the value of the likelihood for changes in one parameter (**a**). On the x-axis are values of **a**, while the y-axis is the value of the likelihood at theappropriate value of **a**. Most models have more than one parameter, but, if the values of all the other coefficients in the model are fixed, changes in a given **a** will show a similar picture. The vertical line marks the value of **a** that maximizes the likelihood.

## The likelihood ratio test

The LR test is performed by estimating two models and comparing the fit of one model to the fit of the other. Removing predictor variables from a model will almost always make the model fit less well (i.e., a model will have a lower log likelihood), but it is necessary to test whether the observed difference in model fit is statistically significant. The LR test does this by comparing the log likelihoods of the twomodels, if this difference is statistically significant, then the less restrictive model (the one with more variables) is said to fit the datasignificantly better than the more restrictive model. If one has the log likelihoods from the models, the LR test is fairly easy to calculate. The formula for the LR test statistic is:

$$LR = -2 lnleft(frac{L(m_1)}{L(m_2)}

ight) = 2(loglik(m_2)-loglik(m_1))$$

Where $L(m_*)$ denotes the likelihood of the respective model (either Model 1 or Model 2), and $loglik(m_*)$ the natural log of the model’s final likelihood (i.e., the log likelihood). Where $m_1$ is the more restrictive model, and $m_2$ is the less restrictive model.

The resulting test statistic is distributed chi-squared, with degrees of freedom equal to the number of parameters that are constrained (in the current example, the number of variables removed from the model, i.e., 2).

Using the same example as above, we will run both the full and the restricted model, and assess the difference in fit using the LR test. Model one is the model using **female** and **read** as predictors (by not including **math** and **science ** in the model, we restrict their coefficientsto zero). Below is the output for model 1. We will skip the interpretation of the results because that is not the focus of our discussion,but we will make note of the final log likelihood printed just above the table of coefficients ($loglik(m_1) = -102.45$).

$LR = 2 * (-84.419842 – (-102.44518) ) = 2 * (-84.419842 + 102.44518 ) = 36.050676$

So our likelihood ratio test statistic is $36.05$ (distributed chi-squared), with two degrees of freedom. We can now use a table or some other method to find the associated *p*-value, which is $p

( 1) math = 0 ( 2) science = 0 chi2( 2) = 27.53 Prob > chi2 = 0.0000

## The Lagrange multiplier or score test

**As with the Wald test, the Lagrange multiplier test requires estimating only a single model. The difference is that with the Lagrange multiplier test, the model estimated does not include the parameter(s) of interest. This means, in our example, we can use the Lagrange multiplier test to test whether adding science** and **math** to the model will result in a significant improvement in model fit, after running a model with just **female **and **read** as predictor variables. The test statistic is calculated based on the slope of the likelihood function at the observed values of the variables in the model (**female** and **read**). This estimated slope, or “score” is the reason the Lagrange multiplier test is sometimes called the score test. The scores are then used to estimate the improvement in model fit if additional variables were included in the model. The test statistic is the expected change in the chi-squared statistic for the model if a variable or set of variables is added to the model. Because it tests for improvement of model fit if variables that are currently omitted are added to the model, the Lagrange multiplier test is sometimes also referred to as a test for omitted variables. They are also sometimes referred toas modification indices, particularly in the structural equation modelingliterature.

Xem thêm: Attrition Rate Là Gì – Nghĩa Của Từ Attrition Rate

Below is output for the logistic regression model using the variables **female** and **read** as predictors of **hiwrite** (this is the same as Model 1 from the LR test).

logit: score tests for omitted variablesTerm | score df p———————+———————- math | 28.94 1 0.0000 science | 15.39 1 0.0001———————+———————- simultaneous test | 35.51 2 0.0000———————+———————-

## A comparison of the three tests

As discussed above, all three tests address the same basic question, which is, does constraining parameters to zero (i.e., leaving out these predictor variables) reduce the fit of the model? The difference between the tests is how they go about answering that question. As you have seen, in order to perform a likelihood ratio test, one must estimate both of the models one wishes to compare. The advantage of the Wald and Lagrange multiplier (or score) tests is that they approximate the LR test, but require that only one model be estimated. Both the Wald and the Lagrange multiplier tests are asymptotically equivalent to the LR test, that is, as the sample size becomes infinitely large, the values of the Wald and Lagrange multiplier test statistics will become increasingly close to the test statistic from the LR test. In finite samples, the three will tend to generate somewhat different test statistics, but will generally come to the same conclusion. An interesting relationship between the three tests is that, when the model is linear the three test statistics have the following relationship Wald ≥ LR ≥ score (Johnston and DiNardo 1997 *p*. 150). That is, the Wald test statistic will always be greater than the LR test statistic, which will, in turn, always be greater than the test statistic from the score test. When computing power was much more limited, and many models took a long time to run, being able to approximate the LR test using a single model was a fairly major advantage. Today, for most of the models researchers are likely to want to compare, computational time is not an issue, and we generally recommend running the likelihood ratio test in most situations. This is not to say that one should never use the Wald or score tests. For example, the Wald test is commonly used to perform multiple degree of freedom tests on sets of dummy variables used to model categorical predictor variables in regression (for moreinformation see our webbooks on Regression with Stata, SPSS, and SAS, specifically Chapter 3 – Regression with Categorical Predictors.) The advantage of the score test is that it can be used to search for omitted variables when the number of candidate variables is large.

*Figure based on a figure in Fox (1997, p. 570); used with author’s permission.*

**One way to better understand how the three tests are related, and how they are different, is to look at a graphical representation of what they are testing. The figure above illustrates what each of the three tests does. Along the x-axis (labeled “a”) are possible values of the parameter a (in our example, this would be the regression coefficient for either math** or **science**). Along the y-axis are the values of the log likelihood corresponding to those values of **a**. The LR test compares the log likelihoods of a model with values of the parameter aconstrained to some value (in our example zero) to a model where **a** is freely estimated. It does this by comparing the height of the likelihoods for the two models to see if the difference is statistically significant (remember, higher values of the likelihood indicate better fit). In the figure above, this corresponds to the vertical distance between the two dotted lines. In contrast, the Wald test compares the parameter estimate **a-hat** to **a_0**; **a_0** is the value of **a** under the null hypothesis, which generally states that **a **= 0. If **a-hat** is significantly different from **a_0**, this suggests that freely estimating **a** (using **a-hat**) significantly improves model fit. In the figure, thisis shown as the distance between **a_0** and **a-hat** on the x-axis (highlighted by the solid lines). Finally, the score test looks at the slope of the log likelihood when **a** is constrained (in our example to zero). That is, it looks at how quickly the likelihood is changing at the (null) hypothesized value of **a**. In the figure above this is shown as the tangent line at **a_0**.

## References

Fox, J. (1997) Applied regression analysis, linear models, and related methods. Thousand Oaks, CA: Sage Publications.

Xem thêm: Ghiền Review Cô Hầu Gái ’: Bó Buộc Nỗi Sợ Hãi Trong Sự Lủng Củng

Johnston, J. and DiNardo, J. (1997) Econometric Methods Fourth Edition. New York, NY: The McGraw-Hill Companies, Inc.