F-statistic calculator (2024)

Last updated:Mar 18, 2024

Table of contents

What is F-statistic?How to calculate the F-statistic using an F-statistic table?How to calculate the F-statistic in linear regression?FAQs

The F-statistic calculator (or F-test calculator) helps you compare the equality of the variances of two populations with normal distributions based on the ratio of the variances of a sample of observations drawn from them.

Read further, and learn the following:

What is an F-statistic;
What is the F-statistic formula; and
How to interpret an F-statistic in regression.

What is F-statistic?

Broadly speaking, an F-statistic is a test procedure that compares variances of two given populations. While an F-test may appear in various statistical or econometric problems, we apply it most frequently to regression analysis containing multiple explanatory variables. In this vein, an F-statistic is comparable to a T-statistic, with the main difference of having a linear combination of multiple regression coefficients (F-test) instead of testing only an individual one (T-test).

In the following article, we introduce the F-test in its most basic form using the F-distribution table for better intuition. Then we show how to calculate F-statistic in linear regressions (see the calculator's Multiple regression mode) and explain how to interpret an F-statistic in regression analysis.

How to calculate the F-statistic using an F-statistic table?

The best way to grasp the essence of F-test statistics is to consider its most basic form. Let's consider two populations, from which we each draw an equal number of observation samples. If we want to test whether the two populations are likely to have the same variance (denoted by $S^2_i$ Si2, $i = 1, 2$ i=1,2), we need to follow these steps:

Specify the null hypothesis $H_0$ H0 (which in our simple case is that the two variances are equal) and the alternative hypothesis $H_1$ H1 (which supposes that the two variances are different).

$\footnotesize \qquad \begin{align*}H_0 : S^2_1 &= S^2_2 \\H_1 : S^2_1 &\neq S^2_2\end{align*}$ H0:S12H1:S12=S22=S22

Determine the variance of the samples (here you may find our variance calculator useful).
Calculate the F-test statistic by dividing the two variances.

$\footnotesize \qquad F = \frac{S^2_1}{S^2_2}$ F=S22S12

Determine the degrees of freedom $(\text{df}_i)$ (dfi) of the two samples, with $n$ n being the number of observations taken from the two populations in each case.

How to calculate the F-statistic in linear regression?

Analysts mainly apply F-statistic on multiple regressions models (and so can you, with our F-test statistic calculator in Multiple regression mode). It's therefore a good idea that we step further in this direction from the previous basic analysis.

Let's assume we have the following regression model (full model, or unrestricted model), where we would like to know if it is more significant than its reduced form (restricted model). In other words, we are testing whether the restricted coefficients (or the effects of the restricted variables) are jointly non-significant (equal to zero) in the population:

$\footnotesize \text{Full model}\\y = \beta_0 + \beta_1x_1+ \beta_2x_2+ \beta_3x_3+ \hat{u}\\[1em]\footnotesize \text{Restricted model}\\y = \beta_0 + \beta_1x_1+ \hat{u}$ Fullmodely=β0+β1x1+β2x2+β3x3+u^Restrictedmodely=β0+β1x1+u^

where:

$\beta_0$ β0 – Constant or intercept;
$y$ y – Dependent variable (also called the regressand, response variable, explained variable, or output variable);
$x_i\, , \ i = 1, 2, 3$ xi,i=1,2,3 is the independent variable (also called the regressor, explanatory variable, controlled variable, or input variable);
$\beta_i\, ,\ i = 1, 2, 3$ βi,i=1,2,3 are the coefficients; and
$\hat{u}$ u^ is the residual (or error term).

To conduct the F-test and obtain the F-statistic (or F-value), we need to take the following steps:

State the hypothesis we want to test.
In our case, the null hypothesis $(H_0)$ (H0) is that the last two coefficients are jointly equal to zero in the unrestricted model. Or, stating the same differently, the joint effect of the related independent variables is insignificant.
In turn, the alternative hypothesis $(H_1)$ (H1) is that at least one of these coefficients is not equal to zero.

$\footnotesize\qquad \text{Specific case} \\\qquad H_0: \beta_2= \beta_3 = 0 \\[1em]\qquad \text{General case} \\\qquad H_0: \beta_{K-J+1} = \cdots = \beta_K = 0$ SpecificcaseH0:β2=β3=0GeneralcaseH0:βK−J+1=⋯=βK=0

where:
- $J$ J is the number of restrictions (in the present case, $J=2$ J=2); and
- $K$ K is the total number of coefficients (in the present case, $K = 3$ K=3).

Now, to gain information on which model fits better, we need to obtain the sum square of residuals ( $\text{SSR}$ SSR), where we expect that the sum square of residuals of the restricted model is larger than that of the full model (i.e. $\text{SSR}_R > \text{SSR}_F$ SSRR>SSRF).

$\footnotesize \qquad SSR = \sum^N_{i=1} \hat{u}^2_i$ SSR=i=1∑Nu^i2

However, the real question is to determine whether the sum square of residuals of the restricted model is significantly larger than the one in the full model (i.e. $\text{SSR}_R \gg \text{SSR}_F$ SSRR≫SSRF). To do so, we need to apply the following F-statistic formula to estimate the F-ratio.

$\qquad \footnotesize F = \frac{(\text{SSR}_R-\text{SSR}_F) / J}{(1 - \text{SSR}_F)/(N-K)}$ F=(1−SSRF)/(N−K)(SSRR−SSRF)/J

where:
- $F$ F – F-statistic;
- $\text{SSR}_F$ SSRF – Sum square of residuals of the full model;
- $\text{SSR}_R$ SSRR – Sum square of residuals of the restricted model;
- $J$ J – Number of restrictions;
- $K$ K – Total number of coefficients; and
- $N$ N – Number of observations representing the population.

Naturally, the larger the F-statistic, the more evidence we have to reject the null hypothesis (note that the F-statistic increases when the difference between the two variances gets larger). However, to be more precise, we need to find a critical value of the F-statistic to decide on the rejection. In other words, if $F$ F is larger than its critical value, we can reject the null hypothesis.
Now, we can proceed in the way we described in the previous section by finding the critical F-value $(F^J_{N-K;\alpha})$ (FN−K;αJ) in the F distribution table with a specified significance level F-statistic $(\alpha)$ (α) and looking for the intercept corresponding to the degrees of freedom, where $\text{df}_1 = J$ df1=J is at the top and $\text{df}_1 = N-K$ df1=N−K is at the side of the table (we can also say that $F$ F has an F-distribution with $J$ J and $N − K$ N−K degrees of freedom). If $F$ F is larger than its critical value, we can reject the null hypothesis.

So how to interpret F-statistic in regression?

The F-test can be interpreted as testing whether the increase in variance moving from the restricted model to the more general model is significant. We may write it formally in the following way:

$\footnotesize P\{F > F^J_{N-K;\alpha}\} = \alpha$ P{F>FN−K;αJ}=α

where $\alpha$ α is the significance level of the test. For example, if $N − K = 40$ N−K=40 and $J = 4$ J=4, the critical value at the 5% level is $F^J_{N-K; \alpha} = 2.606$ FN−K;αJ=2.606.

FAQs

What is the difference between F-test vs T-test?

There are some differences between the F-test vs a T-test.

The T-test is applied to test the significance of one explanatory variable, but the F-test studies the whole model.
While the T-test is used to compare the means of two populations, F-test is applied for comparing two population variances.
The T-statistic is based on the student t-distribution, while the F-statistic follows the F-distribution under the null hypothesis.
While the T-test is a univariate hypothesis test where the standard deviation is unknown, the F-test is applied to determine the equality of the two normal populations.

Can an F-statistic be negative?

No. Since variances always take a positive value (squared values), both the numerator and the denominator of the F-statistic formula must always be positive, resulting in a positive F-value.

What is a high F-statistic?

While a large F-value tends to indicate that the null hypothesis can be rejected, you can confidently reject the null if the T-value is larger than its critical value.

Is the F-distribution symmetric?

No. The curve of the F-distribution is not symmetrical but skewed to the right (the curve has a long tail on its right side), where the shape of the curve depends on the degrees of freedom.

How to calculate F-statistic?

To calculate F-statistic, in general, you need to follow the below steps.

State the null hypothesis and the alternate hypothesis.
Determine the F-value by the formula of F = [(SSE₁ – SSE₂) / m] / [SSE₂ / (n−k)], where SSE is the residual sum of squares, m is the number of restrictions and k is the number of independent variables.
Find the critical value for the F-statistic as determined by F-statistic = variance of the group means / mean of the within-group variances.
Find the F-statistic in the F-table.
Support or reject the null hypothesis.

What is the F-statistic of two populations with variances of 10 and 5?

The F-statistic of two populations with variances of 10 and 5 is 2. To get this result, it suffices to divide the two variances.

Related calculators

Sum of Squares CalculatorMidrange CalculatorCoefficient of Variation Calculator