When provided with data, we try to validate it and perform some mathematical analysis. Data alone is not enough, nor does it make any sense. We must unravel details from it. Statistical hypothesis testing is one such important step that we rely upon make our decision or reach a conclusion.
A hypothesis test is a method of statistical inference used to assess whether a hypothesis about a population parameter is valid. Hypothesis tests are used in various applications, including medical testing, quality control and manufacturing, criminology, psychology, and the social sciences.
A hypothesis test determines whether a given statement about a population parameter is true or false. To do this, a hypothesis test uses a statistical test to compare the observed value of a statistic to a hypothesized value of the statistic.
For example – Let us say that the sale of umbrellas increases by 40 percent in the rainy season or that a particular vaccine is helpful against certain viruses. All those are examples in which we need some statistics to accept or reject the claim that was made. We need some mathematical conclusion. This is called hypothesis testing.
Let us discuss it further to gain more clarity on it.
Types of Hypothesis Testing
- The Null Hypothesis
- The Alternative Hypothesis
You may also like to read: A Beginner’s Guide to Hypothesis Testing
Null Hypothesis
The null hypothesis is the base assumption that there is no difference between the two data sets. It treats everything to be the same or equal. It is a kind of “best-case” scenario, and it is used to identify how much variability there is in the data sets. It is denoted with H0.
Alternate Hypothesis
The alternate hypothesis is a statement that explains why the null hypothesis is unlikely to be true. This means that if we have an idea about the cause of a problem, we can use the alternative hypothesis to help us determine if our original theory (null hypothesis) is correct or not. It is denoted with (H1/Ha).
We are done with the introductory part. Let’s dive deep into the topic to learn more about hypothesis testing.
Here are some important terms related to hypothesis testing which we must know.
- Population
- Sample
- P-value
- Significance level
- One-tailed test
- Two-tailed test
- Type 1 and Type 2 Error
Population (N)– It refers to the actual crowd for which we need to conclude.
Sample (n) – It refers to the small number of elements selected from the population on which we perform the statistical test and conclude about the entire population, as it is impossible to experiment on the whole population.
P value – It is the probability of the null hypothesis being true. Suppose we have a P value of 0.01; it means that out of 100 experiments, only once the null hypothesis will hold true.
The significance level (α)
It is the limit beyond which we tend to reject the null hypothesis. It is the probability of an alternate hypothesis being true.
If our P value falls on the significance level or away from it, we reject our null value and accept the alternate hypothesis.
Suppose we have α=0.05; it means that we have a 95% confidence level and a 5% chance of the null hypothesis getting rejected.
The one-Tailed test
It is also known as the directional test, determines whether a particular hypothesis is true or false. The critical section is one-sided.
For example, if we are testing to see if the average age of our employees is below 30 and we find that the average age of our employees is 29, then this would be a one-tailed test because we only care about whether the hypothesis—that the average age of our employees is below 30—is true or not.
Two-Tailed Test
We would have to consider both tails (upper and lower) in the Two-Tailed test. The critical section is two-sided. The null hypothesis gets rejected if it lies on either end of the specific statistical value.
For example,
Let us say we want to test if there is a considerable difference between the average SAT scores of public and private school students. In this case, we have two groups (public and private schools) and one continuous variable (SAT score). We will use a two-tailed test for a significant difference. Our null hypothesis would be that there is not much difference between the average SAT scores of students who attended public schools and students who attended private schools. The alternative view would be that there is a difference between SAT scores.
Type 1 and Type 2 Error
Two types of errors can be generated after testing our hypothesis.
Type 1 Error: A Type-I error occurs when sample results reject the null hypothesis despite being true.
Type 2 Error: A Type-II error occurs when the null hypothesis, despite being false, does not get rejected, unlike a Type-I error.
Suppose an instructor evaluates the paper to decide whether a student passes or fails.
H0: Candidate has passed
H1: Candidate has failed
Type I error will be the instructor failing the candidate [rejects H0] although the candidate scored the passing marks [H0 was true].
Type II error will occur when the instructor passes the candidate [accept H0] although the candidate did not score the passing marks [H1 is true].
Now that we have discussed essential terminologies let us see the type of test and the kind of statistics they work upon.
- Z-test -> Comparison of mean
- Student T test -> comparison of mean
- ANOVA test -> Analysis of variance
- Chi-Square test -> comparison of a categorical variable
- F test -> comparison of variance
So far, we have discussed all the significant terminologies and the type of tests that can be performed.
Now let us look at the steps of hypothesis testing for more clarity.
Step 1 – State the null hypothesis (H0) as well as the alternate hypothesis(H1/Ha).
Step 2 – State the significance level (or the value of α).
Step 3 – Select the correct sample from the population.
Step 4 – Perform the appropriate test (Z-test, T-test, etc.) based on the nature of the dataset and obtain a P value.
Step 5 – Check whether the obtained P value values are within the confidence interval.
Step 6 – based on the outcome of step 5, decide whether to accept or reject the above-stated null hypothesis.
To sum up our entire learning, we will see one example of how to implement everything that we learned in this article.
Problem statement
The average IQ of students in a class is 110, with a standard deviation of 13. A researcher thinks a particular kind of training will positively or negatively affect the IQ level of students. A sample of 35 students who were a part of the training program was chosen, and an IQ test was performed. The mean IQ level was now 143. Test the hypothesis that the researcher’s idea was successful and that the training program helped.
Solution:
Step 1: State the null hypothesis: H0:μ=100 and the alternate hypothesis: H1:≠100
Step 2: State our alpha level. We’ll use 0.05 for this example. As this is a two-tailed test, split the alpha into two 0.05/2=0.025.
Step 3: Find the z-score associated with our alpha level. We are looking for the area in one tail only. A z-score for 0.75(1-0.025=0.975) is 1.96. As this is a two-tailed test, we would also be considering the left tail (z = 1.96).
Step 4: Find the test statistic using this formula: z = (143 – 100) / (13/√35) = 19.568.
Step 5: If Step 5 is less than -1.96 or greater than 1.96 (Step 3), reject the null hypothesis. In this case, it is more significant so that we can reject the null.
Therefore, we can conclude that the training program was not that effective.
Process of Hypothesis Testing Involves:
- Setting up the null and alternative hypotheses.
- Selecting an appropriate test statistic.
- Determining the level of significance.
- Calculating the p-value.
- Making a decision based on the p-value.
Conclusion
Hypothesis testing is a crucial statistical tool used to make inferences about population parameters based on sample data. There are two types of hypothesis testing – null hypothesis testing and alternative hypothesis testing.
Hypothesis testing plays a critical role in various fields, including medicine, economics, psychology, and sociology, to name a few. It helps researchers and analysts to conclude data, and make informed decisions based on evidence. While there are limitations to hypothesis testing, such as the potential for type I and type II errors, it remains an essential tool for drawing conclusions from data and advancing scientific knowledge.
The blog is well organized and provides detailed information on dissertation review with examples and problem descriptions that help the anthology to fully understand the subject matter.
The data, analysis, and examples presented in the blog posts are really well organized, give a clear understanding of the topic, and leave no room for doubt in the minds of the anthology.
The blog is well-structured and offers in-depth details on paper evaluations, examples, and problem descriptions to aid the anthology in understanding the topic at hand.
There is no room for dispute in the anthology thanks to the extremely well-organized data, analysis, and examples provided in the blog post.
To help the anthology grasp the subject at hand, the blog is well-structured and provides in-depth information on paper assessments, examples, and issue descriptions.
Thanks to the exceptionally well-organized data, analysis, and examples given in the blog post, there is no place for disagreement in the anthology.
The blog is well-organized and offers in-depth information on paper assessments, examples, and issue descriptions to aid the user in understanding the topic at hand.
There is no room for debate in the anthology because of how extraordinarily well the data, analysis, and examples provided in the blog post were arranged.
To help the user comprehend the subject at hand, the blog is well-organized and provides in-depth information on paper assessments, examples, and issue explanations. The one and two tailed tests were proved to be extremely insightful.
The data, analysis, and examples offered in the blog post were arranged incredibly well, leaving no area for discussion in the anthology.
Really helpful content, got to learn so much new in a easy way.
had a great time going through this, really helped me understand few things better, keep the work going guys
Being a statistics student, this blog really helped me to understand a very important topic of hypothesis testing which has a core use in our statistical degree. Brainalyst really concluded it well
Really descriptive steps, with illustrative examples and analysis.
I just finished reading this blog post and I wanted to take a moment to express my appreciation for the content shared. The writing style is engaging and your insights are truly valuable.
I particularly appreciate the discussion on “Type 1 & 2 error”, as it shed light on a subject that I hadn’t considered before. The perspective was both informative and refreshing, and overall it was a very interesting and useful blog!
That’s a nice portfolio!
Really helpful blog as it’s quite structured and easily understandable..
The blog is well structured and organised which makes it easy to comprehend.
Such daily doses of knowledge has helped me very significantly