Basic concepts in Testing of Hypothesis

Tests of Hypothesis

Sampling theory ©

Sampling is a part of our day to day life. Sampling is preferred to complete enumeration due to the fact that it is less time consuming, less expensive more accurate and reliable. Also in some cases sampling is the only possible method of data collection.

Population

Population is a group of items, units or subjects which is under reference of study. It may consist of finite of infinite number of units (Universe).

Sample

Sample is a part or fraction of a population selected on some basis. It consists of a few items of population. In principle, a sample should be such that it is a true representative of the population. Usually a random sample is selected. By random sampling, we mean the sampling in which each and every unit of the population has an equal and independent chance of being included in the sample.

In order to draw inference about certain phenomenon, sampling is a well accepted tool. Entire population can not be studied due to several reasons (stated above). In such a situation sampling is used. A properly drawn sample is much useful in drawing reliable conclusions. Here we draw a sample from probability distribution rather than a group of objects.

Random sample from a continuous probability distribution

A random sample from a continuous probability distribution f(x,ϴ) is nothing but the values of independent and identically distributed ( i. i. d.) random variables with the common probability distribution f(x,ϴ).

Definition: If x₁, x₂,----, x_n are i. i. d. random variables with p. d. f. f(x, ϴ), then they from a random sample from the population with p. d. f. f(x,ϴ) .

Note: i) For drawing inference we use the numerical values of x₁, x₂, ----, x_n. ii) The joint p. d. f. of x₁, x₂,----, x_n is

Statistic

Using the random sample x₁, x₂,----, x_n we draw conclusion about the unknown probability distribution. However probability distribution can be studied if the parameter ϴ is known. We use sample observations for this purpose. The sampled observations are summarized. The summarized quantity is called as statistic (estimator).

Definition: If x₁, x₂,----, x_n is a random sample from a probability distribution f(x,ϴ), then t = t t(x₁, x₂,----, x_n ) a function of sample values which does not involve the unknown parameter ϴ is called as a statistic( estimator).

Parameter

A function based on population values is called as parameter. If f(x,ϴ) is a p. d. f. then the constant ϴ, involved in it is called as parameter. Statistic is a random variable and parameter is a constant. Since statistic is a random variable, it possesses some probability distribution; it may not be the same as that of the parent distribution f(x,ϴ). However parameter being a constant does not possess probability distribution.

Estimator

An estimator is a function of sample values for estimating the population parameter. A particular value of an estimator from a fixed set of values of a random sample is known as estimate. An estimate stands for the value of a parameter. For ex: sample mean X-bar is an estimate of population mean µ and sample variance S² is an estimate of population variance ơ²

Unbiased Estimate

A statistic or estimator t is said to be an unbiased estimate of population parameter ϴ, if E(t) = ϴ, i.e. E( statistic or estimator ) = parameter.

Sampling distribution of statistic and Standard error

If x₁, x₂,----, x_n is a random sample from a probability distribution f(x,ϴ), then the probability distribution of T = t (x₁, x₂,----, x_n ) is called as its sampling distribution and standard deviation of t is called as its standard error ( S.E.).

In testing of hypothesis, standard error of T is important. Some typical statistics along with standard error are :

where p₁ and p₂are proportions obtained using two samples from two populations with proportions P₁and P_{2 .}©

***

Tests of significance

A very important aspect of the sampling theory is the study of the tests of significance. By t tests of significance, we decide on the basis of sample results if the deviation between the observed sample statistic and the parameter value or the deviation between two independent sample statistics is significant or insignificant (due to chance or sampling fluctuations).

Hypothesis

A definite statement about the population parameter is called as hypothesis. (A hypothesis is a claim to be tested). For ex: a particular scooter gives average of 50 km per liter, proportion of unemployed persons is same in two different states, average life of an article produced by company A is greater than company B.

Null Hypothesis

A hypothesis of no difference is called null hypothesis. OR Null hypothesis is the hypothesis which is tested for possible rejection under the assumption that it is true (Prof. R. A. Fisher). For example, in case of a single statistic, H₀ will be that the sample statistic doesn’t differ significantly from the parameter. i.e. H₀:μ = ₀ and in the case of two statistic H₀ will be that the sample statistics don’t differ significantly i.e. H₀: _1
= ₂.

Choice of null hypothesis

i) A hypothesis whose faulty rejection is more harmful. i

i) ii) A hypothesis under which, we can find the probability distribution of test statistic.

Alternative Hypothesis

Any hypothesis which is complementary to the null hypothesis is called an alternative hypothesis. It is denoted by H_1.For example, if H₀: ₀ i.e. the population has a specified mean ₀, then the alternative hypothesis could be

i) H₁: ₀( ₀or ₀) ii) H₁: ₀iii) H₁ ₀

The alternative hypothesis in (i) is known as a two sided(tailed) alternative while in (ii) & (iii) are known as (one sided alternatives) right tailed & left tailed alternative respectively.

Search This Blog

Vijay Koshti Statistics ( Statistical concepts and related articles)

Basic concepts in Testing of Hypothesis

In testing of hypothesis, standard error of T is important. Some typical statistics along with standard error are :

Comments

Post a Comment

Popular posts from this blog

Unit 1 : Multiple Regression , Multiple Correlation and Partial Correlation 1.1: Multiple Linear Regression (for trivariate data)

Time Series