B. Sc. III Semester V

Subject-Statistics-XI

DSE-E15: Sampling Theory

Theory: 36 Hours. (Credit 02)

 Unit-1    Simple and Stratified Random Sampling:

Introduction:

        Sampling is quite often used in our day to day practical life. For example -  in a shop we assess the quality of sugar, wheat or any  other commodity by taking a handful of it from the bag and then decide to purchase it or not. A housewife normally tests, the cooked products to find if they are properly cooked and contain the proper quantity of salt.

In sampling theory we first define the following terms.

i)  Population (Universe)

The group of individuals under study is called population or universe.  (The totality of the objects of study)  For example -if we are going to study the economic conditions of primary teachers in Maharashtra state, then the total of all the primary teachers in Maharashtra state is the population or universe for the study. In short the totality of the members of study is called the population. It may be a group of men, animals, trees or electric bulbs cars etc.

ii) Sample

A finite subset of individuals in a population is called a sample and the number of individuals in a sample is called the sample size. (A part of the population is called sample).

iii) Census method

The method of collecting data from entire population is called the census method. If the census method is to be followed in the above example then we have to collect data about the economic conditions of every primary teacher in Maharashtra state.

iv) Sampling method

If instead of studying the entire population, a part of it is studied it is called the sampling method. Thus if the sampling method is to be used in the above example, we would study the economic conditions of a few properly selected primary teachers and then estimate the results for all the teachers. In short, if the data is collected from a selected few it is called sampling method.

Advantages of sampling method over census method

i) Time

 If the population is large (generally it is), then the study of the entire population not only for collection but also for analyzing the data will require a lot of time. As against this collecting and analysis of the sample will largely reduce the time required. In some cases where the results are required quickly census method is not used

ii) Cost

It is also obvious that the study of entire population will be very costly. Since in a sample survey only a part of is to be studied, the cost involved will be proportionately less.  Sampling method is much more economical than the census method.

iii) Reliability (Accuracy) 

Since in a sample, only a part is to be studied a number of precautions can be taken and a very careful investigation can be made. On the other hand information may be lost in census method on account of the large size of the population. Due to small size of sample, it is possible to check the information also to check the results during analysis. All this leads to increased reliability of the sampling method.

iv) Details of information

Again, since the size of the sample is small, every member of the sample can be studied rigorously and detailed information can be obtained about it.

v) In some cases sampling is the only possible method

 In certain investigations census method is not possible to use and only the sampling method is used. For example: examining blood of a human body, inspection of crackers, explosive materials, measuring life time of electric components etc. In such cases sampling is the only possible method. Thus sampling method is found to be much superior to the census method.

Some concepts in sampling

Distinguishable Elementary Unit

        The ultimate unit in population which is distinguishable and identifiable is called as an elementary unit.  For example; in population census survey every individual person is an elementary unit.

       The population under study must be divided into small parts called sampling units or units. Sampling units together must cover the entire population and they must not overlap. For example: in a socio economic survey, a family is a sampling unit whereas in health survey an individual will be sampling unit.  In a population of light bulbs, the unit is a single bulb. In sampling of agricultural crop the unit may be field or an area of land whose size and shape is immaterial. Thus sampling unit is the smallest part of the population which cannot be further subdivided for the said purpose.  Thus the sampling unit may consist of one or more over elementary units of the population. A well defined and identifiable elements or group of elements on which observations can be made is called sampling units.

 Sampling Frame

In order to cover the entire population, there should be some list or map called the sampling frame. It is an exhaustive list of all members or elements of population. It gives guidelines to cover the entire population.

As the sampling frame determines the structure of the sample survey, it must be up-to-date and non overlapping. In a socio economic survey, frame may be determined from the records at Gram panchayat or ration cards.

Samples can be selected in two ways.

Random sampling

   In this method, the sample is selected impartially.  Personal or any kind of bias in selection is avoided and pure statistical approach is used. These methods least affected by personal bias, so these methods are widely used in practice. It is also referred as probabilistic sampling; since it is random sampling laws of probability can be applied.

Advantages

i) Random sampling does not need the detailed information about the population for its effectiveness.

ii) It provides an estimate and has measurable precision.

iii) It is possible to evaluate the relative efficiency of various sample designs only when random sampling is used.

 Limitations

i) It requires a very high level of skills and experience for its use.

ii) To plan and to execute a random sample, a lot of time is required.

iii) As compare to non random sampling, the cost involves in random sampling is large.

 Due to these limitations, non random sampling is used quite often in practice.

Non random sampling

It is a process of sampling without randomization. A non random sample is selected on the basis of judgment or convenience and not under the probability consideration.  Investigators select elements in any manner suitable to him. For example: he may select elements in first come first serve basis.

 To select candidates for debate competition, deliberate selection of suitable candidates will be done. It is purposive sampling (non random). In the advertisement campaign for cosmetics, certainly a sample of youngsters will be taken. This method is unscientific and produces unreliable results. 

Convenient Sampling:

It involves selecting participants who are easily accessible and convenient to reach.

Example: A researcher wants to study the average amount spent on coffee per day. They stand outside a popular coffee shop and ask customers exiting the shop about their daily coffee expenses. This sample is convenient, but may not represent the entire population.

Purposive Sampling:

It involves selecting participants based on specific criteria or characteristics relevant to the research question.

Example: A researcher wants to study the experiences of entrepreneurs who have successfully launched startups. They select participants from a list of award-winning entrepreneurs, ensuring that the sample has the desired expertise and experience.

Judgment Sampling:

It involves selecting participants based on the researcher's expertise and knowledge about the population.

Example: A researcher wants to study the impact of a new teaching method on student learning outcomes. They select classrooms and students based on their knowledge of the schools and teachers, ensuring a representative sample of the population.

 

Snowball sampling:

It is used to select participants for a study, particularly in cases where the population is hard to reach or hidden. It's called "snowball" because the sample size grows incrementally, like a snowball rolling down a hill, gathering more participants as it goes.

Example: Studying a rare disease, researchers start with a few patients (seeds) and ask them to refer other patients they know, creating a snowball effect to gather more participants.

 Quota sampling:

It is a non-probability sampling technique used in research to select participants that represent specific subgroups or characteristics of a population. In quota sampling, the population is divided into subgroups based on relevant characteristics, such as age, gender, income, occupation; etc. The researcher then sets a quota (a specific number) for each subgroup, ensuring that the sample is representative of the population's diversity. Participants are selected based on these quotas, often through convenience or snowball sampling methods.

Quota sampling is commonly used in market research, social sciences, and opinion polls, where the goal is to understand specific segments of the population rather than the entire population.

Note: 1. As sample is selected to study the population, it should be such that it will represent all important characteristics of the population. Thus sample is miniature of population.

2. Sampling units should be independent.

3. It should be evenly spread over the population. It can be achieved by dividing population in homogeneous subgroups and selecting samples from each subgroup.  

 

            

Methods of sampling

There are various methods used to select the sample from the population. We shall study the following method of sampling.

Simple random sampling (SRS)

             In this method, each item in the population has an equal and independent chance of being selected in the sample.

Suppose we take a sample of size n from a finite population of size N, then there are NCn       possible samples. A sampling method in which each of the NCn samples has an equal chance of being selected is known as random sampling and the sample obtained by this method is called as a random sample. The following methods are commonly used for selecting a simple random sample.

Lottery method

In this method, the numbers or the names of all the members of the population are written on separate pieces of paper of the same size, shape and color.  The pieces are folded in the same manner, mixed up thoroughly in a drum and the required numbers of pieces are drawn blindly.  All this ensures that, each member of the population has equal opportunity of being included in the sample. The method is used for drawing the prizes of a lottery and hence the name.

Table of random numbers

If population is large, lottery method is tedious to follow.  An alternative method is the method of random numbers. In this method, all the items are given numbers. Then a book of random numbers is taken. The book is opened at random and from any row any column, the numbers are taken. The items bearing these numbers are included in the sample.

  SRSWR and SRSWOR

            If the units are selected one by one in such a way that, a unit selected is replaced back to   the population before the next draw (selection), it is known as SRSWR. If a unit selected once is not replaced back to the population before the next draws (selections), it is known as SRSWOR.

For ex.: Population of size N= 4,   contains say 1, 2, 3 & 4 items, then the SRSWR and      SRSWOR’s of size 2 are,

                   SRSWOR                                                    SRSWR

               (1, 2), (1, 3), (1, 4)                                (1,2), (1,3), (1,4 ),(1,1), (2,3), (2,4), 

                      (2,3), (2,4)                                     (2,2),(3,4),(3,3), (4, 4),(2,1),(3,1)  

                           (3, 4)                                        (3,2), (4, 1), (4,2), (4,3)                

                      Total = 06                                                            Total = 16                                                   





                                                                      ***
***
  ***

Sampling of dichotomous attributes

              Attributes mean qualitative (non measurable) characteristics.  For ex: Honesty, Gender, intelligence etc.

Population may be divided into two or more classes according to attributes. An attribute which can be classified into two classes is called as dichotomous attribute and the classification is called dichotomous classification.

For ex: If attribute is gender then population is classified into two classes male and female.

 Notations:

 Consider population of size 'N' units divided into two mutually exclusive and exhaustive classes according to the given attribute.

                                                               


***

Determination of sample size

 Introduction

In sampling theory, some of the most important problems for statisticians or researchers may face before planning for the sample survey are:

I) What should be the size of sample?

II) How large or small should be the sample that it may be representative of the whole population?

III) Whether the estimated sample consists of the smallest sampling error?

IV) How to determine the sample size for further statistical study?

Two important facts are considered at the time of determining sample as,

I) If sample size is too small, it may not serve to achieve the objective of the study.

II) If sample size is too large, it may require huge money (cost of study), time and human resources.

 A sample with the smallest sampling error is always considered as a good representative of the population. So that sample must be computed by using some statistical procedures to minimize these errors.  

 Determination of   sample size

 In the planning of sample survey the stage always reached at which decision must be made about about size of the sample. In the planning of any sample survey the first problem is that, a statistician is face with, to determine the size of the sample so that the unknown population parameter may be estimated with specified degree of precision.

 Sometimes, there occurs error in estimation. Let, 'd' be the margin of error in estimation,  then we want to find the appropriate sample size for which error in estimation is reduced. It can be done using following two methods:

I) When margin of error (d) & confidence coefficient (1-α) is known (pre specified).

II) When coefficient of variation (C.V.) and confidence coefficient (1-α) is known.

 

                  

    

 

    

                                                

          

         



 

Comments

Popular posts from this blog

B. Sc. Part I Semester I I.I Introduction to Statistics :Nature of Data, Sampling, Classification and Tabulation

Unit 1 : Multiple Regression , Multiple Correlation and Partial Correlation 1.1: Multiple Linear Regression (for trivariate data)