Monday, 11 February 2019

Sampling Design


Sampling Design

Census and Sample Survey

  • All items in any field of inquiry constitute a ‘Universe’ or ‘Population.’

  • A complete enumeration of all items in the ‘population’ is known as a census inquiry

Researcher must prepare a sample design for his study i.e., he must plan how a sample should be selected and of what size such a sample would be.

Implications of a Sample Design

·         A sample design is a definite plan for obtaining a sample from a given population.

·         It refers to the technique or the procedure the researcher would adopt in selecting items for the sample.

·         Sample design may as well lay down the number of items to be included in the sample i.e., the size of the sample. Sample design is determined before data are collected

STEPS IN SAMPLE DESIGN

While developing a sampling design, the researcher must pay attention to the following points:

a)      Type of universe: The first step in developing any sample design is to clearly define the set of objects, technically called the Universe, to be studied. The universe can be finite or infinite. In finite universe the number of items is certain, but in case of an infinite universe the number of items is infinite, i.e., we cannot have any idea about the total number of items. The population of a city, the number of workers in a factory and the like are examples of finite universes, whereas the number of stars in the sky, listeners of a specific radio programme, throwing of a dice etc. are examples of infinite universes.


b)      Sampling unit: A decision has to be taken concerning a sampling unit before selecting sample. Sampling unit may be a geographical one such as state, district, village, etc., or a construction unit such as house, flat, etc., or it may be a social unit such as family, club, school, etc., or it may be an individual. The researcher will have to decide one or more of such units that he has to select for his study.



c)      Source list: It is also known as ‘sampling frame’ from which sample is to be drawn. It contains the names of all items of a universe (in case of finite universe only). If source list is not available, researcher has to prepare it. Such a list should be comprehensive, correct, reliable and appropriate. It is extremely important for the source list to be as representative of the population as possible.


d)     Size of sample: This refers to the number of items to be selected from the universe to constitute a sample. The size of sample should neither be excessively large, nor too small. It should be optimum. An optimum sample is one which fulfills the requirements of efficiency, representativeness, reliability and flexibility. While deciding the size of sample, researcher must determine the desired precision as also an acceptable confidence level for the estimate.

e)      Parameters of interest: In determining the sample design, one must consider the question of the specific population parameters which are of interest. For instance, we may be interested in estimating the proportion of persons with some characteristic in the population, or we may be interested in knowing some average or the other measure concerning the population. There may also be important sub-groups in the population about whom we would like to make estimates. All this has a strong impact upon the sample design we would accept.

f)       Budgetary constraint: Cost considerations, from practical point of view, have a major impact upon decisions relating to not only the size of the sample but also to the type of sample. This fact can even lead to the use of a non-probability sample.

g)      Sampling procedure: Finally, the researcher must decide the type of sample he will use i.e., he must decide about the technique to be used in selecting the items for the sample. In fact, this technique or procedure stands for the sample design itself. There are several sample designs (explained in the pages that follow) out of which the researcher must choose one for his study. Obviously, he must select that design which, for a given sample size and for a given cost, has a smaller sampling error.

Criteria of Selecting a Sampling Procedure

In this context one must remember that two costs are involved in a sampling analysis viz.,

          i.            the cost of collecting the data and
        ii.            the cost of an incorrect inference resulting from the data.



Researcher must keep in view the two causes of incorrect inferences viz., systematic bias and sampling error. A systematic bias results from errors in the sampling procedures, and it cannot be reduced or eliminated by increasing the sample size. At best the causes responsible for these errors can be detected and corrected. Usually a systematic bias is the result of one or more of the following factors:

a)      Inappropriate sampling frame: If the sampling frame is inappropriate i.e., a biased representation of the universe, it will result in a systematic bias.

b)     Defective measuring device: If the measuring device is constantly in error, it will result in systematic bias. In survey work, systematic bias can result if the questionnaire or the interviewer is biased. Similarly, if the physical measuring device is defective there will be systematic bias in the data collected through such a measuring device.

c)      Non-respondents: If we are unable to sample all the individuals initially included in the sample, there may arise a systematic bias. The reason is that in such a situation the likelihood of establishing contact or receiving a response from an individual is often correlated with the measure of what is to be estimated.

d)     Indeterminancy principle: Sometimes we find that individuals act differently when kept under observation than what they do when kept in non-observed situations. For instance, if workers are aware that somebody is observing them in course of a work study on the basis of which the average length of time to complete a task will be determined and accordingly the quota will be set for piece work, they generally tend to work slowly in comparison to the speed with which they work if kept unobserved. Thus, the indeterminancy principle may also be a cause of a systematic bias.

e)      Natural bias in the reporting of data: Natural bias of respondents in the reporting of data is often the cause of a systematic bias in many inquiries. There is usually a downward bias in the income data collected by government taxation department, whereas we find an upward bias in the income data collected by some social organisation. People in general understate their incomes if asked about it for tax purposes, but they overstate the same if asked for social status or their affluence. Generally in psychological surveys, people tend to give what they think is the ‘correct’ answer rather than revealing their true feelings.


Sampling errors are the random variations in the sample estimates around the true population parameters. Since they occur randomly and are equally likely to be in either direction, their nature happens to be of compensatory type and the expected value of such errors happens to be equal to zero. Sampling error decreases with the increase in the size of the sample, and it happens to be of a smaller magnitude in case of homogeneous population.



Characteristics of a Good Sample Design

From what has been stated above, we can list down the characteristics of a good sample design as under:

o  Sample design must result in a truly representative sample.

o  Sample design must be such which results in a small sampling error.

o  Sample design must be viable in the context of funds available for the research study.

o  Sample design must be such so that systematic bias can be controlled in a better way.

o  Sample should be such that the results of the sample study can be applied, in general, for the universe with a reasonable level of confidence.


Different Types of Sample Designs

There are different types of sample designs based on two factors viz., the representation basis and the element selection technique. On the representation basis, the sample may be probability sampling or it may be non-probability sampling. Probability sampling is based on the concept of random selection, whereas non-probability sampling is ‘non-random’ sampling. On element selection basis, the sample may be either unrestricted or restricted. When each sample element is drawn individually from the population at large, then the sample so drawn is known as ‘unrestricted sample’, whereas all other forms of sampling are covered under the term ‘restricted sampling’. The following chart exhibits the sample designs as explained above.

Thus, sample designs are basically of two types viz., non-probability sampling and probability sampling. We take up these two designs separately.






CHART SHOWING BASIC SAMPLING DESIGNS














Representation basis




















Element selection

Probability sampling
Non-probability sampling


technique














Unrestricted sampling

Simple random sampling
Haphazard sampling or







convenience sampling











Restricted sampling

Complex random sampling
Purposive sampling (such as





(such as cluster sampling,
quota sampling, judgement





systematic sampling,
sampling)





stratified sampling etc.)














a)      Non-probability sampling:

  • Non-probability sampling is that sampling procedure which does not afford any basis for estimating the probability that each item in the population has of being included in the sample.

  • Non-probability sampling is also known by different names such as deliberate sampling, purposive sampling and judgement sampling.

  • In this type of sampling, items for the sample are selected deliberately by the researcher; his choice concerning the items remains supreme.

  • In other words, under non-probability sampling the organisers of the inquiry purposively choose the particular units of the universe for constituting a sample on the basis that the small mass that they so select out of a huge one will be typical or representative of the whole.

  • For instance, if economic conditions of people living in a state are to be studied, a few towns and villages may be purposively selected for intensive study on the principle that they can be representative of the entire state.

  • In such a design, personal element has a great chance of entering into the selection of the sample.

  • The investigator may select a sample which shall yield results favourable to his point of view and if that happens, the entire inquiry may get vitiated. Thus, there is always the danger of bias entering into this type of sampling technique.

  • But in the investigators are impartial, work without bias and have the necessary experience so as to take sound judgement, the results obtained from an analysis of deliberately selected sample may be tolerably reliable.

  • However, in such a sampling, there is no assurance that every element has some specifiable chance of being included.

  • Sampling error in this type of sampling cannot be estimated and the element of bias, great or small, is always there. As such this sampling design in rarely adopted in large inquires of importance. However, in small inquiries and researches by individuals, this design may be adopted because of the relative advantage of time and money inherent in this method of sampling.

  • Quota sampling is also an example of non-probability sampling. Under quota sampling the interviewers are simply given quotas to be filled from the different strata, with some restrictions on how they are to be filled.

  • In other words, the actual selection of the items for the sample is left to the interviewer’s discretion. This type of sampling is very convenient and is relatively inexpensive.

  • But the samples so selected certainly do not possess the characteristic of random samples. Quota samples are essentially judgement samples and inferences drawn on their basis are not amenable to statistical treatment in a formal way.

b)      Probability sampling:

·         Probability sampling is also known as ‘random sampling’ or ‘chance sampling’.


·         Under this sampling design, every item of the universe has an equal chance of inclusion in the sample.

·         It is, so to say, a lottery method in which individual units are picked up from the whole group not deliberately but by some mechanical process.

·         Here it is blind chance alone that determines whether one item or the other is selected.

·         The results obtained from probability or random sampling can be assured in terms of probability i.e., we can measure the errors of estimation or the significance of results obtained from a random sample, and this fact brings out the superiority of random sampling design over the deliberate sampling design.

·         Random sampling ensures the law of Statistical Regularity which states that if on an average the sample chosen is a random one, the sample will have the same composition and characteristics as the universe.

·         This is the reason why random sampling is considered as the best technique of selecting a representative sample.

·         Random sampling from a finite population refers to that method of sample selection which gives each possible sample combination an equal probability of being picked up and each item in the entire population to have an equal chance of being included in the sample.

·         This applies to sampling without replacement i.e., once an item is selected for the sample, it cannot appear in the sample again (Sampling with replacement is used less frequently in which procedure the element selected for the sample is returned to the population before the next element is selected.

·         In such a situation the same element could appear twice in the same sample before the second element is chosen). In brief, the implications of random sampling (or simple random sampling) are:

§  It gives each element in the population an equal probability of getting intothe sample; and all choices are independent of one another.

§   It Gives Each Possible Sample Combination An Equal Probability Of Being Chosen.


Random Sample from an Infinite Universe

So far we have talked about random sampling, keeping in view only the finite populations. But what about random sampling in context of infinite populations? It is relatively difficult to explain the concept of random sample from an infinite population. However, a few examples will show the basic characteristic of such a sample. Suppose we consider the 20 throws of a fair dice as a sample from the hypothetically infinite population which consists of the results of all possible throws of the dice. If he probability of getting a particular number, say 1, is the same for each throw and the 20 throws are all independent, then we say that the sample is random. Similarly, it would be said to be sampling from an infinite population if we sample with replacement from a finite population and our sample would be considered as a random sample if in each draw all elements of the population have the same probability of being selected and successive draws happen to be independent. In brief, one can say that the selection of each item in a random sample from an infinite population is controlled by the same probabilities and that successive selections are independent of one another.

Complex Random Sampling Designs

Probability sampling under restricted sampling techniques, as stated above, may result in complex random sampling designs. Such designs may as well be called ‘mixed sampling designs’ for many of such designs may represent a combination of probability and non-probability sampling procedures in selecting a sample. Some of the popular complex random sampling designs are as follows:

a)      Systematic sampling:

·         In some instances, the most practical way of sampling is to select every ith item on a list. Sampling of this type is known as systematic sampling.

·         An element of randomness is introduced into this kind of sampling by using random numbers to pick up the unit with which to start. For instance, if a 4 per cent sample is desired, the first item would be selected randomly from the first twenty-five and thereafter every 25th item would automatically be included in the sample.

·         Thus, in systematic sampling only the first unit is selected randomly and the remaining units of the sample are selected at fixed intervals. Although a systematic sample is not a random sample in the strict sense of the term, but it is often considered reasonable to treat systematic sample as if it were a random sample.

·         Systematic sampling has certain plus points. It can be taken as an improvement over a simple random sample in as much as the systematic sample is spread more evenly over the entire population.

·         It is an easier and less costlier method of sampling and can be conveniently used even in case of large populations.
·         But there are certain dangers too in using this type of sampling. If there is a hidden periodicity in the population, systematic sampling will prove to be an inefficient method of sampling.

·         For instance, every 25th item produced by a certain production process is defective. If we are to select a 4% sample of the items of this process in a systematic manner, we would either get all defective items or all good items in our sample depending upon the random starting position.

·         If all elements of the universe are ordered in a manner representative of the total population, i.e., the population list is in random order, systematic sampling is considered equivalent to random sampling.

·         But if this is not so, then the results of such sampling may, at times, not be very reliable. In practice, systematic sampling is used when lists of population are available and they are of considerable length.






b)     Stratified sampling:

·         If a population from which a sample is to be drawn does not constitute a homogeneous group, stratified sampling technique is generally applied in order to obtain a representative sample.

·         Under stratified sampling the population is divided into several sub-populations that are individually more homogeneous than the total population (the different sub-populations are called ‘strata’) and then we select items from each stratum to constitute a sample.

·         Since each stratum is more homogeneous than the total population, we are able to get more precise estimates for each stratum and by estimating more accurately each of the component parts, we get a better estimate of the whole.

·         In brief, stratified sampling results in more reliable and detailed information.


c)      Cluster sampling:

·         If the total area of interest happens to be a big one, a convenient way in which a sample can be taken is to divide the area into a number of smaller non-overlapping areas and then to randomly select a number of these smaller areas (usually called clusters), with the ultimate sample consisting of all (or samples of) units in these small areas or clusters.

·         Thus in cluster sampling the total population is divided into a number of relatively small subdivisions which are themselves clusters of still smaller units and then some of these clusters are randomly selected for inclusion in the overall sample.
·         Suppose we want to estimate the proportion of machine-parts in an inventory which are defective. Also assume that there are 20000 machine parts in the inventory at a given point of time, stored in 400 cases of 50 each. Now using a cluster sampling, we would consider the 400 cases as clusters and randomly select ‘ n’ cases and examine all the machine-parts in each randomly selected case.

·         Cluster sampling, no doubt, reduces cost by concentrating surveys in selected clusters. But certainly it is less precise than random sampling. There is also not as much information in ‘ n’ observations within a cluster as there happens to be in ‘ n’ randomly drawn observations. Cluster sampling is used only because of the economic advantage it possesses; estimates based on cluster samples are usually more reliable per unit cost.

d)     Area sampling:

·         If clusters happen to be some geographic subdivisions, in that case cluster sampling is better known as area sampling.

·         In other words, cluster designs, where the primary sampling unit represents a cluster of units based on geographic area, are distinguished as area sampling.

·         The plus and minus points of cluster sampling are also applicable to area sampling.


e)      Multi-stage sampling:

·         Multi-stage sampling is a further development of the principle of cluster sampling. Suppose we want to investigate the working efficiency of nationalised banks in India and we want to take a sample of few banks for this purpose.

·         The first stage is to select large primary sampling unit such as states in a country. Then we may select certain districts and interview all banks in the chosen districts. This would represent a two-stage sampling design with the ultimate sampling units being clusters of districts.

·         If instead of taking a census of all banks within the selected districts, we select certain towns and interview all banks in the chosen towns. This would represent a three-stage sampling design.

·         If instead of taking a census of all banks within the selected towns, we randomly sample banks from each selected town, then it is a case of using a four-stage sampling plan.

·         If we select randomly at all stages, we will have what is known as ‘multi-stage random sampling design’.

·         Ordinarily multi-stage sampling is applied in big inquires extending to a considerable large geographical area, say, the entire country. There are two advantages of this sampling design viz.,

·         It is easier to administer than most single stage designs mainly because of the fact that sampling frame under multi-stage sampling is developed in partial units. (b) A large number of units can be sampled for a given cost under multistage sampling because of sequential clustering, whereas this is not possible in most of the simple designs.

f)       Sequential sampling:

·         This sampling design is somewhat complex sample design.

·         The ultimate size of the sample under this technique is not fixed in advance, but is determined according to mathematical decision rules on the basis of information yielded as survey progresses.

·         This is usually adopted in case of acceptance sampling plan in context of statistical quality control.

·         When a particular lot is to be accepted or rejected on the basis of a single sample, it is known as single sampling; when the decision is to be taken on the basis of two samples, it is known as double sampling and in case the decision rests on the basis of more than two samples but the number of samples is certain and decided in advance, the sampling is known as multiple sampling.

·         But when the number of samples is more than two but it is neither certain nor decided in advance, this type of system is often referred to as sequential sampling. Thus, in brief, we can say that in sequential sampling, one can go on taking samples one after another as long as one desires to do so.


Conclusion

From a brief description of the various sample designs presented above, we can say that normally one should resort to simple random sampling because under it bias is generally eliminated and the sampling error can be estimated. But purposive sampling is considered more appropriate when the universe happens to be small and a known characteristic of it is to be studied intensively. There are situations in real life under which sample designs other than simple random samples may be considered better (say easier to obtain, cheaper or more informative) and as such the same may be used. In a situation when random sampling is not possible, then we have to use necessarily a sampling design other than random sampling. At times, several methods of sampling may well be used in the same study.


No comments:

Post a Comment