Sampling Design
Census and Sample Survey
- All items in any field of inquiry constitute a
‘Universe’ or ‘Population.’
- A complete enumeration of all items in the
‘population’ is known as a census inquiry
Researcher
must prepare a sample design for his study i.e., he must plan how a sample
should be selected and of what size such a sample would be.
Implications of a Sample Design
·
A sample
design is a definite plan for obtaining a sample from a given population.
·
It refers
to the technique or the procedure the researcher would adopt in selecting items
for the sample.
·
Sample design may as well lay down the number of items to be included
in the sample i.e., the size of the sample. Sample design is determined before
data are collected
STEPS
IN SAMPLE DESIGN
While developing a sampling
design, the researcher must pay attention to the following points:
a) Type of universe: The first step in
developing any sample design is to clearly define the set of objects,
technically called the Universe, to be studied. The universe can be finite or
infinite. In finite universe the number of items is certain, but in case of an
infinite universe the number of items is infinite, i.e., we cannot have any
idea about the total number of items. The population of a city, the number of
workers in a factory and the like are examples of finite universes, whereas the
number of stars in the sky, listeners of a specific radio programme, throwing
of a dice etc. are examples of infinite universes.
b) Sampling unit: A decision has to be
taken concerning a sampling unit before selecting sample. Sampling unit may be a
geographical one such as state, district, village, etc., or a construction unit
such as house, flat, etc., or it may be a social unit such as family, club,
school, etc., or it may be an individual. The researcher will have to decide
one or more of such units that he has to select for his study.
c) Source list: It is also known as
‘sampling frame’ from which sample is to be drawn. It contains the names of
all items of a universe (in case of finite universe only). If source list is
not available, researcher has to prepare it. Such a list should be
comprehensive, correct, reliable and appropriate. It is extremely important for
the source list to be as representative of the population as possible.
d) Size of sample: This refers to the
number of items to be selected from the universe to constitute a sample. The size of sample
should neither be excessively large, nor too small. It should be optimum. An
optimum sample is one which fulfills the requirements of efficiency,
representativeness, reliability and flexibility. While deciding the size of
sample, researcher must determine the desired precision as also an acceptable
confidence level for the estimate.
e) Parameters of interest: In determining the
sample design, one must consider the question of the specific population parameters
which are of interest. For instance, we may be interested in estimating the
proportion of persons with some characteristic in the population, or we may be
interested in knowing some average or the other measure concerning the
population. There may also be important sub-groups in the population about whom
we would like to make estimates. All this has a
strong impact upon the sample design we would accept.
f) Budgetary constraint: Cost considerations,
from practical point of view, have a major impact upon decisions relating to not
only the size of the sample but also to the type of sample. This fact can even
lead to the use of a non-probability sample.
g) Sampling procedure: Finally, the
researcher must decide the type of sample he will use i.e., he must decide
about the technique to be used in selecting the items for the sample. In fact,
this technique or procedure stands for the sample design itself. There are
several sample designs (explained in the pages that follow) out of which the
researcher must choose one for his study. Obviously, he must select that design
which, for a given sample size and for a given cost, has a smaller sampling
error.
Criteria of Selecting a Sampling Procedure
In this
context one must remember that two costs are involved in a sampling analysis
viz.,
i.
the cost of collecting the data and
ii.
the cost of an incorrect inference resulting from
the data.
Researcher
must keep in view the two causes of incorrect inferences viz., systematic bias
and sampling error. A systematic bias results
from errors in the sampling procedures, and it cannot be reduced or eliminated by increasing the sample size. At best
the causes responsible for these errors can be detected and corrected. Usually
a systematic bias is the result of one or more of the following factors:
a)
Inappropriate sampling frame: If the sampling frame
is inappropriate i.e., a biased representation of the universe, it will result in a
systematic bias.
b)
Defective measuring device: If the measuring
device is constantly in error, it will result in systematic bias. In survey work,
systematic bias can result if the questionnaire or the interviewer is biased.
Similarly, if the physical measuring device is defective there will be
systematic bias in the data collected through such a measuring device.
c)
Non-respondents: If we are unable to
sample all the individuals initially included in the sample, there may arise a
systematic bias. The reason is that in such a situation the likelihood of
establishing contact or receiving a response from an individual is often
correlated with the measure of what is to be estimated.
d)
Indeterminancy principle: Sometimes we find
that individuals act differently when kept under observation than what they do when kept
in non-observed situations. For instance, if workers are aware that somebody is
observing them in course of a work study on the basis of which the average
length of time to complete a task will be determined and accordingly the quota
will be set for piece work, they generally tend to work slowly in comparison to
the speed with which they work if kept unobserved. Thus, the indeterminancy
principle may also be a cause of a systematic bias.
e)
Natural bias in the reporting of data: Natural bias of respondents in the reporting of data is often the cause of a
systematic bias in many inquiries. There is usually a downward bias in the
income data collected by government taxation department, whereas we find an
upward bias in the income data collected by some social organisation. People in
general understate their incomes if asked about it for tax purposes, but they
overstate the same if asked for social status or their affluence. Generally in
psychological surveys, people tend to give what they think is the ‘correct’
answer rather than revealing their true feelings.
Sampling errors are the
random variations in the sample estimates around the true population parameters. Since they occur randomly
and are equally likely to be in either direction, their nature happens to be of
compensatory type and the expected value of such errors happens to be equal to
zero. Sampling error decreases with the increase in the size of the sample, and
it happens to be of a smaller magnitude in case of homogeneous population.
Characteristics of a Good Sample Design
From what
has been stated above, we can list down the characteristics of a good sample
design as under:
o Sample design must result in a
truly representative sample.
o Sample design must be such which
results in a small sampling error.
o Sample design must be viable in
the context of funds available for the research study.
o Sample design must be such so
that systematic bias can be controlled in a better way.
o Sample should be such that the
results of the sample study can be applied, in general, for the universe with a
reasonable level of confidence.
Different Types of Sample Designs
There are
different types of sample designs based on two factors viz., the representation
basis and the element selection technique. On the representation basis, the
sample may be probability sampling or it may be non-probability sampling.
Probability sampling is based on the concept of random selection, whereas
non-probability sampling is ‘non-random’ sampling. On element selection basis,
the sample may be either unrestricted or restricted. When each sample element
is drawn individually from the population at large, then the sample so drawn is
known as ‘unrestricted sample’, whereas all other forms of sampling are covered
under the term ‘restricted sampling’. The following chart exhibits the sample
designs as explained above.
Thus,
sample designs are basically of two types viz., non-probability sampling and
probability sampling. We take up these two designs separately.
CHART SHOWING BASIC SAMPLING DESIGNS
|
||||||||
Representation basis
|
||||||||
Element selection
|
Probability sampling
|
Non-probability sampling
|
||||||
technique
|
||||||||
Unrestricted sampling
|
Simple random sampling
|
Haphazard sampling or
|
||||||
convenience sampling
|
||||||||
Restricted sampling
|
Complex random sampling
|
Purposive sampling (such as
|
||||||
(such as cluster sampling,
|
quota sampling, judgement
|
|||||||
systematic sampling,
|
sampling)
|
|||||||
stratified sampling etc.)
|
||||||||
a)
Non-probability sampling:
- Non-probability sampling is
that sampling procedure which does not afford any basis for estimating the
probability that each item in the population has of being included in the
sample.
- Non-probability sampling is
also known by different names such as deliberate sampling, purposive
sampling and judgement sampling.
- In this type of sampling,
items for the sample are selected deliberately by the researcher; his
choice concerning the items remains supreme.
- In other words, under
non-probability sampling the organisers of the inquiry purposively choose
the particular units of the universe for constituting a sample on the
basis that the small mass that they so select out of a huge one will be
typical or representative of the whole.
- For instance, if economic
conditions of people living in a state are to be studied, a few towns and
villages may be purposively selected for intensive study on the principle
that they can be representative of the entire state.
- In such a design, personal element has a great
chance of entering into the selection of the sample.
- The investigator may select a sample which
shall yield results favourable to his point of view and if that happens,
the entire inquiry may get vitiated. Thus, there is always the danger of
bias entering into this type of sampling technique.
- But in the investigators are impartial, work
without bias and have the necessary experience so as to take sound
judgement, the results obtained from an analysis of deliberately selected
sample may be tolerably reliable.
- However, in such a sampling, there is no
assurance that every element has some specifiable chance of being
included.
- Sampling error in this type of sampling cannot
be estimated and the element of bias, great or small, is always there. As
such this sampling design in rarely adopted in large inquires of
importance. However, in small inquiries and researches by individuals,
this design may be adopted because of the relative advantage of time and
money inherent in this method of sampling.
- Quota
sampling is also an example of non-probability
sampling. Under quota sampling the interviewers are simply given quotas to
be filled from the different strata, with some restrictions on how they
are to be filled.
- In other words, the actual selection of the
items for the sample is left to the interviewer’s discretion. This type of
sampling is very convenient and is relatively inexpensive.
- But the samples so selected certainly do not
possess the characteristic of random samples. Quota samples are
essentially judgement samples and inferences drawn on their basis are not
amenable to statistical treatment in a formal way.
b)
Probability sampling:
·
Probability sampling is also known as
‘random sampling’ or ‘chance sampling’.
·
Under this sampling design, every item
of the universe has an equal chance of inclusion in the sample.
·
It is, so to say, a lottery method in
which individual units are picked up from the whole group not deliberately but
by some mechanical process.
·
Here it is blind chance alone that
determines whether one item or the other is selected.
·
The results obtained from probability
or random sampling can be assured in terms of probability i.e., we can measure
the errors of estimation or the significance of results obtained from a random
sample, and this fact brings out the superiority of random sampling design over
the deliberate sampling design.
·
Random sampling ensures the law of
Statistical Regularity which states that if on an average the sample chosen is
a random one, the sample will have the same composition and characteristics as
the universe.
·
This is the reason why random sampling
is considered as the best technique of selecting a representative sample.
·
Random sampling from a finite population refers to
that method of sample selection which gives each possible sample combination an
equal probability of being picked up and each item in the entire population to
have an equal chance of being included in the sample.
·
This applies to sampling without replacement i.e.,
once an item is selected for the sample, it cannot appear in the sample again
(Sampling with replacement is used less frequently in which procedure the
element selected for the sample is returned to the population before the next
element is selected.
·
In such a situation the same element could appear
twice in the same sample before the second element is chosen). In brief, the
implications of random sampling (or simple random sampling) are:
§ It gives
each element in the population an equal probability of getting intothe sample;
and all choices are independent of one another.
§ It Gives Each Possible Sample Combination An
Equal Probability Of Being Chosen.
Random Sample from an Infinite Universe
So far we
have talked about random sampling, keeping in view only the finite populations.
But what about random sampling in context of infinite populations? It is
relatively difficult to explain the concept of random sample from an infinite
population. However, a few examples will show the basic characteristic of such
a sample. Suppose we consider the 20 throws of a fair dice as a sample from the
hypothetically infinite population which consists of the results of all
possible throws of the dice. If he probability of getting a
particular number, say 1, is the same for each throw and the 20 throws are all
independent, then we say that the sample is random. Similarly, it would be said
to be sampling from an infinite population if we sample with replacement from a
finite population and our sample would be considered as a random sample if in
each draw all elements of the population have the same probability of being
selected and successive draws happen to be independent. In brief, one can say
that the selection of each item in a random sample from an infinite population
is controlled by the same probabilities and that successive selections are
independent of one another.
Complex Random Sampling Designs
Probability
sampling under restricted sampling techniques, as stated above, may result in
complex random sampling designs. Such designs may as well be called ‘mixed
sampling designs’ for many of such designs may represent a combination of
probability and non-probability sampling procedures in selecting a sample. Some
of the popular complex random sampling designs are as follows:
a)
Systematic sampling:
·
In some instances, the most practical
way of sampling is to select every ith item on a list. Sampling of this
type is known as systematic sampling.
·
An element of randomness is introduced into this kind of
sampling by using random numbers to pick up the unit with which to start. For
instance, if a 4 per cent sample is desired, the first item would be selected
randomly from the first twenty-five and thereafter every 25th item would
automatically be included in the sample.
·
Thus, in systematic sampling only the
first unit is selected randomly and the remaining units of the sample are
selected at fixed intervals. Although a systematic sample is not a random
sample in the strict sense of the term, but it is often considered reasonable
to treat systematic sample as if it were a random sample.
·
Systematic sampling has certain plus points. It can
be taken as an improvement over a simple random sample in as much as the
systematic sample is spread more evenly over the entire population.
·
It is an easier and less costlier method of
sampling and can be conveniently used even in case of large populations.
·
But there are certain dangers too in using this
type of sampling. If there is a hidden periodicity in the population,
systematic sampling will prove to be an inefficient method of sampling.
·
For instance, every 25th item produced by a certain
production process is defective. If we are to select a 4% sample of the items
of this process in a systematic manner, we would either get all defective items
or all good items in our sample depending upon the random starting position.
·
If all elements of the universe are ordered in a
manner representative of the total population, i.e., the population list is in
random order, systematic sampling is considered equivalent to random sampling.
·
But if this is not so, then the results of such
sampling may, at times, not be very reliable. In practice, systematic sampling
is used when lists of population are available and they are of considerable
length.
b)
Stratified sampling:
·
If a population from which a sample is
to be drawn does not constitute a homogeneous group, stratified sampling
technique is generally applied in order to obtain a representative sample.
·
Under stratified sampling the
population is divided into several sub-populations that are individually more
homogeneous than the total population (the different sub-populations are called
‘strata’) and then we select items from each stratum to constitute a sample.
·
Since each stratum is more homogeneous
than the total population, we are able to get more precise estimates for each
stratum and by estimating more accurately each of the component parts, we get a
better estimate of the whole.
·
In brief, stratified sampling results
in more reliable and detailed information.
c)
Cluster sampling:
·
If the total area of interest happens
to be a big one, a convenient way in which a sample can be taken is to
divide the area into a number of smaller non-overlapping areas and then to
randomly select a number of these smaller areas (usually called clusters), with
the ultimate sample consisting of all (or samples of) units in these small
areas or clusters.
·
Thus in cluster sampling the total population is
divided into a number of relatively small subdivisions which are themselves
clusters of still smaller units and then some of these clusters are randomly
selected for inclusion in the overall sample.
·
Suppose we want to estimate the proportion of
machine-parts in an inventory which are defective. Also assume that there are
20000 machine parts in the inventory at a given point of time, stored in 400
cases of 50 each. Now using a cluster sampling, we would consider the 400 cases
as clusters and randomly select ‘ n’
cases and examine all the machine-parts in each randomly selected case.
·
Cluster sampling, no doubt, reduces cost by
concentrating surveys in selected clusters. But certainly it is less precise
than random sampling. There is also not as much information in ‘ n’ observations within a cluster as
there happens to be in ‘ n’ randomly
drawn observations. Cluster sampling is used only because of the economic
advantage it possesses; estimates based on cluster samples are usually more
reliable per unit cost.
d)
Area sampling:
·
If clusters happen to be some
geographic subdivisions, in that case cluster sampling is better known as area
sampling.
·
In other words, cluster designs, where
the primary sampling unit represents a cluster of units based on geographic
area, are distinguished as area sampling.
·
The plus and minus points of cluster sampling
are also applicable to area sampling.
e)
Multi-stage sampling:
·
Multi-stage sampling is a further
development of the principle of cluster sampling. Suppose we want to
investigate the working efficiency of nationalised banks in India and we want
to take a sample of few banks for this purpose.
·
The first stage is to select large
primary sampling unit such as states in a
country. Then we may select certain districts and interview all banks in the
chosen districts. This would represent a two-stage sampling design with the
ultimate sampling units being clusters of districts.
·
If instead of taking a census of all banks within
the selected districts, we select certain towns and interview all banks in the
chosen towns. This would represent a three-stage sampling design.
·
If instead of taking a census of all banks within
the selected towns, we randomly sample banks from each selected town, then it
is a case of using a four-stage sampling plan.
·
If we select randomly at all stages, we will have
what is known as ‘multi-stage random sampling design’.
·
Ordinarily multi-stage sampling is applied in big
inquires extending to a considerable large geographical area, say, the entire
country. There are two advantages of this sampling design viz.,
·
It is easier to administer than most single stage
designs mainly because of the fact that sampling frame under multi-stage
sampling is developed in partial units. (b) A large number of units can be
sampled for a given cost under multistage sampling because of sequential clustering,
whereas this is not possible in most of the simple designs.
f)
Sequential sampling:
·
This sampling design is somewhat
complex sample design.
·
The ultimate size of the sample under this technique
is not fixed in advance, but is determined according to mathematical decision
rules on the basis of information yielded as survey progresses.
·
This is usually adopted in case of
acceptance sampling plan in context of statistical quality control.
·
When a particular lot is to be accepted
or rejected on the basis of a single sample, it is known as single sampling;
when the decision is to be taken on the basis of two samples, it is known as
double sampling and in case the decision rests on the basis of more than two
samples but the number of samples is certain and decided in advance, the
sampling is known as multiple sampling.
·
But when the number of samples is more
than two but it is neither certain nor decided in advance, this type of system
is often referred to as sequential sampling. Thus, in brief, we can say that in
sequential sampling, one can go on taking samples one after another as long as
one desires to do so.
Conclusion
From a
brief description of the various sample designs presented above, we can say
that normally one should resort to simple random sampling because under it bias
is generally eliminated and the sampling error can be estimated. But purposive
sampling is considered more appropriate when the universe happens to be small
and a known characteristic of it is to be studied intensively. There are
situations in real life under which sample designs other than simple random
samples may be considered better (say easier to obtain, cheaper or more informative)
and as such the same may be used. In a situation when random sampling is not
possible, then we have to use necessarily a sampling design other than random
sampling. At times, several methods of sampling may well be used in the same
study.
No comments:
Post a Comment