馃攳
Sampling methods - YouTube
Channel: unknown
[5]
Welcome to Research Fundamental Modules, the
NIeCer course one.
[14]
Today, we are going to talk about some aspects
of Sampling Methods.
[22]
Sampling is really required, whenever you
are dealing with a very large population and
[28]
you want some quick information.
[31]
Now, look at the definition of sampling.
[35]
The sampling is a procedure by which some
members of the population are selected and
[42]
they are supposed to be the representative
of the entire population.
[47]
See, if you have a population of people like
this and you are looking at a portion of them
[54]
and that is called a Sample.
[58]
I would like to introduce you to some concepts
and one of them is the Study population.
[65]
What you mean by Study population?
[68]
The study population is the population to
which the results of the study are to be inferred.
[77]
Say for example, how many injections do people
receive each year in India?
[84]
The study population in this case is the entire
population of India.
[88]
Suppose, your research question is, how many
needle-sticks health care workers experience
[95]
each year in India?
[96]
Then the study population becomes health care
workers of India.
[101]
Suppose, if your study question is how many
hospitals have a needle sticks prevention
[107]
policy in India?
[108]
Then your study population in this case becomes
hospitals of India.
[115]
The sample which we select should be representative
of the population for which we require an
[123]
answer and this representation should be in
accordance with , seasonality, the day of
[130]
the week, the time of the week.
[132]
Whether it is urban or it is rural, or it
is should rather match the composition of
[138]
age, sex and other demographic characteristics
of the population.
[142]
See now: let us introduce you to some concepts
or terminologies that are often used in the
[152]
sampling parlance.
[155]
What do you mean by Sampling Unit?
[157]
Sometimes it is called basic sampling unit,
BSU.
[161]
These are the elementary unit that will be
sampled, that could be people or health care
[168]
workers or hospitals as we had seen in our
early example.
[173]
What do you mean by Sampling Frame?
[175]
The sampling frame is a list of all sampling
units in the population and what do you mean
[183]
by Sampling Scheme?
[184]
The sampling scheme is a method used to select
sampling units from the sampling frame.
[191]
So now, there are different ways how or why
we should do the sample populations?
[202]
If you have enough resources, you can probably
study the entire population.
[208]
But still, even if you have the resources,
it is not wise to study the entire population
[217]
because often the population is very large
and a large population when you are going
[223]
to collect information one of the major constraint
could be the time.
[228]
You may require lot of time to collect information
and you may say I will employ lot of people
[235]
to do that, but what would rather probably
happen is if you have lots of people collecting
[240]
information there could be lot of inter observer
variations which could rather add on to a
[246]
tremendous amount of error and unfortunately
you cannot measure the amount of such errors.
[253]
So, it often rather happens that by doing
a sample survey you often get accurate information.
[264]
The information that you get from sample surveys
are more accurate than the information you
[271]
do on a large scale population studies.
[275]
So, a population could be an entire universe,
whereas a sample could be as selected a small
[282]
regions.
[283]
Let us look at a practical example.
[287]
Suppose, the ministry of health of a country
X wants to estimate the proportion of children
[293]
in elementary schools, who have been immunized
against childhood infectious diseases.
[298]
You could just imagine, you know the proportion
of children of all elementary schools, who
[305]
have been immunized against childhood infections
of a country.
[308]
So that is a task but one of the conditions
that he has put is the task must be completed
[315]
in one month.
[316]
So, the objective is to estimate the proportion
of immunized children and you want the results
[321]
in a month鈥檚 time.
[322]
Now, let us rather look at the different ways;
how you can rather get this information?
[329]
Or in other words what are the different types
of samples that could rather group.
[335]
See, broadly speaking the sample could be
a Non-probability Sample or a Probability
[345]
Sample.
[346]
What do you mean by a Non-probability Sample?
[347]
Non-probability sample is the probability
of being selected that is a sample, the probability
[353]
of being selected for your study is not known.
[357]
It could be a convenient sample or purposing
sample, you just rather convenient whatever
[363]
the region that is convenient to you, close
by to your place, you can rather go and rather
[368]
see first 100 people that you come across
that could be a convenient sample.
[372]
What could rather happen?
[374]
That sample could be biased or it can rather
give either a best or a worst scenario, people
[379]
you know it is a convenient location.
[383]
You may get rather the results very different
from a location, which is not very convenient
[388]
or which is very remote and difficult to approach.
[391]
And also, some of these are all very subjective
samples and to derive some objective criteria
[401]
from a subjective sample is always difficult.
[406]
But, nevertheless these non-probability sampling
methods still are useful and that is being
[414]
extensively used mainly to generate hypothesis
or to prepare for more systematic probability
[425]
samples.
[426]
Now let us look at, what do you mean by Probability
Samples?
[430]
In a probability sample, every unit in the
population has a known probability of being
[437]
selected.
[439]
What is the advantage?
[441]
This is only sampling method that allows to
draw valid conclusions about the population.
[447]
It removes the possibility of bias in selection
of subjects and also ensures that each subject
[455]
has a known probability of being chosen.
[459]
It allows application of statistical theory
because many of the statistical text that
[463]
you do it insist on a random sampling and
these tests are valid only if the samples
[471]
are a random sample.
[474]
I would like to rather introduce you to the
concept call Sampling Error.
[479]
No sample is a perfect mirror image of the
population.
[483]
Always you know when you pick a sample from
a population and when you look at the results,
[491]
it may not be exactly the same as the results
in the population.
[496]
But, fortunately the magnitude of error could
be measured in terms of probability in the
[504]
case of probability samples.
[507]
This is expressed by standard error of mean
or proportion or differences and that is a
[515]
function of the sample size and then the variability
in the measurement.
[519]
So, sampling error is a very important component
in sampling theory, which helps us in identifying
[529]
the sample size and things so on.
[534]
Now, let us look at some of the popular sampling
methodologies that are employed in sample
[543]
service.
[545]
Let us rather look at the first Simple random
sampling.
[551]
As a name suggests, it is a very simple sampling
procedure, very easy to understand in which
[560]
every individual sampling units have got an
equal chance of being included into a sample.
[568]
How do you do that?
[570]
We number all the units and we randomly draw
units.
[574]
The advantages as I mentioned, it is very
simple and sampling error is also very easily
[586]
measured.
[587]
Major limitation of this is, you need to have
a compete list of all units.
[592]
Many times it may not be available and also
some times you may get a sample, which is
[599]
very different from the whole population may
not be very representative of the population.
[606]
See, an example of a simple random sampling
could be if you have the list of all say about
[612]
these 48 names, you pick a random numbers
of 9, 18, 32, and 40.
[617]
So these are all the names that are selected
as your sample.
[621]
Now, the next sampling type is a Systematic
Sampling.
[628]
A systematic sampling.
[631]
what is rather done is, initial sampling unit
is picked by random and then every kth unit
[638]
from that from your population are examined.
[642]
A unit is drawn and every k units and every
equal chance of being select for each of the
[650]
unit.
[651]
So, you calculate the sampling interval called
k, which is divided by N divided by the number
[658]
of sample size that you require.
[660]
And you draw a random number which is less
than or equal to k for starting and draw every
[667]
k units from the first unit.
[670]
What are the advantages?
[671]
It ensures representatively across the list.
[675]
It is easy to implement.
[677]
You gave a worker that you say, you start
from this house and every 10th house you go
[682]
on rather see and cover all the houses, its
very easily been done.
[686]
If there is some sort of a cycle of some specific
characteristic that you are studying then
[693]
you might probably get a sample, which is
very atypical in a systematic sampling and
[699]
also some of the statistical measures that
you are going to compute, it is difficult
[705]
when you are going to have systematic sampling,
where you do not have an exact formulas, you
[710]
may have to use some approximate formulas.
[713]
The example of a systematic sampling is you
see, in the first, the red house is selected
[721]
and then every eighth house from that is selected
and all the red houses in these houses are
[731]
your selected samples.
[733]
There is a sampling method called Stratified
Sampling.
[739]
The principle of it is, who classify population
into homogeneous subgroups, which are called
[745]
'strata' and you draw sample from in each
strata combine the results of all the strata
[752]
to get an idea of the whole population.
[755]
The advantage of it is its more precise, if
variable associated with strata and all subgroups
[763]
are represented, allowing for separate conclusions
about each one of them.
[768]
Suppose, a natural strata could be male and
female, so you have an estimate for male and
[773]
you have an estimate for female and you can
have an estimate for a combine male and female
[777]
for the whole population.
[779]
But the disadvantage is, sampling error is
difficult to measure, and that could be loss
[784]
of precision, if you are going to rather have
a lot of strata and for each strata you have
[789]
small numbers in it.
[793]
Example of a stratified sampling is, suppose
if you want to estimate the vaccination coverage
[797]
in your country.
[798]
One sample drawn from each region north, east,
south and west and the estimate calculated
[804]
for each of the stratum and at the end you
can weight the stratum according to the size
[814]
of the regions.
[816]
Another important type of sampling, which
is very popularly used in the health surveys
[824]
in research, is called Cluster Sampling.
[829]
The principle of cluster sampling is that,
a random sample of groups or a cluster of
[834]
units, and all proportion of units are included
in these selected clusters.
[841]
Its advantages is, it is simple, we do not
require a list of units and less of travel
[850]
or resources are required because you are
going to collect a cluster and you are going
[855]
see only within the clusters.
[857]
And the disadvantage is, if the clusters of
homogeneous then it may result in a large
[865]
design effect.
[868]
All the people in the sample may have very
homogeneous results which could result in
[875]
a design effect and sampling error is difficult
to measure in a cluster sampling.
[883]
The sampling unit is not a subject, but a
group or a cluster of subject.
[888]
The assumptions here it is, that variability
among the cluster is minimal.
[895]
The variability within each cluster is what
is observed in the general population.
[901]
Now, how these clusters sampling is usually
done.
[905]
It is done as two stage approach.
[909]
In the first stage, a probability proportional
to size, that is select the number of clusters
[915]
to be included, compute a cumulative list
of all the population in each unit with a
[921]
grand total, divide the grand total by the
number of clusters and obtain the sampling
[926]
interval, choose a random number and identify
the first cluster, add the sampling interval
[932]
and identify the second cluster and so on
and by repeating the same procedure, identify
[938]
all the clusters.
[940]
Once your clusters are identified, then in
the second stage in each cluster, you select
[946]
this random sample using the sampling frame
because as I had mentioned you earlier on
[953]
simple random sampling when you want to do
a simple random sampling you need to have
[958]
all the list of the your sampling frame.
[961]
So, in a small cluster it is possible for
you to formulate the sampling frame and you
[967]
can select people from that sampling frame
on a random basis.
[974]
Another important sampling methodology employed
is called a Multistage Sampling.
[983]
In this multistage, especially in a very large,
you want some estimates for at the national
[988]
level, you need to do a sampling in several
chains samples and several statistical units
[996]
are there.
[997]
The advantage is, there is no complete listing
of the population is required and it is most
[1003]
feasible approach for large populations.
[1008]
The disadvantage is, there are several sampling
units and sampling error at times, it is very
[1014]
difficult to unless you follow certain very
specific methodologies for selecting at each
[1020]
stage.
[1021]
Some of the key issues that I would like to
bring to you is we cannot study the whole
[1028]
population so we sample it.
[1034]
Whole population studying could impact a result
in inaccurate results so taking sample leads
[1043]
to sampling error, but which is easily measurable
and we do not have a measure for non-sampling
[1052]
error, whereas we have a measure for sampling
error.
[1055]
Good design and quality assurance ensure validity
and while appropriate sample size will ensure
[1063]
precision.
[1065]
The probability samples are the only one that
allows the use of statistics as we know them
[1071]
and so it is always advantage to use a probability
sample so that you can have a valid conclusion,
[1080]
a precise conclusion and also you can employ
statistical test on them.
[1084]
Thank you so much.
Most Recent Videos:
You can go back to the homepage right here: Homepage