{"id":189,"date":"2015-07-12T17:06:59","date_gmt":"2015-07-12T07:06:59","guid":{"rendered":"http:\/\/www.statulator.com\/blogg\/?p=189"},"modified":"2020-01-09T19:37:40","modified_gmt":"2020-01-09T08:37:40","slug":"demystifying-statistics-estimating-sample-size-for-a-survey","status":"publish","type":"post","link":"https:\/\/www.statulator.com\/blog\/demystifying-statistics-estimating-sample-size-for-a-survey\/","title":{"rendered":"Demystifying statistics: Estimating sample size for a survey"},"content":{"rendered":"\n<figure class=\"wp-block-image size-large is-style-default\"><img decoding=\"async\" loading=\"lazy\" width=\"1024\" height=\"1024\" src=\"http:\/\/www.statulator.com\/blogg\/wp-content\/uploads\/2020\/01\/association-152746_1280-1024x1024.png\" alt=\"\" class=\"wp-image-191\" srcset=\"https:\/\/www.statulator.com\/blog\/wp-content\/uploads\/2020\/01\/association-152746_1280-1024x1024.png 1024w, https:\/\/www.statulator.com\/blog\/wp-content\/uploads\/2020\/01\/association-152746_1280-300x300.png 300w, https:\/\/www.statulator.com\/blog\/wp-content\/uploads\/2020\/01\/association-152746_1280-150x150.png 150w, https:\/\/www.statulator.com\/blog\/wp-content\/uploads\/2020\/01\/association-152746_1280-768x768.png 768w, https:\/\/www.statulator.com\/blog\/wp-content\/uploads\/2020\/01\/association-152746_1280.png 1280w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<h2>Introduction<\/h2>\n\n\n\n<p>Whether you want to understand people\u2019s preferences for a product, estimate the proportion of people preferring a political party or estimate the prevalence of a disease&nbsp;in a population, you will need to calculate the number of respondents sufficient for your survey objective. How can you calculate this magic number?<\/p>\n\n\n\n<p>To obtain a completely accurate answer, you will have to ask each and every individual in your population. A&nbsp;<em>population<\/em>&nbsp;in statistics is defined as all the individuals about whom you want to obtain information. For example if you want to understand voters\u2019 preferences for a political party in a country, then all the voters of that country is the population. If you are interested in estimating the prevalence of obesity among teenagers in a country, then all teenagers of that country make your population. If you are interested in understanding the proportion of students in a school preferring a particular brand of chocolates, then all the students in that school is your population.<\/p>\n\n\n\n<p>Fortunately it is not necessary to ask each and every individual in the population if you are willing to be a little less accurate. You can collect data from only a proportion of people and still able to be get reasonably good answers that reflects the perceptions, habits, preferences or disease status of the population. That proportion of people is called a&nbsp;<em>sample<\/em>&nbsp;and the number of people that you need to select is called&nbsp;<em>sample size<\/em>.<\/p>\n\n\n\n<p>How big your sample size should be? It depends on the following four parameters:<\/p>\n\n\n\n<h3>Population size<\/h3>\n\n\n\n<p>Paradoxically, the size of your target population doesn\u2019t have a huge influence on sample size if the population is \u2018large\u2019, which is usually the case. Therefore, in most practical situations you can specify your population size to be \u2018infinite\u2019. The actual population size only needs&nbsp;to be specified if you are planning to sample a considerable proportion of the population (say &gt;10%).<\/p>\n\n\n\n<h3>Expected proportion<\/h3>\n\n\n\n<p>It might sound ironic, but you do need to guesstimate the proportion you expect to obtain. You could obtain this estimate from previous surveys or from an expert opinion but if there is absolutely no information about what you are trying to measure, then a 50% value can be used to be conservative, as it will provide the&nbsp;largest sample size.<\/p>\n\n\n\n<h3>Precision or margin of error<\/h3>\n\n\n\n<p>It tells you how much error or imperfection in your result you are willing to accept. If you are willing to accept a larger error then your sample size would be smaller and vice versa. For example, let\u2019s assume that 40% of the people in a sample prefer a political party. With a margin of error of 5%, you will be&nbsp;<em>\u2018fairly\u2019<\/em>&nbsp;confident that the proportions of people in the population preferring a political party are between 35% and 45% (i.e. 40 \u00b1 5). This range is called&nbsp;<strong>confidence interval<\/strong>&nbsp;in statistical jargon. Similarly, with a 2% margin of error the confidence interval will be from 38% to 42%. Thus with a 2% margin of error, you will get a more precise answer than with a 5% margin of error but you would need a larger sample size.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large is-style-default\"><img decoding=\"async\" loading=\"lazy\" width=\"1024\" height=\"614\" src=\"http:\/\/www.statulator.com\/blogg\/wp-content\/uploads\/2020\/01\/arrow-2886223_1920-1-1024x614.jpg\" alt=\"\" class=\"wp-image-749\" srcset=\"https:\/\/www.statulator.com\/blog\/wp-content\/uploads\/2020\/01\/arrow-2886223_1920-1-1024x614.jpg 1024w, https:\/\/www.statulator.com\/blog\/wp-content\/uploads\/2020\/01\/arrow-2886223_1920-1-300x180.jpg 300w, https:\/\/www.statulator.com\/blog\/wp-content\/uploads\/2020\/01\/arrow-2886223_1920-1-768x461.jpg 768w, https:\/\/www.statulator.com\/blog\/wp-content\/uploads\/2020\/01\/arrow-2886223_1920-1.jpg 1280w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<h3>Level of confidence<\/h3>\n\n\n\n<p>Note that we can only be&nbsp;<em>fairly<\/em>&nbsp;(not 100%) confident that 35% to 45% of people would prefer the political party in the above example. This word \u2018<em>fairly<\/em>\u2019 is quantified using the level of confidence. Usually a value of 95% for the level of confidence is used but other levels (such as 90% or 99%) can also be used. So if you want to be 99% confident of your result that&nbsp;35% to 45% of people prefer a party, you will have to select a larger sample size than if you are willing&nbsp;to be a little less (95% or 90%) confident. Statistically speaking, a 95% confidence means that&nbsp;<em>if you repeat your survey a large number of times, and calculate confidence interval each time, your intervals will include the true population proportion 95% of the time<\/em>. If you are not sure of what confidence level you need, it is better to stick to the conventional value of 95%.<\/p>\n\n\n\n<p>That\u2019s it! So assuming an infinite population, you just need to specify the expected proportion (use 50% if you are not sure), the level of confidence (use 95% if not sure) and the margin of error to calculate sample size for estimating a proportion.<\/p>\n\n\n\n<h4>Notes:&nbsp;<\/h4>\n\n\n\n<ol><li><em>Margin of error&nbsp;<\/em>can be a bit tricky to decide as its interpretation depends on the expected proportion. For example, a margin of error of 5% is okay&nbsp;for 40% expected proportion (as the confidence interval will be from 35% to 45%) but not for an expected proportion of 4% as the confidence interval will be from -1% to 9% which will not make any sense. Therefore, sometimes it is recommended to select the margin of error&nbsp;<em>relative<\/em>&nbsp;to the expected proportion. For example, if we decide on a relative margin of error of &#8216;<em>10% of the expected proportion<\/em>&#8216; then the absolute margin of error will be 5% for an expected proportion of 50% (i.e. 10%*50) and 0.5% for an expected proportion of 5% (10%*5). Both of these margins would make good sense.<\/li><li>The above approach will calculate sample size for estimating a proportion with a certain confidence. The approach to calculate sample size for&nbsp;<em>comparing proportions<\/em>&nbsp;is a bit different. I&nbsp;will discuss it in future&nbsp;blog.<\/li><li>This sample size calculation assumes simple random sampling. We will discuss calculation of sample sizes for other designs in a future blog.<\/li><\/ol>\n\n\n\n<h2>Implementation<\/h2>\n\n\n\n<p>Sample size can be easily calculated using a calculator that we recently developed :&nbsp;<strong><a rel=\"noreferrer noopener\" href=\"http:\/\/statulator.com\/SampleSize\/ss1P.html\" target=\"_blank\">http:\/\/statulator.com\/SampleSize\/ss1P.html<\/a><\/strong>. The calculator also shows you visualisation of the changes in sample size for a range of expected proportions and margins of errors. You can also create a table with a range of sample sizes and download it for discussion with your colleagues or collaborators.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large is-style-default\"><img decoding=\"async\" loading=\"lazy\" width=\"1024\" height=\"760\" src=\"http:\/\/www.statulator.com\/blogg\/wp-content\/uploads\/2020\/01\/Screen-Shot-2020-01-01-at-5.04.51-pm-1024x760.png\" alt=\"\" class=\"wp-image-193\" srcset=\"https:\/\/www.statulator.com\/blog\/wp-content\/uploads\/2020\/01\/Screen-Shot-2020-01-01-at-5.04.51-pm-1024x760.png 1024w, https:\/\/www.statulator.com\/blog\/wp-content\/uploads\/2020\/01\/Screen-Shot-2020-01-01-at-5.04.51-pm-300x223.png 300w, https:\/\/www.statulator.com\/blog\/wp-content\/uploads\/2020\/01\/Screen-Shot-2020-01-01-at-5.04.51-pm-768x570.png 768w, https:\/\/www.statulator.com\/blog\/wp-content\/uploads\/2020\/01\/Screen-Shot-2020-01-01-at-5.04.51-pm.png 1238w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>For example, if you expect 15% of teenagers in a large population to be obese and you want to estimate this proportion with 95% confidence and with 10% relative margin of error (i.e. 10%*15 = 1.5% absolute margin of error), specify these values in the calculator and click calculate to obtain the required sample size&nbsp;(2177 individuals). The calculator will also&nbsp;interprets the results for you which you could adapt for your project proposal or a journal article.<\/p>\n\n\n\n<p>This calculator also provides you some other options to adjust sample size for clustering, response rate etc. which I will discuss in a future blog.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large is-style-default\"><img decoding=\"async\" loading=\"lazy\" width=\"635\" height=\"605\" src=\"http:\/\/www.statulator.com\/blogg\/wp-content\/uploads\/2020\/01\/Screen-Shot-2020-01-01-at-5.05.48-pm.png\" alt=\"\" class=\"wp-image-194\" srcset=\"https:\/\/www.statulator.com\/blog\/wp-content\/uploads\/2020\/01\/Screen-Shot-2020-01-01-at-5.05.48-pm.png 635w, https:\/\/www.statulator.com\/blog\/wp-content\/uploads\/2020\/01\/Screen-Shot-2020-01-01-at-5.05.48-pm-300x286.png 300w\" sizes=\"(max-width: 635px) 100vw, 635px\" \/><\/figure>\n\n\n\n<p><br><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Introduction Whether you want to understand people\u2019s preferences for a product, estimate the proportion of people preferring a political party or estimate the prevalence of a disease&nbsp;in a population, you will need to calculate the number of respondents sufficient for your survey objective. How can you calculate this magic number? To obtain a completely accurate&hellip;&nbsp;<a href=\"https:\/\/www.statulator.com\/blog\/demystifying-statistics-estimating-sample-size-for-a-survey\/\" class=\"\" rel=\"bookmark\">Read More &raquo;<span class=\"screen-reader-text\">Demystifying statistics: Estimating sample size for a survey<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":191,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_coblocks_attr":"","_coblocks_dimensions":"","_coblocks_responsive_height":"","_coblocks_accordion_ie_support":"","neve_meta_sidebar":"","neve_meta_container":"","neve_meta_enable_content_width":"","neve_meta_content_width":0,"neve_meta_title_alignment":"","neve_meta_author_avatar":"","neve_post_elements_order":"","neve_meta_disable_header":"","neve_meta_disable_footer":"","neve_meta_disable_title":""},"categories":[30,20],"tags":[25,29,19,24,28],"_links":{"self":[{"href":"https:\/\/www.statulator.com\/blog\/wp-json\/wp\/v2\/posts\/189"}],"collection":[{"href":"https:\/\/www.statulator.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.statulator.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.statulator.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.statulator.com\/blog\/wp-json\/wp\/v2\/comments?post=189"}],"version-history":[{"count":5,"href":"https:\/\/www.statulator.com\/blog\/wp-json\/wp\/v2\/posts\/189\/revisions"}],"predecessor-version":[{"id":750,"href":"https:\/\/www.statulator.com\/blog\/wp-json\/wp\/v2\/posts\/189\/revisions\/750"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.statulator.com\/blog\/wp-json\/wp\/v2\/media\/191"}],"wp:attachment":[{"href":"https:\/\/www.statulator.com\/blog\/wp-json\/wp\/v2\/media?parent=189"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.statulator.com\/blog\/wp-json\/wp\/v2\/categories?post=189"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.statulator.com\/blog\/wp-json\/wp\/v2\/tags?post=189"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}