Exact Nonparametric Tests for Comparing Means - A Personal Summary
|
|
|
- Beryl Sullivan
- 9 years ago
- Views:
Transcription
1 Exact Nonparametric Tests for Comparing Means - A Personal Summary Karl H. Schlag European University Institute 1 December 14, Economics Department, European University Institute. Via della Piazzuola 43, Florence, Italy, [email protected]
2 Abstract We present an overview of exact nonparametric tests for comparing means of two random variables based on two independent i.i.d. samples when outcomes are known to be contained in some given bounded interval.
3 1 1 The Setting A statistical inference problem is parametric if the data generating process is known up to a nite dimensional parameter space. It is nonparametric if it is not parametric (Fraser, 1957). A statistical inference problem is distribution-free if there are no assumptions on the distribution underlying the data beyond what is known, e.g. often one has some information about the possible support. So the Fisher exact test for comparing means of two binomial distributions is uniformly most powerful for a parametric and distribution-free statistical inference problem. It is known that observations are only 0 or 1 so the restriction to binomial distributions is implied by what is known as is not an assumption on the distribution. The null and alternative hypothesis used to prove that a permutation test is most powerful (see below) make the associated statistical inference problem nonparametric but not distribution-free. A test is exact if its properties are derived for nite samples and not only for a su ciently large sample size. Often properties refers to the size of the test (or an upper bound on its size). This either means that there is a formula for the value of the size of the test or that there is a formula for an upper bound of the size of the test. Exact simply means that these formulas are correct for a given sample size. Assume that there are two underlying random variables Y 1 and Y 2 with unknown means E = EY 1 and EY 2 and two separate samples y1 k n and n k=1 2 yk of independent realizations of Y 1 and Y 2 respectively. There is an interval [;!] that is known k=1 to have the property that Y 1 ; Y 2 2 [;!] : So the joint distribution P is an element of [;!] 2 : Let P 1 and P 2 denote the respective marginals. The data generating process can be identi ed with P. As we allow for any P 2 [;!] 2 our setting is nonparametric. 2 One-Sided Test Consider the objective to choose an exact test of the null hypothesis H 0 : EY 1 EY 2 (or H 0 : EY 1 = EY 2 ) against the alternative hypothesis H 1 : EY 1 > EY 2 : 2.1 UMP Note that there is no uniformly most powerful test for our hypotheses. Nor is there a uniformly most powerful unbiased test. However there are most powerful tests and
4 2 other exact tests. 2.2 T Test The two sample t-test is not an exact test in our setting as it assumes that the underlying random variables are normally distributed. However neither Y 1 nor Y 2 can be normally distributed as the ranges of Y 1 and Y 2 are bounded. Without knowing more about the random variables one cannot argue that the t-test can be applied if the sample is su ciently large. It is only pointwise asymptotically level ; i.e. for given P and given " there is some N (P; ") such that if n > N (P; ") then the size of the test is within " of the size of the underlying t-test. However, the notion of su ciently large depends on P and it turns out that N (P; ") ; P 2 [;!] 2 is an unbounded set. The two sample t-test is not uniformly asymptotically level. In fact, the size of the test for any given sample size n is equal to 1. For the argument one can easily adapt the constructive proof of Lehmann and Loh (1990) who show this statement for testing the mean of a single random variable. Hence the existence of a bound N (P; ") is useless. In simple words, it is the di erence between pointwise and uniform convergence that makes the t-test non applicable even if the sample is very large. The example that can be used to prove the above statement is a simple variation of the one in Lehmann and Loh (1990). Assume that Y 1 = 1=2 almost surely and that Y 2 has two mass points, one at and the other at 0 where is strictly larger but very close to 1=2 such that EY 2 = 1=2: Notice that the closer is to 1=2; the more mass Y 2 puts on as EY 2 = 1=2: Provided is su ciently close to 1=2; with large probability each of the n realizations of Y 2 are equal to : If only Y 2 = is observed then since > 1=2 and since the sample variances are both 0 it follows that the null is rejected under the t-test even though EY 1 = EY 2 : Since the event that only is observed can be made arbitrarily large by choosing very close to 1=2; the probability of wrongly rejecting the null can me bade arbitrarily close to 1: Thus the size of the two sample t test is equal to 1: Notice that this example can be easily adjusted to apply only to continuous distributions without changing the conclusion. Notice also that variance of any random variable in our setting is bounded. The phenomenon of this example is that we cannot rule out that most weight is put on one observation very close to 1=2: The obvious reaction to the above example is to suggest to rst test for normality of
5 3 the errors. Such tests would be able to discover the above counter example. However this does not mean that they would make the t-test valid given the current standing of research. The problem of tests for normality (e.g. Shapiro-Wilk) is that the null hypothesis asserts that they are normally distributed. To claim that the errors are normally distributed is a claim about not being able to reject the null. It can be very likely that normality cannot be rejected even though errors are not normally distributed. This a ects the size of the combined test, rst of normality and then of EY 1 EY 2 : In fact, without results on the probability of rejecting EY 1 EY 2 with the t-test when errors are not normally distributed one has to simply work with the upper bound that the t-test always rejects EY 1 EY 2 whenever errors are not normal. But then this means that the size of the combined test is dominated by the probability of wrongly not being able to reject that errors are normal. Now note that the maximal probability of wrongly not rejecting that errors are normal will be approximately equal to or larger than the probability of correctly accepting the null. This is because there are distributions that are not normal that are almost normal. So if the size of the test for normality is equal to 10% then the probability of thinking that errors are normally distributed even if they are not can be 90% or higher. In this case the best upper bound given current research on the size of the combined test is 90%: Note that the bound above is not tight. When errors are almost normal then the size of the t-test will be close to its size when errors are normal. This will be true (for appropriate distance function) in the current framework as we assume that payo s belong to [0; 1] : In order to assess the size of the combined test one would have to derive the size of this combined test, in particular obtain nite sample solutions to the size of the t-test when errors are almost normal. As this has not been done, it is an open question whether tests for normality can be used to reinstate the t-test when payo s are bounded. 2.3 Permutation Tests Permutation tests are exact tests in terms of size. A permutation test is a most powerful unbiased test of testing H 0 : EY 1 = EY 2 against the simple alternative hypothesis H 1 : P = P where P is given with E P Y 1 > E P Y 2 : Accordingly all permutations of the data are considered in which the data is randomly assigned to one of the two actions and the null is rejected if the true sample is within the % of permutations under
6 4 which the alternative is least likely under P. This is the theory. In practice, typically only the test statistic itself is speci ed without mentioning whether it is most powerful relative to some alternative hypothesis. The test remains exact in terms of size. A popular test statistic is to take the di erence between the two sample means. I have not come across any distribution with bounded support that can be speci ed for the alternative hypothesis that makes even this test most powerful. Note that the Wilcoxon rank sum test (or Mann Whitney U test) is a also permutation test which can be used to compare means and to compare distributions. It can be useful when data is given in form of ranks. However I do not see the sense to rst transform the data into ranks and then to apply this test as the transformation means to throw away information. The reason is that I cannot imagine that it is most powerful against any speci c distribution. The literature on permutation tests typically ignores the possibility to use permutation tests to create most powerful tests as they are interested in distribution-free tests or at least in alternatives that are composite. 2.4 Aside on Methodology Before we proceed with out presentation of tests we present some material on methodology of selecting tests Multiple Testing on same Data It is not good practice to evaluate multiple tests on the same data without correcting for this. The reason is that this typically increases the type I error underlying the nal evaluation. To illustrate, assume that the null hypothesis could not be rejected with a rst test but that it can be rejected based on a second test where both tests are unbiased with the same size : Consider an underlying data generating process P with EY 0 = EY 1. As the rst test is unbiased, the probability of wrongly rejecting the null using this test is equal to : Consequently, the probability of wrongly rejecting the null hypothesis under P is equal to + ' (P ) where ' (P ) is the probability that the rst test does not reject the null and that the second does reject the null. If the two tests are not identical then ' (P ) > 0 and this two step procedure leads to a rejection of the null hypothesis with probability strictly larger than : More generally, if neither test is more powerful than the other then the sequential procedure of testing until
7 5 obtaining a rejection strictly increases the type I error. Of course one could mention how many other tests have been tried, however it is di cult to derive exactly how the type I error increases, the upper bound on the type I error when checking two tests is 2: As one can hardly verify how many others tests were tried it is useful to have an understanding of which tests should be applied to which setting. This is best done by comparing tests. When uniformly most powerful tests exist then there is no danger for worrying about multiple testing of same data. One can say that there is a best test. However in our setting a UMP test does not exist. Permutation tests are most powerful given the assumptions on how the alternative hypothesis has been limited. As di erent alternative hypotheses satisfying EY 1 > EY 2 yield di erent tests, each is most powerful for the speci c alternative. Again it can be di cult to justify which alternative is most plausible and it is not good practice to try several alternatives Choice of a Test: Ex-Ante vs Ex-Post When there is no uniformly most powerful test then it is important to rst select a test and then to gather data. One problem of looking for a test after data is gathered is illustrated above as it is hard to check how many tests were run but not reported. A further problem is more fundamental and is due to the fact that size is an ex-ante concept which means the test may not be conditioned on the observed data. Often if not typically one can design a test given the data gathered that will reject the null given the data. We illustrate. Consider the test to reject the null if the di erence between the two sample means is equal to 6= 0 where is given. So reject if Y 1 Y2 = : Then the size of this test will be small if the sample size n is su ciently large. Note that it is important that is rst chosen before the data is gathered. Otherwise could be set equal to the di erence of the observed averages and the null would be rejected. So setting after data is observed yields size equal to 1: An objection to the above example could be that the test suggested is not plausible. One would expect that if it is rejected for Y 1 Y2 = with > 0 then it is also rejected for Y 1 Y2 : However this is only a heuristic argument. The real problem is that the above test is not credible as it is not most powerful nor does it satisfy any other optimality principles.
8 The Value of the P Value It is standard practice to record p values when testing data. In particular one cites whether the null can be rejected at 1% level or at the 5% level. In either case the null is rejected. P levels above 5% typically lead to a statement of not being able to reject the null. Consider this practice. What is the real size of the test? In other words, how often will the null be wrongly rejected. Is the p value really the smallest level of signi cance leading to rejection of the null? Well if the null is rejected as long as the p value is below 5% then the size is 5%. So regardless of whether or not the data has a p value below 1%, if the statistician would have also rejected the null if the p value was above 1% but below 5% then the fact that the data has a p value below 1% is not of importance. The test has size 5%: The speci c p value is useless for the given data, it only matters if it is below or above 5%: In particular the p value is not the smallest level of signi cance leading to rejection of the null. The issue behind the above phenomenon is the same as we discussed in the previous section. The test is formulated, by identifying whether signi cance is at 1% or 5% level, after the data has been gathered. Size is an ex-ante concept that is a function of the test that is derived before the data is observed. Looking at the p value is an ex-post operation. If one can credibly commit to reject the null if and only if the p value is below 1% then a p value below 1% would have the information that the size of the test is equal to 1%: However, standard practice calls to reject if and only if the p value is below 5%; hence observing a p value of 1% or lower has the only information that the p value is below 5%: It does not matter how much lower it is. Of course the p value has a value for the general objective. When discovering that it is very low then one can be more ambitious when one gathers similar data next time, e.g. one can choose to gather less data The Role of the Null Hypothesis Often the null hypothesis cannot be rejected in which case we would like to argue that there is no treatment e ect. This is where type II error comes in handy. If a test is most powerful for some distribution satisfying the alternative then we know that the null could not be rejected more often under this alternative. However it would be even nicer to know explicitly what the type II error is. Moreover, how to argue that the distribution underlying the alternative is well chosen given that they type II error
9 7 depends on this distribution? One would wish to nd some way of making inferences without making speci c unveri able assumptions on the distributions to avoid these issues. We show how to do this now Comparing Tests Even if there is no uniformly most powerful test one can still compare tests. This comparison will not build on making speci c assumptions on the distributions underlying either the null or the alternative. The idea is to consider a worst case analysis centered around the parameters of interest which here are the means. Speci cally, for any given 1 and 2 with 1 > 2 one can consider all distributions P with EY 1 = 1 and EY 2 = 2 and calculate the maximal type II error of wrongly not rejecting the null within this class. In other words, one can calculate the minimal power within this set. This leads to a minimum power function that depends on 1 and 2 : Or for any given d > 0 one can derive the maximal type II error among all distributions P with EY 1 EY 2 + d. This leads to a minimal power function that depends on d: Note that fp : EY 2 < EY 1 < EY 2 + dg can be interpreted as an indi erence zone, a set of underlying parameters where the decision maker does not care whether or not the null hypothesis is rejected. It is necessary to consider d > 0 as the type II error for d = 0 will be equal to 1 : For d > 0 one typically can make the type II error small provided the sample is su ciently large. The smaller d the larger the sample must be. Using these minimal power functions one can evaluate the event of not rejecting the null as a function of either ( 1 ; 2 ) or as a function of d: One can then compare tests by comparing minimal power functions. We show in a di erent paper that there is a best test in terms of such minimal power functions. In other words, there is a uniformly most powerful test when inference is only based on the parameters of the distribution satisfying the alternative hypothesis. Such a test is called parameter most powerful. A general approach followed in the literature (see Lehmann and Romano, 2005) is to limit the set of distributions belonging to the alternative hypothesis. So one sets H 1 : P 2 K where K fp : EY 1 > EY 2 g : In this case, fp : EY 1 > EY 2 g n K is the indi erence zone. The alternative hypothesis only contains those distributions di ering so widely from those postulated by the null hypothesis that false acceptance of the null is a serious error (Lehmann and Romano, 2005, p. 320). In
10 8 this sense it is natural to formulate the indi erence zone in terms of the underlying means. For instance, one can x d > 0 and set H 1 : EY 1 EY 2 + d: In this case fp : EY 2 < EY 1 < EY 2 + dg is the indi erence zone. A maximin test is a test that maximizes over all tests of size the minimum power over all P 2 K : In particular, a parameter most powerful test is a maximin test when K can be described in terms of only Unbiasedness Unbiasedness is a property that is intuitive and which can add enough structure to nd a best test. For instance, there is no UMP test for comparing the means of two Bernoulli distributed random variables but there is a UMP unbiased test for that setting, called the Fisher exact test. However, lack of unbiasedness should not be a reason to reject a test. Unbiasedness means that the alternative should never be rejected more likely when the alternative is true than when the null is true. This sounds intuitive. However we are comparing very small probabilities in this statement and there is no decision theoretic basis for doing this. The reason why unbiasedness fails is typically due to distributions belonging to the alternative that are very close to the null. In our setting we are worried about rejecting the null with probability smaller than among the alternatives but this typically means that EY 1 > EY 2 but EY 1 EY 2 : Above we however argued that it is plausible not to worry too much about alternatives that are close to the null in the sense that EY 2 < EY 1 < EY 2 + d for some small d: We now return to our presentation of exact tests. 2.5 Test Based on Hoe ding Bounds In the following I construct a nonparametric test in which any distribution belongs either to the null or to the alternative hypothesis. Lower bounds on minimal power functions are easily derived. The test is not unbiased. I use the Hoe ding bounds, similar to what Bickel et al. (1989) did for testing the mean of a single sample. Following a corollary by Hoe ding (1963, eq. 2.7), Pr ((y 1 y 2 ) (EY 1 EY 2 ) t) e nt2 (! ) 2 holds for t > 0: So if t solves e nt2 (! )2 = then the rule to reject the null when y 1 y 2 t : The nice thing about this approach is that one can calculate an upper bound on the type II error. If EY 1 = EY 2 + d then Pr ((y 1 y 2 ) t ) = Pr ((y 1 y 2 ) d t d) e n(d t)2 (! ) 2 for
11 9 d > t : For example, if n = 30, [;!] = [0; 1] and = 0:05 then one easily veri es that t = p ln 20= p 30 0:316 and that the type II error is bounded above by 0:2 when d 0: Using Tests Derived for a Single Sample The above are the only deterministic exact nonparametric one-sided tests I know of that were explicitly constructed for the two sample problem. There are exact tests for the one sample problem (see Schlag, 2006) that of course can also be applied to our setting. One can rede ne Y = Y 1 Y 2 and test the null that EY 0 against the alternative that EY > 0 where it is known that Y 2 [ 1; 1]. 2.7 Parameter Most Powerful Tests In a separate paper (Schlag, 2006) I have constructed a randomized exact unbiased test that I call parameter most powerful as it minimizes the type II error whenever the set of alternatives can be described in terms of the underlying means only. It maximizes the minimal power function that only depends on 1 and 2 : In a rst step the data is randomly transformed into binary data. In the second step one then applies the Fisher exact test which is uniformly most powerful unbiased test for the binary case. My test is randomized as the transformation of the data is random and hence the recommendation of the test is typically randomized: reject the null hypothesis with x% where typically 0 < x < 100: The property of being parameter most powerful comes at the cost of being randomized. The advantage is that it can be used to evaluate the relative e ciency of other deterministic tests.. References [1] Bickel, P., J. Godfrey, J. Neter and H. Clayton (1989), Hoe ding Bounds for Monetary Unit Sampling in Auditing, Contr. Pap. I.S.I. Meeting, Paris. [2] Fraser, D.A.S. (1957), Nonparametric Methods in Statistics, New York: John Wiley and Sons.
12 10 [3] Hoe ding, W. (1963), Probability Inequalities for Sums of Bounded Random Variables, J. Amer. Stat. Assoc. 38(301), [4] Lehmann, E.L. and J.P. Romano (2005), Testing Statistical Hypotheses, 3rd edition, Springer. [5] Schlag, K.H. (2006), Nonparametric Minimax Risk Estimates and Most Powerful Tests for Means, European University Institute, Mimeo.
1 Another method of estimation: least squares
1 Another method of estimation: least squares erm: -estim.tex, Dec8, 009: 6 p.m. (draft - typos/writos likely exist) Corrections, comments, suggestions welcome. 1.1 Least squares in general Assume Y i
Permutation Tests for Comparing Two Populations
Permutation Tests for Comparing Two Populations Ferry Butar Butar, Ph.D. Jae-Wan Park Abstract Permutation tests for comparing two populations could be widely used in practice because of flexibility of
Statistical tests for SPSS
Statistical tests for SPSS Paolo Coletti A.Y. 2010/11 Free University of Bolzano Bozen Premise This book is a very quick, rough and fast description of statistical tests and their usage. It is explicitly
Nonparametric tests these test hypotheses that are not statements about population parameters (e.g.,
CHAPTER 13 Nonparametric and Distribution-Free Statistics Nonparametric tests these test hypotheses that are not statements about population parameters (e.g., 2 tests for goodness of fit and independence).
Chapter 4: Statistical Hypothesis Testing
Chapter 4: Statistical Hypothesis Testing Christophe Hurlin November 20, 2015 Christophe Hurlin () Advanced Econometrics - Master ESA November 20, 2015 1 / 225 Section 1 Introduction Christophe Hurlin
1 if 1 x 0 1 if 0 x 1
Chapter 3 Continuity In this chapter we begin by defining the fundamental notion of continuity for real valued functions of a single real variable. When trying to decide whether a given function is or
1 Hypothesis Testing. H 0 : population parameter = hypothesized value:
1 Hypothesis Testing In Statistics, a hypothesis proposes a model for the world. Then we look at the data. If the data are consistent with that model, we have no reason to disbelieve the hypothesis. Data
CAPM, Arbitrage, and Linear Factor Models
CAPM, Arbitrage, and Linear Factor Models CAPM, Arbitrage, Linear Factor Models 1/ 41 Introduction We now assume all investors actually choose mean-variance e cient portfolios. By equating these investors
Non Parametric Inference
Maura Department of Economics and Finance Università Tor Vergata Outline 1 2 3 Inverse distribution function Theorem: Let U be a uniform random variable on (0, 1). Let X be a continuous random variable
Representation of functions as power series
Representation of functions as power series Dr. Philippe B. Laval Kennesaw State University November 9, 008 Abstract This document is a summary of the theory and techniques used to represent functions
Chapter G08 Nonparametric Statistics
G08 Nonparametric Statistics Chapter G08 Nonparametric Statistics Contents 1 Scope of the Chapter 2 2 Background to the Problems 2 2.1 Parametric and Nonparametric Hypothesis Testing......................
Hypothesis testing. c 2014, Jeffrey S. Simonoff 1
Hypothesis testing So far, we ve talked about inference from the point of estimation. We ve tried to answer questions like What is a good estimate for a typical value? or How much variability is there
14.451 Lecture Notes 10
14.451 Lecture Notes 1 Guido Lorenzoni Fall 29 1 Continuous time: nite horizon Time goes from to T. Instantaneous payo : f (t; x (t) ; y (t)) ; (the time dependence includes discounting), where x (t) 2
Binomial lattice model for stock prices
Copyright c 2007 by Karl Sigman Binomial lattice model for stock prices Here we model the price of a stock in discrete time by a Markov chain of the recursive form S n+ S n Y n+, n 0, where the {Y i }
The Binomial Distribution
The Binomial Distribution James H. Steiger November 10, 00 1 Topics for this Module 1. The Binomial Process. The Binomial Random Variable. The Binomial Distribution (a) Computing the Binomial pdf (b) Computing
Rank-Based Non-Parametric Tests
Rank-Based Non-Parametric Tests Reminder: Student Instructional Rating Surveys You have until May 8 th to fill out the student instructional rating surveys at https://sakai.rutgers.edu/portal/site/sirs
Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)
Chapter 45 Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when no assumption
160 CHAPTER 4. VECTOR SPACES
160 CHAPTER 4. VECTOR SPACES 4. Rank and Nullity In this section, we look at relationships between the row space, column space, null space of a matrix and its transpose. We will derive fundamental results
Nonparametric Tests for Randomness
ECE 461 PROJECT REPORT, MAY 2003 1 Nonparametric Tests for Randomness Ying Wang ECE 461 PROJECT REPORT, MAY 2003 2 Abstract To decide whether a given sequence is truely random, or independent and identically
Sample Size and Power in Clinical Trials
Sample Size and Power in Clinical Trials Version 1.0 May 011 1. Power of a Test. Factors affecting Power 3. Required Sample Size RELATED ISSUES 1. Effect Size. Test Statistics 3. Variation 4. Significance
Two-Sample T-Tests Assuming Equal Variance (Enter Means)
Chapter 4 Two-Sample T-Tests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of
c 2008 Je rey A. Miron We have described the constraints that a consumer faces, i.e., discussed the budget constraint.
Lecture 2b: Utility c 2008 Je rey A. Miron Outline: 1. Introduction 2. Utility: A De nition 3. Monotonic Transformations 4. Cardinal Utility 5. Constructing a Utility Function 6. Examples of Utility Functions
Comparison of frequentist and Bayesian inference. Class 20, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom
Comparison of frequentist and Bayesian inference. Class 20, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom 1 Learning Goals 1. Be able to explain the difference between the p-value and a posterior
STRUTS: Statistical Rules of Thumb. Seattle, WA
STRUTS: Statistical Rules of Thumb Gerald van Belle Departments of Environmental Health and Biostatistics University ofwashington Seattle, WA 98195-4691 Steven P. Millard Probability, Statistics and Information
Non-Parametric Tests (I)
Lecture 5: Non-Parametric Tests (I) KimHuat LIM [email protected] http://www.stats.ox.ac.uk/~lim/teaching.html Slide 1 5.1 Outline (i) Overview of Distribution-Free Tests (ii) Median Test for Two Independent
Chapter 2. Hypothesis testing in one population
Chapter 2. Hypothesis testing in one population Contents Introduction, the null and alternative hypotheses Hypothesis testing process Type I and Type II errors, power Test statistic, level of significance
MTH6120 Further Topics in Mathematical Finance Lesson 2
MTH6120 Further Topics in Mathematical Finance Lesson 2 Contents 1.2.3 Non-constant interest rates....................... 15 1.3 Arbitrage and Black-Scholes Theory....................... 16 1.3.1 Informal
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
Stat 5102 Notes: Nonparametric Tests and. confidence interval
Stat 510 Notes: Nonparametric Tests and Confidence Intervals Charles J. Geyer April 13, 003 This handout gives a brief introduction to nonparametrics, which is what you do when you don t believe the assumptions
Empirical Methods in Applied Economics
Empirical Methods in Applied Economics Jörn-Ste en Pischke LSE October 2005 1 Observational Studies and Regression 1.1 Conditional Randomization Again When we discussed experiments, we discussed already
LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING
LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.
14.773 Problem Set 2 Due Date: March 7, 2013
14.773 Problem Set 2 Due Date: March 7, 2013 Question 1 Consider a group of countries that di er in their preferences for public goods. The utility function for the representative country i is! NX U i
Skewed Data and Non-parametric Methods
0 2 4 6 8 10 12 14 Skewed Data and Non-parametric Methods Comparing two groups: t-test assumes data are: 1. Normally distributed, and 2. both samples have the same SD (i.e. one sample is simply shifted
QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS
QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS This booklet contains lecture notes for the nonparametric work in the QM course. This booklet may be online at http://users.ox.ac.uk/~grafen/qmnotes/index.html.
Tests for Two Proportions
Chapter 200 Tests for Two Proportions Introduction This module computes power and sample size for hypothesis tests of the difference, ratio, or odds ratio of two independent proportions. The test statistics
HYPOTHESIS TESTING: POWER OF THE TEST
HYPOTHESIS TESTING: POWER OF THE TEST The first 6 steps of the 9-step test of hypothesis are called "the test". These steps are not dependent on the observed data values. When planning a research project,
Nonparametric Two-Sample Tests. Nonparametric Tests. Sign Test
Nonparametric Two-Sample Tests Sign test Mann-Whitney U-test (a.k.a. Wilcoxon two-sample test) Kolmogorov-Smirnov Test Wilcoxon Signed-Rank Test Tukey-Duckworth Test 1 Nonparametric Tests Recall, nonparametric
Advanced Microeconomics
Advanced Microeconomics Ordinal preference theory Harald Wiese University of Leipzig Harald Wiese (University of Leipzig) Advanced Microeconomics 1 / 68 Part A. Basic decision and preference theory 1 Decisions
Permutation & Non-Parametric Tests
Permutation & Non-Parametric Tests Statistical tests Gather data to assess some hypothesis (e.g., does this treatment have an effect on this outcome?) Form a test statistic for which large values indicate
Hypothesis Testing for Beginners
Hypothesis Testing for Beginners Michele Piffer LSE August, 2011 Michele Piffer (LSE) Hypothesis Testing for Beginners August, 2011 1 / 53 One year ago a friend asked me to put down some easy-to-read notes
I. Pointwise convergence
MATH 40 - NOTES Sequences of functions Pointwise and Uniform Convergence Fall 2005 Previously, we have studied sequences of real numbers. Now we discuss the topic of sequences of real valued functions.
Basics of Statistical Machine Learning
CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu [email protected] Modern machine learning is rooted in statistics. You will find many familiar
SIMULATION STUDIES IN STATISTICS WHAT IS A SIMULATION STUDY, AND WHY DO ONE? What is a (Monte Carlo) simulation study, and why do one?
SIMULATION STUDIES IN STATISTICS WHAT IS A SIMULATION STUDY, AND WHY DO ONE? What is a (Monte Carlo) simulation study, and why do one? Simulations for properties of estimators Simulations for properties
IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem
IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem Time on my hands: Coin tosses. Problem Formulation: Suppose that I have
WHERE DOES THE 10% CONDITION COME FROM?
1 WHERE DOES THE 10% CONDITION COME FROM? The text has mentioned The 10% Condition (at least) twice so far: p. 407 Bernoulli trials must be independent. If that assumption is violated, it is still okay
NAG C Library Chapter Introduction. g08 Nonparametric Statistics
g08 Nonparametric Statistics Introduction g08 NAG C Library Chapter Introduction g08 Nonparametric Statistics Contents 1 Scope of the Chapter... 2 2 Background to the Problems... 2 2.1 Parametric and Nonparametric
Difference tests (2): nonparametric
NST 1B Experimental Psychology Statistics practical 3 Difference tests (): nonparametric Rudolf Cardinal & Mike Aitken 10 / 11 February 005; Department of Experimental Psychology University of Cambridge
Point and Interval Estimates
Point and Interval Estimates Suppose we want to estimate a parameter, such as p or µ, based on a finite sample of data. There are two main methods: 1. Point estimate: Summarize the sample by a single number
3. Mathematical Induction
3. MATHEMATICAL INDUCTION 83 3. Mathematical Induction 3.1. First Principle of Mathematical Induction. Let P (n) be a predicate with domain of discourse (over) the natural numbers N = {0, 1,,...}. If (1)
Elements of probability theory
2 Elements of probability theory Probability theory provides mathematical models for random phenomena, that is, phenomena which under repeated observations yield di erent outcomes that cannot be predicted
THE KRUSKAL WALLLIS TEST
THE KRUSKAL WALLLIS TEST TEODORA H. MEHOTCHEVA Wednesday, 23 rd April 08 THE KRUSKAL-WALLIS TEST: The non-parametric alternative to ANOVA: testing for difference between several independent groups 2 NON
Hypothesis testing - Steps
Hypothesis testing - Steps Steps to do a two-tailed test of the hypothesis that β 1 0: 1. Set up the hypotheses: H 0 : β 1 = 0 H a : β 1 0. 2. Compute the test statistic: t = b 1 0 Std. error of b 1 =
A Simpli ed Axiomatic Approach to Ambiguity Aversion
A Simpli ed Axiomatic Approach to Ambiguity Aversion William S. Neilson Department of Economics University of Tennessee Knoxville, TN 37996-0550 [email protected] March 2009 Abstract This paper takes the
MODIFIED PARAMETRIC BOOTSTRAP: A ROBUST ALTERNATIVE TO CLASSICAL TEST
MODIFIED PARAMETRIC BOOTSTRAP: A ROBUST ALTERNATIVE TO CLASSICAL TEST Zahayu Md Yusof, Nurul Hanis Harun, Sharipah Sooad Syed Yahaya & Suhaida Abdullah School of Quantitative Sciences College of Arts and
Introduction to Hypothesis Testing
I. Terms, Concepts. Introduction to Hypothesis Testing A. In general, we do not know the true value of population parameters - they must be estimated. However, we do have hypotheses about what the true
Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means
Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis
Optimal insurance contracts with adverse selection and comonotonic background risk
Optimal insurance contracts with adverse selection and comonotonic background risk Alary D. Bien F. TSE (LERNA) University Paris Dauphine Abstract In this note, we consider an adverse selection problem
Maximum Likelihood Estimation
Math 541: Statistical Theory II Lecturer: Songfeng Zheng Maximum Likelihood Estimation 1 Maximum Likelihood Estimation Maximum likelihood is a relatively simple method of constructing an estimator for
Study Guide for the Final Exam
Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make
3.4 Statistical inference for 2 populations based on two samples
3.4 Statistical inference for 2 populations based on two samples Tests for a difference between two population means The first sample will be denoted as X 1, X 2,..., X m. The second sample will be denoted
The Variability of P-Values. Summary
The Variability of P-Values Dennis D. Boos Department of Statistics North Carolina State University Raleigh, NC 27695-8203 [email protected] August 15, 2009 NC State Statistics Departement Tech Report
UCLA. Department of Economics Ph. D. Preliminary Exam Micro-Economic Theory
UCLA Department of Economics Ph. D. Preliminary Exam Micro-Economic Theory (SPRING 2011) Instructions: You have 4 hours for the exam Answer any 5 out of the 6 questions. All questions are weighted equally.
Error Type, Power, Assumptions. Parametric Tests. Parametric vs. Nonparametric Tests
Error Type, Power, Assumptions Parametric vs. Nonparametric tests Type-I & -II Error Power Revisited Meeting the Normality Assumption - Outliers, Winsorizing, Trimming - Data Transformation 1 Parametric
CHAPTER 14 NONPARAMETRIC TESTS
CHAPTER 14 NONPARAMETRIC TESTS Everything that we have done up until now in statistics has relied heavily on one major fact: that our data is normally distributed. We have been able to make inferences
1 Error in Euler s Method
1 Error in Euler s Method Experience with Euler s 1 method raises some interesting questions about numerical approximations for the solutions of differential equations. 1. What determines the amount of
Projects Involving Statistics (& SPSS)
Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES Contents 1. Random variables and measurable functions 2. Cumulative distribution functions 3. Discrete
The Wilcoxon Rank-Sum Test
1 The Wilcoxon Rank-Sum Test The Wilcoxon rank-sum test is a nonparametric alternative to the twosample t-test which is based solely on the order in which the observations from the two samples fall. We
How To Test For Significance On A Data Set
Non-Parametric Univariate Tests: 1 Sample Sign Test 1 1 SAMPLE SIGN TEST A non-parametric equivalent of the 1 SAMPLE T-TEST. ASSUMPTIONS: Data is non-normally distributed, even after log transforming.
CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE
1 2 CONTENTS OF DAY 2 I. More Precise Definition of Simple Random Sample 3 Connection with independent random variables 3 Problems with small populations 8 II. Why Random Sampling is Important 9 A myth,
p ˆ (sample mean and sample
Chapter 6: Confidence Intervals and Hypothesis Testing When analyzing data, we can t just accept the sample mean or sample proportion as the official mean or proportion. When we estimate the statistics
Non-Inferiority Tests for Two Proportions
Chapter 0 Non-Inferiority Tests for Two Proportions Introduction This module provides power analysis and sample size calculation for non-inferiority and superiority tests in twosample designs in which
Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing
Chapter 8 Hypothesis Testing 1 Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing 8-3 Testing a Claim About a Proportion 8-5 Testing a Claim About a Mean: s Not Known 8-6 Testing
ECON 459 Game Theory. Lecture Notes Auctions. Luca Anderlini Spring 2015
ECON 459 Game Theory Lecture Notes Auctions Luca Anderlini Spring 2015 These notes have been used before. If you can still spot any errors or have any suggestions for improvement, please let me know. 1
Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 10
CS 70 Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 10 Introduction to Discrete Probability Probability theory has its origins in gambling analyzing card games, dice,
StatCrunch and Nonparametric Statistics
StatCrunch and Nonparametric Statistics You can use StatCrunch to calculate the values of nonparametric statistics. It may not be obvious how to enter the data in StatCrunch for various data sets that
Risk Aversion. Expected value as a criterion for making decisions makes sense provided that C H A P T E R 2. 2.1 Risk Attitude
C H A P T E R 2 Risk Aversion Expected value as a criterion for making decisions makes sense provided that the stakes at risk in the decision are small enough to \play the long run averages." The range
Economics 326: Duality and the Slutsky Decomposition. Ethan Kaplan
Economics 326: Duality and the Slutsky Decomposition Ethan Kaplan September 19, 2011 Outline 1. Convexity and Declining MRS 2. Duality and Hicksian Demand 3. Slutsky Decomposition 4. Net and Gross Substitutes
X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)
CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.
EconS 503 - Advanced Microeconomics II Handout on Cheap Talk
EconS 53 - Advanced Microeconomics II Handout on Cheap Talk. Cheap talk with Stockbrokers (From Tadelis, Ch. 8, Exercise 8.) A stockbroker can give his client one of three recommendations regarding a certain
HYPOTHESIS TESTING WITH SPSS:
HYPOTHESIS TESTING WITH SPSS: A NON-STATISTICIAN S GUIDE & TUTORIAL by Dr. Jim Mirabella SPSS 14.0 screenshots reprinted with permission from SPSS Inc. Published June 2006 Copyright Dr. Jim Mirabella CHAPTER
START Selected Topics in Assurance
START Selected Topics in Assurance Related Technologies Table of Contents Introduction Some Statistical Background Fitting a Normal Using the Anderson Darling GoF Test Fitting a Weibull Using the Anderson
6.4 Normal Distribution
Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under
Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus
Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus 1. Introduction Facebook is a social networking website with an open platform that enables developers to extract and utilize user information
How To Solve A Minimum Set Covering Problem (Mcp)
Measuring Rationality with the Minimum Cost of Revealed Preference Violations Mark Dean and Daniel Martin Online Appendices - Not for Publication 1 1 Algorithm for Solving the MASP In this online appendix
Review of Bivariate Regression
Review of Bivariate Regression A.Colin Cameron Department of Economics University of California - Davis [email protected] October 27, 2006 Abstract This provides a review of material covered in an
LOGNORMAL MODEL FOR STOCK PRICES
LOGNORMAL MODEL FOR STOCK PRICES MICHAEL J. SHARPE MATHEMATICS DEPARTMENT, UCSD 1. INTRODUCTION What follows is a simple but important model that will be the basis for a later study of stock prices as
Comparing Means in Two Populations
Comparing Means in Two Populations Overview The previous section discussed hypothesis testing when sampling from a single population (either a single mean or two means from the same population). Now we
Sensitivity Analysis 3.1 AN EXAMPLE FOR ANALYSIS
Sensitivity Analysis 3 We have already been introduced to sensitivity analysis in Chapter via the geometry of a simple example. We saw that the values of the decision variables and those of the slack and
2 Sample t-test (unequal sample sizes and unequal variances)
Variations of the t-test: Sample tail Sample t-test (unequal sample sizes and unequal variances) Like the last example, below we have ceramic sherd thickness measurements (in cm) of two samples representing
5.1 Identifying the Target Parameter
University of California, Davis Department of Statistics Summer Session II Statistics 13 August 20, 2012 Date of latest update: August 20 Lecture 5: Estimation with Confidence intervals 5.1 Identifying
Analysis of Variance ANOVA
Analysis of Variance ANOVA Overview We ve used the t -test to compare the means from two independent groups. Now we ve come to the final topic of the course: how to compare means from more than two populations.
Nonparametric Statistics
Nonparametric Statistics References Some good references for the topics in this course are 1. Higgins, James (2004), Introduction to Nonparametric Statistics 2. Hollander and Wolfe, (1999), Nonparametric
MATHEMATICAL INDUCTION. Mathematical Induction. This is a powerful method to prove properties of positive integers.
MATHEMATICAL INDUCTION MIGUEL A LERMA (Last updated: February 8, 003) Mathematical Induction This is a powerful method to prove properties of positive integers Principle of Mathematical Induction Let P
Section 13, Part 1 ANOVA. Analysis Of Variance
Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability
Chapter 3: The Multiple Linear Regression Model
Chapter 3: The Multiple Linear Regression Model Advanced Econometrics - HEC Lausanne Christophe Hurlin University of Orléans November 23, 2013 Christophe Hurlin (University of Orléans) Advanced Econometrics
INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)
INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the one-way ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of
26 Integers: Multiplication, Division, and Order
26 Integers: Multiplication, Division, and Order Integer multiplication and division are extensions of whole number multiplication and division. In multiplying and dividing integers, the one new issue
