Notebooks

## Large Deviations

21 Feb 2013 15:04

The limit theorems of probability theory --- the weak and strong laws of large numbers, the central limit theorem, etc. --- basically say that averages taken over large samples (of well-behaved independent, identically distributed random variables) converge on expectation values. (The strong law of large numbers asserts almost-sure convergence, the central limit theorem asserts a kind of convergence in distribution, etc.) These results say little or nothing about the rate of convergence, however, which is often important for many applications of probability theory, e.g., statistical mechanics. One way to address this is the theory of large deviations. (I believe the terminology goes back to Varadhan in the 1970s, but that's just an impression, rather than research.)

Let me say things sloppily first, so the idea comes through, and then more precisely, so people who know the subject won't get too upset. Suppose $X$ is a random variable with expected value $\mathbf{E}[X]$, and we consider $S_n \equiv \frac{1}{n}\sum_{i=1}^{n}{X_i}$, the sample mean of $n$ samples of $X$. $S_n$ "obeys a large deviations principle" if there is a non-negative function $r$, called the rate function, such that $\Pr{\left(\left| \mathbf{E}[X] - S_n \right| \geq \epsilon\right)} \rightarrow e^{-nr(\epsilon)} ~.$
(The rate function has to obey some sensible but technical continuity conditions.) This is a large deviation result, because the difference between the empirical mean and the expectation is remaining constant as $n$ grows --- there has to be a larger and large conspiracy, as it were, among the samples to keep deviating from the expectation in the same way. Now, one reason what I've stated isn't really enough to satisfy a mathematician is that the right-hand side converges on zero, so the functional form of the probability could be anything which also converges on zero and that'd be satisfied, but we want to pick out exponential convergence. The usual way is to look at the limiting growth rate of the probability. Also, we want the probability that the difference between the empirical mean and the expectation falls into any arbitrary set. So one usually sees the LDP asserted in some form like, for any reasonable set $A$, $\lim_{n\rightarrow\infty}{-\frac{1}{n}\log{\mathrm{Pr}\left(\left| \mathbf{E}X - S_n \right| \in A\right)}} = \inf_{x\in A}{r(x)} ~.$
(Actually, to be completely honest, I really shouldn't be assuming that there is a limit to those probabilities. Instead I should connect the lim inf of that expression to the infimum of the rate function over the interior of $A$, and the lim sup to the infimum of the rate function over the closure of $A$.)

Similar large deviation principles can be stated for the empirical distribution, the empirical process, functionals of sample paths, etc., rather than just the empirical mean. There are tricks for relating LDPs on higher-level objects, like the empirical distribution over trajectories, to LDPs on lower-level objects, like empirical means. (These go under names like "the contraction principle".)

Since ergodic theory extends the probabilistic limit laws to stochastic processes, rather than just sequences of independent variables, it shouldn't be surprising that large deviation principles also hold for some stochastic processes. I am particularly interested in LDPs for Markov processes, and their applications. There are further important connections to information theory, since in an awful lot of situations, the large deviations rate function is the Kullback-Leibler divergence, a.k.a. the relative entropy.

Related, but strictly speaking distinct topics:

• Finite-sample deviation inequalities, such as the Bernstein, Chernoff and Hoeffding inequalities, which bound the probability of averages departing by more than a certain amount from expectation values at given finite sample sizes;
• Concentration of measure, roughly speaking upper bounds on deviation probabilities holding uniformly over large classes of functions. (Note that large deviations principles have match upper and lower bounds, and need only hold asymptotically.)
Recommended:
• James Bucklew, Large Deviation Techniques in Decision, Simulation, and Estimation
• Thomas Cover and Joy Thomas, Elements of Information Theory [Very nice chapter on large deviations for IID sequences]
• Amir Dembo and Ofer Zeitouni, Large Deviations Techniques and Applications [Chapters 2, 4 and 5, and parts of chapter 6, are available in postscript format via Prof. Dembo's page for his course on large deviations]
• Frank den Hollander, Large Deviations [Nice introductory text for people with an applied probability background. Short.]
• Richard S. Ellis
• "The Theory of Large Deviations: from Boltzmann's 1877 Calculation to Equilibrium Macrostates in 2D Turbulence", Physica D 133 (1999): 106--136
• Entropy, Large Deviations, and Statistical Mechanics
• M. I. Friedlin and A. D. Wentzell, Random Perturbations of Dynamical Systems
• Hugo Touchette, "The Large Deviations Approach to Statistical Mechanics", Physics Reports 478 (2009): 1--69, arxiv:0804.0327
• S. R. S. Varadhan, "Large Deviations", Annals of Probability 36 (2008): 397--419 [Copy via Prof. Varadhan. Wald Lecture for 2005.]
Recommended, more specialized:
• R. R. Bahadur, Some Limit Theorems in Statistics [1971. The notation is now much more transparent, and the proofs of many basic theorems considerably simplified. But if there's a better source for statistical applications than this little book, I've yet to find it.]
• Julien Barré, Freddy Bouchet, Thierry Dauxois and Stefano Ruffo, "Large deviation techniques applied to systems with long-range interactions", cond-mat/0406358 = Journal of Statistics Physics 119 (2005): 677--713
• Michel Benaïn and Jörgen W. Weibull, "Deterministic Approximation of Stochastic Evolution in Games", Econometrica 71 (2003): 879--903 [JSTOR]
• Christian Borgs, Jennifer Chayes and David Gamarnik, "Convergent sequences of sparse graphs: A large deviations approach", arxiv:1302.4615 [See under graph limits]
• Arijit Chakrabarty, "Effect of truncation on large deviations for heavy-tailed random vectors", arxiv:1107.2476
• Sourav Chatterjee and S. R. S. Varadhan, "The large deviation principle for the Erdos-Renyi random graph", arxiv:1008.1946
• J.-R. Chazottes and D. Gabrielli, "Large deviations for empirical entropies of Gibbsian sources", math.PR/0406083 = Nonlinearity 18 (2005): 2545--2563 [This is a very cool result which shows that block entropies, and entropy rates estimated from those blocks, obey the large deviation principle even as one lets the length of the blocks grow with the amount of data, provided the block-length doesn't grow too quickly (only logarithmically). I wish I could write papers like this.]
• W. De Roeck, Christian Maes and Karel Netocny, "H-Theorems from Autonomous Equations", cond-mat/0508089 [this basically derives the H-theorem of statistical mechanics as a large deviations result, assuming a certain reasonable Markovian form for the macroscopic dynamics. In fact, we have a separate argument that you don't have that Markovian form, you're just not trying hard enough; see here]
• Paul Dupuis, "Large Deviations Analysis of Some Recursive Algorithms with State-Dependent Noise", Annals of Probability 16 (1988): 1509--1536 [Open access]
• Gregory L. Eyink
• "Action principle in nonequilbrium statistical dynamics," Physical Review E 54 (1996): 3419--3435 [Least action as a consequence of Markovian LDP]
• "A Variational Formulation of Optimal Nonlinear Estimation," physics/0011049 [Nice connections between optimal state estimation (assuming a known form for the underlying stochastic process), nonequilibrium statistical mechanics, and large deviations theory, leading to tractable-looking numerical schemes for estimation.]
• Jin Feng and Thomas G. Kurtz, Large Deviations for Stochastic Processes [Online]
• Fuqing Gao and Xingqiu Zhao, "Delta method in large deviations and moderate deviations for estimators", Annals of Statistics 39 (2011): 1211-1240, arxiv:1105.3552 [This is based on an extension of the "contraction principle" which is of independent interest]
• S. Orey and S. Peliken, "Large deviations principles for stationary processes", Annals of Probability 16 (1988): 1481--1496
• Eric Smith, "Large-deviation principles, stochastic effective actions, path entropies, and the structure and meaning of thermodynamic descriptions", arxiv:1102.3938
• Eric Smith and Supriya Krishnamurthy, "Symmetry and Collective Fluctuations in Evolutionary Games", SFI Working Paper 11-03-010
• Paul H. Algoet and Brian H. Marcus, "Large Deviation Theorems for Empirical Types of Markov Chains Constrained to Thin Sets," IEEE Trans. Info. Theory 38 (1992): 1276--1291
• Alexei Andreanov, Giulio Biroli, Jean-Philippe Bouchaud, and Alexandre Lefèvre, "Field theories and exact stochastic equations for interacting particle systems", Physical Review E 74 (2006): 030101 = cond-mat/0602307
• David Andrieux, "Equivalence classes for large deviations", arxiv:1208.5699
• Ellen Baake, Frank den Hollander and Natali Zint, "How T-Cells Use Large Deviations to Recognize Foreign Antigens", arxiv:q-bio.SC/0605016 [Presumably == the paper of the same title in Journal of Mathematical Biology 57 (2008): 841--861, but that orders the authors Zint, Baake and den Hollander.]
• J. Barral and P. Goncalves, "On the Estimation of the Large Deviations Spectrum", Journal of Statistical Physics 144 (2011): 1256--1283
• L. Bertini, A. De Sole, D. Gabrielli, G. Jona-Lasinio, C. Landim, "Large deviation approach to non equilibrium processes in stochastic lattice gases", arxiv:math/0602557
• Matthias Birkner, Andreas Greven and Frank den Hollander, "Quenched large deviation principle for words in a letter sequence", arxiv:0807.2611
• Igor Bjelakovic, Jean-Dominique Deuschel, Tyll Krueger, Ruedi Seiler, Rainer Siegmund-Schultze and Arleta Szkola
• Amarjit Budhiraja, Paul Dupuis, Markus Fischer, "Large deviation properties of weakly interacting processes via weak convergence methods", arxiv:1009.6030
• Amarjit Budhiraja, Paul Dupuis, Vasileios Maroulas, "Large deviations for infinite dimensional stochastic dynamical systems", Annals of Applied Probability 36 (2008): 1390--1420 = arxiv:0808.3631
• Raphaël Cerf and Pierre Petit, "Cramér's theorem for asymptotically decoupled fields", arxiv:1103.4415 [The English abstract is extremely interesting, but unfortunately this paper is in French, so my marking it "to read" is misleading.]
• Arijit Chakrabarty, "Central Limit Theorem and Large Deviations for truncated heavy-tailed random vectors", arxiv:1003.2159
• Po-Ning Chen, "Generalization of Gartner-Ellis theorem", IEEE Transactions on Information Theory 46 (2000): 2752--2760
• Zhiyi Chi
• Igor Chueshov and Annie Millet, "Stochastic 2D hydrodynamical type systems: Well posedness and large deviations", arxiv:0807.1810
• A. de Acosta, "A general nonconvex large deviation result II", Annals of Probability 32 (2004): 1873--1901 = math.PR/0410101
• Zach Deitz and Sunder Sethuraman, "Large deviations for a class of nonhomgeneous Markov chains", math.PR/0404230
• B. Derrida, "Non equilibrium steady states: fluctuations and large deviations of the density and of the current", cond-mat/0703762
• B. Derrida, Joel L. Lebowitz and Eugene R. Speer, "Exact Large Deviation Functional for the Density Profile in a Stationary Nonequilibrium Open System," cond-mat/0105110
• Paul Dupuis and Richard S. Ellis, A Weak Convergence Approach to the Theory of Large Deviations [PDF preprint]
• Vlad Elgart and Alex Kamenev, "Rare Events Statistics in Reaction--Diffusion Systems", cond-mat/0404241 [i.e., large deviations]
• Andreas Engel, Remi Monasson and Alexander K. Hartmann, "On Large Deviation Properties of Erdos-Renyi Random Graphs", Journal of Statistical Physics 117 (2004): 387--426
• Parisa Fatheddin, Jie Xiong, "Large Deviation Principle for Some Measure-Valued Processes", arxiv:1204.3501
• Hans Follmer and Steven Orey, "Large Deviations for the Empirical Field of a Gibbs Measure", Annals of Probability 16 (1988): 961--977
• Jorge Garcia, "A Large Deviation Principle for Stochastic Integrals", Journal of Theoretical Probability 21 (2008): 476--501
• Cristian Giardina', Jorge Kurchan, Luca Peliti, "Direct evaluation of large-deviation functions", cond-mat/0511248 ["numerical [evaluation of] probabilities of large deviations of physical quantities, such as current or density, that are local in time. The large-deviation functions are given in terms of the typical properties of a modified dynamics, and since they no longer involve rare events, can be evaluated efficiently and over a wider ranges of values."]
• Yuri Golubev, Vladimir Spokoiny, "Exponential bounds for minimum contrast estimators", arxiv:0901.0655
• Nathael Gozlan and Christian Léonard
• Alice Guionnet, "Large deviations and stochastic calculus for large random matrices", Probability Surveys 1 (2004): 72--172 [Open access]
• O. V. Gulinskii and R. S. Liptser, "Example of Large Deviations for Stationary Processes", Theory of Probability and Applications 44 (1999): 211--225 [PDF]
• Te Sun Han
• "Hypothesis Testing with the General Source", IEEE Transactions on Information Theory 46 (2000): 2415--2427 = math.PR/0004121 ["The asymptotically optimal hypothesis testing problem with the general sources as the null and alternative hypotheses is studied.... Our fundamental philosophy in doing so is first to convert all of the hypothesis testing problems completely to the pertinent computation problems in the large deviation-probability theory. ... [This] enables us to establish quite compact general formulas of the optimal exponents of the second kind of error and correct testing probabbilities for the general sources including all nonstationary and/or nonergodic sources with arbitrary abstract alphabet (countable or uncountable). Such general formulas are presented from the information-spectrum point of view."]
• "An information-spectrum approach to large deviation theorems", cs.IT/0606104
• Zhishui Hu, John Robinson, Qiying Wang, "Cramér-type large deviations for samples from a finite population", Annals of Statistics 35 (2007): 673--696, arxiv:0708.1880
• Henrik Hult and Gennady Samorodnitsky, "Large deviations for point processes based on stationary sequences with heavy tails", Journal of Applied Probability 47 (2010): 1--40
• Svante Janson, "Large deviations for sums of partly dependent random variables", Random Structures and Algorithms 24 (2004): 234--248 ["We use and extend a method by Hoeffding to obtain strong large deviation bounds for sums of dependent random variables with suitable dependency structure. The method is based on breaking up the sum into sums of independent variables. Applications are given to U-statistics, random strings and random graphs." Applied here only to Erdos-Renyi (IID) random graphs, but might be extendable to Markov random graphs...? PDF preprint]
• Giovanni Jona-Lasinio, "From fluctuations in hydrodynamics to nonequilibrium thermodynamics", arxiv:1003.4164
• Vladislav Kargin, "A Large Deviation Inequality for Vector Functions on Finite Reversible Markov Chains", math.PR/0508538
• Gerhard Keller, Equilibrium States in Ergodic Theory [blurb]
• Michael Keyl, "Quantum state estimation and large deviations", quant-ph/0412053
• Yuri Kifer, "Large deviations and adiabatic transitions for dynamical systems and Markov processes in fully coupled averaging", arxiv:0710.2405
• Yuri Kifer, S. R. S. Varadhan, "Nonconventional Large Deviations Theorems", arxiv:1206.0156
• Yuichi Kitamura, "Empirical likelihood methods in econometrics: Theory and Practice", Cowles Foundation Discussion Paper No. 1569 (2006)
• F. Klebaner and R. Liptser, "Large Deviations for Past-Dependent Recursions", math.PR/0603407 [Corrected version of Problems of Information Transmission 32 (1996): 23--34]
• Ioannis Kontoyiannis and S. P. Meyn
• D. Lacoste, A. W. C. Lau and K. Mallick, "Fluctuation theorem and large deviation function for a solvable model of a molecular motor", Physical Review E 78 (2008): 011915
• Vivien Lecomte, Cécile Appert-Rolland, and Frédéric van Wijland
• "Thermodynamic formalism for systems with Markov dynamics", cond-mat/0606211
• "Thermodynamic formalism and large deviation functions in continuous time Markov dynamics", cond-mat/0703435
• Vivien Lecomte and Julien Tailleur, "A numerical approach to large deviations in continuous time", Journal of Statistical Mechanics: Theory and Experiment 2007: P03004
• Raphael Lefevere, Mauro Mariani, Lorenzo Zambotti, "Large deviations for renewal processes", arxiv:1009.2659
• Christian Léonard , "Entropic Projections and Dominating Points", ESAIM: Probability and Statistics 14 (2010): 343--381, arxiv:0711.0206 ["Generalized entropic projections and dominating points are solutions to convex minimization problems related to conditional laws of large numbers"]
• Robert Sh. Liptser and Anatolii A. Pukhalskii, "Limit theorems on large deviations for semimartingales", math.PR/0510028 [But published in a journal in 1992]
• Fotis Loukissas, "Precise Large Deviations for Long-Tailed Distributions", Journal of Theoretical Probability 25 (2012): 913--924
• Yutao Ma, Ran Wang, Liming Wu, "Moderate Deviation Principle for dynamical systems with small random perturbation", arxiv:1107.3432
• Claudio Macci, "Large Deviations for Empirical Estimators of the Stationary Distribution of a Semi-Markov Process with Finite State Space", Communications in Statistics: Theory and Methods 37 (2008): 3077--3089
• Satya N. Majumdar and Alan J. Bray, "Large-Deviation Functions for Nonlinear Functionals of a Gaussian Stationary Markov Process", cond-mat/0202138 = Physical Review E 65 (2002): 051112
• Matteo Marsili, "On the concentration of large deviations for fat tailed distributions", arxiv:1201.2817
• David McAllester, "A Statistical Mechanics Approach to Large Devations Theorems" [E-print available via CiteSeer --- published?]
• Thomas Mikosch, Olivier Wintenberger, "Precise large deviations for dependent regularly varying sequences", arxiv:1206.1395
• Abdelkader Mokkadem, Mariane Pelletier and Baba Thiam, "Large and moderate deviations principles for kernel estimators of the multivariate regression", math.ST/0703341
• K. Netocny and F. Redig, "Large deviations for quantum spin systems", math-ph/0404018 = Journal of Statistical Physics 117 (2004): 521--547
• Enzo Olivieri and Maria Eulalia Vares, Large Deviations and Metastability [Blurb]
• Huyen Pham, "Some applications and methods of large deviations in finance and insurance",math.PR/0702473
• Mark Pollicott and Richard Sharp, "Large Deviations, Fluctuations and Shrinking Intervals", Communications in Mathematical Physics 290 (2009): 321--334
• Anatoly Puhalskii, Large Deviations and Idempotent Probability
• Anatolii A. Puhalskii, "Stochastic processes in random graphs", math.PR/0402183 [Large deviations for Erdos-Renyi graphs. Memo to self: how much work would it be to extend this to Markovian graphs?]
• Hong Qian, "Relative Entropy: Free Energy Associated with Equilibrium Fluctuations and Nonequilibrium Deviations", math-ph/0007010 = Physical Review E 63 (2001): 042103
• Olivier Rivoire, "The cavity method for large deviations", cond-mat/0506164 = Journal of Statistical Mechanics: Theory and Experiment (2005): P07004 ["A method is introduced for studying large deviations in the context of statistical physics of disordered systems. The approach, based on an extension of the cavity method to atypical realizations of the quenched disorder, allows us to compute exponentially small probabilities (rate functions) over different classes of random graphs."]
• David Ruelle, Thermodynamic Formalism
• Shin-ichi Sasa, "Physics of Large Deviation", arxiv:1204.5584
• L. Saulis and V. A. Statulevicius, Limit Theorems for Large Deviations
• Carolyn Schroeder, "I-Projection and Conditional Limit Theorems for Discrete Parameter Markov Processes", Annals of Probability 21 (1993): 721--758
• Adam Shwartz, Large Deviations in Performance Modeling
• Joe Suzuki, "A Markov chain analysis of genetic algorithms: large deviation principle approach", Journal of Applied Probability 47 (2010): 967--975
• Vincent Y. F. Tan, Animashree Anandkumar, Lang Tong and Alan S. Willsky, "A Large-Deviation Analysis of the Maximum-Likelihood Learning of Markov Tree Structures", IEEE Transactions on Information Theory 57 (2011): 1714--1735, arxiv:0905.0940 [Large deviations for Chow-Liu trees]
• Hugo Touchette, Rosemary J. Harris, "Large deviation approach to nonequilibrium systems", arxiv:1110.5216
• José Trashorras, Olivier Wintenberger, "Large deviations for bootstrapped empirical measures", arxiv:1110.4620
• Wei Wang, A. J. Roberts and Jinqiao Duan, "Large deviations for slow-fast stochastic partial differential equations", arxiv:1001.4826 ["the rate function is exactly that of the averaged equation plus the fluctuating deviation which is a stochastic partial differential equation with small Gaussian perturbation"]
• Lingjiong Zhu, "Process-Level Large Deviations for General Hawkes Processes", arxiv:1108.2431
To write:
• CRS, "Large Deviations in Exponential Families of Stochastic Automata"

Previous versions: 2005-11-09 17:39 (but not the first version by any means)

Notebooks:     Hosted, but not endorsed, by the Center for the Study of Complex Systems