Notebooks

## Statistics

22 Sep 2014 20:41

An application of probability, with intimate ties to machine learning, non-demonstrative inference and induction.

Since June 2005, I have been a (very, very junior) professor of statistics. This made me interested in how to teach it.

Dependent data
Statistical inference for stochastic processes, a.k.a. time-series analysis. Signal processing and filtering. Spatial statistics.
Model selection
Especially: adapting to unknown characteristics of the data, like unknown noise distributions, or unknown smoothness of the regression function.
Model discrimination
That is, designing experiments so as to discriminate between competing classes of model. Adaptation to data issues here.
Rates of convergence of estimators to true values
Empirical process theory. (Cf. some questions in ergodic theory).
Estimating distribution functions
And estimating entropies, or other functionals of distributions.
Non-parametric methods
Both those that are genuinely distribution-free, and those that would more accurately be mega-parametric (even infinitely-parametric) methods, such as neural networks
Regression
Bootstrapping and other resampling methods
Cross-validation
Sufficient statistics
Exponential families
Information Geometry
Partial identification of parametric statistical models
Causal Inference
Decision theory
Conventional, and the sorts with some connection to how real decisions are made.
Graphical models
Monte Carlo and other simulation methods
"De-Bayesing"
Ways of taking Bayesian procedures and eliminating dependence on priors, either by replacing them by initial point-estimates, or by showing the prior doesn't matter, asymptotically or hopefully sooner. See: Frequentist consistency of Bayesian procedures.
Computational Statistics
Statistics of structured data
Statistics on manifolds
i.e., what to do when the data live in a continuous but non-Euclidean space.
Grammatical Inference
Factor analysis
Mixture models
Multiple testing
Predictive distributions
... especially if they have confidence/coverage properties
Density estimation
especially conditional density estimation; and density estimation on graphical models
Indirect inference
"Missing mass" and species abundance problems
I.e., how much of the distribution have we not yet seen?
Independence Tests, Conditional Independence Tests, Measures of Dependence and Conditional Dependence
Two-Sample Tests
Recommended, non-technical:
• Francis Galton, "Statistical Inquiries into the Efficacy of Prayer," Fortnightly Review 12 (1872): 125--135 [online]
• Larry Gonick and Woollcott Smith, The Cartoon Guide to Statistics
• Ian Hacking, The Taming of Chance [Putting chance to work in the 19th century]
• D. Huff, How to Lie with Statistics
• Theodore Porter, The Rise of Statistical Thinking, 1820--1900
• Constance Reid, Neyman from Life [Biography of Jerzy Neyman, one of the makers of modern statistical theory, and, I am happy to say, among the brighter lights of my alma mater. Reid does an excellent job of explaining Neyman's work in terms accessible to the general reader. There is a new edition, titled simply Neyman, but otherwise unchanged. Review by Steve Laniel]
• Edward R. Tufte
• The Visual Display of Quantitative Information
• Visual Explanations
Recommended, technical, big pictures:
• Ole E. Barndorff-Nielsen and David R. Cox, Inference and Asymptotics
• M. S. Bartlett
• "Inference and Stochastic Processes", Journal of the Royal Statistical Society A 130 (1967): 457--478 [JSTOR]
• "Chance or Chaos?", Journal of the Royal Statistical Society A 153 (1990): 321--347 [JSTOR]
• Richard A. Berk, Regression Analysis: A Constructive Critique [Mini-review]
• Leo Breiman, "Statistical Modeling: The Two Cultures", Statistical Science 16 (2001): 199--231 [very much including the discussion by others and the reply by Breiman. Thanks to Chris Wiggins for alerting me to this.]
• David R. Brillinger, "The 2005 Neyman Lecture: Dynamic Indeterminism in Science", Statistical Science 23 (2008): 48--64, arxiv:0808.0620 [With discussions and response]
• D. R. Cox and Christl A. Donnelly, Principles of Applied Statistics [Review: Turning Scientific Perplexity into Ordinary Statistical Uncertainty]
• Harald Cramér, Mathematical Methods of Statistics [Review]
• C. David Garson, Statnotes: An Online Textbook
• Peter Guttorp, Stochastic Modeling of Scientific Data [Good introduction to using dependent data]
• Trever Hastie, Robert Tibshirani and Jerome Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction [Website, with full text free in PDF]
• Robert E. Kass, "Statistical Inference; The Big Picture", Statistical Science 26 (2011): 1--19, arxiv:1106.2895
• Tony Lin [Prof. Dr. Lin was working on his doctorate when I was an undergrad at Berkeley; we became friends at the I-House, if that is the word I want for someone who offered to keep my brain alive in a jigger-glass and subject it to random electrical shocks ("Jzzt! Jzzt!"). But despite his questionable tastes in acquaintances, he's a damn good statistician and a model teacher.]
• Deborah Mayo, Error and the Growth of Experimental Knowledge [Review: We Have Ways of Making You Talk, or, Long Live Peircism-Popperism-Neyman-Pearson Thought!]
• NIST, Electronic Handbook of Statistical Methods [Full text free online]
• E. J. G. Pitman, Some Basic Theory for Statistical Inference [Review: Intermediate Statistics from an Advanced Point of View]
• Jorma Rissanen, Stochastic Complexity in Statistical Inquiry [Review: Less Is More, or, Ecce data!]
• Mark Schervish, Theory of Statistics
• John R. Taylor, An Introduction to Error Analysis: The Study of Uncertainties in Physical Measurements [a.k.a. "the book with the train-wreck on the cover"]
• Larry Wasserman
• All of Statistics
• All of Nonparametric Statistics
Recommended, technical, close-ups:
• A. C. Atkinson and A. N. Donev, Optimum Experimental Design [Review]
• F. Bacchus, H. E. Kyburg and M. Thalos, "Against Conditionalization," Synthese 85 (1990): 475--506 [Why "Dutch book" arguments do not, in fact, mean that rational agents must be Bayesian reasoners]
• Andrew Barron and Nicolas Hengartner, "Information theory and superefficiency", Annals of Statistics 26 (1998): 1800--1825
• M. S. Bartlett, "The Statistical Significance of Odd Bits of Information", Biometrika 39 (1952): 228--237 [A goodness-of-fit test based on fluctuations of the entropy. JSTOR]
• M. J. Bayarri and James O. Berger, "P Values for Composite Null Models", Journal of the American Statistical Association 95 (2000) 127--1142 [To be read in conjunction with Robins, van der Vaart and Ventura, below. JSTOR]
• Anil K. Bera and Aurobindo Ghosh, "Neyman's Smooth Test and Its Applications in Econometrics", pp. 177--230 in Aman Ullah, Alan T. K. Wan and Anoop Chaturvedi (eds.), Handbook of Applied Econometrics and Statistical Inference, SSRN/272888
• Julian Besag, "A Candidate's Formula: A Curious Result in Bayesian Prediction", Biometrika 76 (1989): 183 [A wonderful and bizarre expression for the Bayesian predictive density, in terms of how adding a new data point would change the posterior. JSTOR]
• Pier Bissiri, Chris Holmes, Stephen Walker, "A General Framework for Updating Belief Distributions", arxiv:1306.6430
• David Blackwell and M. A. Girshick, Theory of Games and Statistical Decisions
• Leo Breiman, "No Bayesians in Foxholes", IEEE Expert: Intelligent Systems and Their Applications 12 (1997): 21--24 [PDF reprint; comments by Andy Gelman]
• Jochen Brocker, "A Lower Bound on Arbitrary f-Divergences in Terms of the Total Variation" arxiv:0903.1765
• Peter Bühlmann and Sara van de Geer, Statistics for High-Dimensional Data: Methods, Theory and Applications [Review]
• Ronald W. Butler, "Predictive Likelihood Inference with Applications", Journal of the Royal Statistical Society B 48 (1986): 1--38 ["in the predictive setting, all parameters are nuisance parameters". JSTOR]
• Venkat Chandrasekaran and Michael I. Jordan, "Computational and Statistical Tradeoffs via Convex Relaxation", Proceedings of the National Academy of Sciences (USA) 110 (2013): E1181--E1190, arxiv:1211.1073
• Hwan-sik Choi and Nicholas M. Kiefer, "Differential Geometry and Bias Correction in Nonnested Hypothesis Testing" [PDF preprint via Kiefer]
• A. C. Davison and D. V. Hinkley, Bootstrap Methods and their Applications
• J. Bradford DeLong and Kevin Lang, "Are All Economic Hypotheses False?", Journal of Political Economy 100 (1992): 1257--1272 [PDF preprint. The point is about abuses of hypothesis testing, not economic hypotheses as such.]
• Earman, Bayes or Bust? A Critical Account of Bayesian Confirmation Theory
• Michael Evans, "What does the proof of Birnbaum's theorem prove?", arxiv:1302.5468
• S. N. Evans and P. B. Stark, "Inverse Problems as Statistics" [Abstract, PDF]
• Steve Fienberg, The Analysis of Cross-Classified Categorical Data
• Don Fraser, "Is Bayes posterior just quick and dirty confidence", Statistical Science 26 (2011): 299--316, arxiv:1112.5582 [See also the discussions by others, and Fraser's reply. My answer to the question posed in Fraser's title is "yes", or rather "YES!"]
• Andrew Gelman, Jennifer Hill and Masanao Yajima, "Why we (usually) don't have to worry about multiple comparisons" [PDF preprint]
• Andrew Gelman and Iain Pardoe, "Average predictive comparisons for models with nonlinearity, interactions, and variance components", Sociological Methodology 37 (2007): 23--51 [PDF preprint, Gelman's comments]
• Christopher Genovese, Peter Freeman, Larry Wasserman, Robert C. Nichol and Christopher Miller, "Inference for the Dark Energy Equation of State Using Type IA Supernova Data", Annals of Applied Statistics 3 (2009): 144--178, arxiv:0805.4136 [I am biased, because Genovese and Wasserman are friends, but this seems to me a model of a modern applied statistics paper: use interesting statistical ideas to say something helpful about an important scientific problem on its own terms, rather than distorting the problem until it "looks like a nail".]
• Charles J. Geyer, "Le Cam Made Simple: Asymptotics of Maximum Likelihood without the LLN or CLT or Sample Size Going to Infinity", arxiv:1206.4762 [There are two separable points here. One is that much of the usual asymptotic theory of maximum likelihood follows from the quadratic form of the likelihood alone; whenever and however that is reached, those consequences follow. Approximately quadratic likelihoods imply approximations to the usual asymptotics. This is unquestionably correct. The other is some bashing of results like the law of large numbers and central limit theorem, which seems misguided to me.]
• Tilmann Gneiting, "Making and Evaluating Point Forecasts", Journal of the American Statistical Association 106 (2011): 746--762, arxiv:0912.0902
• Trygve Haavelmo, "The Probability Approach in Econometrics", Econometrica 12 (1944, supplement): iii--115 [JSTOR]
• Mark S. Handcock and Martina Morris, Relative Distribution Methods in the Social Sciences [Review: Beyond Mean and Deviance]
• Bruce E. Hansen
• "The Likelihood Ratio Test Under Nonstandard Conditions: Testing the Markov Switching Model of GNP", Journal of Applied Econometrics 7 (1992): S61--S82 [I very much like the approach of treating the likelihood ratio as an empirical process; why haven't I seen it before? (Also, the state-of-the-art in simulating Gaussian processes must be much better now than what Hansen had in '92, which would make this even more practical. PDF reprint.]
• "Inference when a nuisance parameter is not identified under the null hypothesis", Econometrica 64 (1996): 413--430
• Jeffrey D. Hart, Nonparametric Smoothing and Lack-of-Fit Tests [Mini-review]
• Nils Lid Hjort and David Pollard, "Asymptotics for minimisers of convex processes", arxiv:1107.3806 [Very elegant]
• Peter J. Huber
• Wilbert C. M. Kallenberg and Teresa Ledwina, "Data-driven smooth tests when the hypothesis is composite", Journal of the American Statistical Association 92 (1997): 1094--1104 [Abstract, PDF reprint; JSTOR]
• Gary King, A Solution to the Ecological Inference Problem: Reconstructing Individual Behavior from Aggregate Data [Review]
• Gary King and Margaret Roberts, "How Robust Standard Errors Expose Methodological Problems They Do Not Fix" [PDF preprint]
• Solomon W. Kullback, Information Theory and Statistics
• Michael Lavine and Mark J. Schervish, "Bayes Factors: What They Are and What They Are Not" [PS preprint]
• Steffen Lauritzen, Extremal Families and Systems of Sufficient Statistics [See comments under sufficient statistics]
• J. F. Lawless and Marc Fredette, "Frequentist prediction intervals and predictive distributions", Biometrika 92 (2005): 529--542 ["Frequentist predictive distributions are defined as confidence distributions .... A simple pivotal-based approach that produces prediction intervals and predictive distributions with well-calibrated frequentist probability interpretations is introduced, and efficient simulation methods for producing predictive distributions are considered. Properties related to an average Kullback-Leibler measure of goodness for predictive or estimated distributions are given."]
• Lucien Le Cam
• "Neyman and Stochastic Models" [PDF. Some vignettes of Neyman putting together models, and his model-building process.]
• "Maximum Likelihood; An Introduction" [PDF. Not really an introduction, but rather a collection of examples of where it just does not work, or at least doesn't work well.]
• Erich L. Lehmann, "On likelihood ratio tests", math.ST/0610835
• Jin Lei, Alessandro Rinaldo, and Larry Wasserman, "A Conformal Prediction Approach to Explore Functional Data", arxiv:1302.6452
• Jing Lei, James Robins, and Larry Wasserman, "Efficient Nonparametric Conformal Prediction Regions", arxiv:1111.1418
• Jing Lei and Larry Wasserman, "Distribution Free Prediction Bands", arxiv:1203.5422
• Bing Li, "A minimax approach to consistency and efficiency for estimating equations," Annals of Statistics 24 (1996): 1283--1297 [online version]
• Bruce Lindsay and Liawei Liu, "Model Assessment Tools for a Model False World", Statistical Science 24 (2009): 303--318, arxiv:1010.0304 [Their model-adequacy index is, essentially, the number of samples needed to detect the falsity of the model with some reasonable, pre-set level of power, with fixed size/significance level. This is a very natural quantity. In fact, by results which go back to Kullback's book, the power grows exponentially, with a rate equal to the Kullback-Leibler divergence rate. (More exactly, one minus the power goes to zero exponentially at that rate, but you know what I meant.) Large deviations theory includes generalizations of this result. Many statisticians, I'd guess, would prefer the Lindsay-Liu index because will feel it more natural to them to gauge error in terms of a sample size rather than bits, but to each their own.]
• Brad Luen and Philip B. Stark, "Testing earthquake predictions", pp. 302--315 in Deborah Nolan and Terry Speed (eds.), Probability and Statistics; Essays in Honor of David A. Freedman [The issues arise however not just for earthquakes, but for all sorts of clustered events]
• Charles Manski, Identification for Prediction and Decision [Review]
• Deborah G. Mayo and D. R. Cox, "Frequentist statistics as a theory of inductive inference", math.ST/0610846
• Neri Merhav, "Bounds on Achievable Convergence Rates of Parameter Estimators via Universal Coding", IEEE Transactions on Information Theory 40 (1994): 1210--1215 [PDF reprint via Prof. Merhav]
• Karthika Mohan, Judea Pearl and Jin Tian, "Graphical Models for Inference with Missing Data", NIPS 2013 [There was at least one preprint version with the more pointed title "Missing Data as a Causal Inference Problem"]
• M. B. Nevel'son and R. Z. Has'minskii, Stochastic Approximation and Recursive Estimation
• Andrey Novikov, "Optimal sequential multiple hypothesis tests", arxiv:0811.1297
• David Pollard
• "Asymptotics via Empirical Processes", Statistical Science 4 (1989): 341--354
• Empirical Processes: Theory and Applications
• Jeffrey S. Racine, "Nonparametric Econometrics: A Primer", Foundations and Trends in Econometrics 3 (2008): 1--88 [Good primer of nonparametric techniques for regression, density estimation and hypothesis testing; next to no economic content (except for examples). Presumes reasonable familiarity with parametric statistics. PDF reprint]
• J. N. K. Rao, "Some recent advances in model-based small area estimation", Survey Methodology 25 (1999): 175--186
• James M. Robins and Ya'acov Ritov, "Toward a curse of Dimensionality Appropriate (CODA) Asymptotic Theory for Semi-Parametric Models", Statistics in Medicine 16 (1997): 285--319 [PDF reprint via Prof. Robins]
• James M. Robins, Aad van de Vaart and Valérie Ventura, "Asymptotic Distribution of P Values in Composite Null Models", Journal of the American Statistical Association 95 (2000): 1143--1156 [JSTOR. Paired article with Bayarri and Berger, above. The discussions and rejoinders (pp. 1157--1172) are valuable.]
• George G. Roussas, Contiguity of Probability Measures: Some Applications in Statistics [Mini-review]
• C. Scott and R. Nowak, "A Neyman-Pearson Approach to Statistical Learning", IEEE Transactions on Information Theory 51 (2005): 3806--3819 [Comments: Learning Your Way to Maximum Power]
• Steven G. Self and Kung-Yee Liang, "Asymptotic Properties of Maximum Likelihood Estimators and Likelihood Ratio Tests Under Nonstandard Conditions", Journal of the American Statistical Association 82 (1987): 605--610 [JSTOR]
• Tom Shively, Stephen Walker, "On the Equivalence between Bayesian and Classical Hypothesis Testing", arxiv:1312.0302
• Jeffrey S. Simonoff, Smoothing Methods in Statistics
• Spyros Skouras, "Decisionmetrics: Towards a Decision-Based Approach to Econometrics," SFI Working Paper 2001-11-064 [Applies far outside econometrics. If what you really want to do is to minimize a known loss function, optimizing a conventional accuracy measure, e.g. least squares, can be highly counterproductive.]
• Aris Spanos
• "The Curve-Fitting Problem, Akaike-type Model Selection, and the Error Statistical Approach" [Or: could your model selection tell you that Kepler is better than Ptolemy? Technical report, economics dept., Virginia Tech, 2006. PDF]
• "Where do statistical models come from? Revisiting the problem of specification", math.ST/0610849
• Yun Ju Sung, Charles J. Geyer, "Monte Carlo likelihood inference for missing data models", Annals of Statistics 35 (2007): 990--1011, arxiv:0708.2184
• Alexandre B. Tsybakov, Introduction to Nonparametric Estimation [Review]
• Sara van de Geer, Empirical Process Theory in M-Estimation [Finding non-asymptotic rates of convergence for common estimators]
• Quang H. Vuong, "Likelihood Ratio Tests for Model Selection and Non-Nested Hypotheses", Econometrica 57 (1989): 307--333
• Grace Wahba, Spline Models for Observational Data
• Michael E. Wall, Andreas Rechtsteiner and Luis M. Rocha, "Singular Value Decomposition and Principal Component Analysis," physics/0208101
• Michael D. Ward, Brian D. Greenhill and Kristin M. Bakke, "The perils of policy by p-value: Predicting civil conflicts", Journal of Peace Research 47 (2010): 363--375
• Larry Wasserman, "Low Assumptions, High Dimensions", RMM 2 (2011): 201--209
• Halbert White, Estimation, Inference and Specification Analysis [Review]
• Achilleas Zapranis and Apostolos-Paul Refenes, Principles of Neural Model Identification, Selection and Adequacy, with Applications to Financial Econometrics
• Sven Zenker, Jonathan Rubin, Gilles Clermont, "From Inverse Problems in Mathematical Physiology to Quantitative Differential Diagnoses", PLoS Computational Biology 3 (2007): e205
• Johanna F. Ziegel and Tilmann Gneiting, "Copula Calibration", arxiv:1307.7650
Not altogether recommended:
• Peter J. Diggle and Amanda G. Chetwynd, Statistics and Scientific Method: An Introduction for Students and Researchers [A missed opportunity.]
• Felix Abramovich, Yoav Benjamini, David L. Donoho and Iain M. Johnstone, "Adapting to Unknown Sparsity by controlling the False Discovery Rate", math.ST/0505374 [I don't really care about sparsity, but they promise novel relations between the FDR control and asymptotic minimaxity and complexity-penalized model selection.]
• Gianfranco Adimari and Annamaria Guolo, "A note on the asymptotic behaviour of empirical likelihood statistics", Statistical Methods and Applications 19 (2010): 463--476
• Stéphanie Allassonniere, Estelle Kuhn, "Convergent Stochastic Expectation Maximization algorithm with efficient sampling in high dimension. Application to deformable template model estimation", arxiv:1207.5938
• Elizabeth S. Allman, Catherine Matias, John A. Rhodes, "Identifiability of parameters in latent structure models with many observed variables", Annals of Statistics 37 (2009): 3099--3132, arxiv:0809.5032
• Miguel A. Arcones, "Bahadur Efficiency of the Likelihood Ratio Test" [PDF preprint from 2005, presumably since published...]
• Barry C. Arnold et al., Conditional Specification of Statistical Models
• R. A. Bailey, Design of Comparative Experiments
• Sivaraman Balakrishnan, Alessandro Rinaldo, Don Sheehy, Aarti Singh, Larry Wasserman, "Minimax Rates for Homology Inference", arxiv:1112.5627
• Roger Barlow, "Asymmetric Errors", physics/0401042
• Ole E. Barndorff-Nielsen and David R. Cox, "Prediction and Asymptotics", Bernoulli 2 (1996): 319--340
• Ole E. Barndorff-Nielsen, David R. Cox and Claudia Klüppelberg (eds.), Complex Stochastic Systems
• Zvika Ben-Haim and Yonina C. Eldar, "The Cramer-Rao Bound for Sparse Estimation", arxiv:0905.4378
• Yoav Benjamini, Marina Bogomolov, "Adjusting for selection bias in testing multiple families of hypotheses", arxiv:1106.3670
• David R. Bickel
• "The Strength of Statistical Evidence for Composite Hypotheses: Inference to the Best Explanation" [Preprint]
• "Resolving conflicts between statistical methods by probability combination: Application to empirical Bayes analyses of genomic data", arxiv:1111.6174
• "A prior-free framework of coherent inference and its derivation of simple shrinkage estimators" [preprint]
• Peter J. Bickel, C. A. J. Klaassen, Y. Ritov and J. A. Wellner, Efficient and Adaptive Estimation for Semiparametric Models
• Peter J. Bickel and Bo Li, "Regularization in Statistics", Test 15 (2006): 271--344 [PDF reprint]
• Peter J. Bickel and Y. Ritov, "Non-Parametric Estimators Which Can Be Plugged-In' " UCB Stat. Tech. Rep. 602 [abstract, pdf]
• Lucien Birgé
• Gilles Blanchard, Sylvain Delattre, Etienne Roquain , "Testing over a continuum of null hypotheses", arxiv:1110.3599
• Michael Blum, "Approximate Bayesian Computation: a non-parametric perspective", arxiv:0904.0635
• Ingwer Borg and Patrick J. F. Groenen, Modern Multidimensional Scaling: Theory and Application
• A. R. Brazzale and A. C. Davison, "Accurate Parametric Inference for Small Samples", Statistical Science 23 (2008): 465--484 [Apparently, a preview for the book.]
• A. R. Brazzale, A. C. Davison and N. Reid, Applied Asymptotics: Case Studies in Small-Sample Statistics
• Trevor S. Breusch, "Hypothesis Testing in Unidentified Models", Review of Economic Studies 53 (1986): 635--651 [JSTOR]
• Adam D. Bull, "Honest adaptive confidence bands and self-similar functions", arxiv:1110.4985
• Florentina Bunea, Alexandre B. Tsybakov, Marten H. Wegkamp, Adrian Barbu, "Spades and Mixture Models", Annals of Statistics 38 (2010): 2525--2558, arxiv:0901.2044
• Dizza Bursztyn and david M. Steinberg, "Comparison of designs for computer experiments", Journal of Statistical Planning and Inference 136 (2006): 1103--1119
• T. Tony Cai, "Minimax and Adaptive Inference in Nonparametric Function Estimation", Statistical Science 27 (2012): 31--50
• T. Tony Cai and Mark G. Low, "An adaptation theory for nonparametric confidence intervals", Annals of Statistics 32 (2004): 1805--1840 = math.ST/0503662
• Emmanuel Candes and Terence Tao, "Near Optimal Signal Recovery from Random Projections and Universal Encoding Strategies", math.CA/0410542
• Herve Cardot, Andre Mas and Pascal Sarda, "CLT in Functional Linear Regression Models", math.ST/0508073
• Kamalika Chaudhuri and Daniel Hsu, "Convergence Rates for Differentially Private Statistical Estimation", arxiv:1206.6395
• Djalil Chafai and Didier Concordet, "On the strong consistency of approximated M-estimators", math.ST/0507102 [Sounds cool...]
• In Hong Chang and Rahul Mukerjee, "Asymptotic results on the frequentist mean squared error of generalized Bayes point predictors", Statistics and Probability Letters 67 (2004): 65--71 [Note to self: file this one under "de-Bayesing".]
• Sandra Chapman, George Rowlands and Nicholas Watkins
• "Extremum statistics: A framework for data analysis," cond-mat/0106015
• "Extremum Statistics and Signatures of Long Range Correlations," cond-mat/0106015
• "The relationship between extremum statistics and universal fluctuations," cond-mat/0007275
• Xiaohong Chen, Markus Reiss, "On rate optimality for ill-posed inverse problems in econometrics", arxiv:0709.2003 [Non-parametric instrumental variables?]
• Xinjia Chen, "Sequential Tests of Statistical Hypotheses with Confidence Limits", arxiv:1007.4278
• N. N. Chentsov, Statistical Decision Rules and Optimal Inference
• Zhiyi Chi, "Effects of statistical dependence on multiple testing under a hidden Markov model", Annals of Statistics 39 (2011): 439--473
• Christine Choirat and Raffaello Seri, "Estimation in Discrete Parameter Models", Statistical Science 27 (2012): 278--293
• Bertrand Clarke, "Desiderata for a Predictive Theory of Statistics", Bayesian Analysis 5 (2010): 1--36
• Sandy Clarke, Peter Hall, "Robustness of multiple testing procedures against dependence", Annals of Statistics 37 (2009): 332--358, arxiv:0903.0464
• Arthur Cohen and Harold B. Sackrowitz, "Decision theory results for one-sided multiple comparison procedures", math.ST/0504505 = Annals of Statistics 33 (2005): 126--144
• Arthur Cohen, Harold B. Sackrowitz, Minya Xu, "A new multiple testing method in the dependent case", arxiv:0906.3082 = Annals of Statistics 37 (2009) 1518--1544
• Daniel Commenges, "Statistical models: Conventional, penalized and hierarchical likelihood", Statistics Surveys 3 (2009): 1--17, arxiv:0808.4042
• Daniel Commenges, Helene Jacqmin-Gadda, Cecile Proust, and Jeremie Guedj, "A Newton-Like Algorithm for Likelihood Maximization: The Robust-Variance Scoring Algorithm", math.ST/0610402
• Cox and Wermuth, Multivariate Dependencies: Models, Analysis and Interpretation
• Anirban DasGupta, Asymptotic Theory of Statistics and Probability
• Alexandre d'Aspremont, Onureena Banerjee, Laurent El Ghaoui, "First-order methods for sparse covariance selection", math.OC/0609812
• I. Dattner, A. Goldenshluger, A. Juditsky, "On deconvolution of distribution functions", arxiv:1006.3918 ["nonparametric estimation of a continuous distribution function from observations with measurement errors... rate optimal estimators based on direct inversion of empirical characteristic function"]
• P. L. Davies
• "Data Features", Statistica Neerlandica 49 (1995): 185--245
• "Approximating Data", Journal of the Korean Statistical Society 37 (2008): 191--211 [With discussion and rejoinder. Open access?]
• P. L. Davies, A. Kovac and M. Meise, "Nonparametric Regression, Confidence Regions and Regularization", arxiv:0711.0690
• A. Philip Dawid, Steven de Rooij, Glenn Shafer, Alexander Shen, Nikolai Vereshchagin, Vladimir Vovk, "Martingales and p-values as measures of evidence", arxiv:0912.4269
• Pierpaolo De Blasi and Stephen G. Walker, "Bayesian Estimation of the Discrepancy with Misspecified Parametric Models", Bayesian Analysis 8 (2013): 781--800
• Aurore Delaigle, Peter Hall and Jiashun Jin, "Robustness and accuracy of methods for high dimensional data analysis based on Student's t-statistic", Journal of the Royal Statistical Society B forthcoming (2011)
• Joshua V Dillon, Guy Lebanon, "Stochastic Composite Likelihood", Journal of Machine Learning Research 11 (2010): 2597--2633, apparently the final version of arxiv:1003.0691
• David L. Donoho, "Estimation by epsilon-nets" (Le Cam Lecture, 2003; find citation)
• David L. Donoho and Richard C. Liu, "The Automatic'' Robustness of Minimum Distance Functionals", Annals of Statistics 16 (1988): 552--586
• David L. Donoho and Jared Tanner, "Observed Universality of Phase Transitions in High-Dimensional Geometry, with Implications for Modern Data Analysis and Signal Processing", arxiv:0906.2530
• Mathias Drton, "Likelihood ratio tests and singularities", Annals of Statistics 37 (2009): 979--1012, arxiv:math.ST/0703360
• Mathias Drton and Seth Sullivant, "Algebraic statistical models", math.ST/0703609
• Jin-Chuan Duan and Andras Fulop, "A stable estimator of the information matrix under EM for dependent data", Statistics and Computing 21 (2011): 83--91
• John C. Duchi, Michael I. Jordan, Martin J. Wainwright, "Local Privacy and Statistical Minimax Rates", arxiv:1302.3203
• Morris L. Eaton, Multivariate Statistics: A Vector Space Approach ["a version of multivariate statistical theory in which vector space and invariance methods replace, to a large extent, more traditional multivariate methods"]
• Sam Efromovich
• Bradley Efron, "Size, power and false discovery rates", Annals of Statistics 35 (2007): 1351--1377, arxiv:0710.2245
• Werner Ehm, Jürgen Kornmeier, and Sven P. Heinrich, "Multiple testing along a tree", Electronic Journal of Statistics 4 (2010): 461--471 =? arxiv:0902.2296
• Thibault Espinasse, Paul Rochet, "A Cramér-Rao inequality for non differentiable models", arxiv:1204.2763
• Michael Evans and Gun Ho Jang, "Invariant P-values for model checking", Annals of Statistics 38 (2010): 512--525
• Jianqing Fan and Jian Zhang, "Sieve empirical likelihood ratio tests for nonparametric functions", Annals of Statistics 32 (2004): 1858--1907 = math.ST/0503667
• Stefano Favaro, Antonio Lijoi, and Igor Prünster, "Asymptotics for a Bayesian nonparametric estimator of species variety", Bernoulli 18 (2012): 1267--1283
• Thomas S. Ferguson, A Course in Large Sample Theory
• Jean-David Fermanian and Bernard Salanié "A Nonparametric Simulated Maximum Likelihood Estimation Method", Econometric Theory 20 (2004): 701--734
• Ana K. Fermin and Carenne Ludena, "A Statistical view of Iterative Methods for Linear Inverse Problems", math.ST/0504064
• Luisa Turrin Fernholz, von Mises Calculus for Statistical Functionals
• S. E. Fienberg, P. Hersh, A. Rinaldo and Y. Zhou, "Maximum Likelihood Estimation in Latent Class Models For Contingency Table Data", arxiv:0709.3535
• D. A. S. Fraser, N. Reid, E. Marras and G. Y. Yi, "Default priors for Bayesian and frequentist inference", Journal of the Royal Statistical Society B 72 (2010): 631--654
• A. Fraysse, "Why minimax is not that pessimistic", arxiv:0902.3311 [Because, apparently, learning a generic function is just as hard as minimax leads you to think. Bummer if true.]
• Magalie Fromont and Béatrice Laurent, "Adaptive goodness-of-fit tests in a density model", Annals of Statistics 34 (2006): 680--720, math.ST/0607013
• Kenji Fukumizu, Le Song, Arthur Gretton, "Kernel Bayes' rule", arxiv:1009.5736 ["A kernel method is proposed for realizing Bayes' rule, based on representations of probability distributions in reproducing kernel Hilbert spaces (RKHS). The empirical RKHS embeddings of the conditional probabilities and prior are expressed as feature mappings of samples, and an RKHS embedding of the posterior distribution is computed, again based on a feature mapping of a sample."]
• Axel Gandy and Patrick Rubin-Delanchy, "An algorithm to compute the power of Monte Carlo tests with guaranteed precision", Annals of Statistics 41 (2013): 125--142, arxiv:1110.1248
• Surya Ganguli and Haim Sompolinsky, "Statistical Mechanics of Compressed Sensing", Physical Review Letters 104 (2010): 188701
• Seymour Geisser, Predictive Inference
• Christopher R. Genovese and Larry Wasserman
• Josep Ginebra, "On the Measure of the Information in a Statistical Experiment", Bayesian Analysis (2007): 167--212
• Tilmann Gneiting, Fadoua Balabdaoui and Adrian E. Raftery, "Probabilistic forecasts, calibration and sharpness", Journal of the Royal Statistical Society 69 (2007): 243--268
• Tilmann Gneiting and Roopesh Ranjan, "Combining predictive distributions", Electronic Journal of Statistics 7 (2013): 1747--1782
• Yuri Golubev, Vladimir Spokoiny, "Exponential bounds for minimum contrast estimators", arxiv:0901.0655
• Grassberger and Nadal (eds.), From Statistical Physics to Statistical Inference and Back
• Ulf Grenander, Abstract Inference
• Robert Hable, "Asymptotic Normality of Support Vector Machines for Classification and Regression", arxiv:1010.0535
• Peter Hall, Hans-Georg Müller, Fang Yao, "Estimation of functional derivatives", Annals of Statistics 37 (2009): 3307--3329, arxiv:0909.1157
• Marc Hallin, Davy Paindaveine, and Miroslav Siman, "Multivariate quantiles and multiple-output regression quantiles: From L1 optimization to halfspace depth", Annals of Statistics 38 (2010): 635--669 [with discussion]
• Wolfgang Härdle, Marlene Müller, Stefan Sperlich and Axel Werwatz, Nonparametric and Semiparametric Models: An Introduction [Full text online]
• Matthew T. Harrison, "Valid p-Values using Importance Sampling", arxiv:104.2910
• Heng Lian, "Empirical Likelihood Confidence Intervals for Nonparametric Functional Data Analysis", arxiv:0904.0843
• David A. Hensher et al. Applied Choice Analysis: A Primer ["Application of quantitative statistical methods to study choices made by individuals"]
• Tim Hesterberg, Nam Hee Choi, Lukas Meier, Chris Fraley, "Least angle and $\ell_1$ penalized regression: A review", Statistics Surveys 2 (2008): 61--93, arxiv:0802.0964
• David Hinkley, "Predictive Likelihood", Annals of Statistics 7 (1979): 718--728
• Nils Lid Hjort, Ian W. McKeague, Ingrid Van Keilegom, "Extending the scope of empirical likelihood", arxiv:0904.2949 = Annals of Statistics 37 (2009): 1079--1111
• Peter D. Hoff, "A hierarchical eigenmodel for pooled covariance estimation", Journal of the Royal Statistical Society B 71 (2009): 971--992
• Peter Hoff, Jon Wakefield, "Bayesian sandwich posteriors for pseudo-true parameters", arxiv:1211.0087
• Marc Hoffmann and Richard Nickl, "On adaptive inference and confidence bands", Annals of Statistics 39 (2011): 2383--2409
• Torsten Hothorn, Thomas Kneib, Peter Bühlmann, "Conditional transformation models", Journal of the Royal Statistical Society B forthcoming
• Joel L. Horowitz, Semiparametric and Nonparametric Methods in Econometrics
• Serkan Hosten, Amit Khetan and Bernd Sturmfels, "Solving the Likelihood Equations", math.ST/0408270
• Ping-Hung Hsieh, "A nonparametric assessment of model adequacy based on Kullback-Leibler divergence", Statistics and Computing 23 (2013): 149--162
• Mia Hubert, Peter J. Rousseeuw, Stefan Van Aelst, "High-Breakdown Robust Multivariate Methods", Statistical Science 23 (2008): 92--119, arxiv:0808.0657
• Alexander Ilin, Tapani Raiko, "Practical Approaches to Principal Component Analysis in the Presence of Missing Values", Journal of Machine Learning Research 11 (2010): 1957--2000
• Stefano M. Iacus and Davide La Torre
• "Approximating Distribution Functions by Iterated Function Systems," math.PR/0111152
• "Nonparametric estimation of distribution and density functions in presence of missing data: an IFS approach," math.PR/0302016
• Thomas Jaki and and R. Webster West, "Maximum Kernel Likelihood Estimation", Journal of Computational and Graphical Statistics 17 (2008): 976--993
• Jiantao Jiao, Kartik Venkat, Tsachy Weissman, "Maximum Likelihood Estimation of Functionals of Discrete Distributions", arxiv:1406.6959
• Adam M. Johansen, Arnaud Doucet and Manuel Davy, "Particle methods for maximum likelihood estimation in latent variable models", Statistics and Computing 18 (2008) : 47--57
• Ana Justel, Daniel Pena, Ruben Zamar, "A multivariate Kolmogorov-Smirnov test of goodness of fit", Statistics and Probability Letters 35 (1997): 251--259 [PDF reprint via Prof. Pena]
• Paul Kabaila and Kreshna Syuhada, "The Asymptotic Efficiency of Improved Prediction Intervals", arxiv:0901.1911
• Ata Kaban, "Non-parametric detection of meaningless distances in high dimensional data", Statistics and Computing 22 (2011): 375--385
• Oscar Kempthorne, "The classical problem of inference--goodness of fit", Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Vol. 1, 235--249
• D. F. Kerridge, "Inaccuracy and Inference", Journal of the Royal Statistical Society B 23 (1961): 184--194
• Yuichi Kitamura, "Empirical likelihood methods in econometrics: Theory and Practice", Cowles Foundation Discussion Paper No. 1569 (2006)
• Ioannis Kontoyiannis and S. P. Meyn, "Computable exponential bounds for screened estimation and simulation", Annals of Applied Probability 18 (2008): 1491--1518, arxiv:math/0612040
• Jim Kuelbs and Anand N. Vidyashankar, "Asymptotic inference for high-dimensional data", Annals of Statistics 38 (2010): 836--869
• Solomon Kullback, "Probability densities with given marginals," Annals of Mathematical Statistics 39 (1968): 1236--1243
• Masayuki Kumon and Akimichi Takemura, "On a simple strategy weakly forcing the strong law of large numbers in the bounded forecasting game", math.PR/0508190 ["In the framework of the game-theoretic probability of Shafer and Vovk (2001) ... construct an explicit strategy weakly forcing the strong law of large numbers (SLLN) in the bounded forecasting game. ... simple finite-memory strategy based on the past average of Reality's moves, which weakly forces the strong law of large numbers with the convergence rate of $O(\sqrt{\log n/n})$.... We show that if Reality violates SLLN, then the exponential growth rate of Skeptic's capital process is explicitly described in terms of the Kullback divergence between the average of Reality's moves when she violates SLLN and the average when she observes SLLN."]
• Tze Leung Lai, Shulamith T. Gross, and David Bo Shen, "Evaluating probability forecasts", Annals of Statistics 39 (2011): 2356--2382, arxiv:1202.5140
• Mikhail Langovoy, "Data-driven goodness-of-fit tests", arxiv:0708.0169
• Guy Lebanon, The Analysis of Data
• Youngjo Lee and John A. Nelder, "Likelihood Inference for Models with Unobservables: Another View", Statistical Science 24 (2009): 255--269, arxiv:1010.0303 [with discussion and replies following]
• E. L. Lehmann and Joseph P. Romano, "Generalizations of the Familywise Error Rate", math.ST/0507420 = Annals of Statistics 33 (2005): 1138--1154
• Matthieu Lerasle, "Adaptive non-asymptotic confidence balls in density estimation", arxiv:1007.4528
• M. Lerasle, R. I. Oliveira, "Robust Empirical Mean Estimators", arxiv:1112.3914
• Bo Li and Marc G. Genton, "Nonparametric Identification of Copula Structures", Journal of the American Statistical Association 108 (2013): 666--675
• Feng Liang, Sayan Mukherjee, Mike West, "The Use of Unlabeled Data in Predictive Modeling", arxiv:0710.4618 = Statistical Science 22 (2007): 189--205
• Perry Liang, Francis Bach, Guillaume Bouchard and Michael I. Jordan, "Asymptotically Optimal Regularization in Smooth Parametric Models" [PDF preprint via Prof. Jordan]
• Victor S. L'vov, Anna Pomyalov and Itamar Procaccia, "Outliers, Extreme Events and Multiscaling," nlin.CD/0009049
• Christian K. Machens, "Adaptive sampling by information maximization," physics/0112070
• Edouard Machery, "Power and Negative Results", Philosophy of Science 79 (2012): 808--820
• Robert Mariano, Til Schuermann and Melyvn J. Weeks (eds.), Simulation-Based Inference in Econometrics: Methods and Applications
• Ryan Martin, Chuanhai Liu, "Inferential models: A framework for prior-free posterior probabilistic inference", arxiv:1206.4091
• McCabe and Tremayne, Modern Asymptotic Theory
• P. McCullagh
• Tensor Methods in Statistics
• "What is a statistical model?", Annals of Statistics 30 (2002): 1225--1310 [With commentary and rejoinder. Not sure, from a superficial glance, if there's really anything to this or it's just what one of my mentors calls "algebraic noodling".]
• Nicolai Meinshausen and John Rice, "Estimating the proportion of false null hypotheses among a large number of independently tested hypotheses", math.ST/0501289
• Alexander Meister, Deconvolution Problems in Nonparametric Statistics ["e.g., density estimation based on contaminated data, errors-in-variables regression, and image reconstruction"]
• K. L. Mengersen, P. Pudlo, C. P. Robert, "Bayesian computation via empirical likelihood", arxiv:1205.5658
• Vladimir N. Minin, John D. O'Brien, Arseni Seregin, "Empirically corrected estimation of complete-data population summaries under model misspecification", arxiv:0911.0930
• David Mumford and Agnes Desolneux, Pattern Theory: The Stochastic Analysis of Real-World Signals
• Richard Nickl, "Donsker-type theorems for nonparametric maximum likelihood estimators", Probability Theory and Related Fields 138 (2007): 411--449
• Andrey Novikov, "Sequential multiple hypothesis testing in presence of control variables", Kybernetika 45 (2009): 507--528, arxiv:0812.2712
• Wojciech Olszewski, Alvaro Sandroni, "A nonmanipulable test", arxiv:0904.0338 = Annals of Statistics 37 (2009): 1013--1039
• Giulio Palombo, "Multivariate Goodness of Fit Procedures for Unbinned Data: An Annotated Bibliography", arxiv:1102.2407
• Leandro Pardo, Statistical Inference Based on Divergence Measures
• Hanxiang Peng and Anton Schick, "Empirical likelihood approach to goodness of fit testing", Bernoulli 19 (2013): 954--981
• William Perkins, Mark Tygert, Rachel Ward, "Significance testing without truth", arxiv:1301.1208
• Donald A. Pierce and Dawn Peters, "Improving on exact tests by approximate conditioning", Biometrika 86 (1999): 265--277 [Via Andrew Gelman's blog; PDF reprint]
• Tomaz Podobnik and Tomi Zivko, "On Consistent and Calibrated Inference about the Parameters of Sampling Distributions", physics/0508017
• Thorsten Poeschel, Werner Ebeling, and Helge Rose, "Guessing probability distributions from small samples," cond-mat/0203467 = Journal of Statistical Physics 80 (1995): 1443
• David Pollard, "Some thoughts on Le Cam's statistical decision theory", arxiv:1107.3811
• Benedikt M. Pötscher, "Confidence Sets Based on Sparse Estimators are Necessarily Large", arxiv:0711.1036
• Joel Predd, Robert Seiringer, Elliott H. Lieb, Daniel Osherson, Vincent Poor, Sanjeev Kulkarni, "Probabilistic coherence and proper scoring rules", IEEE Transactions on Information Theory 55 (2009): 4786, arxiv:0710.3183
• Ramsay and Silverman, Functional Data Analysis
• C. Radhakrishna Rao, "Diversity: Its measurement, decomposition, apportionment and analysis", Sankhya: The Indian Journal of Statistics 44(A) (1982): 1--22 [Sankhya is not in JSTOR! Why is Sankhya not in JSTOR?!?!]
• R.-D. Reiss and M. Thomas, Statistical Analysis of Extreme Values: With Applications to Insurance, Finance, Hydrology and Other Fields
• Irina Rish, Sparse Modeling; Theory, Algorithms, and Applications
• James Robins, Lingling Li, Eric Tchetgen, Aad van der Vaart, "Higher order influence functions and minimax estimation of nonlinear functionals", arxiv:0805.3040
• James Robins and Aad van der Vaart, "Adaptive nonparametric confidence sets", Annals of Statistics 34 (2006): 229--253, arxiv:math/0605473 ["We construct honest confidence regions for a Hilbert space-valued parameter in various statistical models. The confidence sets can be centered at arbitrary adaptive estimators, and have diameter which adapts optimally to a given selection of models."]
• Sylvain Rubenthaler, Tobias Ryden and Magnus Wiktorsson, "Fast simulated annealing in $\R^d$ and an application to maximum likelihood estimation", math.PR/0609353
• Birgit Rudloff, Ioannis Karatzas, "Testing composite hypotheses via convex duality", Bernoulli 16 (2010): 1224--1239, arxiv:0809.4297
• Susanne M. Schennach, "Point estimation with exponentially tilted empirical likelihood", Annals of Statistics 35 (2007): 634--672, arxiv:0708.1874
• Emilio Seijo and Bodhisattva Sen, "A continuous mapping theorem for the smallest argmax functional", Electronic Journal of Statistics 5 (2011): 421--439
• Galit Shmueli, "To Explain or to Predict?", Statistical Science 25 (2010): 289--310, arxiv:1101.0891
• Haochang Shou, Russell T. Shinohara, Han Liu, Daniel Reich, and Ciprian Crainiceanu, "Soft Null Hypotheses: A Case Study of Image Enhancement Detection in Brain Lesions", Johns Hopkins University, Dept. of Biostatistics Working Paper 257
• Ricardo Silva, Robert B. Gramacy, "Gaussian Process Structural Equation Models with Latent Variables", arxiv:1002.4802 [Heard the talk at UAI 2010, but I want the details.]
• Kesar Singh, Minge Xie, William E. Strawderman, "Confidence distribution (CD) -- distribution estimator of a parameter", pp. 132--150 in Regina Liu, William Strawderman and Cun-Hui Zhang (eds.), Complex Datasets and Inverse Problems: Tomography, Networks and Beyond
• Karline Soetaert, Thomas Petzoldt, "Inverse Modelling, Sensitivity and Monte Carlo Analysis in R Using Package FME", Journal of Statistical Software 33 (2010): 3
• Jascha Sohl-Dickstein, "The Natural Gradient by Analogy to Signal Whitening, and Recipes and Tricks for its Use", arxiv:1205.1828
• Jascha Sohl-Dickstein, Peter Battaglino, Michael R. DeWeese, "Minimum Probability Flow Learning", arxiv:0906.4779
• Christopher G. Small, The Statistical Theory of Shape
• Aris Spanos
• Pablo Sprechmann, Ignacio Ramírez, Guillermo Sapiro, Yonina Eldar, "C-HiLasso: A Collaborative Hierarchical Sparse Modeling Framework", arxiv:1006.1346
• Johan A.K. Suykens, Carlos Alzate, and Kristiaan Pelckmans, "Primal and dual model representations in kernel-based learning", Statistics Surveys 4 (2010): 148--183
• Olivier Thas, Comparing Distributions [mostly about goodness-of-fit tests]
• F. V. Tkachov, "Quasi-optimal observables: Attaining the quality of maximal likelihood in parameter estimation when only a MC event generator is available," physics/0108030
• Samuel Vaiter, Mohammad Golbabaee, Jalal Fadili, Gabriel Peyré, "Model Selection with Piecewise Regular Gauges", arxiv:1307.2342
• Mark J. van der Laan and Sherri Rose, Targeted Learning: Causal Inference for Observational and Experimental Data
• Aki Vehtari and Janne Ojanen, "A survey of Bayesian predictive methods for model assessment, selection and comparison", Statistics Surveys 6 (2012): 142--228
• Guenther Walther, "The Average Likelihood Ratio for Large-scale Multiple Testing and Detecting Sparse Mixtures", arxiv:1111.0328
• Xiaogang Wang and James V. Zidek, "Selecting likelihood weights by cross-validation", math.ST/0505599 = Annals of Statistics 33 (2005): 463--500 ["The (relevance) weighted likelihood was introduced to formally embrace a variety of statistical procedures that trade bias for precision. Unlike its classical counterpart, the weighted likelihood combines all relevant information while inheriting many of its desirable features including good asymptotic properties. However, in order to be effective, the weights involved in its construction need to be judiciously chosen. Choosing those weights is the subject of this article in which we demonstrate the use of cross-validation. We prove the resulting weighted likelihood estimator (WLE) to be weakly consistent and asymptotically normal. An application to disease mapping data is demonstrated." Sounds interesting...]
• Fabian L. Wauthier, Michael I. Jordan, "Heavy-Tailed Processes for Selective Shrinkage", arxiv:1006.3901
• Holger Wendland, Scattered Data Approximation
• Halbert White, Asymptotic Theory for Econometricians [Useful source, it seems, for non-IID central limit theorems]
• Christopher K. I. Williams, "How to Pretend That Correlated Variables Are Independent by Using Difference Observations", Neural Computation 17 (2005): 1--6 ["In many areas of data modeling, observations at different locations (e.g., time frames or pixel locations) are augmented by differences of nearby observations.... These augmented observations are then often modeled as being independent. How can this make sense? We provide two interpretations, showing (1) that the likelihood of data generated from an autoregressive process can be computed in terms of 'independent' augmented observations and (2) that the augmented observations can be given a coherent treatment in terms of the products of experts model..."]
• Wei Biao Wu, "On false discovery control under dependence", Annals of Statistics 36 (2008): 364--380, arxiv:0903.1971
• Yuhong Yang and Andrew Barron, "Information-theoretic determination of minimax rates of convergence", Annals of Statistics 27 (1999): 1564--1599

Notebooks:     Hosted, but not endorsed, by the Center for the Study of Complex Systems