David Draper

David Draper (last update 28 October 2001)

Table of Contents

(0. Under the pressure of other commitments this web page has fallen somewhat out of date and is in serious need of updating; my apologies.)

1. Recent and upcoming events and news items of possible interest:

I am the Chair of the newly forming Department of Applied Mathematics and Statistics (AMS) in the School of Engineering at the University of California, Santa Cruz (I am a statistician).
AMS is offering a new course next quarter (beginning in the week of 26-30 March 2001) on Bayesian statistical methods and reasoning, which I will be teaching.
This is a potentially exciting topic to both undergraduate and graduate students with a variety of interests in science and engineering, because
Partly because (i) we are new, (ii) I am unfamiliar with how to market such courses, and (iii) I may have set the initial prerequisites too high given that there has not been much teaching of statistics on campus in the past, the pre-enrollment for this course is very low, and it is in danger of being canceled.
I am going to be very liberal in interpreting the "permission of instructor" prerequisite as a way of inviting all students who might be interested to come to the first few classes -- I will see who shows up and what their backgrounds are, and will tailor the class to the enrollment.
If you wish to enroll in this class, please contact me by email at draper@ams.ucsc.edu or in person (Baskin Engineering 147) -- I have lots of permission codes.
A description of the course follows.
ENG 181: Bayesian Statistics
Spring 2001
Instructor: David Draper
Prerequisites: permission of instructor
This course will provide an introduction to Bayesian statistical methods for inference and prediction.
Statistics is the study of uncertainty - how to measure it, and what to do about it. As such, it is of potential interest in many (virtually all?) aspects of science and decision-making. Of the two main ways to quantify uncertainty -- involving relative frequency and subjective (Bayesian) notions of probability -- the second way is more flexible and general, but for a long time the Bayesian approach was limited in applications by an inability to perform high-dimensional numerical integrations. With the advent of powerful computers and new simulation-based techniques over the past 10 years, the computing problem is now solved, and there has been a revolution in Bayesian methods and applications.
The course will be methodological but will be guided by a series of real-world case studies. The first half of the course will involve symbolic mathematical calculations in the computer package Maple, and statistical analyses and graphics in the package R; contemporary Bayesian computation using the package WinBUGS will feature prominently in the second half.
The instructor, who has won or been nominated for major teaching awards at four leading universities in the US and England, will survey the background of the initial participants in the course in mathematics and probability to decide how the course should be run for maximal benefit of the participants. He intends to provide a course that will be interesting and profitable for a variety of students at both the undergraduate and graduate levels.
The week-by-week breakdown of topics to be covered is as follows.

Week 1: Probability as quantification of uncertainty about observables. Discrete outcomes; single-parameter problems. Case Study: Hospital-specific prediction of mortality for heart attack patients.
Week 2: Exchangeability as a Bayesian concept parallel to frequentist independence. Prior, likelihood, posterior, and predictive distributions.
Week 3: Inference and prediction; coherence and calibration. Conjugate analysis. Software: Maple.
Week 4: Comparison with frequentist modeling. Continuous outcomes. Gaussian models. Software: R.
Week 5: Multiparameter problems. Integrating over nuisance parameters. Case Study: Measurement of physical constants.
Week 6: Markov chains. Introduction to Markov Chain Monte Carlo (MCMC) methods.
Week 7: Gibbs sampling and Metropolis-Hastings sampling. Software: WinBUGS.
Week 8: Hierarchical modeling. Case study: Poisson random-effects modeling: A controlled experiment to assess effectiveness of in-home geriatric assessment.
Week 9: Bayesian model diagnostics. Model expansion and cross-validation.
Week 10: Bayesian model selection and sensitivity analysis.

Reading List
The main textbook for the course will be

Gelman A, Carlin JB, Stern HS, Rubin DB (1995). Bayesian Data Analysis. New York: Chapman & Hall.
Supplementary reading will be taken from

Gilks WR, Richardson S, Spiegelhalter DJ (1996). Markov Chain Monte Carlo in Practice. New York: Chapman & Hall.
and from

Draper D (2001). Bayesian Hierarchical Modeling. New York: Springer-Verlag (forthcoming).

Evaluation
There will be homework assignments (more like small take-home tests) given out in weeks 2, 4, 6, and 8 and due one week later; these will blend paper-and-pen, symbolic computing, statistical and MCMC calculations. A take-home final exam will be assigned in week 9 and due at the end of the final examination period.
I gave a tutorial on Bayesian hierarchical modeling at the ISBA 2000 meeting in Crete in May 2000. A revised PostScript version of this tutorial - which fills in most of the blank pages in the previous version - is available for downloading here. There are still a few blank pages in this version; these are just placeholders for some screen shots from using WinBUGS to fit the models whose analysis I illustrate (I couldn't figure out how to incorporate the screen shots into the PostScript document). If you have any comments on the tutorial I would be interested to hear them at draper@ams.ucsc.edu.
I gave a one-day short course on Bayesian hierarchical modeling at the Interface 2000 meeting (the 32nd symposium on the interface between computer science and statistics) in New Orleans in April 2000. More details about the meeting are available here, and a complete description of the course may be found here. The course was co-sponsored by LearnStat, the continuing education branch of the American Statistical Association.
At this year's Interface 2001 meeting in southern California, I will give a two-day short course on Bayesian hierarchical modeling, with the first day on introductory and intermediate topics and the second day on advanced issues. This is scheduled to take place on (Mon-tue) 11-12 June 2001.
Drafts of articles and book chapters recently finished, available for downloading:

A comparison of Bayesian and likelihood-based methods for fitting multilevel models (PostScript format) (with Browne WJ; October 2000): submitted (a substantially revised new version of the previous August 1999 paper). (We use simulation studies, whose design is realistic for educational and medical research, to compare Bayesian and likelihood-based methods for fitting variance-components (VC) and random-effects logistic regression (RELR) models. The likelihood (and approximate likelihood) approaches we examine are based on the methods most widely used in current applied multilevel analyses: maximum likelihood (ML) and restricted ML (REML) for Gaussian outcomes, and marginal and penalised quasi-likelihood (MQL and PQL) for Bernoulli outcomes. Our Bayesian methods use Markov chain Monte Carlo (MCMC) estimation, with adaptive hybrid Metropolis-Gibbs sampling for RELR models, and several diffuse prior distributions (inverse gamma and uniform priors for variance components). For evaluation criteria we consider bias of point estimates and nominal versus actual coverage of interval estimates. In two-level VC models we find that (a) both likelihood-based and Bayesian approaches can be made to produce approximately unbiased estimates, although the automatic manner in which REML achieves this is an advantage, but (b) both approaches had difficulty achieving nominal coverage in small samples and with extreme values of the variance parameters (as measured by the ratio of variances at levels 1 and 2). With three-level RELR models we find that (c) quasi-likelihood methods for estimating random-effects variances performed badly with respect to bias and coverage in the example we simulated, and (d) Bayesian diffuse-prior intervals lead to well-calibrated point and interval RELR estimates. Given that the likelihood-based methods we study are considerably faster computationally than MCMC and that a number of models are typically fit during the model exploration phase of a multilevel study, one possible analytic strategy suggested by our results is a hybrid of likelihood-based and Bayesian methods, with (i) REML and quasi-likelihood estimation (for their computational speed) during model exploration and (ii) diffuse-prior Bayesian estimation using MCMC to produce final inferential results. Other analytic strategies based on less approximate likelihood methods are also possible but would benefit from further study of the type summarised here.).)
Bayesian and likelihood methods for fitting multilevel models with complex level-1 variation (Browne WJ, Draper D, Goldstein H, Rasbash J; July 2000; PostScript, 404K; prints 21 pages): submitted. (In multilevel modeling it is common practice to assume constant variance at level 1 across individuals. In this paper we consider situations where the level-1 variance depends on predictor variables. We examine two cases using a dataset from educational research; in the first case the variance at level 1 of a test score depends on a continuous "intake score" predictor, and in the second case the variance is assumed to differ according to gender. We contrast two maximum-likelihood methods based on iterative generalized least squares with two MCMC methods based on adaptive hybrid versions of the Metropolis-Hastings (MH) algorithm, and we use two simulation experiments to compare these four methods. We find that all four approaches have good repeated-sampling behavior in the classes of models we simulate. We conclude by contrasting raw- and log-scale formulations of the level-1 variance function, and we find that adaptive MH sampling is considerably more efficient than adaptive rejection sampling when the heteroscedasticity is modeled polynomially on the log scale.)
A Case Study of Stochastic Optimization in Health Policy: Problem Formulation and Preliminary Results (with Fouskakis D; May 2000; PostScript, 582K; prints 22 pages): Journal of Global Optimization, forthcoming. (We use Bayesian decision theory to address a variable selection problem arising in attempts to indirectly measure the quality of hospital care, by comparing observed mortality rates to expected values based on patient sickness at admission. Our method weighs data collection costs against predictive accuracy to find an optimal subset of the available admission sickness variables. The approach involves maximizing expected utility across possible subsets, using Monte Carlo methods based on random division of the available data into N modeling and validation splits to approximate the expectation. After exploring the geometry of the solution space, we compare a variety of stochastic optimization methods - including genetic algorithms (GA), simulated annealing (SA), threshold acceptance (TA), messy simulated annealing (MSA), and tabu search (TS) - on their performance in finding good subsets of variables, and we clarify the role of N in the optimization. Preliminary results indicate that TS is somewhat better than TA and SA in this problem, with MSA and GA well behind the other three methods. Sensitivity analysis reveals broad stability of our conclusions.)
Implementation and performance issues in the Bayesian and likelihood fitting of multilevel models (with Browne WJ; December 1999; PostScript, 525K): Computational Statistics, forthcoming. (Explores Bayesian and likelihood fitting methods, in terms of validity of conclusions, in two-level random-slopes regression (RSR) models, and compares several Bayesian fitting methods based on Markov chain Monte Carlo, in terms of computational efficiency, in random-effects logistic regression (RELR) models.)
Model uncertainty yes, discrete model averaging maybe (September 1999): comment on "Bayesian model averaging: a tutorial," by Hoeting JA, Madigan D, Raftery AE, Volinsky CT, Statistical Science, forthcoming. (Argues that variable selection uncertainty in generalized linear models should be dealt with in a continuous manner via hierarchical modeling rather than through discrete model averaging, and advocates the use of expected utility maximization as a basis for model choice.)
Book review (with Fouskakis D; May 1999) of "Tabu Search," by Glover F, Laguna M, Journal of the Royal Statistical Society, Series D (The Statistician), forthcoming. (Discusses the only book-length treatment of tabu search, a good stochastic optimization method developed by people in operations research and not well known yet by statisticians.)
Scenario and parametric sensitivity and uncertainty analyses in nuclear waste disposal risk assessment: the case of GESAMAC (with Saltelli A, Tarantola S, Prado P; revised May 2000): Chapter 13 in Mathematical and Statistical Methods for Sensitivity Analysis (Saltelli A, Chan K, Scott M, eds.), New York: Wiley (2000), 275-292. (Shows that variance-based sensitivity analyses are not fully adequate in determining the factors most responsible for high radiologic doses arising from the failure of underground storage facilities for nuclear waste, and that about 30% of the overall predictive uncertainty for log dose arises from uncertainty about the scenario describing how the facility will fail -- a source of uncertainty previously largely ignored or treated qualitatively. Also explores the use of projection pursuit regression in sensitivity analysis.)
Sampling errors under non-probability sampling (with Bowater R; January 1999): Chapter 4 in Model Quality Reports in Business Statistics: Theory and Methods for Quality Evaluation, by Bowater R, Chambers C, Davies P, Draper D, Skinner C, Smith P. Luxembourg: Eurostat. (Book-length report produced by team consisting of people from the Office for National Statistics (UK), Statistics Sweden, and the Universities of Bath and Southampton; in this chapter we review biases arising from voluntary sampling, judgmental sampling, quota sampling, and cut-off sampling, and make recommendations on how to assess and minimize such biases.)
Model assumption errors in survey sampling (with Bowater R; January 1999): Chapter 9 in Model Quality Reports in Business Statistics: Theory and Methods for Quality Evaluation, by Bowater R, Chambers C, Davies P, Draper D, Skinner C, Smith P. Luxembourg: Eurostat. (See item above; in this chapter we review model assumption errors as they arise in the construction of index formulae, bench-marking, seasonal adjustment, cut-off sampling, small-area estimation, and non-ignorable nonresponse, and make recommendations on how to assess and minimize such errors.)
Scenario and parametric uncertainty in GESAMAC: A methodological study in nuclear waste disposal risk assessment (with Pereira A, Prado P, Saltelli A, Cheal R, Eguilior S, Mendes B, Tarantola S; November 1998): Computer Physics Communications, forthcoming. (Companion piece to the first item above, with similar results and methods on a different source of radioactive decay.)
Draft chapters of Bayesian Hierarchical Modeling (text and computer programs) now available for downloading
Bayesian MCMC multilevel modeling workshops (6 April and 29 October 1998)
Bayesian hierarchical modeling short courses (August 1998)
International workshop on stochastic model building and variable selection (Duke University, 9-10 October 1997)
RSS half-day meeting on design and analysis of complex sample surveys (14 May 1997)

2. Research and teaching

Contact information
Biographical sketch
Statistical philosophy and outlook
Current research interests (lots of papers available for downloading)
Funded research projects
Some thoughts on statistics teaching
Present and past postdocs and PhD/MSc students
PhD projects on offer

3. Personal information

1. Recent and Upcoming Events of Possible Interest

1.1 Draft chapters of Bayesian Hierarchical Modeling
(text and computer programs) now available
I am working on a research monograph on Bayesian Hierarchical Modeling. The second draft of the first two chapters (together with a preface, an appendix with computer programs, and references) is now available (in PostScript format) here for downloading and free use. (Number of people worldwide who have downloaded the text and programs so far: about 1,250.)

The intended audience for the book is methodological and applied statisticians who wish to learn (more) about the formulation and fitting of hierarchical (multilevel) models from the Bayesian point of view.

An understanding of probability at the level typically required for a master's degree in statistics would provide ample mathematical background for reading the book. I have taught subsets of the draft material successfully to groups including British final-year undergraduates, American PhD students, and PhD-level researchers enrolled in short courses (including an award-winning course at the Anaheim Joint Statistical Meetings in 1997), and the book has also proven useful for self-study by researchers and graduate students in a variety of disciplines (including statistics). No previous experience with Bayesian methods is needed -- all relevant ideas are covered in a self-contained fashion.

The draft manuscript PostScript file is about 1.5Mb, and when printed you get 183 pages: Contents and Preface (pp. i-xiv), Chapter 1 (pp. 1-46), Chapter 2 (pp. 47-122), some placeholder stuff you can avoid printing (pp. 123-138, 169), Appendix 2 (pp. 139-160), and References (pp. 161-168).

The first chapter (46 pages) provides a standalone introduction to Bayesian modeling in the context of two case studies, and the second chapter (76 pages) offers an in-depth look at Markov Chain Monte Carlo (MCMC) methods from first principles, also based on two case studies. The writing style is informal; the main text is not very mathy, but each chapter is supplemented by extensive endnotes giving additional formalism and details.

Appendix 2 contains S+, BUGS, Maple, and C programs for conducting the analyses in Chapter 2; these programs are also available for downloading here as a 36Kb text file. Chapter 2, when combined with computer work based on the code supplied here, might make a nice MCMC tutorial to supplement any other coverage you may have on this topic, and Chapter 1 might serve as a gentle introduction to Bayes for advanced undergraduates or beginning grad students.

The only things I ask in return for the free use of these materials in your teaching are (a) that you email me giving details (class or tutorial size and name, level of students) of both (i) when and where you have used the book and programs, and (ii) any errors or problems you encounter, or other comments you wish to pass on; and (b) that - if you like this material - you consider buying a copy of the finished book, which (I hope) will be published within the next 12 months!

1.2 Bayesian MCMC multilevel modeling workshops
(6 April and 29 October 1998)

One-day workshops on MCMC methods in multilevel modeling were given by David Draper (University of Bath) and Bill Browne (Institute of Education, University of London) on April 6 and October 29, 1998 at the Institute of Education, using the new Windows version of MLn (MLwiN), which has recently been released. The workshops were offered in conjunction with the Multilevel Modelling Project, led by Harvey Goldstein.

The workshops combined methodology discussion with real-time hands-on computing experience, and there was also an opportunity for participants to submit their data sets for inclusion as case studies for interactive analysis in the afternoon session.

The intended audience was multilevel/hierarchical modelers (of varying levels of experience, from not much to quite a lot) who wish to learn about Markov Chain Monte Carlo (MCMC) Bayesian methods and their implementation in MLwiN, and people interested in learning about the new interactive model specification, fitting, and diagnostic capabilities of the new Windows version of MLn also found the workshops worthwhile. Bill Browne and I are the co-developers of the MCMC functionality in MLwiN.

We are interested in giving future workshops of this type, and are open to suggestions on location and timing. If you would like to discuss this possibility, or for further details, please email Bill Browne, or phone me at +44 (0) 1225 826222. The MLwiN software would be available for a discount as part of coming to the workshop.

A tentative program for the day is as follows:

9.00 - 10.30: Introduction to multilevel models, including Gaussian and generalized linear multilevel models; hands-on computer work
10.30 - 11.00: Tea, informal discussion
11.00 - 12.15: Introduction to Bayesian inference and prediction
12.15 - 1.30: Lunch, informal discussion
1.30 - 3.00: Markov Chain Monte Carlo (MCMC) estimation and diagnostics; hands-on computer work
3.00 - 3.30: Coffee, informal discussion
3.30 - 5.00: Model formulation, diagnostics, and elaboration; hands-on computer work, including analysis of participants' data

As the tentative program indicates, the workshop includes lectures on the basics of multilevel models, Bayesian inference, and MCMC methods, together with some real-time hands-on computing sessions using the MLwiN package. There is also an opportunity for participants to submit their data sets for inclusion as case studies for interactive analysis in the afternoon session. The first two chapters of my book on hierarchical modeling (see 1.1 above) -- on Bayesian modeling and MCMC -- are now available in draft form (with associated computer code in S+, C, and Maple), and are distributed as part of the workshop.

The cost of the workshop would be as follows:

Workshop only: 250 pounds (150 pounds academic)
Workshop + 1 copy of MLwiN: 700 pounds (420 pounds academic)
Workshop + upgrade from MLn (DOS): 300 pounds (180 pounds academic)

Anyone who has already purchased MLn (DOS) after October 1st 1997 would receive the upgrade to MLwiN for free.

1.3 Bayesian hierarchical modeling short courses

I have recently given two one-day short courses on Bayesian hierarchical modeling, based on the book described in 1.1 above, at the Dallas Joint Statistical Meetings in August 1998, through the American Statistical Association (ASA) Continuing Education program.

The first day was an invited short course, offered at an introductory/intermediate level -- it essentially repeated the course I gave in Anaheim in 1997, which won an ASA Excellence in Continuing Education award. No previous exposure to Bayesian methods was needed in this first course -- all ideas were covered in a self-contained fashion. Topics for the first course included (1) an introduction to Bayesian modeling, (2) MCMC methods from scratch, (3) formulation of hierarchical models (HMs) based on the scientific/decision-making context, and (4) diagnostics for HMs.

The second day covered more advanced topics, including (1) random-effects and mixed models, (2) Bayesian nonparametric inference with Polya trees, and (3) HMs as an approach to model selection and dealing with model uncertainty. Each day can be taken in a standalone fashion, or people can come to both days if they want to.

I am interested in giving these short courses again in the future. If you would like to suggest a time, place, and audience, please email me.

1.4 INTERNATIONAL WORKSHOP
ON STOCHASTIC MODEL BUILDING
AND VARIABLE SELECTION

Duke University, Durham NC
October 9 and 10, 1997

A Workshop whose goal was bringing together researchers interested in novel approaches to computer-based and/or simulation-based aids to model building.

The program of the Workshop included both talks and poster presentations. For lists of the speakers and participants, practical details, and other information see this web page; if you have questions please email Giovanni Parmigiani or me.

1.5 RSS half-day meeting on
design and analysis of complex sample surveys

Date and time: 14 May 1997, 2-6.50pm.

Place: headquarters of the Royal Statistical Society (RSS), 12 Errol Street, London EC1Y 8LX England (voice +44-171-638-8998, fax +44-171-256-7598, email rss@rss.org.uk).

Four papers (available for downloading) by international teams of leading researchers in survey sampling, together with invited and contributed discussion and rejoinder. For more information see this web page or email me.

2. David Draper

2.1 Contact Information

I am a Professor in, and Head of, the Statistics Group in the Department of Mathematical Sciences. My phone number is +44-1225-826222; fax is +44-1225-826492; email is d.draper@maths.bath.ac.uk. My postal address is Statistics Group, Department of Mathematical Sciences, University of Bath, Claverton Down, Bath BA2 7AY, England.

2.2 Biographical Sketch

Before coming to Bath I studied for a BSc in mathematics at the University of North Carolina/Chapel Hill (1970-74) and a PhD in statistics at the University of California/Berkeley (1975-81); worked at IBM in New York (1974-75); and taught and did research at the University of Chicago (1981-84), the RAND Corporation (1984-91), and UCLA (1991-93), with brief sabbatic and consultant stints at the University of Washington (1986) and AT&T Bell Labs (1987). Since arriving in Bath I have also done some teaching at the University of Neuchâtel in Switzerland, and given short courses on Bayesian hierarchical modeling at the joint meetings of the American Statistical Association and the Institute of Mathematical Statistics.

2.3 Statistical Philosophy and Outlook

Philosophically I am some kind of de Finetti-style Bayesian, meaning that for me

prediction of observables is more fundamental than inference about unobservables, and

(conditional) exchangeability judgments are fundamental to predictive modeling.

Sure, you end up thinking about lots of unobservable parameters in this approach, but they don't come first -- they arise from the use of de Finetti's theorem to pass from exchangeability to conditional independence.

To me it's OK to supplement contextual information from experts with data analysis in forming your exchangeability judgments, as long as you keep yourself honest by not using the data twice in the process.

In practice for me this often means employing predictive calibration: holding out some of the data from the modeling and seeing where the observed outcomes in the held-out data fall in their respective model-based predictive distributions. If this kind of predictive calibration fails, then you have to go back and change the model (which includes the possibility of changing the ``prior'') until you are well-calibrated.

I see this as a kind of fusion of the best of Bayesian and non-Bayesian reasoning: (1) Bayes by itself (when done right) guarantees internal but not (2) external consistency, which involves asking inherently frequentist questions (how often do my predictive intervals include the observed outcomes?).

This philosophy has implications both in research and teaching, which I am currently refining. If you disagree with the views above and would like to talk about it, please email me.

2.4 Current Research Interests

Bayesian inference and prediction (e.g., Draper D (1996), Utility, sensitivity analysis, and cross-validation in Bayesian model-checking. Discussion of ``Posterior predictive assessment of model fitness via realized discrepancies," by A Gelman et al., Statistica Sinica, 6, 28-35; Draper D and Madigan D (1997), The scientific value of Bayesian statistical methods and outlook, IEEE Expert, Trends and Controversies department, October/December issue; and 2 other publications);
Model uncertainty, and exploration of the mapping from statistical assumptions to conclusions (e.g., Draper D (1988), Statistical Science, 3, 239-271; Draper D (1995), Assessment and propagation of model uncertainty (with discussion), Journal of the Royal Statistical Society Series B, 57, 45--97; Draper D (1997), On the relationship between model uncertainty and inferential/predictive uncertainty, under revision for Biometrika; Draper D (1997), Model uncertainty in stochastic and ``deterministic'' systems, Proceedings of the 12th International Workshop on Statistical Modeling, Biel, July 1997, Schriftenreihe der Osterreichischen Statistichen Gesellschaft, 5, 43-59; and 2 other publications);
Theory of data analysis (e.g., Draper D (1987), On exchangeability judgments in predictive modeling, and the role of data in statistical research, Statistical Science, 2, 454-461 (discussion of ``Prediction of Future Observations in Growth Curve Models,'' by CR Rao); Hadorn D et al. (1992), Cross-validation performance of patient mortality prediction models, Statistics in Medicine, 11, 475-489; Draper D et al. (1993), Exchangeability and data analysis (with discussion), Journal of the Royal Statistical Society Series A, 156, 9-37; Greenland S, Draper D (1997), Exchangeability. Entry in Encyclopedia of Biostatistics. Armitage P, Colton T (eds). London: Wiley; Draper D (1998), Discussion of ``Some statistical heresies'' by JK Lindsey, The Statistician, forthcoming; and 1 other publication);
Hierarchical modeling (e.g., Draper D et al. (1993), Combining Information: Statistical Issues and Opportunities for Research, Contemporary Statistics Series, No. 1, American Statistical Association, Alexandria, VA; Draper D (1995), Inference and hierarchical modeling in the social sciences (with discussion), Journal of Educational and Behavioral Statistics, 20, 115-147, 233-239; and 3 other publications);
Causal inference (e.g., Draper D and Cheal R (1997), Causal inference via Markov Chain Monte Carlo (in preparation));
Sample surveys (e.g., Bayesian analysis of finite-population survey data using Markov Chain Monte Carlo. Closing discussion, Half-Day Meeting on Design and Analysis of Complex Sample Surveys, Journal of the Royal Statistical Society Series B, 60, 96--98);
Markov Chain Monte Carlo (MCMC) methods (e.g., Cheal et al. (1997), MCMC methods for inference on family trees (in preparation); Goldstein H et al. (1997), A User's Guide to MLn for Windows (MLwiN), Version 1.0b, London: Institute of Education); and
Applications of statistical methods to the

Social sciences (e.g., Steiner A et al. (1996), Gerontologist, 36, 54-62),
Medical and health sciences, particularly quality of care in health policy (e.g., Dubois R et al. (1987), New England Journal of Medicine, 317, 1674-1680; Daley J et al. (1988), Journal of the American Medical Association, 260, 3611-3616; Draper D et al. (1990), Journal of the American Medical Association, 264, 1956-1961; Swezey R et al. (1997), Journal of Rheumatology; and 11 other publications), and
Biological and environmental sciences (e.g., Cheal et al. (1997), Inference on founder allele frequencies in the Przewalski horse pedigree (in preparation); Draper (1997), Model uncertainty in stochastic and ``deterministic" systems, Proceedings of the 12th International Workshop on Statistical Modeling, Biel, July 1997, Schriftenreihe der Osterreichischen Statistichen Gesellschaft, 5, 43-59; Draper et al. (1998), Scenario and parametric uncertainty in GESAMAC: A methodological study in nuclear waste disposal risk assessment, Computer Physics Communications, forthcoming).

If you have written something which you've made public on one or more of these topics and you are interested in a dialogue, please email me with details on how to get a copy, and I will try to send you comments.

2.5 Funded Research Projects

I have just finished working with Dr. Ryan Cheal (Bath) and partners at the Environmental Institute, Joint Research Center (Ispra, Italy), CIEMAT (Madrid, Spain), and the University of Stockholm (Sweden) on GESAMAC, a three-year EC-funded environmental project exploring the likely effects on the geosphere from possible future failure of underground containment vessels for spent nuclear fuel.

Ryan and I were helping to quantify all relevant sources of uncertainty (from model input scenarios, model structural assumptions, model parametric variability, and predictive inaccuracy) in forecasts of radiologic dose arising from containment vessel failure. The EC has made the working documents from this project confidential, but two papers now in the open literature are available: Draper (1997), Model uncertainty in stochastic and ``deterministic" systems, Proceedings of the 12th International Workshop on Statistical Modeling, Biel, July 1997, Schriftenreihe der Osterreichischen Statistichen Gesellschaft, 5, 43-59; and Draper et al. (1998), Scenario and parametric uncertainty in GESAMAC: A methodological study in nuclear waste disposal risk assessment, Computer Physics Communications, forthcoming.

If you have interests in this area I would like to start a dialogue with you; please email me.

I have also just finished working with Dr. Russell Bowater and partners at the Office for National Statistics, the Department of Social Statistics at the University of Southampton, and Statistics Sweden on a one-year project funded by Eurostat, to develop and test methodology to better account for all sources of uncertainty in complex business surveys of the type routinely undertaken by EC member countries.

We reviewed and tested methodologies for

estimating and adjusting for bias arising from non-probability sampling, and

assessing uncertainty arising from incorrect modeling assumptions in the survey sampling context.

The main output of this work is an extensive report, to be published by Eurostat, giving methodology and best practice in the reporting of business sample surveys; this should be available early in 1999. If you have an interest in these areas and would like to start a discussion by email, please write to me.

2.6 Some Thoughts on Statistics Teaching

I also have a basic interest in the teaching of statistics at the BSc, MSc and PhD levels. I have taught at Berkeley, Chicago, RAND, Seattle, UCLA, Bath, Neuchâtel, and the American Joint Statistical Meetings on introductory statistics, design of experiments, sample surveys, multivariate methods, computationally intensive inference, linear models, statistical modeling, Bayesian inference and prediction, and Bayesian hierarchical modeling.

I believe that three principles should govern the teaching of statistics at all levels:

It is important to tell the Bayesian and frequentist stories side by side, so that people can clearly see the strengths and weaknesses of each approach and can thereby create their own personal combination of what is good in both approaches. There is a lot of silliness in the frequentist approach, but being Bayesian is no guarantee of getting the right answer, either.

What works for me is (a) to reason in a Bayesian way when formulating my inferences and predictions and (b) to reason in a frequentist way when evaluating their quality, through predictive calibration (see the section on philosophy above).

There are certainly other ways to look for the best in Bayes and non-Bayes; I am sure you have your own (strong) views on the subject, and I would be interested to hear them.

More emphasis should be placed on prediction than in current treatments of statistics, which almost always focus almost exclusively on inference. Both science and decision-making are inherently predictive at heart: good scientific theories make testable (and accurate) predictions, and decision theory is all about making predictions about the future under different scenarios and choosing your favorite future.

Many inferential questions can usefully be rephrased in predictive terms, e.g., if you are a physician the key medical question is often not (whether treatment A is better on average than treatment B in some population) but (how much different the outcome would be under the two treatments for the patient in front of you). Given all of this, why do we spend so little teaching time on prediction?

Each unit of material should begin with (1) a scientific or decision-making case study, with sufficient contextual details for the real-world problem to be clearly in focus. Then (2) the statistical methods that are the point of this unit can be developed in the context of the case study, after which (3) these methods can be applied to solve the real-world problem in (1) that prompted the inquiry in the first place. After this the unit can be concluded with (4) an investigation of the general properties of the methods developed in (2).

This four-step approach echoes the process by which the methods were originally developed, which encourages people to see how ideas are discovered in the first place. It is especially good to cover step (2) in an interactive way, asking people to help suggest ideas for what to do next and exploring rather than condemning dead-ends, because in practice the discovery process itself often proceeds by learning what is wrong with each of a series of partial failures.

I am working at present on two books - a monograph on Bayesian hierarchical modeling (see 1.1 above) and an introductory text - that try to follow these three principles, at least roughly. If your views on the best way to teach statistics differ and you would like to talk about it, please email me.

2.7 Present and past postdocs and PhD/MSc students

Dr. Russell Bowater is a postdoc who finished his PhD with Bernard Silverman at Bristol in 1997, on MCMC in a nonstandard spatial statistics application in biology. Russell and I have just finished working on the Eurostat complex sample surveys project described above.
Bill Browne (bwjsmsr@ioe.ac.uk) recently finished his PhD on Applying MCMC methods to multi-level models, including putting MCMC into the popular package MLwiN for fitting hierarchical models. His dissertation was nominated for the 1998 Leonard Jimmy Savage Award for best Bayesian PhD dissertation in the world. Earlier we worked together on his MSc dissertation (Topics in hierarchical modeling), which was the recipient of the James Duthie Prize in 1995. In October 1998 Bill started a postdoc with Harvey Goldstein at the Institute of Education of the University of London;
Dr. Ryan Cheal worked with me on his PhD (Markov chain Monte Carlo methods for inference on family trees), finishing in 1997. He was until recently a postdoc on the GESAMAC project (see 2.5 above).
Dimitris Fouskakis is a PhD student working with me on Stochastic optimization for cost-effective quality assessment in health, a project using Bayesian modeling and utility analysis to construct optimal scales for measuring inputs (such as patient sickness at admission to the hospital) in league-table (input-output) quality assessment. Previously we worked together on his MSc dissertation, Variable selection via hierarchical modeling and utility, which was awarded distinction in 1996. Dimitris has been short-listed for the 1999 Ede and Ravenscroft Research Prize at the University of Bath.
Mark Gittoes has just started a PhD with me in September 1998 on Hierarchical modeling for quality assessment in health and education.
Daphne Kounali did her MSc dissertation work with me on Cardiac mortality and dietary risk factors: Survival analysis with time-varying covariates, finishing in 1998. She is now a medical statistician in the Research and Development Unit at the Salford Royal Hospitals NHS Trust.
Callum McKail worked with me in 1997 on his MSc dissertation, Fixing the broken bootstrap: Bayesian inference with skewed and long-tailed data. He is now with a software development company in the London area.
Kristi Raube did her PhD with me at the RAND Graduate School of Policy Studies, finishing in 1991; her dissertation was on Health and social support in the elderly. She is now Acting Director of the Center for Health Administration Studies at the University of Chicago.

2.8 PhD Projects on Offer

I am currently looking for PhD students to work with me on the following topics:

Assessment and propagation of model uncertainty. Models formalizing facts and assumptions about known and unknown quantities are central to statistical inference and the prediction of future observables. It is common practice to search for a reasonable model and then settle on a single choice, ignoring the model uncertainty uncovered in the search. Conditioning on a single model, when others with different predictive consequences are also plausible, under-propagates an important component of uncertainty, leading to predictive uncertainty assessments that, in retrospect, are often seen to be too small. This has direct consequences in recommending actions that do not hedge sufficiently against uncertainty.

The Bayesian solution is essentially to deal with model uncertainty in the way that a nuisance parameter would be treated, by integrating over it as in the following three-step program (Draper D (1995), Assessment and propagation of model uncertainty (with discussion), Journal of the Royal Statistical Society Series B, 57, 45--97):

Put a prior distribution on model space and update to a posterior on the set of possible models given the data;

Compute a predictive distribution based on each model with nonzero posterior probability; and

Produce a composite posterior predictive distribution that combines model-specific predictive distributions weighted by their posterior probabilities on model space.

Current projects in this area include an exploration of the role bootstrapping and Markov Chain Monte Carlo methods can play in approximating posterior model probabilities in complex problems. For example, bootstrapping the modeling process -- creating bootstrap copies of a data set, conducting parallel modeling exercises on each copy (including outlier deletion and variable selection and transformation), and combining within-copy and between-copy uncertainty assessments to produce better-calibrated predictions -- is a largely untested but promising way to use the bootstrap to approximate posterior model probabilities that reflect the realistic complexity of applied data analysis.

Input-output analysis for quality assessment in health and education. Two major societal institutions that have recently begun to look more closely at the quality with which they carry out their mandates are hospitals and schools. In the case of hospitals, a number of factors relevant to quality assessment have been identified, including the processes of care (what health professionals actually do on behalf of patients), the outcomes of care (what happens to patients as a result of the processes they receive), and patient sickness at admission (since hospitals differ widely in the severity of illness of their patients, and any examination of their health outcomes must bear this in mind). Similar concepts apply to schools.

In both instances process is difficult and expensive to measure, so interest has begun to focus on a less costly input-output approach to quality assessment, in which institutional outcomes are compared after adjusting for differences in inputs. In the case of hospitals this may take the form of a contrast between observed and expected mortality rates, given how sick patients are when they arrive at the hospital; in schools value-added computations examining A-level results (from standardized tests taken in the last year of high school) after accounting for GCSE scores (from another set of standardized tests taken one year earlier) have begun to emerge. Bayesian hierarchical modeling that is explicitly tailored to the multilevel structure of the data (patients nested within hospitals, students within classrooms within schools) plays a central role in this approach to quality assessment.

Projects available in this area include (a) the solution of technical problems posed by the hierarchical modeling of massive data sets and (b) explicit attempts to guide public policy by using such modeling to suggest optimal data collection strategies and optimal allocation of resources between input-output and process-only quality assessments.

Causal inference via Markov chain Monte Carlo. A major challenge in the analysis of observational studies is the assessment of the likely influence of unmeasured potential confounding factors on the estimated effect of the principal causal factor of interest. Two leading classes of models for such data are selection models (developed by the economist J Heckman) and counterfactual models (introduced by J Neyman and developed by D Rubin). The fully Bayesian analysis of such models with Markov chain Monte Carlo methods poses several technical problems, including multi-modality of the posterior distribution and extremely high serial correlation of the Monte Carlo draws. In addition to solving these problems, this project involves the specification of appropriate informative prior distributions for key quantities not well addressed by the data, and a detailed comparison - in theory, simulations, and case studies - between the selection and counterfactual approaches.

Developing a general-purpose Metropolis-Hastings engine. The program BUGS, created by the MRC Biostatistics Unit in Cambridge, is an elegant and rather general-purpose environment within which to perform approximate Bayesian inference with Gibbs sampling. An analogous Metropolis-Hastings engine, suitable for an even wider variety of inferential and predictive situations than that addressed by BUGS, has not yet been developed. This project would explore several strategies for constructing such an engine, including adaptive selection of multivariate normal proposal distributions after appropriate parameter transformation.

Nonparametric Bayesian analysis. Using de Finetti's theorem to approach, from first principles, the construction of well-calibrated predictive and inferential distributions for observable and unobservable quantities, respectively, requires placing a prior distribution on the set of all possible CDFs, a problem that has until recently had no fully satisfying solution. Even nonparametric bootstrap confidence intervals, which can be regarded as crude approximations to posterior distribution summaries of particular interest, perform surprisingly poorly with fairly large samples of long-tailed data, because the empirical CDF has nothing to say about the tails of the distribution beyond the largest observation. A similar problem arises in Bayesian analyses of generalized linear models, when attempts to honestly assess uncertainty about the link function require placing a prior on the set of all possible regression surfaces.

This project involves the theory and application of MCMC in the context of Polya trees - and other approaches to non-parametric Bayesian inference - in a way that is responsive to actual applied prior knowledge on, e.g., unimodality, tail behavior, and moments of CDFs (on the one hand) and monotonicity and smoothness of regression surfaces (on the other), in the context of theory, simulations, and real case studies. I have recently used Polya trees to solve a consulting problem for AEA Technologies in risk assessment arising from nuclear waste disposal, and I am eager to explore the practical limits of this modeling strategy.

Bayesian predictive validation as an approach to solving the problem posed by Cromwell's Rule. Cromwell's Rule (Lindley 1972) reminds us that, in the Bayesian formulation to statistical modeling, anything with zero prior probability must also have zero posterior probability, no matter how the data came out. This poses a dilemma for practical Bayesian modeling: we must, on grounds of feasibility, place zero initial prior probability on vast regions of the space of all possible models for the observed data, and yet once the data arrive we may well regret having marked as impossible in the prior various features of the data that clearly are not impossible, because they actually occurred. Thus there is a need in Bayesian modeling (just as with other approaches) to update prior guesses about model structure in light of the data, but without cheating by using the data twice.

One way out of this difficulty involves the sort of Bayesian nonparametric analyses described in the previous project. In this project another approach, out-of-sample predictive validation, will be used to overcome the central problem posed by Cromwell's Rule. The idea is to cross-validate the modeling process - by setting aside some data and using a range of models fitted to another subset of the data to predict the set-aside observations - but in such a way as to obtain an honest estimate of predictive accuracy of the composite modeling process. Theory, simulations, and case studies will be used to explore the strengths and limitations of this approach.

3. Personal Information

My wife is Dr. Andrea Steiner, a Senior Lecturer in gerontology and health policy analysis in Social Sciences and Geriatric Medicine at the University of Southampton. We live in an old stone house in Limpley Stoke, a nice village on the river Avon near Bath, with a canal, some good pubs, and some excellent hill-walking nearby. I now know a lot more about real ale than I did six years ago.