







Alan
Hubbard
Assistant Professor of Biostatistics
School of Public Health
University of California,
Berkeley
101 Haviland Hall, MC 7358
Berkeley, CA 94720
PHONE: 510-643-6160
FAX: 510-643-5163
OFFICE: 113B Haviland
EMAIL: hubbard@stat.berkeley.edu |

|
Teaching
PH 242C: Longitudinal Data Analysis
This course covers the statistical
issues
surrounding estimation of effects using data on subjects followed
through
time. The course emphasizes a regression
model approach and discusses disease
incidence modeling and both continuous outcome data/linear models and
longitudinal extentions to nonlinear models (e.g., logistic and
Poisson). The
primary focus is from the analysis side, but we will also discuss the
mathematical intuition behind the procedures.
The statistical/mathematical material include some survival analysis,
normal linear models, logistic and Poisson regression and matrix
algebra for
statistics. Next taught in Spring, 2009. Co-taught with Nicholas Jewell.
PH 296-03: Causal Consulting
The
course revolves around researchers (students, faculty, etc.) in
the School of Public Health that desire advice on the analysis plan (or
design) for their studies. They will present to the class their
particular problem, it will be followed by a discussion that defines 1)
the so-called full data one would optimally wish to have to estimate
the effect of interest (i.e., the notion of counterfactuals), 2) the
definition of this effect as a function of the full-data distribution,
3) the observed data, 4) estimators of the parameter of interest from
this full data and 5) assumptions necessary for this estimator to yield
unbiased estimates of the parameter from the observed data.
It’s a 5-step program that’s guaranteed to change your
life, or at least the way you approach analysis. We
emphasize how to provide proper inference that accounts for the amount
of exploration (model selection) one has done with the data. Next
taught in Spring, 2009. Co-taught with Mark van der Laan.
PH 298-53: Methods in Social Epidemiology
This
course is designed to review, evaluate and apply methods currently used
in the field of social epidemiology. The course aims to teach
approaches to forming clear research questions, and selecting the best
method(s) to answer the questions posed. Initially we will
discuss approaches to defining clear and specific research
questions. We will then discuss recent controversies around the
meaning of questions posed in social epidemiology, and the ability of
currently used methods to answer questions in social
epidemiology. Finally we will review, evaluate and apply a range
of different methods that are or could be used to answer questions in
social epidemiology, again emphasizing the types of questions answered
by these methods, and their ability to address the challenges to
effectively answering questions in social epidemiology. There
will be a mixture of discussion and lecture depending on the topic,
with student participation and questions strongly encouraged.
Next taught in Spring, 2009. Co-taught with Jennifer Ahern.
Research
Clustering
Functions
This research has revolved
around the apparently simple question:
How many different kinds of patients does many data set contain? It was motivated by a data set from San
Francisco General Hospital (SFGH) on several hundred HIV subjects
followed
after initialization of HAART. Subjects
were followed irregularly over time and both CD4 counts and viral loads
were
recorded. The basic method involves
an
ad hoc part (smoothing and prediction at grid points, clustering) and a
rigorous part (choosing the parameters at each step by
cross-validation). The result is a set of
clusters defined by
the longitudinal profiles of patients.
Dynamic Models of
Infectious Disease
More of my work has
been focused
on
infectious diseases and the unique statistical issues that arise when
outcome
data among subjects is inherently related (correlated).
Part of the work involves using mathematical
infectious disease models to investigate the potential bias of ignoring
the
feedback inherent in infectious diseases. (Eisenberg,
et al., 2003)
In
addition, a
recently submitted paper on analyzing the different contributions
(person-to-person, person-to-environment-to-person) to the Cryptosporidium
outbreak in Milwaukee, we used
a novel technique to
find the posterior
distribution (the estimation distribution) of the relevant parameters
in the
model. This involved a combination of
profile likelihood methods and a modified MCMC algorithm. (similar to Hubbard,
et al., 2002)
Risk Assessment
With Prof. Mark Nicas on assessing risk from
respiratory infections, also incorporating previous work on
dose-response. This work is inspired by
characterizing risk
of infection (and the efficacy of preventive measures) from
bioterrorism or
infection of hospital workers in an outbreak. (Nicas
and Hubbard, 2002 and Nicas
and Hubbard, 2003)
Computational
Biology
I have recently
completed an
initial
analysis on Affymetrix data and workers exposed to benzene. The
data (from Prof. Martyn Smith’s lab)
consists of 40,000+ gene expressions measured on 12 workers (6 exposed
and 6
unexposed matched pairs) in China. In
addition, we are examining a very similar data set on dioxin exposure.
Locally Efficient Estimation
Work on (treatment
specific) locally efficient estimation in the presence of potentially
informative censoring and confounding. (van
der Laan, Hubbard
and Robins, 2002, Hubbard,
et al., 2000, and van der
Laan and Hubbard,
1998)
Other
Activities
Links
Curiculum
vitae
Group in Biostatistics