HomeBiostatistical Methods: Applications of Statistics to Genetics and Molecular Biology

Biostatistical Methods: Applications of Statistics to Genetics and Molecular Biology


PB HLTH C240D/STAT C245D

 

Course Description: This course surveys statistical and computational methods in biomedical and genomic research. Biological topics surveyed in the course include: modeling meiosis; the genetic mapping of complex human traits: e.g., haplotype inference, linkage analysis, linkage disequilibrium analysis, SNP-based association studies; nucleotide and protein sequence analysis: e.g., sequence alignment, identification of regulatory motifs in DNA sequences, gene finding; the design and analysis of high-throughput gene expression experiments using microarrays and second generation sequencing: e.g., transcriptome analysis and genome annotation using mRNA-Chip and mRNA-Seq, protein-nucleic acid interaction using ChIP-Chip and ChIP-Seq; the analysis of biological annotation metadata: e.g., Gene Ontology (GO) annotation.

Statistical and computational methods, introduced within the biological context, include: numerical and graphical summaries of data; stochastic processes: e.g., Markov and hidden Markov processes; loss-based estimation with cross-validation: regression, classification, maximum likelihood estimation, density estimation, variable selection; Markov chain Monte-Carlo procedures; multiple hypothesis testing; cluster analysis; resampling: cross-validation, bootstrap; the design of in silico experiments.

The course discusses statistical computing resources for the analysis of biological data, with emphasis on the R language and environment (www.r-project.org ) and Bioconductor software packages (www.bioconductor.org ).

The course also provides an introduction to basic notions in genetics and molecular biology and involves the critical reading of articles related to statistical analyses in the biological and medical sciences.