\documentclass[11pt, a4paper]{article}
\usepackage{helvet} \renewcommand{\familydefault}{\sfdefault} 
%%% PAGE DIMENSIONS
\usepackage[margin=2.5cm, head=1.27cm]{geometry} % to change the page dimensions
\usepackage{xcolor}
\usepackage{fancyhdr} % 
\pagestyle{fancy} % 
\lhead{}
\chead{\textcolor{gray}{International Biometric Society}}
\rhead{}
\renewcommand{\headrulewidth}{0pt} % customise the layout...
%
\lfoot{}
\cfoot{International Biometric Conference, Floripa, Brazil, 5Ð-10 December 2010}
\rfoot{}
%
\parindent 0pt
%%%
\begin{document}
%
%Please leave text above unchanged
%Replace below with your information
\begin{center}
\textbf{ Bayesian graphical models for whole genome association studies}\\[1em]
Claudio J. Verzilli$^1$, Nigel Stallard$^2$ and John C. Whittaker$^1$\\[1em]
\end{center}
$^1$ Department of Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, UK\\
$^2$ Division of Health in the Community, Warwick Medical School, The University of Warwick, UK\\

As the extent of human genetic variation becomes more fully characterised, the research community is faced with the challenging task of exploiting this information to dissect the heritable component of complex traits. Whole genome association studies offer great promise in this respect but their analysis poses formidable difficulties. In this paper we describe a computationally efficient approach for mining genotype-phenotype associations that scales to the size of datasets currently being collected in such studies. We use discrete graphical models as a data mining tool, searching for single or multi-locus patterns of association around a causative site. The approach is fully Bayesian allowing us to incorporate prior knowledge on the spatial dependencies  around each marker due to linkage disequilibrium, which simplifies considerably the number of possible graphical structures. An MCMC scheme is developed which  yields samples from the posterior distribution of graphs conditional on the data from which probabilistic statements about the strength of any genotype-phenotype association can be made.  Using data simulated under scenarios that vary in marker density, genotype relative risk of a causative allele and mode of inheritance, we show that the proposed approach has better localisation properties and leads to lower false positive rates than single-locus analyses. Finally we present an application of our method to a quasi-synthetic dataset in which data from the CYP2D6 region from~\cite{hosking02} is embedded within simulated data on 100K SNPs. Analysis is quick ($< 5$ mins)  and we are able to localise the causative site to a very short interval. 
\end{document}

