Microarray Informatics
The Microarray Informatics efforts of CHIBI/BPIC are closely coordinated with the Genome Technology Center which generates a large majority of array-based data for NYU Langone Medical Center. Microarray-centered informatics is currently applied to primarily two high-capacity profiling areas:
1) genome wide microarray gene expression profiling
2) array-based Q-PCR microRNA expression profiling
We provide assistance with these following modules of typical experimental workflows:
Study and Experiment Design – in-depth discussion of the biological context and key questions asked; experiment type selection (pair-wise comparison, time series, multiparametric studies); best cost-effective strategies for replicate studies; optimization of experimental conduct to identify and minimize sources of experimental noise; sample preparation strategy and array platform selection.
Data Preprocessing and Management – upload of annotated raw data into institutional repository and into client analysis environments; data normalization and transformation (probe level summarization principles (Affymetrix), condition-centered normalization strategies, experiment interpretation options); data filtering (intensities, QC metrics such as Ct values and Q-PCR flags (microRNA)); data reduction strategies.
Differential Expression – statistical significance (t-test statistics, one-way and two-factor ANOVA, multiple testing corrections, Significance Analysis of Microrarrays, non-parametric tests, Bayesian estimation of temporal regulation)
Pattern Matching – ad hoc and post hoc template matching procedures to establish non-overlapping patterns of gene expression.
Clustering of reduced-size data – unsupervised (hierarchical), semi-supervised (K-means clustering, Self-Organizing Maps), resampling for support.
Classification – strategies to classify biological samples and predicting outcomes based on gene expression profiles (Support Vector Machines, Discriminant Analysis Classifiers, K- nearest neighbor).
MicroRNA Target mRNA Prediction – cross-validation of target prediction algorithms; correlation of miRNA profiles with mRNA and protein expression patterns derived from gene expression and proteomics data; context-score ranking of target mRNAs; determination of patterns of post-transcriptional control (microRNA:mRNA regulatory networks).
Biological Significance – in-depth data and literature mining; functional annotation of gene groups and determination of context-specific biological meaning of expression profiling results using integrated biological knowledgebases (e.g. NIH DAVID -database for annotation, visualization and integrated discovery, GSE – gene set enrichment analysis etc).
Integration of Multiple Species Data – batch translation of standard ID’s into orthologous lists; matching of expression patterns from patient and model organism samples.
Data Publication – formatting data for upload to public repositories (NCBI GEO); visualization of data for publication purposes; generating specific-focus figures and tables for grant applications, research articles and scientific conference presentations.
For a consultation please contact:
Jiri Zavadil
OR
InformaticsConsultation@nyumc.org
