A knowledge driven regression model for gene expression and microarray analysis

Rong Jin, Luo Si, Shireesh Srivastava, Zheng Li, Christina Chan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

  • 3 Citations

Abstract

The linear regression model has been widely used in the analysis of gene expression and microarray data to identify a subset of genes that are important to a given metabolic function. One of the key challenges in applying the linear regression model to gene expression data analysis arises from the sparse data problem, in which the number of genes is significantly larger than the number of conditions. To resolve this problem, we present a knowledge driven regression model that incorporates the knowledge of genes from the Gene Ontology (GO) database into the linear regression model. It is based on the assumption that two genes are likely to be assigned similar weights when they share similar sets of GO codes. Empirical studies show that the proposed knowledge driven regression model is effective in reducing the regression errors, and furthermore effective in identifying genes that are relevant to a given metabolite.

LanguageEnglish (US)
Title of host publicationAnnual International Conference of the IEEE Engineering in Medicine and Biology - Proceedings
Pages5326-5329
Number of pages4
DOIs
StatePublished - 2006
Event28th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS'06 - New York, NY, United States
Duration: Aug 30 2006Sep 3 2006

Other

Other28th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS'06
CountryUnited States
CityNew York, NY
Period8/30/069/3/06

Profile

Microarrays
Gene expression
Genes
Linear regression
Ontology
Metabolites

ASJC Scopus subject areas

  • Bioengineering

Cite this

Jin, R., Si, L., Srivastava, S., Li, Z., & Chan, C. (2006). A knowledge driven regression model for gene expression and microarray analysis. In Annual International Conference of the IEEE Engineering in Medicine and Biology - Proceedings (pp. 5326-5329). [4029317] DOI: 10.1109/IEMBS.2006.260347

A knowledge driven regression model for gene expression and microarray analysis. / Jin, Rong; Si, Luo; Srivastava, Shireesh; Li, Zheng; Chan, Christina.

Annual International Conference of the IEEE Engineering in Medicine and Biology - Proceedings. 2006. p. 5326-5329 4029317.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Jin, R, Si, L, Srivastava, S, Li, Z & Chan, C 2006, A knowledge driven regression model for gene expression and microarray analysis. in Annual International Conference of the IEEE Engineering in Medicine and Biology - Proceedings., 4029317, pp. 5326-5329, 28th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS'06, New York, NY, United States, 8/30/06. DOI: 10.1109/IEMBS.2006.260347
Jin R, Si L, Srivastava S, Li Z, Chan C. A knowledge driven regression model for gene expression and microarray analysis. In Annual International Conference of the IEEE Engineering in Medicine and Biology - Proceedings. 2006. p. 5326-5329. 4029317. Available from, DOI: 10.1109/IEMBS.2006.260347
Jin, Rong ; Si, Luo ; Srivastava, Shireesh ; Li, Zheng ; Chan, Christina. / A knowledge driven regression model for gene expression and microarray analysis. Annual International Conference of the IEEE Engineering in Medicine and Biology - Proceedings. 2006. pp. 5326-5329
@inproceedings{671a4c7be63d407882a65e9fb66829f6,
title = "A knowledge driven regression model for gene expression and microarray analysis",
abstract = "The linear regression model has been widely used in the analysis of gene expression and microarray data to identify a subset of genes that are important to a given metabolic function. One of the key challenges in applying the linear regression model to gene expression data analysis arises from the sparse data problem, in which the number of genes is significantly larger than the number of conditions. To resolve this problem, we present a knowledge driven regression model that incorporates the knowledge of genes from the Gene Ontology (GO) database into the linear regression model. It is based on the assumption that two genes are likely to be assigned similar weights when they share similar sets of GO codes. Empirical studies show that the proposed knowledge driven regression model is effective in reducing the regression errors, and furthermore effective in identifying genes that are relevant to a given metabolite.",
author = "Rong Jin and Luo Si and Shireesh Srivastava and Zheng Li and Christina Chan",
year = "2006",
doi = "10.1109/IEMBS.2006.260347",
language = "English (US)",
isbn = "1424400325",
pages = "5326--5329",
booktitle = "Annual International Conference of the IEEE Engineering in Medicine and Biology - Proceedings",

}

TY - GEN

T1 - A knowledge driven regression model for gene expression and microarray analysis

AU - Jin,Rong

AU - Si,Luo

AU - Srivastava,Shireesh

AU - Li,Zheng

AU - Chan,Christina

PY - 2006

Y1 - 2006

N2 - The linear regression model has been widely used in the analysis of gene expression and microarray data to identify a subset of genes that are important to a given metabolic function. One of the key challenges in applying the linear regression model to gene expression data analysis arises from the sparse data problem, in which the number of genes is significantly larger than the number of conditions. To resolve this problem, we present a knowledge driven regression model that incorporates the knowledge of genes from the Gene Ontology (GO) database into the linear regression model. It is based on the assumption that two genes are likely to be assigned similar weights when they share similar sets of GO codes. Empirical studies show that the proposed knowledge driven regression model is effective in reducing the regression errors, and furthermore effective in identifying genes that are relevant to a given metabolite.

AB - The linear regression model has been widely used in the analysis of gene expression and microarray data to identify a subset of genes that are important to a given metabolic function. One of the key challenges in applying the linear regression model to gene expression data analysis arises from the sparse data problem, in which the number of genes is significantly larger than the number of conditions. To resolve this problem, we present a knowledge driven regression model that incorporates the knowledge of genes from the Gene Ontology (GO) database into the linear regression model. It is based on the assumption that two genes are likely to be assigned similar weights when they share similar sets of GO codes. Empirical studies show that the proposed knowledge driven regression model is effective in reducing the regression errors, and furthermore effective in identifying genes that are relevant to a given metabolite.

UR - http://www.scopus.com/inward/record.url?scp=34047136060&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=34047136060&partnerID=8YFLogxK

U2 - 10.1109/IEMBS.2006.260347

DO - 10.1109/IEMBS.2006.260347

M3 - Conference contribution

SN - 1424400325

SN - 9781424400324

SP - 5326

EP - 5329

BT - Annual International Conference of the IEEE Engineering in Medicine and Biology - Proceedings

ER -