A knowledge driven regression model for gene expression and microarray analysis

Rong Jin, Luo Si, Shireesh Srivastava, Zheng Li, Christina Chan

    Research output: ResearchConference contribution

    • 3 Citations

    Abstract

    The linear regression model has been widely used in the analysis of gene expression and microarray data to identify a subset of genes that are important to a given metabolic function. One of the key challenges in applying the linear regression model to gene expression data analysis arises from the sparse data problem, in which the number of genes is significantly larger than the number of conditions. To resolve this problem, we present a knowledge driven regression model that incorporates the knowledge of genes from the Gene Ontology (GO) database into the linear regression model. It is based on the assumption that two genes are likely to be assigned similar weights when they share similar sets of GO codes. Empirical studies show that the proposed knowledge driven regression model is effective in reducing the regression errors, and furthermore effective in identifying genes that are relevant to a given metabolite.

    LanguageEnglish (US)
    Title of host publicationAnnual International Conference of the IEEE Engineering in Medicine and Biology - Proceedings
    Pages5326-5329
    Number of pages4
    DOIs
    StatePublished - 2006
    Event28th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS'06 - New York, NY, United States
    Duration: Aug 30 2006Sep 3 2006

    Other

    Other28th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS'06
    CountryUnited States
    CityNew York, NY
    Period8/30/069/3/06

    Profile

    Microarrays
    Gene expression
    Genes
    Linear regression
    Ontology
    Metabolites

    ASJC Scopus subject areas

    • Bioengineering

    Cite this

    Jin, R., Si, L., Srivastava, S., Li, Z., & Chan, C. (2006). A knowledge driven regression model for gene expression and microarray analysis. In Annual International Conference of the IEEE Engineering in Medicine and Biology - Proceedings (pp. 5326-5329). [4029317] DOI: 10.1109/IEMBS.2006.260347

    A knowledge driven regression model for gene expression and microarray analysis. / Jin, Rong; Si, Luo; Srivastava, Shireesh; Li, Zheng; Chan, Christina.

    Annual International Conference of the IEEE Engineering in Medicine and Biology - Proceedings. 2006. p. 5326-5329 4029317.

    Research output: ResearchConference contribution

    Jin, R, Si, L, Srivastava, S, Li, Z & Chan, C 2006, A knowledge driven regression model for gene expression and microarray analysis. in Annual International Conference of the IEEE Engineering in Medicine and Biology - Proceedings., 4029317, pp. 5326-5329, 28th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS'06, New York, NY, United States, 8/30/06. DOI: 10.1109/IEMBS.2006.260347
    Jin R, Si L, Srivastava S, Li Z, Chan C. A knowledge driven regression model for gene expression and microarray analysis. In Annual International Conference of the IEEE Engineering in Medicine and Biology - Proceedings. 2006. p. 5326-5329. 4029317. Available from, DOI: 10.1109/IEMBS.2006.260347
    Jin, Rong ; Si, Luo ; Srivastava, Shireesh ; Li, Zheng ; Chan, Christina. / A knowledge driven regression model for gene expression and microarray analysis. Annual International Conference of the IEEE Engineering in Medicine and Biology - Proceedings. 2006. pp. 5326-5329
    @inbook{671a4c7be63d407882a65e9fb66829f6,
    title = "A knowledge driven regression model for gene expression and microarray analysis",
    abstract = "The linear regression model has been widely used in the analysis of gene expression and microarray data to identify a subset of genes that are important to a given metabolic function. One of the key challenges in applying the linear regression model to gene expression data analysis arises from the sparse data problem, in which the number of genes is significantly larger than the number of conditions. To resolve this problem, we present a knowledge driven regression model that incorporates the knowledge of genes from the Gene Ontology (GO) database into the linear regression model. It is based on the assumption that two genes are likely to be assigned similar weights when they share similar sets of GO codes. Empirical studies show that the proposed knowledge driven regression model is effective in reducing the regression errors, and furthermore effective in identifying genes that are relevant to a given metabolite.",
    author = "Rong Jin and Luo Si and Shireesh Srivastava and Zheng Li and Christina Chan",
    year = "2006",
    doi = "10.1109/IEMBS.2006.260347",
    isbn = "1424400325",
    pages = "5326--5329",
    booktitle = "Annual International Conference of the IEEE Engineering in Medicine and Biology - Proceedings",

    }

    TY - CHAP

    T1 - A knowledge driven regression model for gene expression and microarray analysis

    AU - Jin,Rong

    AU - Si,Luo

    AU - Srivastava,Shireesh

    AU - Li,Zheng

    AU - Chan,Christina

    PY - 2006

    Y1 - 2006

    N2 - The linear regression model has been widely used in the analysis of gene expression and microarray data to identify a subset of genes that are important to a given metabolic function. One of the key challenges in applying the linear regression model to gene expression data analysis arises from the sparse data problem, in which the number of genes is significantly larger than the number of conditions. To resolve this problem, we present a knowledge driven regression model that incorporates the knowledge of genes from the Gene Ontology (GO) database into the linear regression model. It is based on the assumption that two genes are likely to be assigned similar weights when they share similar sets of GO codes. Empirical studies show that the proposed knowledge driven regression model is effective in reducing the regression errors, and furthermore effective in identifying genes that are relevant to a given metabolite.

    AB - The linear regression model has been widely used in the analysis of gene expression and microarray data to identify a subset of genes that are important to a given metabolic function. One of the key challenges in applying the linear regression model to gene expression data analysis arises from the sparse data problem, in which the number of genes is significantly larger than the number of conditions. To resolve this problem, we present a knowledge driven regression model that incorporates the knowledge of genes from the Gene Ontology (GO) database into the linear regression model. It is based on the assumption that two genes are likely to be assigned similar weights when they share similar sets of GO codes. Empirical studies show that the proposed knowledge driven regression model is effective in reducing the regression errors, and furthermore effective in identifying genes that are relevant to a given metabolite.

    UR - http://www.scopus.com/inward/record.url?scp=34047136060&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=34047136060&partnerID=8YFLogxK

    U2 - 10.1109/IEMBS.2006.260347

    DO - 10.1109/IEMBS.2006.260347

    M3 - Conference contribution

    SN - 1424400325

    SN - 9781424400324

    SP - 5326

    EP - 5329

    BT - Annual International Conference of the IEEE Engineering in Medicine and Biology - Proceedings

    ER -