A bayesian framework for knowledge driven regression model in micro-array data analysis

Rong Jin, Luo Si, Christina Chan

    Research output: Contribution to journalArticle

    • 4 Citations

    Abstract

    This paper addresses the sparse data problem in the linear regression model, namely the number of variables is significantly larger than the number of the data points for regression. We assume that in addition to the measured data points, the prior knowledge about the input variables may be provided in the form of pair wise similarity. We presented a full Bayesian framework to effectively exploit the similarity information of the input variables for linear regression. Empirical studies with gene expression data show that the regression errors can be reduced significantly by incorporating the similarity information derived from gene ontology.

    Original languageEnglish (US)
    Pages (from-to)250-267
    Number of pages18
    JournalInternational Journal of Data Mining and Bioinformatics
    Volume2
    Issue number3
    DOIs
    StatePublished - 2008

    Profile

    Anthralin
    Linear regression
    regression
    Linear Models
    data
    Dental Technicians
    Cyclic AMP Receptor Protein
    Opportunistic Infections
    Gene expression
    Ontology
    Data reduction
    Genes
    input
    information
    model
    Gene Ontology
    Gene Expression
    ontology
    data analysis
    error

    Keywords

    • Bayesian analysis
    • Bioinformatics
    • Data mining
    • Data regression
    • Gene expression analysis
    • Graph Laplacian
    • Knowledge driven data regression

    ASJC Scopus subject areas

    • Library and Information Sciences
    • Information Systems
    • Biochemistry, Genetics and Molecular Biology(all)

    Cite this

    A bayesian framework for knowledge driven regression model in micro-array data analysis. / Jin, Rong; Si, Luo; Chan, Christina.

    In: International Journal of Data Mining and Bioinformatics, Vol. 2, No. 3, 2008, p. 250-267.

    Research output: Contribution to journalArticle

    Jin, Rong; Si, Luo; Chan, Christina / A bayesian framework for knowledge driven regression model in micro-array data analysis.

    In: International Journal of Data Mining and Bioinformatics, Vol. 2, No. 3, 2008, p. 250-267.

    Research output: Contribution to journalArticle

    @article{d786703448424113a00cfffd9fb2f3fb,
    title = "A bayesian framework for knowledge driven regression model in micro-array data analysis",
    abstract = "This paper addresses the sparse data problem in the linear regression model, namely the number of variables is significantly larger than the number of the data points for regression. We assume that in addition to the measured data points, the prior knowledge about the input variables may be provided in the form of pair wise similarity. We presented a full Bayesian framework to effectively exploit the similarity information of the input variables for linear regression. Empirical studies with gene expression data show that the regression errors can be reduced significantly by incorporating the similarity information derived from gene ontology.",
    keywords = "Bayesian analysis, Bioinformatics, Data mining, Data regression, Gene expression analysis, Graph Laplacian, Knowledge driven data regression",
    author = "Rong Jin and Luo Si and Christina Chan",
    year = "2008",
    doi = "10.1504/IJDMB.2008.020525",
    volume = "2",
    pages = "250--267",
    journal = "International Journal of Data Mining and Bioinformatics",
    issn = "1748-5673",
    publisher = "Inderscience Enterprises Ltd",
    number = "3",

    }

    TY - JOUR

    T1 - A bayesian framework for knowledge driven regression model in micro-array data analysis

    AU - Jin,Rong

    AU - Si,Luo

    AU - Chan,Christina

    PY - 2008

    Y1 - 2008

    N2 - This paper addresses the sparse data problem in the linear regression model, namely the number of variables is significantly larger than the number of the data points for regression. We assume that in addition to the measured data points, the prior knowledge about the input variables may be provided in the form of pair wise similarity. We presented a full Bayesian framework to effectively exploit the similarity information of the input variables for linear regression. Empirical studies with gene expression data show that the regression errors can be reduced significantly by incorporating the similarity information derived from gene ontology.

    AB - This paper addresses the sparse data problem in the linear regression model, namely the number of variables is significantly larger than the number of the data points for regression. We assume that in addition to the measured data points, the prior knowledge about the input variables may be provided in the form of pair wise similarity. We presented a full Bayesian framework to effectively exploit the similarity information of the input variables for linear regression. Empirical studies with gene expression data show that the regression errors can be reduced significantly by incorporating the similarity information derived from gene ontology.

    KW - Bayesian analysis

    KW - Bioinformatics

    KW - Data mining

    KW - Data regression

    KW - Gene expression analysis

    KW - Graph Laplacian

    KW - Knowledge driven data regression

    UR - http://www.scopus.com/inward/record.url?scp=53349127892&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=53349127892&partnerID=8YFLogxK

    U2 - 10.1504/IJDMB.2008.020525

    DO - 10.1504/IJDMB.2008.020525

    M3 - Article

    VL - 2

    SP - 250

    EP - 267

    JO - International Journal of Data Mining and Bioinformatics

    T2 - International Journal of Data Mining and Bioinformatics

    JF - International Journal of Data Mining and Bioinformatics

    SN - 1748-5673

    IS - 3

    ER -