• Login
    View Item 
    •   Home
    • Research Articles
    • Journal articles & published research
    • View Item
    •   Home
    • Research Articles
    • Journal articles & published research
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Map of Submissions

    Home Page
    UlsterN
    4715
    UlsterS
    4715
    Connacht
    1603
    Munster
    48
    Leinster
    426

    Browse

    All of Lenus, The Irish Health RepositoryCommunitiesTitleAuthorsDate publishedSubjectsThis CollectionTitleAuthorsDate publishedSubjects

    My Account

    LoginRegister

    About

    About LenusDirectory of Open Access JournalsOpen Access Publishing GuideNational Health Library & Knowledge ServiceGuide to Publishers' PoliciesFAQsTerms and ConditionsVision StatementRIAN Pathways to Irish ResearchHSE position statement on Open AccessNational Open Research Forum (NORF)Zenodo (European Open Research repository)

    Statistics

    Display statistics

    Greedy feature selection for glycan chromatography data with the generalized Dirichlet distribution

    • CSV
    • RefMan
    • EndNote
    • BibTex
    • RefWorks
    Thumbnail
    Name:
    1471-2105-14-155.xml
    Size:
    377.4Kb
    Format:
    XML
    Download
    Thumbnail
    Name:
    1471-2105-14-155.pdf
    Size:
    946.8Kb
    Format:
    PDF
    Download
    View more filesView fewer files
    Authors
    Galligan, Marie C
    Saldova, Radka
    Campbell, Matthew P
    Rudd, Pauline M
    Murphy, Thomas B
    Issue Date
    2013-05-07
    
    Metadata
    Show full item record
    Citation
    BMC Bioinformatics. 2013 May 07;14(1):155
    URI
    http://dx.doi.org/10.1186/1471-2105-14-155
    http://hdl.handle.net/10147/295444
    Abstract
    Abstract Background Glycoproteins are involved in a diverse range of biochemical and biological processes. Changes in protein glycosylation are believed to occur in many diseases, particularly during cancer initiation and progression. The identification of biomarkers for human disease states is becoming increasingly important, as early detection is key to improving survival and recovery rates. To this end, the serum glycome has been proposed as a potential source of biomarkers for different types of cancers. High-throughput hydrophilic interaction liquid chromatography (HILIC) technology for glycan analysis allows for the detailed quantification of the glycan content in human serum. However, the experimental data from this analysis is compositional by nature. Compositional data are subject to a constant-sum constraint, which restricts the sample space to a simplex. Statistical analysis of glycan chromatography datasets should account for their unusual mathematical properties. As the volume of glycan HILIC data being produced increases, there is a considerable need for a framework to support appropriate statistical analysis. Proposed here is a methodology for feature selection in compositional data. The principal objective is to provide a template for the analysis of glycan chromatography data that may be used to identify potential glycan biomarkers. Results A greedy search algorithm, based on the generalized Dirichlet distribution, is carried out over the feature space to search for the set of “grouping variables” that best discriminate between known group structures in the data, modelling the compositional variables using beta distributions. The algorithm is applied to two glycan chromatography datasets. Statistical classification methods are used to test the ability of the selected features to differentiate between known groups in the data. Two well-known methods are used for comparison: correlation-based feature selection (CFS) and recursive partitioning (rpart). CFS is a feature selection method, while recursive partitioning is a learning tree algorithm that has been used for feature selection in the past. Conclusions The proposed feature selection method performs well for both glycan chromatography datasets. It is computationally slower, but results in a lower misclassification rate and a higher sensitivity rate than both correlation-based feature selection and the classification tree method.
    Item Type
    Journal Article
    Collections
    Journal articles & published research

    entitlement

     
    National Health Library & Knowledge Service | Health Service Executive | Dr Steevens' Hospital | Dublin 8 | Ireland
    lenus@hse.ie | Tel +353 (1) 6352558
    DSpace software copyright © 2002-2017  DuraSpace
    Contact Us | Disclaimer
    Open Repository is a service operated by 
    Atmire NV
     

    Export search results

    The export option will allow you to export the current search results of the entered query to a file. Different formats are available for download. To export the items, click on the button corresponding with the preferred download format.

    By default, clicking on the export buttons will result in a download of the allowed maximum amount of items.

    To select a subset of the search results, click "Selective Export" button and make a selection of the items you want to export. The amount of items that can be exported at once is similarly restricted as the full export.

    After making a selection, click one of the export format buttons. The amount of items that will be exported is indicated in the bubble next to export format.