Genomics Inform Search

CLOSE


Genomics Inform > Volume 1(2); 2003 > Article
Data Mining for High Dimensional Data in Drug Discovery and Development.
Kwan R Lee, Daniel C Park, Xiwu Lin, Sergio Eslava
GlaxoSmithKline, Research & Development, Data Exploration Sciences 1250 South Collegeville Road Collegeville, PA 19426, USA. kwan.lee@gsk.com
Abstract
Data mining differs primarily from traditional data analysis on an important dimension, namely the scale of the data. That is the reason why not only statistical but also computer science principles are needed to extract information from large data sets. In this paper we briefly review data mining, its characteristics, typical data mining algorithms, and potential and ongoing applications of data mining at biopharmaceutical industries. The distinguishing characteristics of data mining lie in its understandability, scalability, its problem driven nature, and its analysis of retrospective or observational data in contrast to experimentally designed data. At a high level one can identify three types of problems for which data mining is useful: description, prediction and search. Brief review of data mining algorithms include decision trees and rules, nonlinear classification methods, memory-based methods, model-based clustering, and graphical dependency models. Application areas covered are discovery compound libraries, clinical trial and disease management data, genomics and proteomics, structural databases for candidate drug compounds, and other applications of pharmaceutical relevance.
Keywords: data mining; high dimensional data, genomics; proteomics, pharmacogenomics
TOOLS
Share :
Facebook Twitter Linked In Google+
METRICS Graph View
  • 1,088 View
  • 29 Download
Related articles in GNI

Pharmacogenomics in Drug Discovery and Development.2007 June;5(2)



ABOUT
ARTICLE CATEGORY

Browse all articles >

BROWSE ARTICLES
FOR CONTRIBUTORS
Editorial Office
Room No. 806, 193 Mallijae-ro, Jung-gu, Seoul 04501, Korea
Tel: +82-2-558-9394    Fax: +82-2-558-9434    E-mail: kogo3@kogo.or.kr                

Copyright © 2022 by Korea Genome Organization.

Developed in M2PI

Close layer
prev next