Genomics Inform Search


Genomics Inform > Volume 16(4); 2018 > Article
DOI:    Published online December 28, 2018.
Opinion: Strategy of Semi-Automatically Annotating Full Text Corpus of Genomics & Informatics
Hyun-Seok Park1,2 
1Bioinformatics Laboratory, ELTEC College of Engineering, Ewha Womans University, Seoul 03760, Korea
2Center for Convergence Research of Advanced Technologies, Ewha Womans University, Seoul 03760, Korea
Corresponding author:  Hyun-Seok Park
Tel: +82-2-3277-3513   Fax: +82-2-3277-2306   Email:
Received: December 13, 2018   Revised: December 20, 2018   Accepted: December 20, 2018
There is a community need for an annotated corpus consisting of the full texts of biomedical journal articles. In response to community needs, a prototype version of full text corpus of Genomics & Informatics, called GNI version 1.0 has been recently published, with 499 annotated full text articles available as a corpus resource. However, GNI needs to be updated, as the texts were shallow-parsed, and annotated with several existing parsers. I list issues associated with upgrading annotations, and give opinion on methodology to develop next version of GNI corpus based on a semi-automatic strategy for more linguistically rich corpus annotation.
Keywords: biomedical text mining, corpus, text analytics
Share :
Facebook Twitter Linked In Google+
METRICS Graph View
  • 0 Crossref
  • 131 View
  • 12 Download
Related articles in GNI


Browse all articles >

Editorial Office
The Korea Science Technology Center, Rm. 1011, 22 Teheran-ro 7-gil, Gangnam-gu, Seoul 06130, Korea
Tel: +82-2-558-9394    Fax: +82-2-558-9434    E-mail:                

Copyright © 2019 by Korea Genome Organization. All rights reserved.

Developed in M2community

Close layer
prev next