We focus on developing algorithms to process text and to make their information accessible to many Natural Language Processing-based applications. We also specialize in Korean Language Processing and keep some Korean Language Processing tools and resources.  If you are interested, please contact us!

Deep Learning-based Language Processing

We are working on Deep Learning-based Language Processing. We are heavily working on word/contextual embeddings based on Transformer Architectures. We have been developing many Korean Pre-trained models.

Try out Korean based Bert pre-trained (KR-BERT), KR-KOSAC-BERT, and many other Models.

Semantic Search

We have been working on Semantic Search especially focused on Legal and Medical Documents

Sentiment / Opinion Analysis

We have been working on (Korean) Sentiment/Opinion Analysis. We trained KR_SpanBert for Aspect-based Sentiment Analysis.

Try out Korean Sentiment Analysis Corpus (KOSAC).

Korean Temporal Awareness and Reasoning Systems for Question Interpretation

We are working on the Korean version of Temporal Awareness and Reasoning Systems for Question Interpretation, following the work of TARSQI in Brandeis University. Currently, we are developing the Korean TimeML (Markup Language for Temporal and Event Expressions).

TimeML is a robust specification language for events and temporal expressions in natural language. It is designed to address four problems in event and temporal expression markup:

  1. Time stamping of events (identifying an event and anchoring it in time)

  2. Ordering events with respect to one another (lexical versus discourse properties of ordering)

  3. Reasoning with contextually underspecified temporal expressions (temporal functions such as 'last week' and 'two weeks before')

  4. Reasoning about the persistence of events (how long does an event or the outcome of an event last)

Korean Lexical Resources

We are developing Korean lexical resources for various NLP task

  • The KOLON(KOrean Lexicon mapped onto ONtology) - we map Korean nouns and predicates (verbs and adjectves) from the Sejong Electronic Dictionary onto the Mikrokosmos Ontology developed by New Mexico State University. The KOLON is different from other Wordnets for Korean in that it separates concepts from lexical items, and lexical items are mapped onto the concepts, which ends up combining ontological relations with lexical constrains, and achieving  byproduct, lexical hierarchies. Lexical items now have various lexical relations such as hypernymy and homonymy, syntactic information such as subcategorization, and semantic information such as conceptual structures (semantic classifications). The Resource browser will be available pretty soon.
  • We are also working on the methods for automatic clustering of similar words from the web. Word Similarity for unlisted words in a dictionary is important for NLP work. Our similarity measure for Korean helps us to enrich our lexical resources with those newly created or unlisted words.

Knowledge Base/Ontology

Nowadays, research related to ontologies in connection with natural language processing of meanings is a trend. These ontologies, as structures of concepts, are a part of a knowledge base needed for lexical bases, lexical networks, semantic networks and meta-NLP. Concerning this field, we have been doing the following at our lab:

  • Construction of an ontology by structuring various concepts, and, following this, trying to classify the Korean lexicon, which is used for establishing semantic relations and constructing lexicons on specialized fields.
  • Research on the application of an ontology in an actual system, based on experience in the development of an actual ontology, Mikrokosmos Ontology at CRL of New Mexico State University.
  • Research on the solution for Korean words' suitableness based on language resources rooted in ontologies, as well as research on ontology integration.