Optimized for browser print and PDF export.
Resume

Donghyeon Kim

Data and applied AI engineer connecting data structure, NLP/LLM evaluation, and implementation

Data & Applied AI Engineer Data SystemsAI-DLC / MLOpsNLP / LLM
Location Incheon, South Korea
Projects 7
Publications 3
Training & Certifications 4

Summary

I work across data structure, AI development workflows, and NLP/LLM evaluation, carrying problems from framing to implementation, validation, and delivery.

  • I turn diverse data surfaces into structures and pipelines that can carry into modeling and operations.
  • I make AI experiments and operating workflows reproducible and observable.

Research, Programs, and Leadership

Graduate Researcher
Intelligent Data Analytics Lab., Gachon University | 2024.03 - 2026.02
  • Led graduate research on EMR-based nursing surveillance decision support and diagnostic classification.
  • Built end-to-end modeling pipelines using KM-BERT ensembles, XGBoost, and both structured and text data.
Research Project Participant
Institute of Information & Communications Technology Planning & Evaluation (IITP) | 2025.09 - 2025.12
  • Implemented evaluation-related code in a human-centered multimodal AI project.
  • Bridged evaluation requirements with actual code and reviewable deliverables.
Research Project Participant
National Research Foundation of Korea (NRF) | 2024.03 - 2025.12
  • Contributed to an NRF-funded clinical AI project centered on nursing surveillance decision support using EMR data.
  • Implemented workflows for clinical text understanding, including keyword extraction, dependency parsing-based preprocessing, topic modeling, and similarity analysis.
Student Leader and Community Organizer
Gachon University / Notion Community Program | 2019.03 - 2025.02
  • Held multiple leadership roles in the official university programming club and served as president in 2022.
  • Planned and led study groups on machine learning, big data, financial ML, and GNNs.

Projects

EMR-Based Nursing Surveillance for Automatic ICD Coding
Clinical AI Research | 2025

Built an automatic ICD coding pipeline for nursing surveillance of abdominal surgery patients using core EMR data.

  • Reviewed overall behavior and rare-class recall together
  • Core EMR classification without post-hoc documents
Open project
Contexta: Local-First ML Observability
Self-directed ML Platform Project | 2026

Designed and built Contexta as a local-first ML observability project for collecting, storing, querying, comparing, and recovering machine learning execution records...

  • 로컬 퍼스트 observability 구조 설계
  • canonical contract 및 workspace 구현
Open project
Lynxes: Graph Analytics Engine
Graph Systems Engine Project | 2026

Designed and implemented Lynxes, an Apache Arrow-based graph analytics engine focused on CSR indexing, lazy execution, and a high-performance graph processing experi...

  • 그래프 엔진 아키텍처 설계
  • CSR 탐색 구조 구현
Open project