Portfolio | Donghyeon Kim

Review Mode

Choose a first reading angle

Start with official record and role fit, then use cases for technical support.

Focus Areas

Reorder representative work by focus area

Choose data, AI workflow, or NLP/LLM to bring the most relevant projects forward.

Choose a focus area to bring relevant projects forward.

Primary work

Lynxes: Graph Analytics Engine Graph Systems Engine Project · 2026

Primary evidence for Data Systems review.

Shows system design and implementation depth for treating graph data as a first-class execution model.

Supporting signals

Contexta: Local-First ML Observability Self-directed ML Platform Project · 2026

Supporting evidence for Data Systems review.

Primary work

Contexta: Local-First ML Observability Self-directed ML Platform Project · 2026

Primary evidence for AI-DLC / MLOps review.

Shows operational observability design for tracing, comparing, and recovering AI execution records and artifacts.

Supporting signals

Lynxes: Graph Analytics Engine Graph Systems Engine Project · 2026

Supporting evidence for AI-DLC / MLOps review.

BloGeek: AI Modules for a React + Spring Blog Project Collaborative NLP Project · 2023

Supporting evidence for AI-DLC / MLOps review.

Primary work

BloGeek: AI Modules for a React + Spring Blog Project Collaborative NLP Project · 2023

Primary evidence for NLP / LLM review.

Connects Korean NLP classification and generation models to product-shaped web service features.

Development Flow

Recurring development flow across AI projects

This summarizes recurring work patterns across projects, not a click-by-click stage diagram for one project.

Structured Data Structured Data Design

Schema and relational design for structured inputs including EMR and feature tables.

Text & Embedding Text & Embedding

Tokenization and semantic embedding space structure for unstructured text.

Knowledge Graph Knowledge Graph

Real-time knowledge graph design compatible with GNNs and recommender systems.

Hybrid Pipeline Hybrid Pipeline

Integrating heterogeneous sources into high-performance processing pipelines.

Data Data surface

Shape structured, text, image, and graph inputs into usable modeling surfaces.

Experiment Tracking

Keep hypotheses, settings, metrics, and artifacts inspectable.

Training Training flow

Manage model training through reproducible environments and workflows.

Evaluation Evaluation design

Review errors, rare cases, and domain needs beyond top-line scores.

Deployment Delivery

Package outputs into reviewable artifacts and execution surfaces.

Observability Visibility

Use logs, artifacts, and run records to explain system state.

Feedback / Recovery Recovery

Connect comparison, recovery, and retrospection into the next cycle.

Language Modeling Language Modeling

Domain-specific pre-training and downstream task adaptation/fine-tuning.

Evaluation Design Evaluation Design

Multi-dimensional evaluation addressing rare errors, biases, and domain constraints beyond simple metrics.

Alignment & Preference Alignment & Preference

Instruction tuning and alignment with human/AI preferences via RLAIF/RLHF.

Research-to-System Research-to-System

Translating advanced NLP research into production-grade web service APIs and lightweight optimized models.

Project Records

Project records

Project title and explanation stay separate from metadata such as data surfaces and availability.

Data Systems

Data Systems Specialist with Graph-native Depth

Turns heterogeneous data into model-ready and system-ready structures, pipelines, and graph representations.

Built project

Contexta: Local-First ML Observability

Shows operational observability design for tracing, comparing, and recovering AI execution records and artifacts.

structuredhybrid

Open Project

Built project

Lynxes: Graph Analytics Engine

Shows system design and implementation depth for treating graph data as a first-class execution model.

structuredgraphhybrid

Open Project

Published work / Summary available

EMR-Based Nursing Surveillance for Automatic ICD Coding

Combines heterogeneous structured EMR and Korean clinical text into an evaluable NLP pipeline.

structuredtexthybrid

Open Project

Built project / Summary available

Dalkom Shop: Internal Employee Mileage Commerce Platform

Shows delivery, security, and observability foundations for running service features in an operational environment.

structuredimagehybrid

Open Project

AI-DLC / MLOps

AI-DLC and Operational MLOps Engineer

Makes run records, artifacts, model behavior, and feedback paths inspectable enough to improve.

Built project

Contexta: Local-First ML Observability

Shows operational observability design for tracing, comparing, and recovering AI execution records and artifacts.

structuredhybrid

Open Project

Built project

Lynxes: Graph Analytics Engine

Shows system design and implementation depth for treating graph data as a first-class execution model.

structuredgraphhybrid

Open Project

Built project / Summary available

Dalkom Shop: Internal Employee Mileage Commerce Platform

Shows delivery, security, and observability foundations for running service features in an operational environment.

structuredimagehybrid

Open Project

Prototype / Summary available

Devridge: LLM-Based Feedback Bridge for Developers

Shows prompt and interaction design for making LLM output useful in role-based technical review.

text

Open Project

Built project / Summary available

BloGeek: AI Modules for a React + Spring Blog Project

Connects Korean NLP classification and generation models to product-shaped web service features.

text

Open Project

Built project / Summary available

FRIMO: Conversational AI for Emotional Support and Diary Generation

Connects Korean NLP models into a user-facing AI pipeline for conversational product experience.

text

Open Project

NLP / LLM

Applied NLP and LLM Research Engineer

Translates language-model research into better modeling and evaluation decisions in applied systems.

Published work / Summary available

EMR-Based Nursing Surveillance for Automatic ICD Coding

Combines heterogeneous structured EMR and Korean clinical text into an evaluable NLP pipeline.

structuredtexthybrid

Open Project

Prototype / Summary available

Devridge: LLM-Based Feedback Bridge for Developers

Shows prompt and interaction design for making LLM output useful in role-based technical review.

text

Open Project

Built project / Summary available

BloGeek: AI Modules for a React + Spring Blog Project

Connects Korean NLP classification and generation models to product-shaped web service features.

text

Open Project

Built project / Summary available

FRIMO: Conversational AI for Emotional Support and Diary Generation

Connects Korean NLP models into a user-facing AI pipeline for conversational product experience.

text

Open Project

Project Comparison

Featured work framed by problem, decision, and outcome

The portfolio is not a project list; it is a map of judgment, tradeoffs, and results.

Project Comparison

Compare representative projects by the same criteria

Review featured work by problem, decision, outcome, and availability before reading the full case.

Case	Problem	Decision	Outcome	Availability	Open
Lynxes: Graph Analytics Engine Graph Systems Engine Project / 2026	Existing Python graph libraries and generic dataframe wrappers often struggle to combine memory efficiency, traversal performance, and lazy query optimization for large graph analytics.	Designed GraphFrame to own Arrow RecordBatches directly.	Established the foundation for a graph analytics engine with Arrow columnar memory, CSR-based traversal, and lazy collect execution.	Built project Public	Open
Contexta: Local-First ML Observability Self-directed ML Platform Project / 2026	ML experiments and deployment work often scatter metadata, records, and artifacts across tools, making reproducible local observability hard to maintain.	Used a `.contexta/` workspace as the home for separated metadata, records, and artifact storage.	Implemented a local observability foundation for consistently managing and inspecting ML execution history and artifacts.	Built project Public	Open
BloGeek: AI Modules for a React + Spring Blog Project Collaborative NLP Project / 2023	The product needed ML components that could classify emotional polarity and generate stylistic variations of text to support richer blog content workflows.	Used KoBERT for polarity recognition and KoBART for style transfer.	The project gave the team practical AI modules for blog-oriented text processing.	Built project Summary available	Open

Graph Systems Engine Project / 2026

Lynxes: Graph Analytics Engine

A high-performance graph analytics engine that combines Arrow columnar memory with graph-native traversal structures for Python users.

Problem

Existing Python graph libraries and generic dataframe wrappers often struggle to combine memory efficiency, traversal performance, and lazy query optimization for large graph analytics.

Key Decision

Designed GraphFrame to own Arrow RecordBatches directly.

Outcome

Established the foundation for a graph analytics engine with Arrow columnar memory, CSR-based traversal, and lazy collect execution.

Self-directed ML Platform Project / 2026

Contexta: Local-First ML Observability

A local-first observability library for tracing, comparing, and recovering ML execution history through one consistent contract.

Problem

ML experiments and deployment work often scatter metadata, records, and artifacts across tools, making reproducible local observability hard to maintain.

Key Decision

Used a `.contexta/` workspace as the home for separated metadata, records, and artifact storage.

Outcome

Implemented a local observability foundation for consistently managing and inspecting ML execution history and artifacts.

Collaborative NLP Project / 2023

BloGeek: AI Modules for a React + Spring Blog Project

A Korean NLP project connecting emotion classification and style transfer models to a blog product workflow.

Problem

The product needed ML components that could classify emotional polarity and generate stylistic variations of text to support richer blog content workflows.

Key Decision

Used KoBERT for polarity recognition and KoBART for style transfer.

Outcome

The project gave the team practical AI modules for blog-oriented text processing.

Process

A repeatable way of turning research into systems

This section shows how research, experimentation, implementation, and delivery connect.

01

Feature Engineering

Data structure and management

I preprocess complex data in ways that fit the task and design pipelines that preserve data quality and consistency.

02

Reproducible Experiments

Automating training

I train models under carefully controlled code and environment settings, and track them systematically to build experiments that can be reproduced at any time.

03

Robust Evaluation

Validation and assessment

Going beyond simple accuracy, I examine robustness, error cases, and application-context requirements from multiple angles before practical use.

04

DevOps

Infrastructure and deployment

I deploy models in AWS or Docker environments and support reliable operation in real-world settings through continuous integration and automation.

05

System Observability

Monitoring and feedback loops

I collect and visualize logs, resource signals, and prediction outputs from deployed AI systems in real time so that the internal state of the pipeline can be observed transparently.

Research

Research records

Research is connected as modeling and evaluation background behind project decisions.

Journal of The Korea Society of Computer and Information / 2025

Deep Learning based Automatic ICD Coding for Nursing Surveillance of Abdominal Surgery Patients

Supports the portfolio claim that NLP/LLM systems should be judged through domain data structure and error distribution, not only headline accuracy.

EMR-Based Nursing Surveillance for Automatic ICD Coding

Recognition

Learning and practice signals behind the work

Certificates and awards support the work record rather than replacing it.

Certificate

Practical Implementation of Monitoring and Testing in DevOps Environments

LLOYDK / 2023.11

Completed practical training in Elastic-based DevOps monitoring and testing.

Certificate

Multi Cloud Orchestration Program

5Works / 2023.12

Completed HashiCorp-based multi-cloud orchestration and IaC training.

Certificate

Company-Led Intensive Project Training

DK Techin / 2024.02

Participated in industry-linked practical training focused on security and DevOps engineering.

Certificate

Micro Degree in Software Specialist Training

Gachon University / 2024.02

Completed a micro-degree program for training software specialists.

Use the resume for formal review and the cases for technical depth

I work across data structure, AI development workflows, and NLP/LLM evaluation, carrying problems from framing to implementation, validation, and delivery.

Open Resume View All Projects

Email eastlighting1@gachon.ac.kr GitHub github.com/eastlighting1 LinkedIn www.linkedin.com/in/동현-김-350b4b29b Google Scholar scholar.google.com/citations?user=3BpqnYYAAAAJ

Data & Applied AI Engineer

Choose a first reading angle

Reorder representative work by focus area

Recurring development flow across AI projects

Project records

Data Systems Specialist with Graph-native Depth

Contexta: Local-First ML Observability

Lynxes: Graph Analytics Engine

EMR-Based Nursing Surveillance for Automatic ICD Coding

Dalkom Shop: Internal Employee Mileage Commerce Platform

AI-DLC and Operational MLOps Engineer

Contexta: Local-First ML Observability

Lynxes: Graph Analytics Engine

Dalkom Shop: Internal Employee Mileage Commerce Platform

Devridge: LLM-Based Feedback Bridge for Developers

BloGeek: AI Modules for a React + Spring Blog Project

FRIMO: Conversational AI for Emotional Support and Diary Generation

Applied NLP and LLM Research Engineer

EMR-Based Nursing Surveillance for Automatic ICD Coding

Devridge: LLM-Based Feedback Bridge for Developers

BloGeek: AI Modules for a React + Spring Blog Project

FRIMO: Conversational AI for Emotional Support and Diary Generation

Featured work framed by problem, decision, and outcome

Compare representative projects by the same criteria

Lynxes: Graph Analytics Engine

Contexta: Local-First ML Observability

BloGeek: AI Modules for a React + Spring Blog Project

A repeatable way of turning research into systems

Feature Engineering

Reproducible Experiments

Robust Evaluation

DevOps

System Observability

Research records

Deep Learning based Automatic ICD Coding for Nursing Surveillance of Abdominal Surgery Patients

Learning and practice signals behind the work

Practical Implementation of Monitoring and Testing in DevOps Environments

Multi Cloud Orchestration Program

Company-Led Intensive Project Training

Micro Degree in Software Specialist Training

Use the resume for formal review and the cases for technical depth