LLM Coding Tutor

Automated grading with per-commit LLM feedback for introductory programming.

What it is

LLM Coding Tutor is an open platform for teaching introductory programming with automated grading and LLM-driven per-commit feedback.

GitHub Classroom delivers assignments to students.
GitHub Actions grade every commit inside a prebuilt Docker image.
An AI tutor reads the student's code and test results, and writes feedback next to the pass/fail score. The demo assignments run in English; the platform supports other languages as a per-assignment setting — CPF and ECA at Tech University of Korea get bilingual Korean + English feedback.
Every run is harvested — we record the full commit trajectory so we can study how students actually learn.

Students need nothing but a browser. No local toolchain, no Python install, no Docker. The entire loop runs on GitHub.

Why

First-year programming students get stuck on small mechanical errors that are obvious to a human TA and invisible to a traditional autograder. An LLM tutor that reads the code and the failing test output can unblock those moments in seconds — and we can measure whether that help actually helps.

The students don't have to be CS majors. The two courses it powers today are engineering courses at Tech University of Korea — CPF (Computer Programming Fundamentals) for first-year engineering students, and ECA (Engineering Computational Analysis) for juniors who need programming as a means to domain work rather than as an end. The same pipeline should work for any Python or C++ homework in any lab.

The zero-install design — nothing runs on the student's machine; everything runs in GitHub's cloud — was originally for freshmen with weak laptops. The same property makes the platform deployable anywhere with internet: LMIC universities, budget-constrained schools, or classrooms on satellite internet. Bringing personalized feedback to students who would otherwise have none is a direction we are actively working toward.

One more thing worth saying out loud: most of this platform's configuration — Dockerfiles, GitHub Actions workflows, grader and assignment templates, and this landing page — was built with help from several LLM coding assistants. The tools we teach about are the tools we use.

Try it

The fastest way to feel the loop is to accept a demo assignment yourself. Each link opens a GitHub Classroom invite hosted under the try-ai-tutor demo organization; accepting creates a private repo with your username. Push a change, watch CI grade it, read the tutor's feedback.

020 — Rectangle area (code)
Write exercise.py; the grader runs it against a hidden test suite and the tutor explains what went wrong (or right).
020p — Rectangle area (prompt-only)
Write a prompt. An LLM turns it into code. The code gets graded. If it fails, the tutor tells you which detail your prompt was missing. Covered in the demo video below.

You'll need a GitHub account. No local setup is required. If accepting fails with a 500 error, the repo is created but you are not added as a collaborator — please reach out via the contact form with your GitHub username and the assignment name so we can add you manually.

Architecture

At the coarsest level, three independent pieces talk through GitHub:

Grader image
Docker · GHCR
tests + AI tutor baked in

→

Student repo
GitHub Classroom
one per student, per assignment

→

Collector
actions-collector
harvests every run

Grader image — a private Docker image per assignment (ghcr.io/kangwonlee/python-pytest-NNN) containing pytest, style checks, the LLM tutor, and the hidden test cases.
Student repo — created by GitHub Classroom from a public homework template. A thin workflow pulls the grader image and runs it on every push.
Collector — a separate tool that walks every org and archives run logs, scores, and commit trajectories into a local dataset for research.
Owl / Canary — the meta-test framework that grades the graders, so a regression in the autograder is caught before it reaches students.

Each piece is a separately versioned repository under try-ai-tutor and kangwonlee. The LLM tutor ships with configs for Gemini, Claude, Grok, Perplexity, and NVIDIA NIM out of the box, and any OpenAI-compatible REST endpoint can be added by subclassing a small config class. Replacing one piece does not touch the others.

Videos

RoboRacer Lab 02 — autonomous-vehicle AEB demo

The platform applied to F1Tenth autonomous-racing coursework: a safety_node graded against unit tests, closed-loop physics, and an AI tutor. Full write-up on the Lab 02 page.

Prompt-only assignment — 020p area demo

Silent screen capture. A prompt-only assignment: a vague first prompt earns 2/5 and the tutor names the missing details; a revised prompt earns 5/5.

PyCon Korea 2024 — infrastructure walkthrough

Conference talk covering the build-push / Classroom / auto-grading pipeline.