TL;DR
I fine-tuned a compact AI model (a distilled 8B Llama) to run on basic, second-hand computers without needing an internet connection. The model takes any article and, using a structured JSON template and a “Chain of Thought” reasoning process, automatically converts it into an interactive lesson plan complete with key concepts, quizzes, and discussion prompts. The goal is to make a powerful learning tool that’s accessible and useful in places with limited resources.
The Problem
Many children worldwide lack access to quality education due to geographic and economic barriers, leaving them without the critical thinking skills needed to break cycles of poverty and misinformation. Technology can level this playing field by bringing personalized, high-quality educational content directly to underserved communities, even without internet connectivity.
Meet the Pocket-Sized Teacher
My answer is an AI tutor that can live on a second-hand computer and keep working when the power goes out. The goal is to create a fine-tuned model that transforms any text—Wikipedia articles, textbook chapters, news stories—into structured, interactive educational content.
Here’s an example generated from a Wikipedia article on Enzyme Kinetics:
Enzyme Kinetics WorksheetFor children in resource-constrained environments, this means having access to a tireless personal tutor that adapts any content to their learning level, giving every child the same quality educational support regardless of their economic background.
More Examples
Here are a few more examples of worksheets generated by the model. Click on each one to see the rendered HTML output in a new tab.
Structured Learning with JSON
To achieve this, I first needed to define what a “structured learning module” would look like. I decided on a JSON format because it’s machine-readable, structured, and flexible. This means any application—web, mobile, or desktop—can parse and display the content consistently.
Here is the JSON schema I designed for the lesson plans:
{
"type": "object",
"properties": {
"lesson_title": {"type": "string", "minLength": 1},
"learning_objectives": {
"type": "array", "items": {"type": "string"}, "minItems": 2, "maxItems": 5
},
"key_concepts": {
"type": "array", "minItems": 3, "maxItems": 5,
"items": {
"type": "object",
"properties": {
"concept": {"type": "string"},
"explanation": {"type": "string", "minLength": 20},
"analogy_or_simple_example": {"type": "string", "minLength": 15}
},
"required": ["concept", "explanation", "analogy_or_simple_example"]
}
},
"quiz_questions": {
"type": "array", "minItems": 2, "maxItems": 4,
"items": {
"type": "object",
"properties": {
"question": {"type": "string"},
"options": {"type": "array", "items": {"type": "string"}, "minItems": 3, "maxItems": 4},
"answer": {"type": "string"},
"explanation": {"type": "string", "minLength": 15}
},
"required": ["question", "options", "answer", "explanation"]
}
},
"reading_level": {"type": "string"},
"engagement_hook": {"type": "string"},
"thought_experiment_or_discussion": {
"type": "string",
"description": "A resource-free activity to encourage critical thinking, observation, or discussion. Must not require any special materials or specific environments.",
"minLength": 20
}
},
"required": [
"lesson_title", "learning_objectives", "key_concepts", "quiz_questions",
"reading_level", "engagement_hook", "thought_experiment_or_discussion"
]
}
How to Teach a Small Model Big Tricks
A tiny model needs good mentors. The LLM I’m using as the brain is a distilled version of Llama-3 8B called DeepSeek-R1-Distill-Llama-8B—small enough to survive on a CPU once quantized to 4 bits, but big enough to juggle analysis of articles and provide analogies.
I’m not 100% sure if this choice was the best, but it works. I want to test the same data on Qwen3 (not the distill) to see how that performs. I also want to try out Gemma3N 4E (4b active parameters) but that’s not a reasoning model, so i’d have to strip those out of my dataset. I’m not sure if models smaller than 4b could understand the article they’re reading in order to provide the analogies i’m looking for, but if anyone knows of a smaller reasoning model, please let me know.
Boosting Reasoning with “Chain of Thought”
To create a truly capable model, I wanted it to learn the reasoning process behind creating a lesson plan. I wrote a script to generate a “reasoning trace” for each article-JSON pair—a first-person narrative of the thought process a curriculum developer might have. The key was to preserve the model’s internal monologue, or <think>
tokens, during training so it would learn how to reason, not just what to output.
By including this trace in the training data, the model learns not just to mimic the output, but to understand the underlying process of instructional design.
Things I Need to Take a Look At
The model currently takes about two minutes to generate a lesson plan on my Mac M3 Pro, and that’s with GPU acceleration. The main bottleneck seems to be the “Chain of Thought” reasoning. The model spends a lot of time thinking before it starts writing the JSON output.
My next step is to experiment with a few optimizations:
- Concise Reasoning: I’ll train a version of the model where the reasoning trace is more focused and less verbose. The goal is to see if I can speed up the output without sacrificing quality.
- No Reasoning: I also want to see what happens if I remove the reasoning trace entirely. If the model can produce high-quality lessons without the explicit reasoning step, I can significantly improve performance.
- Different Models: I’m planning to test other models as well. I want to try Qwen3 (which supports reasoning) and Google’s Gemma3N (which doesn’t) to compare their performance and output quality.
The ideal outcome is a model that is both fast and accurate, making it practical for real-world classroom use.
Fighting Load-Shedding with Sunshine
None of this matters if the machine dies every time the grid does—a regular occurrence in places like Kharian. For long-term deployments, we’ll need something like a 100Ah LiFePO₄ battery paired with a cheap 300-500W solar panel. Alternatively, we could forgo PCs and use laptops; a used ThinkPad has its own battery, which means one less component to fail in the midsummer heat.
The parts cost roughly £80 for the computer and another ~£250 for the solar kit. The solar kit is the most expensive part, and I’m looking for ways to optimize this. For now, I’ll be deploying the machines without a solar backing.
What Happens Next?
The first classroom trial will involve one ThinkCentre PC (i5-7500T, 8GB RAM), my fine-tuned model, and 512GB of offline educational content from Internet-in-a-Box, including Wikipedia, TED Talks, and Khan Academy. A simple app will tie it all together, allowing teachers and students to paste in articles, select a grade level, and get a printable lesson.
If the little AI survives the Pakistani summer, I’ll open-source the whole stack: datasets, training scripts, and the front-end. Break it, fork it, translate it—carry it on. Whatever pushes the idea further and helps more children learn.
I’m rusty at writing, but I’ll keep posting progress here. If you’ve got suggestions, critiques, or a spare stick of DDR4 that deserves a better life, my inbox is open: inbox@bilawal.net.
Key Area | Focus | Status |
---|---|---|
Goal | An offline AI tutor for children in underserved communities. | In Progress |
Model | Fine-tuning DeepSeek-R1-Distill-Llama-8B to run on second-hand, low-power hardware. | Complete |
Method | Using a JSON schema and “Chain of Thought” reasoning to convert articles into interactive lessons. | Complete |
Hardware | Deploying on a ThinkCentre PC (i5-7500T, 8GB RAM) with offline content from Internet-in-a-Box. | Planned |
Next Step | First classroom trial in Pakistan, followed by open-sourcing the entire stack. | Upcoming |