Skip to content
15 AI & LLM Projects (2026)

15 AI & LLM Projects (2026)

DodaTech Updated Jun 20, 2026 5 min read

Large language models are reshaping how we build software. These 15 projects teach you prompt engineering, retrieval-augmented generation, AI agent architectures, and fine-tuning — skills that are in high demand. Start with simple API wrappers and progress to custom RAG pipelines and autonomous agents.

Beginner Projects

1. Prompt Engineering Playground

Difficulty:
Skills: LLM API (OpenAI/Claude), prompt design, temperature/top-p tuning
Build a UI for testing prompts. Features: adjustable system/user prompts, temperature slider, response streaming, save prompt templates, compare responses side by side.

2. AI Chatbot UI

Difficulty:
Skills: Chat completion API, conversation history, streaming responses
Build a clean chat interface for any LLM. Features: message bubbles, markdown rendering in responses, chat history persistence, stop generation button, dark mode.

3. Markdown-to-Text Summarizer

Difficulty:
Skills: LLM summarization, chunking long text, token counting
Build a tool that summarizes markdown documents. Features: paste markdown input, configurable summary length (short/medium/long), bullet or paragraph output, export summary.

4. AI Email Reply Generator

Difficulty: ⭐⭐
Skills: Prompt templates, context injection, tone control
Build an assistant that drafts email replies. Features: paste received email, select tone (formal/friendly/urgent), generate reply draft, edit and copy, multiple variations.

5. Content Rewriting Tool

Difficulty: ⭐⭐
Skills: Paraphrasing, style transfer, prompt chaining
Build a tool that rewrites content in different styles. Features: rewrite as professional/casual/academic, preserve key facts, length control (shorter/longer), batch processing for multiple paragraphs.

Intermediate Projects

6. RAG Pipeline (PDF Q&A)

Difficulty: ⭐⭐⭐
Skills: Document chunking, embeddings, vector DB (Chroma/Pinecone), retrieval
Build a system that answers questions from PDF documents. Features: PDF ingestion and chunking, embedding generation, vector store indexing, semantic search, answer generation with source citations.

7. AI Research Assistant (Web Search + LLM)

Difficulty: ⭐⭐⭐
Skills: Web search API integration, result summarization, citation
Build a research tool that searches the web and summarizes findings. Features: query multiple sources, extract relevant snippets, generate research summary, cite sources, export to markdown.

8. Custom Chatbot with Memory

Difficulty: ⭐⭐⭐
Skills: Conversation buffer, session management, summarization memory
Build a chatbot that remembers past conversations. Features: short-term (recent messages) and long-term (summarized) memory, user identification, memory retrieval on relevant topics, forget/reset command.

9. AI Code Review Tool

Difficulty: ⭐⭐⭐
Skills: Code context injection, diff analysis, best practices prompting
Build a tool that reviews code diffs. Features: paste code or diff, auto-detect language, review categories (bugs, style, security, performance), suggestion generation, pass/fail rating.

10. Meeting Note Taker (Transcription + Summary)

Difficulty: ⭐⭐⭐⭐
Skills: Speech-to-text API, LLM summarization, speaker diarization
Build a tool that transcribes and summarizes meetings. Features: upload audio file, speaker identification, timestamped transcript, action item extraction, meeting summary with key decisions.

11. Multi-Agent Research System

Difficulty: ⭐⭐⭐⭐
Skills: Agent orchestration, task delegation, tool use
Build a system with multiple AI agents that collaborate. Features: orchestrator agent delegates to specialist agents (search, summarize, fact-check), agents use tools (web, calculator, DB), final synthesized report.

Advanced Projects

12. Fine-Tune a Small LLM (LoRA)

Difficulty: ⭐⭐⭐⭐⭐
Skills: LoRA / QLoRA, Hugging Face transformers, dataset preparation, evaluation
Fine-tune a small open-source LLM (Llama 3, Mistral, Phi-3) on custom data. Features: prepare instruction dataset, LoRA config, training with PEFT, inference with merged weights, evaluate on holdout set.

13. AI Coding Agent

Difficulty: ⭐⭐⭐⭐⭐
Skills: Code generation, sandboxed execution, iterative debugging
Build an agent that writes and tests code. Features: natural language task input, generate code skeleton, execute in sandbox, read errors and fix, test generation, explain code output.

14. Autonomous Web Research Agent

Difficulty: ⭐⭐⭐⭐⭐
Skills: Browser automation (Playwright/Selenium), planning, reflection
Build an agent that autonomously researches a topic. Features: accept research question, plan sub-questions, browse websites, extract relevant content, synthesize findings, cite sources, produce structured report.

15. LLM Evaluation Harness

Difficulty: ⭐⭐⭐⭐
Skills: Benchmark datasets, metrics calculation, model comparison
Build a system to evaluate LLM performance. Features: load benchmark datasets (MMLU, TruthfulQA, GSM8K), run evaluations across multiple models, accuracy/F1/ROUGE scores, leaderboard visualization, regression detection.

16. AI Document Analysis Pipeline

Difficulty: ⭐⭐⭐⭐
Skills: Multi-modal LLMs, OCR, document parsing, structured extraction
Build a pipeline that analyzes scanned documents. Features: OCR text extraction, classify document type (invoice, contract, report), extract structured fields (dates, amounts, parties), validate extracted data, export to JSON.

17. Custom RAG with Hybrid Search

Difficulty: ⭐⭐⭐⭐⭐
Skills: Dense + sparse retrieval (BM25), re-ranking, query expansion
Build an advanced RAG system with hybrid search. Features: dense embeddings + BM25 keyword search, reciprocal rank fusion, cross-encoder re-ranking, query expansion with LLM, ablation study to compare retrieval methods.

18. LLM-Powered Data Extraction Tool

Difficulty: ⭐⭐⭐⭐
Skills: Structured output (JSON mode), schema definition, batch processing
Build a tool that extracts structured data from unstructured text. Features: define extraction schema (JSON), process batch of documents, validate extracted data, confidence scoring, handle extraction failures with fallback.


FAQ

What API should I start with?
Start with OpenAI (gpt-4o-mini is cheap) or Anthropic (Claude 3 Haiku). Both have generous free tiers. For open-source models, use Ollama locally or Together AI for hosted inference.
How much does it cost to run these projects?
Beginner projects cost cents per day. Intermediate RAG pipelines cost a few dollars per month (vector DB + API calls). Fine-tuning can cost $5–50 depending on model size and dataset.
What hardware do I need?
For API-based projects, any computer works. For local open-source models, a GPU with 8GB+ VRAM (or use Ollama which works on CPU). For fine-tuning, rent cloud GPUs (RunPod, Lambda, Colab).
How do I stay updated with the fast-changing LLM space?
Follow the Hugging Face blog, the official OpenAI/Anthropic changelogs, and communities like r/LocalLLaMA and the AI Engineer newsletter. Most of these projects use libraries that evolve quarterly — check docs for the latest API changes.

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro