← Back to portfolio
In Development

Notesmith AI

Intelligent document summarization — not just shorter, actually better

PythonLLMAI Agents

The Problem

Summarization tools truncate. Notesmith understands.

Long documents and notes are hard to process quickly. Simple summarization — send everything to an LLM and ask it to shorten — breaks down on variable-length inputs and loses important structure. Notesmith uses an agentic pipeline where the LLM decides how to chunk, what to prioritize, and how to format output based on document type.

Architecture

How it's built

Agentic summarization pipeline with document-type awareness. The agent first analyzes document structure, then decides on a chunking strategy, then summarizes with awareness of what's important vs. filler. Output format adapts to content — bullet points for notes, narrative for articles, key facts for reports.

System Flow

Document Input

Raw text / file

Structure Analyzer

Type detection

Chunking Strategy

LLM decides splits

Priority Ranker

What matters most

Formatted Output

Type-aware summary

Document → Structure Analysis → Chunking → Priority Ranking → Formatted Summary

Tech Decisions

Why these choices

Agentic approach over a simple prompt

Handles variable-length input better. A single prompt breaks on long documents; an agent adapts its strategy to the content.

Document-type awareness

A meeting note and a research paper need different summaries. Detecting type first improves output quality significantly.

Challenges & Learnings

What was hard

  • 01.

    Handling edge cases in document length and structure — badly formatted inputs, code-heavy docs, multi-language content.

  • 02.

    Making the output genuinely useful, not just shorter. Shorter is easy. Useful requires judgment about what matters.

Screenshots / Demo

In action

Screenshot coming

Add image at /public/projects/notesmith/screenshot.png