Workflow Intelligence Kit: design notes

Workflow Intelligence Kit is a reusable foundation for running an agentic AI assistant inside a single small service business. The kit is opinionated, narrow, and deliberately incomplete in places where most agentic-AI starter projects try to do too much. The rest of this note explains the constraints it picks, the decisions that follow from those constraints, and the things it refuses to do in v1.

Problem

Most agentic AI projects for small businesses fail the same way: they over-scope. The pitch is a general-purpose AI workforce that talks to every system the business runs, with a beautiful web UI and a customer-facing chat surface and a knowledge graph and a vector store and a multi-agent orchestration layer. The build never finishes. The owner ends up with a half-deployed pile of cloud services, monthly bills, and no working assistant.

The opposite failure is also common: a single chatbot wired to one API endpoint, no operator workflow around it, no observability, no scheduled actions, no way to handle the case where the assistant needs a human in the loop. It runs once, demos well, and rots within a quarter.

Workflow Intelligence Kit picks the middle: a single-operator AI assistant runtime, deployed on a single VPS, accessed through a single chat interface, doing a single category of work well. Forkable for the next business. The architecture is what makes that scope sustainable.

The kit assumes the operator is a technically-comfortable solo consultant or small-team owner running this for a service business they understand (construction, accounting, real estate, professional services, the same set of verticals Smart Code Shop has worked across since 2016). The assistant is a back-office worker, not a customer-facing product.

Constraints

The kit makes five constraint choices. They are listed below; the rest of the essay is what falls out of accepting them.

Single-operator, Telegram-only interface
Read-only data access by design, scoped API tokens
Ubuntu 24.04 + systemd on Hetzner-class VPS
Credentials in the agent's .env, no parallel auth-profile layer
No customer-facing surface in v1

What these constraints buy: a runtime that one person can install, configure, and operate without a platform team. No queueing infrastructure. No web frontend. No identity provider integration. No multi-tenant data isolation problem. The assistant runs on one VPS, owned by one operator, talking to one chat client, reading from a small set of business systems with read-only scoped tokens.

What they cost: the kit is not a product platform. It is a foundation that someone with operator skills uses to deploy an assistant for a specific business. The assistant cannot be sold as SaaS without rebuilding most of these layers. Customers of the service business never touch the assistant directly. There is no central dashboard for managing fleets of assistants across multiple businesses.

That tradeoff is the point. Most service businesses I've worked with do not need a SaaS platform. They need one workflow handled reliably, in a way that does not blow up the existing operation. The kit gets to a working assistant faster because it refuses to also be the wrong things.

Decisions

Each constraint above forces a downstream decision. The four most consequential are below.

Hermes Agent over LangChain or CrewAI

The kit uses Hermes Agent, the MIT-licensed successor to OpenClaw from Nous Research. I chose it because this is a single-operator system, not an enterprise agent platform, and the runtime needs to stay small enough that one person can understand and maintain it.

LangChain is powerful, but it brings too much abstraction for this use case. The kit does not need a sprawling framework for chains, retrievers, callbacks, agents, graphs, and provider wrappers. It needs a reliable runtime that can receive a message, use tools, remember context, run scheduled jobs, ask for approval, and report back. LangChain can do some of that, but keeping up with its surface area would become its own job.

CrewAI has a different mismatch. It assumes the interesting problem is multi-agent orchestration. That may be useful later, but v1 is deliberately not trying to coordinate a room full of simulated coworkers. The first version needs one dependable assistant with access to the right tools and a clear approval boundary.

n8n is useful as a workflow engine, but this kit needs an agent runtime first and workflow automation second. Hermes fits that shape better. It has native cron, MCP support, persistent memory, chat-first operation, and a small enough mental model that the operator can reason about what the assistant is doing. That matters more than framework popularity.

MCP for tool integration

The kit uses Model Context Protocol as the tool layer because tools should not be trapped inside one agent runtime.

A local business assistant needs to connect to messy systems: calendars, CRMs, email, spreadsheets, file stores, ticketing systems, databases, and one-off internal APIs. Those integrations are where most of the long-term maintenance lives. If every tool is wired directly into the assistant, the runtime becomes the integration layer. That is the wrong place for it.

MCP keeps the boundary cleaner. A tool server can be written in whatever language fits the system it talks to. It can evolve separately from the assistant. It can be reused by other MCP-aware agents later if the operator changes runtimes. That portability matters because the agent market is still moving fast, and betting the whole kit on one framework's private tool format would be dumb in a very avoidable way.

A custom tool layer would be faster for the first demo. It would also age badly. The kit is meant to be lived with, not just shown once. MCP adds some setup cost, but it keeps the integration work modular and portable. That is the trade I want.

The kit does not try to scaffold MCP across every unsupported framework version. That compatibility work turns into a swamp quickly. If a framework cannot consume MCP cleanly, it is out of scope for v1.

Telegram as the only interface in v1

Telegram is the v1 interface because the operator already lives in chat, and the assistant needs to be reachable from both a phone and a desktop without adding a new app.

A web UI would be slower to build and less useful at the moment the assistant actually needs the operator: approval, clarification, or quick review. Slack and Discord are team-shaped tools. They make sense when the assistant belongs in a shared workspace. This kit is for one operator running a back-office assistant, so a team chat surface adds ceremony without much benefit.

Telegram's bot API is simple, mature, free, and boring in the right way. It can send messages, receive approvals, deliver files, and handle the lightweight back-and-forth that a human-in-the-loop assistant needs. The operator does not need to manage a Slack workspace, invite a bot to channels, or run a custom mobile app just to approve a draft invoice or review a CRM update.

This is also a scope decision. A future version can add Slack, Discord, email, or a web dashboard. v1 gets one interface that works and keeps the rest of the kit focused.

Read-only by default, scoped API tokens

The assistant is read-only by default. That is the line that makes the rest of the kit defensible.

Most useful business automation starts with reading: summarize the inbox, inspect CRM records, find stale opportunities, review calendar context, draft follow-ups, compare spreadsheets, surface exceptions. Those tasks are valuable before the assistant writes a single byte back into a business system.

Write access is different. Once an assistant can update records, send messages, create invoices, change statuses, or modify customer data, mistakes become operational events. Early agent systems are especially prone to confident wrongness: bad assumptions, stale context, misunderstood instructions, and tool calls that technically succeed while doing the wrong thing. Giving that kind of system broad write access by default is asking for the fun kind of incident report.

So the kit separates reading from writing. The default tokens are read-only. Write actions require a scoped write token and a human confirmation step. The assistant can draft the action, explain what it plans to change, and ask the operator to approve it. Only then does it execute.

That slows down some workflows, intentionally. The goal is not maximum autonomy on day one. The goal is trust. A small business can tolerate an assistant that asks before it changes customer data. It cannot tolerate an assistant that quietly makes bad updates across core systems because the demo wanted to look impressive.

Out of scope

The kit's "out of scope" list is as important as its "in scope" list. Six things v1 explicitly does not do.

MCP scaffolding on unsupported framework versions. MCP is moving fast. The kit pins to versions that have been validated against Hermes, and refuses to ship a fragile compatibility layer for older runtimes. If a framework cannot run current MCP, it cannot be in the kit yet.
Local GPU model assumptions. The kit does not assume the operator has GPU hardware. Hosted inference is the path. Local model support is a future option, not a v1 surface.
Mac/launchd support. Ubuntu + systemd only. Mac development is fine; Mac production deployment is not in scope. The kit ships with one OS contract.
Committed live config files. The kit ships templates and an .env.example; the actual operator config never enters the repo. This sounds obvious until you see how many starter projects ship with secrets committed by accident.
Parallel auth-profile layer. Hermes's .env is the single source of credentials. There is no separate identity provider, no SSO layer, no role-based access on top. The single-operator constraint makes that layer pure waste.
Generic knowledge-graph infrastructure. No Neo4j, no LlamaIndex graph store, no homebrew RDF. If a workflow needs structured knowledge, the skill that owns the workflow brings its own (usually a flat data file or a single SQL table). Generic graph infrastructure is a tempting trap that adds cost long before it adds value at this scope.

These exclusions are not "things the kit does not do yet." They are decisions about what the kit refuses to do in v1 so that v1 can ship and stay maintained by one person.

Lineage

The kit's design discipline is borrowed from RShuken's flyn-agent pattern, and specifically from the postmortem on that project. The lessons applied here: install ordering must be idempotent and survive partial failure (the install script can be re-run safely at any state); secrets layering must keep credentials out of the repo, out of process env vars when avoidable, and inside a single configurable location; MCP-to-framework compatibility cannot be assumed forward and must be pinned per release. None of these lessons are exotic. All of them are the kind of thing that breaks a starter project the first time it touches a real deployment.

What is next

v1 scaffolding is in progress. The first thing to land is the idempotent install script for Ubuntu 24.04, the hardening scripts (UFW rules, SSH hardening, fail2ban), the single systemd unit template, and the reference skill against Open-Meteo (a no-auth weather API used purely to verify the MCP integration path). After that lands, the first real workflow skill is the one I want to ship: a specific service-business intake-triage or proposal-drafting workflow that exercises the kit against a real client constraint. That second skill is what tells me whether the kit's scope choices were right.

Try it or follow along

The repo is at github.com/prodriguez-dev/workflow-intelligence-kit. v1 scaffolding is underway; design is settled. If you have a service-business workflow that an assistant should be handling and you want to talk through the architecture, get in touch.