HasteKit bundles every primitive you need to take an agent from prompt to production — a drop-in LLM gateway, durable runtimes, sub-agents and handoffs, skills, RAG, built-in tools, MCP, connectors, workflows, channels, triggers, self-evolving memory, and end-to-end observability. One platform. One SDK. One bill.
A drop-in replacement for the OpenAI API. Point any existing OpenAI compatible SDK at app.hastekit.dev/api/gateway/responses with your sk-uno-… virtual key and ship — no client-library rewrite, no provider lock-in. Rate-limited per project, cost-tracked per request, OpenTelemetry-traced end to end.
Pick the model and provider. Tune history with summarizer strategies. Force structured JSON output. Add tools and skills. Build knowledge base and memories to ground the agents in context. Cap max iterations. Snapshot every save as an immutable version and route production to any version via aliases. Run in-process for development, or on Temporal for durable execution.
Long-running agent loops survive process crashes, node restarts, and transient tool failures. Run on Temporal or Restate for replay-safe execution — every step is checkpointed, every retry is automatic, every resume picks up exactly where the agent left off.
Call a sub-agent as a tool with isolated or shared context. Or hand off the conversation entirely — the user keeps chatting, but a specialist takes over. Compose focused teams of agents instead of overloading one monolithic prompt.
Author a SKILL.md bundle with prompts, scripts, and reference docs. Pin it to any agent and it mounts at /skills/<name>/ inside the sandbox — accessible from bash, Python, or Node. Share across agents, share across projects.
Drop in your docs. Configure chunk size, overlap, and embedding model. Attach a knowledge base to one agent or share it across many. At call time, the most relevant chunks are retrieved and injected — with the full citation chain preserved in every trace.
Image generation, speech, transcription, a sandboxed code-execution environment, and a progress-tracker todo — all shipped out of the box. No integration matrix, no API-key shuffle. Toggle each on per agent.
Attach an MCP server or paste in an OpenAPI spec — every operation becomes a tool. Flag any tool as deferred so it stays out of context until the agent reaches for it. Flag any as requires-approval for a human-in-the-loop gate.
Gmail, Google Calendar, Slack, Jira, GitHub — and growing. Each ships with curated actions wrapped as tools (send · list · comment · transition). Users connect their own accounts via OAuth; your agent never touches a token.
Build a DAG from the same set of tools your agents already reach for. Run it durably on Temporal. Save it — and it becomes a tool any agent can call. Tools, all the way up.
Bind any agent to a Slack channel or Telegram bot. Users chat naturally; the agent replies in-thread with rich formatting, attachments, and approval buttons. One agent, every chat surface.
A cron expression fires the agent every weekday at 9 a.m. A schedule_once wakes it at a future timestamp. A GitHub PR webhook fires it on every open. Triggers are conversation starters that don't need a human in the loop.
Each agent maintains its own knowledge as a living wiki — semantic facts and episodic events side by side. Every N turns the agent edits the page: adds new facts, refines old ones, prunes the stale. Recall surfaces them in context, on demand.
OpenTelemetry spans for every gateway call, every agent run, every tool invocation, every workflow node. Aggregate by user, organization, or agent. Watch cost roll up by provider and project, in real time.
Bring your own keys & configure agents with rich capabilities and get going.