I spent about four months building LLM-powered stuff by hand before I looked seriously at LangChain. REST calls, prompt templates as strings, manual token counting, storing conversation history in a database I had to design. It worked. It was also exhausting in ways I didn’t notice until I stopped doing it that way.
Why I was doing it myself
The first project was straightforward: a chatbot that could answer questions about my Proxmox setup by searching through my documentation. I figured I’d use the OpenAI API, throw in some vector embeddings from a local model, stitch it together with Python. Sounded contained.
It was. For about two weeks. Then I needed the bot to remember what I’d told it earlier in the conversation. Then I needed to switch between OpenAI and Ollama depending on which machine was idle. Then I wanted to add a tool that could actually query my Proxmox API, not just search docs. Each addition meant threading state through another layer, rewriting the prompt to include new instructions, testing edge cases where the model hallucinated API calls.
By month three I had a working system. It lived in about 800 lines of Python. It had bugs that took hours to trace because I couldn’t easily see what the model was “seeing” when it made a decision. Adding a new capability meant understanding five different concerns at once: the prompt, the context window, the tool definitions, the state management, the token limits.
What LangChain actually does
I started reading about LangChain not because I thought my code was bad, but because I needed to add a second chatbot and couldn’t face writing the same infrastructure twice. The core appeal was obvious once I looked at it: it abstracts away the repetitive plumbing.
Here’s the difference. Before, adding a tool meant writing a function, then writing code to construct the prompt that described the tool, then parsing the model’s response to figure out if it wanted to call the tool and which arguments to pass. This was fragile. Models don’t format JSON consistently.
With LangChain, you define a tool once:
from langchain.tools import tool
from langchain_community.utilities import OpenWeatherMapAPIWrapper
@tool
def query_proxmox(query: str) -> str:
"""Query Proxmox cluster status. Use for node info, VM status, disk space."""
# actual implementation
return result
Then you hand that to an agent and it handles the rest. The agent worries about token limits, instruction formatting, parsing the model’s response, handling retries. You don’t.
Memory management is similar. Before: I had a database schema for conversations, code to fetch the last N messages, logic to decide what to include based on token counts, debugging when the context got contaminated. With LangChain, you attach a memory object to your chain and it tracks the conversation. Different memory strategies exist. You pick one and move on.
The part that surprised me
The framework is larger than I expected. Not bloated, exactly, but it has a lot of surface area. It tries to be general-purpose enough to handle everything from simple chains to multi-step agents with tools, memory, and retrievers. That generality means you have to understand more concepts than you might think to avoid shooting yourself in the foot.
I spent a frustrating afternoon wondering why my agent kept calling the same tool twice in a row. Turned out I hadn’t configured the tool’s return_direct parameter correctly. The agent was interpreting the tool output as “keep going” when I meant “this is the final answer.” Small thing. But it meant reading the docs carefully instead of just guessing.
The documentation has improved since I started, but it’s still scattered. Some examples use the old API. Some assume you’re using OpenAI when you’re actually trying to run Ollama locally. A few show patterns that work but aren’t idiomatic anymore.
What I actually use it for now
My documentation chatbot lives in a LangChain agent now. It retrieves relevant docs, queries my Proxmox API when I ask about VMs, keeps a 10-message conversation history, and falls back to Claude when it’s unsure. The whole thing is maybe 120 lines of code, including the tool definitions. Maintenance is sane.
I added a second agent that monitors my infrastructure alerts and suggests fixes. Same pattern, different tools. Writing the second one took a day instead of three weeks.
I run both on a machine in my homelab with Ollama for the base model (usually Mistral) and Claude for the harder reasoning tasks. LangChain switches between them smoothly.
What I miss from the old way
Honest answer: precision. When I was writing the plumbing myself, I understood every decision the system made. I could see exactly what went into the prompt, why a tool was called, where a hallucination started. With LangChain, there’s a layer of abstraction. Most of the time that’s fine. Occasionally when something goes weird, it’s harder to debug because I’m not watching the raw model output.
I also miss the feeling of owning the whole stack. That’s probably just ego. LangChain isn’t a black box. I can read the source. I can override behavior when I need to. But it’s a thicker layer than code I wrote myself.
The other loss is less philosophical. I had to migrate my conversation history schema when I switched to LangChain’s memory objects. Not a big deal for a homelab tool, but it meant thinking about whether to keep the old data, how to translate it, whether it was worth the effort. For a production system this would have been messier.
Would I do it again
Yes, but with a caveat. LangChain makes sense once your LLM tools get complex enough that the boilerplate becomes the bulk of the work. If you’re just calling an API once and formatting the response, you don’t need it. If you’re building agents, chains, memory, tool use, prompt templates, and you want to swap between different LLM providers without rewriting everything, it saves time.
The calculus changes if you’re trying to run everything locally. LangChain + Ollama works, but it’s less polished than LangChain + OpenAI. The framework was optimized for the paid API case first. That said, it gets better every release.
I’m running 0.1.13 on two machines right now. The other day I noticed they’re shipping 0.2.x versions, which apparently reorganized some imports. That’s the kind of friction a framework this young still has. Not terrible, but worth knowing about if you’re planning to leave it running for a year without touching it.
What I’d tell someone considering the switch: take one small project and rebuild it with LangChain. Don’t try to architect perfectly. Just see if the abstraction level matches how you think about the problem. If it does, you’ll notice the productivity bump immediately. If it doesn’t, you haven’t wasted much time and you’ll understand why you prefer your own code.
Explore LangChain in our AI Homelab Toolkit.
Recommended Hardware & Hosting
Build your homelab with hardware tested and used by our team.
Affiliate links โ we may earn a small commission at no extra cost to you.