Continue

About This Tool

Continue is an open-source VS Code and JetBrains extension that connects to any LLM — including local ones via Ollama. Get AI code completion, chat, and editing powered by models running on your own hardware. The perfect AI coding tool for privacy-conscious homelabbers who want to keep everything local.

In-Depth Review

As someone who's been running local AI models on my homelab for over a year, Continue has become an essential part of my development workflow. This open-source VS Code and JetBrains extension bridges the gap between powerful local LLMs and practical coding assistance, delivering features that rival GitHub Copilot while keeping everything on your own hardware.

Setup is refreshingly straightforward if you already have Ollama running. After installing the extension, you simply point Continue to your local Ollama instance and select your preferred model. I've tested it extensively with CodeLlama, Deepseek Coder, and even smaller models like Phi-3 on my RTX 4090 setup. The configuration file is well-documented, and switching between different models for different tasks is seamless.

Performance varies significantly based on your hardware and model choice. On my dedicated AI server with 48GB VRAM, CodeLlama 34B provides excellent code completion with sub-2-second response times. However, running smaller models like CodeLlama 7B on more modest hardware still delivers useful results, just with slightly less sophisticated suggestions. The real-time code completion feels natural and doesn't interrupt your flow like some alternatives.

The standout feature is the flexibility to mix and match models. I use different models for different languages – Deepseek Coder for Python and JavaScript, and CodeLlama for systems programming. The chat interface is particularly useful for explaining complex code sections or debugging, and having these conversations completely private is invaluable when working on proprietary projects.

Continue also supports cloud providers if you want to supplement local models with more powerful options, but the local-first approach is what makes it special for homelab users. The API integration means you can connect it to any OpenAI-compatible endpoint, including local deployments of other inference engines.

Limitations include the typical challenges of local AI – you need substantial hardware for the best experience, and setup complexity increases if you want optimal performance. The extension occasionally struggles with very large codebases, and context management isn't as sophisticated as some commercial alternatives. Documentation could be more comprehensive for advanced configurations.

For privacy-conscious developers who want AI coding assistance without sending code to external services, Continue is exceptional. It's not quite plug-and-play for newcomers to self-hosted AI, but for homelab enthusiasts already running local models, it's a must-have tool.

Real-World Use Cases

01 Developing proprietary applications with AI assistance while keeping all code completely private

02 Getting intelligent code completion for Python data analysis scripts using local CodeLlama models

03 Explaining and documenting complex legacy codebases through private AI chat without external API calls

04 Rapid prototyping with AI-generated boilerplate code using locally hosted Deepseek Coder

05 Debugging JavaScript applications with context-aware suggestions from self-hosted models

06 Learning new programming languages with AI tutoring that doesn't leak your practice code externally

07 Generating unit tests and documentation for internal business logic using privacy-preserving local LLMs

Pros & Cons

Pros

Complete code privacy with all AI processing happening on your own hardware
Flexible model selection allowing optimization for different programming languages and tasks
No subscription fees or API costs once you have the local infrastructure running
Seamless integration with existing VS Code and JetBrains workflows without changing development habits
Support for multiple inference backends including Ollama, vLLM, and OpenAI-compatible APIs
Active open-source development with regular updates and community contributions

Cons

Requires significant local GPU resources for optimal performance with larger models
Initial setup complexity for users new to local LLM deployment and configuration
Performance heavily dependent on hardware capabilities and model size choices
Context window limitations compared to cloud-based solutions like GitHub Copilot
Documentation gaps for advanced configuration and troubleshooting scenarios

Works With

VS Code JetBrains IDEs Ollama Docker NVIDIA GPU Apple Silicon vLLM OpenAI API Hugging Face Transformers CUDA ROCm Metal Performance Shaders Kubernetes Podman LM Studio TabbyML

User Ratings

Log in to rate this tool.