LLM Provider Configuration
LLM Provider Configuration
AI Developer Assistant supports multiple Large Language Model (LLM) providers, giving you flexibility in choosing between cloud-based and local AI models. This guide covers configuration, usage, and best practices for each provider.
Supported Providers
1. OpenAI (Cloud)
OpenAI provides access to powerful models like GPT-4 and GPT-3.5-turbo through their cloud API.
Configuration
Environment Variables:
export OPENAI_API_KEY="sk-your-openai-api-key"
export LLM_PROVIDER="openai"
export LLM_MODEL="gpt-4"
Configuration File:
# ai-dev.config.local.yaml
llm:
provider: "openai"
model: "gpt-4"
temperature: 0.7
maxTokens: 2000
openai:
apiKey: "${OPENAI_API_KEY}"
baseUrl: "https://api.openai.com/v1"
organization: "${OPENAI_ORGANIZATION}" # Optional
Usage
# Use OpenAI with command-line options
ai-dev review --llm-provider openai --openai-api-key your-key --llm-model gpt-4
# Use OpenAI with environment variables
ai-dev review # Uses OPENAI_API_KEY and LLM_PROVIDER from environment
Available Models
- gpt-4: Most capable model, best for complex analysis
- gpt-4-turbo: Faster version of GPT-4
- gpt-3.5-turbo: Cost-effective option for most tasks
- gpt-3.5-turbo-16k: Extended context length
Best Practices
# Use GPT-4 for complex code reviews
ai-dev review --llm-model gpt-4 --verbose
# Use GPT-3.5-turbo for quick reviews
ai-dev review --llm-model gpt-3.5-turbo
# Use GPT-4-turbo for balanced performance
ai-dev review --llm-model gpt-4-turbo
2. Google Gemini (Cloud)
Google Gemini provides access to advanced AI models through Google AI Studio.
Configuration
Environment Variables:
export GEMINI_API_KEY="AIzaSy-your-gemini-api-key"
export LLM_PROVIDER="gemini"
export LLM_MODEL="gemini-2.0-flash"
Configuration File:
# ai-dev.config.local.yaml
llm:
provider: "gemini"
model: "gemini-2.0-flash"
temperature: 0.7
maxTokens: 2000
gemini:
apiKey: "${GEMINI_API_KEY}"
model: "gemini-2.0-flash"
baseUrl: "https://generativelanguage.googleapis.com/v1beta"
Usage
# Use Gemini with command-line options
ai-dev review --llm-provider gemini --gemini-api-key your-key --llm-model gemini-2.0-flash
# Use Gemini with environment variables
ai-dev review # Uses GEMINI_API_KEY and LLM_PROVIDER from environment
Available Models
- gemini-2.0-flash: Latest model with improved performance
- gemini-pro: High-quality model for complex tasks
- gemini-pro-vision: Multimodal model with image understanding
Best Practices
# Use Gemini 2.0 Flash for fast analysis
ai-dev review --llm-model gemini-2.0-flash
# Use Gemini Pro for detailed analysis
ai-dev review --llm-model gemini-pro --verbose
3. Ollama (Local)
Ollama allows you to run large language models locally on your machine, providing privacy and offline capabilities.
Installation
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
# Pull a model
ollama pull llama2
ollama pull codellama
ollama pull mistral
Configuration
Environment Variables:
export OLLAMA_ENABLED="true"
export OLLAMA_BASE_URL="http://localhost:11434"
export OLLAMA_MODEL="llama2"
export LLM_PROVIDER="ollama"
Configuration File:
# ai-dev.config.local.yaml
llm:
provider: "ollama"
model: "llama2"
temperature: 0.7
maxTokens: 2000
ollama:
enabled: true
baseUrl: "http://localhost:11434"
model: "llama2"
Usage
# Start Ollama service
ollama serve
# Use Ollama with command-line options
ai-dev review --llm-provider ollama --ollama-enabled --ollama-model llama2
# Use Ollama with environment variables
ai-dev review # Uses OLLAMA_* environment variables
Available Models
General Purpose:
- llama2: Meta's Llama 2 model
- mistral: Mistral AI's model
- codellama: Code-specialized Llama model
Code-Specific:
- codellama:7b: 7B parameter code model
- codellama:13b: 13B parameter code model
- codellama:34b: 34B parameter code model
Best Practices
# Use CodeLlama for code analysis
ai-dev review --llm-model codellama:7b
# Use Mistral for general analysis
ai-dev review --llm-model mistral
# Use Llama2 for balanced performance
ai-dev review --llm-model llama2
Provider Comparison
Provider | Privacy | Cost | Performance | Setup | Best For |
---|---|---|---|---|---|
OpenAI | Cloud | Pay-per-use | High | Easy | Production, complex analysis |
Gemini | Cloud | Pay-per-use | High | Easy | Fast analysis, multimodal |
Ollama | Local | Free | Variable | Medium | Privacy, offline use |
Advanced Configuration
Model Parameters
Configure model behavior with parameters:
# ai-dev.config.local.yaml
llm:
provider: "openai"
model: "gpt-4"
temperature: 0.7 # Creativity level (0-1)
maxTokens: 2000 # Maximum response length
topP: 0.9 # Nucleus sampling
frequencyPenalty: 0.0 # Reduce repetition
presencePenalty: 0.0 # Encourage new topics
Provider-Specific Settings
OpenAI Advanced Configuration
openai:
apiKey: "${OPENAI_API_KEY}"
baseUrl: "https://api.openai.com/v1"
organization: "${OPENAI_ORGANIZATION}"
timeout: 30000 # Request timeout in ms
maxRetries: 3 # Number of retries
retryDelay: 1000 # Delay between retries
Gemini Advanced Configuration
gemini:
apiKey: "${GEMINI_API_KEY}"
model: "gemini-2.0-flash"
baseUrl: "https://generativelanguage.googleapis.com/v1beta"
timeout: 30000
maxRetries: 3
retryDelay: 1000
Ollama Advanced Configuration
ollama:
enabled: true
baseUrl: "http://localhost:11434"
model: "llama2"
timeout: 60000 # Longer timeout for local models
maxRetries: 2
retryDelay: 2000
Switching Between Providers
Command-Line Switching
# Switch to OpenAI
ai-dev review --llm-provider openai --openai-api-key your-key
# Switch to Gemini
ai-dev review --llm-provider gemini --gemini-api-key your-key
# Switch to Ollama
ai-dev review --llm-provider ollama --ollama-enabled
Environment Variable Switching
# Set OpenAI as default
export LLM_PROVIDER="openai"
export OPENAI_API_KEY="your-key"
# Switch to Gemini
export LLM_PROVIDER="gemini"
export GEMINI_API_KEY="your-key"
# Switch to Ollama
export LLM_PROVIDER="ollama"
export OLLAMA_ENABLED="true"
Configuration File Switching
# ai-dev.config.local.yaml
llm:
provider: "openai" # Change to "gemini" or "ollama"
model: "gpt-4" # Change model accordingly
Performance Optimization
Model Selection
Choose the right model for your use case:
# Fast analysis with good quality
ai-dev review --llm-model gpt-3.5-turbo
# High-quality analysis
ai-dev review --llm-model gpt-4
# Balanced performance
ai-dev review --llm-model gpt-4-turbo
Token Management
Optimize token usage:
# Limit response length
ai-dev review --llm-max-tokens 1000
# Use shorter prompts for simple tasks
ai-dev review --file-patterns "src/utils/helper.ts"
Caching
Enable response caching for repeated requests:
# ai-dev.config.local.yaml
llm:
provider: "openai"
model: "gpt-4"
cache: true
cacheTTL: 3600 # Cache for 1 hour
Security Considerations
API Key Management
Environment Variables (Recommended):
# Never commit API keys to version control
export OPENAI_API_KEY="sk-your-key"
export GEMINI_API_KEY="AIzaSy-your-key"
Configuration Files:
# Use environment variable references
openai:
apiKey: "${OPENAI_API_KEY}" # Not: "sk-your-key"
Local vs Cloud
Use Local (Ollama) for:
- Sensitive code that shouldn't leave your machine
- Offline development
- Cost control
- Privacy requirements
Use Cloud (OpenAI/Gemini) for:
- Better performance and accuracy
- Access to latest models
- Team collaboration
- Production environments
Troubleshooting
Common Issues
API Key Errors:
# Test API key
curl -H "Authorization: Bearer $OPENAI_API_KEY" https://api.openai.com/v1/models
# Check key format
echo $OPENAI_API_KEY | head -c 10 # Should start with "sk-"
Ollama Connection Issues:
# Check if Ollama is running
curl http://localhost:11434/api/tags
# Start Ollama service
ollama serve
# Pull required model
ollama pull llama2
Rate Limiting:
# Reduce request frequency
ai-dev review --file-patterns "src/**/*.ts" --llm-max-tokens 500
# Use smaller models
ai-dev review --llm-model gpt-3.5-turbo
Provider-Specific Issues
OpenAI Issues:
- Check API key validity
- Verify organization settings
- Monitor usage limits
Gemini Issues:
- Verify API key format
- Check quota limits
- Ensure model availability
Ollama Issues:
- Ensure service is running
- Check model availability
- Verify system resources
Best Practices
1. Choose the Right Provider
# For production: Use cloud providers
ai-dev review --llm-provider openai --llm-model gpt-4
# For development: Use local models
ai-dev review --llm-provider ollama --llm-model codellama:7b
# For privacy: Use local models
ai-dev review --llm-provider ollama --llm-model llama2
2. Optimize for Cost
# Use smaller models for simple tasks
ai-dev review --llm-model gpt-3.5-turbo
# Limit token usage
ai-dev review --llm-max-tokens 1000
# Use local models when possible
ai-dev review --llm-provider ollama
3. Ensure Reliability
# Use stable models
ai-dev review --llm-model gpt-4
# Enable retries
ai-dev review --llm-max-retries 3
# Use fallback providers
ai-dev review --llm-provider openai --fallback-provider gemini