Core Concepts

LLM Provider Configuration

Learn how to configure and use different LLM providers with AI Developer Assistant

LLM Provider Configuration

AI Developer Assistant supports multiple Large Language Model (LLM) providers, giving you flexibility in choosing between cloud-based and local AI models. This guide covers configuration, usage, and best practices for each provider.

Supported Providers

1. OpenAI (Cloud)

OpenAI provides access to powerful models like GPT-4 and GPT-3.5-turbo through their cloud API.

Configuration

Environment Variables:

export OPENAI_API_KEY="sk-your-openai-api-key"
export LLM_PROVIDER="openai"
export LLM_MODEL="gpt-4"

Configuration File:

# ai-dev.config.local.yaml
llm:
  provider: "openai"
  model: "gpt-4"
  temperature: 0.7
  maxTokens: 2000

openai:
  apiKey: "${OPENAI_API_KEY}"
  baseUrl: "https://api.openai.com/v1"
  organization: "${OPENAI_ORGANIZATION}"  # Optional

Usage

# Use OpenAI with command-line options
ai-dev review --llm-provider openai --openai-api-key your-key --llm-model gpt-4

# Use OpenAI with environment variables
ai-dev review  # Uses OPENAI_API_KEY and LLM_PROVIDER from environment

Available Models

  • gpt-4: Most capable model, best for complex analysis
  • gpt-4-turbo: Faster version of GPT-4
  • gpt-3.5-turbo: Cost-effective option for most tasks
  • gpt-3.5-turbo-16k: Extended context length

Best Practices

# Use GPT-4 for complex code reviews
ai-dev review --llm-model gpt-4 --verbose

# Use GPT-3.5-turbo for quick reviews
ai-dev review --llm-model gpt-3.5-turbo

# Use GPT-4-turbo for balanced performance
ai-dev review --llm-model gpt-4-turbo

2. Google Gemini (Cloud)

Google Gemini provides access to advanced AI models through Google AI Studio.

Configuration

Environment Variables:

export GEMINI_API_KEY="AIzaSy-your-gemini-api-key"
export LLM_PROVIDER="gemini"
export LLM_MODEL="gemini-2.0-flash"

Configuration File:

# ai-dev.config.local.yaml
llm:
  provider: "gemini"
  model: "gemini-2.0-flash"
  temperature: 0.7
  maxTokens: 2000

gemini:
  apiKey: "${GEMINI_API_KEY}"
  model: "gemini-2.0-flash"
  baseUrl: "https://generativelanguage.googleapis.com/v1beta"

Usage

# Use Gemini with command-line options
ai-dev review --llm-provider gemini --gemini-api-key your-key --llm-model gemini-2.0-flash

# Use Gemini with environment variables
ai-dev review  # Uses GEMINI_API_KEY and LLM_PROVIDER from environment

Available Models

  • gemini-2.0-flash: Latest model with improved performance
  • gemini-pro: High-quality model for complex tasks
  • gemini-pro-vision: Multimodal model with image understanding

Best Practices

# Use Gemini 2.0 Flash for fast analysis
ai-dev review --llm-model gemini-2.0-flash

# Use Gemini Pro for detailed analysis
ai-dev review --llm-model gemini-pro --verbose

3. Ollama (Local)

Ollama allows you to run large language models locally on your machine, providing privacy and offline capabilities.

Installation

# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh

# Pull a model
ollama pull llama2
ollama pull codellama
ollama pull mistral

Configuration

Environment Variables:

export OLLAMA_ENABLED="true"
export OLLAMA_BASE_URL="http://localhost:11434"
export OLLAMA_MODEL="llama2"
export LLM_PROVIDER="ollama"

Configuration File:

# ai-dev.config.local.yaml
llm:
  provider: "ollama"
  model: "llama2"
  temperature: 0.7
  maxTokens: 2000

ollama:
  enabled: true
  baseUrl: "http://localhost:11434"
  model: "llama2"

Usage

# Start Ollama service
ollama serve

# Use Ollama with command-line options
ai-dev review --llm-provider ollama --ollama-enabled --ollama-model llama2

# Use Ollama with environment variables
ai-dev review  # Uses OLLAMA_* environment variables

Available Models

General Purpose:

  • llama2: Meta's Llama 2 model
  • mistral: Mistral AI's model
  • codellama: Code-specialized Llama model

Code-Specific:

  • codellama:7b: 7B parameter code model
  • codellama:13b: 13B parameter code model
  • codellama:34b: 34B parameter code model

Best Practices

# Use CodeLlama for code analysis
ai-dev review --llm-model codellama:7b

# Use Mistral for general analysis
ai-dev review --llm-model mistral

# Use Llama2 for balanced performance
ai-dev review --llm-model llama2

Provider Comparison

ProviderPrivacyCostPerformanceSetupBest For
OpenAICloudPay-per-useHighEasyProduction, complex analysis
GeminiCloudPay-per-useHighEasyFast analysis, multimodal
OllamaLocalFreeVariableMediumPrivacy, offline use

Advanced Configuration

Model Parameters

Configure model behavior with parameters:

# ai-dev.config.local.yaml
llm:
  provider: "openai"
  model: "gpt-4"
  temperature: 0.7        # Creativity level (0-1)
  maxTokens: 2000         # Maximum response length
  topP: 0.9              # Nucleus sampling
  frequencyPenalty: 0.0   # Reduce repetition
  presencePenalty: 0.0    # Encourage new topics

Provider-Specific Settings

OpenAI Advanced Configuration

openai:
  apiKey: "${OPENAI_API_KEY}"
  baseUrl: "https://api.openai.com/v1"
  organization: "${OPENAI_ORGANIZATION}"
  timeout: 30000          # Request timeout in ms
  maxRetries: 3           # Number of retries
  retryDelay: 1000        # Delay between retries

Gemini Advanced Configuration

gemini:
  apiKey: "${GEMINI_API_KEY}"
  model: "gemini-2.0-flash"
  baseUrl: "https://generativelanguage.googleapis.com/v1beta"
  timeout: 30000
  maxRetries: 3
  retryDelay: 1000

Ollama Advanced Configuration

ollama:
  enabled: true
  baseUrl: "http://localhost:11434"
  model: "llama2"
  timeout: 60000          # Longer timeout for local models
  maxRetries: 2
  retryDelay: 2000

Switching Between Providers

Command-Line Switching

# Switch to OpenAI
ai-dev review --llm-provider openai --openai-api-key your-key

# Switch to Gemini
ai-dev review --llm-provider gemini --gemini-api-key your-key

# Switch to Ollama
ai-dev review --llm-provider ollama --ollama-enabled

Environment Variable Switching

# Set OpenAI as default
export LLM_PROVIDER="openai"
export OPENAI_API_KEY="your-key"

# Switch to Gemini
export LLM_PROVIDER="gemini"
export GEMINI_API_KEY="your-key"

# Switch to Ollama
export LLM_PROVIDER="ollama"
export OLLAMA_ENABLED="true"

Configuration File Switching

# ai-dev.config.local.yaml
llm:
  provider: "openai"  # Change to "gemini" or "ollama"
  model: "gpt-4"      # Change model accordingly

Performance Optimization

Model Selection

Choose the right model for your use case:

# Fast analysis with good quality
ai-dev review --llm-model gpt-3.5-turbo

# High-quality analysis
ai-dev review --llm-model gpt-4

# Balanced performance
ai-dev review --llm-model gpt-4-turbo

Token Management

Optimize token usage:

# Limit response length
ai-dev review --llm-max-tokens 1000

# Use shorter prompts for simple tasks
ai-dev review --file-patterns "src/utils/helper.ts"

Caching

Enable response caching for repeated requests:

# ai-dev.config.local.yaml
llm:
  provider: "openai"
  model: "gpt-4"
  cache: true
  cacheTTL: 3600  # Cache for 1 hour

Security Considerations

API Key Management

Environment Variables (Recommended):

# Never commit API keys to version control
export OPENAI_API_KEY="sk-your-key"
export GEMINI_API_KEY="AIzaSy-your-key"

Configuration Files:

# Use environment variable references
openai:
  apiKey: "${OPENAI_API_KEY}"  # Not: "sk-your-key"

Local vs Cloud

Use Local (Ollama) for:

  • Sensitive code that shouldn't leave your machine
  • Offline development
  • Cost control
  • Privacy requirements

Use Cloud (OpenAI/Gemini) for:

  • Better performance and accuracy
  • Access to latest models
  • Team collaboration
  • Production environments

Troubleshooting

Common Issues

API Key Errors:

# Test API key
curl -H "Authorization: Bearer $OPENAI_API_KEY" https://api.openai.com/v1/models

# Check key format
echo $OPENAI_API_KEY | head -c 10  # Should start with "sk-"

Ollama Connection Issues:

# Check if Ollama is running
curl http://localhost:11434/api/tags

# Start Ollama service
ollama serve

# Pull required model
ollama pull llama2

Rate Limiting:

# Reduce request frequency
ai-dev review --file-patterns "src/**/*.ts" --llm-max-tokens 500

# Use smaller models
ai-dev review --llm-model gpt-3.5-turbo

Provider-Specific Issues

OpenAI Issues:

  • Check API key validity
  • Verify organization settings
  • Monitor usage limits

Gemini Issues:

  • Verify API key format
  • Check quota limits
  • Ensure model availability

Ollama Issues:

  • Ensure service is running
  • Check model availability
  • Verify system resources

Best Practices

1. Choose the Right Provider

# For production: Use cloud providers
ai-dev review --llm-provider openai --llm-model gpt-4

# For development: Use local models
ai-dev review --llm-provider ollama --llm-model codellama:7b

# For privacy: Use local models
ai-dev review --llm-provider ollama --llm-model llama2

2. Optimize for Cost

# Use smaller models for simple tasks
ai-dev review --llm-model gpt-3.5-turbo

# Limit token usage
ai-dev review --llm-max-tokens 1000

# Use local models when possible
ai-dev review --llm-provider ollama

3. Ensure Reliability

# Use stable models
ai-dev review --llm-model gpt-4

# Enable retries
ai-dev review --llm-max-retries 3

# Use fallback providers
ai-dev review --llm-provider openai --fallback-provider gemini
Start with a cloud provider for better performance and accuracy, then consider local models for privacy-sensitive projects or cost optimization. You can always switch between providers based on your specific needs.