Project Overview
This AI-powered code review assistant integrates with GitHub to automatically analyze pull requests using GPT-4. It provides intelligent feedback on code quality, potential bugs, security vulnerabilities, and adherence to best practices.
Architecture
┌──────────────┐
│ GitHub │
│ Webhook │
└──────┬───────┘
│
▼
┌──────────────┐ ┌──────────────┐
│ FastAPI │─────▶│ OpenAI │
│ Backend │ │ GPT-4 API │
└──────┬───────┘ └──────────────┘
│
▼
┌──────────────┐
│ PostgreSQL │
│ Database │
└──────────────┘
Core Features
- Automated PR analysis triggered by GitHub webhooks
- Multi-language support (Python, JavaScript, TypeScript, Java, Go)
- Security vulnerability detection using pattern matching and AI
- Code smell identification (complexity, duplication, naming)
- Best practice suggestions based on language-specific guidelines
- Inline comments posted directly on the PR
- Summary reports with overall code quality score
- Custom rule configuration per repository
Implementation
Backend Service
The FastAPI backend handles webhook events and orchestrates the review process:
from fastapi import FastAPI, Request, BackgroundTasks
from openai import OpenAI
import httpx
from typing import List, Dict
import os
app = FastAPI()
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
class CodeReviewer:
def __init__(self, repo_owner: str, repo_name: str, pr_number: int):
self.repo_owner = repo_owner
self.repo_name = repo_name
self.pr_number = pr_number
self.github_token = os.getenv("GITHUB_TOKEN")
async def get_pr_diff(self) -> str:
"""Fetch the PR diff from GitHub API"""
url = f"https://api.github.com/repos/{self.repo_owner}/{self.repo_name}/pulls/{self.pr_number}"
headers = {
"Authorization": f"token {self.github_token}",
"Accept": "application/vnd.github.v3.diff"
}
async with httpx.AsyncClient() as client:
response = await client.get(url, headers=headers)
return response.text
async def analyze_code(self, diff: str) -> List[Dict]:
"""Use GPT-4 to analyze the code changes"""
prompt = f"""You are an expert code reviewer. Analyze the following code diff and provide detailed feedback.
Focus on:
1. Potential bugs or logic errors
2. Security vulnerabilities
3. Performance issues
4. Code style and best practices
5. Maintainability concerns
Provide specific, actionable feedback with line numbers when possible.
Diff:
{diff}
Format your response as a JSON array of review comments with this structure:
[
{{
"line": <line_number>,
"severity": "error|warning|info",
"category": "bug|security|performance|style|maintainability",
"message": "Detailed explanation of the issue",
"suggestion": "How to fix it"
}}
]
"""
response = client.chat.completions.create(
model="gpt-4-turbo-preview",
messages=[
{"role": "system", "content": "You are an expert code reviewer."},
{"role": "user", "content": prompt}
],
temperature=0.3,
response_format={"type": "json_object"}
)
import json
return json.loads(response.choices[0].message.content)
async def post_review_comments(self, comments: List[Dict]):
"""Post review comments to the GitHub PR"""
url = f"https://api.github.com/repos/{self.repo_owner}/{self.repo_name}/pulls/{self.pr_number}/comments"
headers = {
"Authorization": f"token {self.github_token}",
"Accept": "application/vnd.github.v3+json"
}
async with httpx.AsyncClient() as client:
for comment in comments:
body = f"""**{comment['severity'].upper()}** - {comment['category']}
{comment['message']}
**Suggestion:** {comment['suggestion']}
---
*Generated by AI Code Reviewer 🤖*
"""
payload = {
"body": body,
"commit_id": await self.get_latest_commit(),
"path": comment.get("file", ""),
"line": comment["line"]
}
await client.post(url, headers=headers, json=payload)
async def generate_summary(self, comments: List[Dict]) -> str:
"""Generate an overall review summary"""
errors = len([c for c in comments if c['severity'] == 'error'])
warnings = len([c for c in comments if c['severity'] == 'warning'])
summary = f"""## AI Code Review Summary
**Total Issues Found:** {len(comments)}
- 🔴 Errors: {errors}
- 🟡 Warnings: {warnings}
- ℹ️ Info: {len(comments) - errors - warnings}
### Category Breakdown
"""
categories = {}
for comment in comments:
cat = comment['category']
categories[cat] = categories.get(cat, 0) + 1
for cat, count in categories.items():
summary += f"- {cat.title()}: {count}\n"
# Calculate quality score
score = max(0, 100 - (errors * 10) - (warnings * 5))
summary += f"\n**Code Quality Score:** {score}/100\n"
return summary
@app.post("/webhook/github")
async def github_webhook(request: Request, background_tasks: BackgroundTasks):
"""Handle GitHub webhook events"""
payload = await request.json()
# Only process pull request events
if payload.get("action") not in ["opened", "synchronize"]:
return {"status": "ignored"}
pr = payload["pull_request"]
repo = payload["repository"]
# Queue the review in the background
background_tasks.add_task(
review_pull_request,
repo["owner"]["login"],
repo["name"],
pr["number"]
)
return {"status": "queued"}
async def review_pull_request(owner: str, repo: str, pr_number: int):
"""Perform the code review"""
reviewer = CodeReviewer(owner, repo, pr_number)
# Get the PR diff
diff = await reviewer.get_pr_diff()
# Analyze with AI
comments = await reviewer.analyze_code(diff)
# Post comments to GitHub
await reviewer.post_review_comments(comments)
# Post summary
summary = await reviewer.generate_summary(comments)
await reviewer.post_pr_comment(summary)
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=8000)
Security Analysis Module
Additional security-focused analysis using pattern matching:
import re
from typing import List, Dict
class SecurityAnalyzer:
"""Detect common security vulnerabilities"""
PATTERNS = {
"hardcoded_secrets": [
r'password\s*=\s*["\'][^"\']+["\']',
r'api_key\s*=\s*["\'][^"\']+["\']',
r'secret\s*=\s*["\'][^"\']+["\']',
],
"sql_injection": [
r'execute\([^)]*\+[^)]*\)',
r'\.format\([^)]*\).*execute',
],
"xss_vulnerability": [
r'innerHTML\s*=',
r'dangerouslySetInnerHTML',
],
"insecure_random": [
r'random\.random\(',
r'Math\.random\(',
]
}
def analyze(self, code: str) -> List[Dict]:
"""Scan code for security issues"""
issues = []
for category, patterns in self.PATTERNS.items():
for pattern in patterns:
matches = re.finditer(pattern, code, re.IGNORECASE)
for match in matches:
issues.append({
"category": "security",
"subcategory": category,
"severity": "error",
"line": code[:match.start()].count('\n') + 1,
"message": f"Potential {category.replace('_', ' ')} detected",
"code_snippet": match.group(0)
})
return issues
Challenges & Solutions
Challenge 1: Token Limits
Problem: Large PRs exceeded GPT-4’s token limit.
Solution: Implemented chunking strategy to analyze files separately and aggregate results, with smart prioritization of changed files.
Challenge 2: False Positives
Problem: AI sometimes flagged valid code as problematic.
Solution: Added confidence scoring and allowed developers to mark false positives, which are used to fine-tune the prompts.
Challenge 3: Rate Limiting
Problem: GitHub API rate limits caused issues with high-volume repos.
Solution: Implemented request queuing, caching, and exponential backoff with retry logic.
Results & Impact
- 50% reduction in time spent on initial code reviews
- 30% increase in caught bugs before merge
- Consistent enforcement of coding standards
- Educational value for junior developers through detailed explanations
Lessons Learned
- Prompt engineering is critical for quality AI responses
- Context matters - providing file structure and dependencies improves analysis
- Human oversight is still essential - AI augments, doesn’t replace
- Cost management is important with API-based solutions
- Feedback loops improve the system over time
Future Enhancements
- Support for GitLab and Bitbucket
- Custom rule engine for team-specific standards
- Integration with static analysis tools (ESLint, Pylint)
- Learning from accepted/rejected suggestions
- Multi-file context awareness
- Performance benchmarking suggestions
Try It Out
Try It Out
Check out the live demo or explore the source code on GitHub.