Architecture Overview¶
git-autosquash is designed as a modular system with clear separation of concerns. This document provides a comprehensive overview of the system architecture, component interactions, and design decisions.
System Overview¶
graph TB
subgraph "CLI Layer"
A[main.py] --> B[Argument Parsing]
B --> C[Repository Validation]
end
subgraph "Core Processing"
C --> D[GitOps]
D --> E[HunkParser]
E --> F[BlameAnalyzer]
end
subgraph "User Interface"
F --> G[AutoSquashApp]
G --> H[ApprovalScreen]
H --> I[Widgets]
end
subgraph "Execution Layer"
I --> J[RebaseManager]
J --> K[Interactive Rebase]
K --> L[Conflict Resolution]
end
style A fill:#e1f5fe
style G fill:#f3e5f5
style J fill:#e8f5e8
Core Components¶
1. GitOps (git_ops.py
)¶
Purpose: Central interface for all Git operations with proper error handling and subprocess management.
Key Responsibilities:
- Repository validation and branch detection
- Working tree status analysis
- Git command execution with timeout and error handling
- Merge base calculation and commit validation
Design Patterns: - Facade Pattern: Simplifies complex Git interactions - Error Handling: Comprehensive subprocess error management - Caching: Intelligent caching of expensive Git operations
class GitOps:
def __init__(self, repo_path: str = ".") -> None
def is_git_repo(self) -> bool
def get_current_branch(self) -> str | None
def get_merge_base_with_main(self, current_branch: str) -> str | None
def get_working_tree_status(self) -> dict[str, bool]
def run_git_command(self, args: list[str], env: dict[str, str] | None = None) -> subprocess.CompletedProcess[str]
2. HunkParser (hunk_parser.py
)¶
Purpose: Parses Git diff output into structured hunk objects for analysis and processing.
Key Responsibilities:
- Parse git diff
output into structured DiffHunk
objects
- Support both default and line-by-line hunk splitting modes
- Extract file context and line range information
- Handle various diff formats and edge cases
Design Decisions:
- Immutable Data Structures: DiffHunk
objects are immutable for safety
- Flexible Parsing: Supports multiple diff modes and contexts
- Line Preservation: Maintains exact line content including whitespace
@dataclass(frozen=True)
class DiffHunk:
file_path: str
old_start: int
old_count: int
new_start: int
new_count: int
lines: list[str]
context_before: list[str]
context_after: list[str]
3. BlameAnalyzer (blame_analyzer.py
)¶
Purpose: Analyzes Git blame information to determine target commits for each hunk with confidence scoring.
Key Responsibilities: - Run git blame analysis on hunk line ranges - Determine most frequent commit for each hunk (frequency-first algorithm) - Filter commits to branch scope (merge-base to HEAD) - Calculate confidence levels based on blame consistency - Cache commit metadata for performance
Algorithm Design: - Frequency-First Scoring: Prioritizes commits that modified the most lines - Recency Tiebreaking: Uses commit timestamps to break frequency ties - Branch Scoping: Only considers commits on current branch since merge-base
class BlameAnalyzer:
def analyze_hunks(self, hunks: List[DiffHunk]) -> List[HunkTargetMapping]
def _analyze_single_hunk(self, hunk: DiffHunk) -> HunkTargetMapping
def _get_branch_commits(self) -> Set[str] # Cached
def _get_commit_timestamp(self, commit_hash: str) -> int # Cached
4. TUI System (tui/
)¶
Purpose: Rich terminal interface using Textual framework for user interaction and approval workflow.
Component Structure:
graph TD
A[AutoSquashApp] --> B[ApprovalScreen]
B --> C[HunkMappingWidget]
B --> D[DiffViewer]
B --> E[ProgressIndicator]
C --> F[Checkbox]
C --> G[Static Text]
D --> H[Syntax Highlighting]
E --> I[Progress Display]
Key Design Principles: - Reactive UI: Real-time updates based on user interactions - Safety Defaults: All hunks start unapproved requiring explicit consent - Keyboard Navigation: Full keyboard control for efficient workflows - Graceful Fallback: Text-based fallback when TUI unavailable
5. RebaseManager (rebase_manager.py
)¶
Purpose: Orchestrates interactive rebase operations to apply approved hunks to historical commits.
Key Responsibilities: - Group hunks by target commit for batch processing - Execute interactive rebase with chronological ordering - Handle stash/unstash operations for working tree management - Detect and report conflicts with resolution guidance - Provide automatic rollback on errors or interruption
Execution Flow:
1. Preparation: Stash uncommitted changes, validate branch state
2. Grouping: Organize hunks by target commit hash
3. Ordering: Sort commits chronologically (oldest first) for history integrity
4. Processing: For each commit:
- Start interactive rebase to edit the commit
- Apply hunk patches using git apply
- Amend the commit with new changes
- Continue rebase to next commit
5. Cleanup: Restore stash, handle any remaining cleanup
Error Handling Strategy: - Conflict Detection: Identify merge conflicts and pause with guidance - Automatic Rollback: Restore repository state on errors or cancellation - Resource Cleanup: Ensure temporary files and stashes are properly cleaned up
Data Flow¶
1. Input Processing¶
sequenceDiagram
participant CLI as main.py
participant Git as GitOps
participant Parser as HunkParser
CLI->>Git: Validate repository
Git->>CLI: Repository status
CLI->>Git: Get working tree changes
Git->>Parser: Raw diff output
Parser->>CLI: Structured DiffHunk objects
2. Analysis Phase¶
sequenceDiagram
participant CLI as main.py
participant Blame as BlameAnalyzer
participant Git as GitOps
CLI->>Blame: analyze_hunks(hunks)
loop For each hunk
Blame->>Git: git blame <lines>
Git->>Blame: Blame output
Blame->>Blame: Find target commit
Blame->>Blame: Calculate confidence
end
Blame->>CLI: HunkTargetMapping list
3. User Approval¶
sequenceDiagram
participant CLI as main.py
participant App as AutoSquashApp
participant Screen as ApprovalScreen
participant User as User
CLI->>App: Launch TUI with mappings
App->>Screen: Create approval screen
Screen->>User: Show hunk mappings
User->>Screen: Review and approve hunks
Screen->>App: Approved mappings
App->>CLI: User decisions
4. Execution Phase¶
sequenceDiagram
participant CLI as main.py
participant Rebase as RebaseManager
participant Git as GitOps
CLI->>Rebase: execute_squash(mappings)
Rebase->>Rebase: Group hunks by commit
Rebase->>Rebase: Order commits chronologically
loop For each target commit
Rebase->>Git: Start interactive rebase
Rebase->>Git: Apply hunk patches
Rebase->>Git: Amend commit
Rebase->>Git: Continue rebase
end
Rebase->>CLI: Success/failure result
Design Patterns and Principles¶
1. Separation of Concerns¶
Each component has a single, well-defined responsibility:
- GitOps: Git command interface
- HunkParser: Diff parsing and structure
- BlameAnalyzer: Blame analysis and targeting
- TUI Components: User interface and interaction
- RebaseManager: Rebase orchestration and execution
2. Error Handling Strategy¶
Defensive Programming: - Validate all inputs at component boundaries - Handle subprocess failures gracefully - Provide meaningful error messages to users - Implement automatic rollback mechanisms
Error Categories:
- User Errors: Invalid repository state, detached HEAD
- Git Errors: Command failures, conflicts, repository issues
- System Errors: File I/O, permissions, resource constraints
- Interruption: User cancellation, keyboard interrupt
3. Performance Optimizations¶
Caching Strategy: - Commit metadata: Timestamps and summaries cached to avoid repeated Git calls - Branch commits: Expensive commit list operations cached per session - Blame results: Reuse blame data across multiple hunk analyses
Resource Management: - Subprocess timeouts: Prevent hanging on Git operations - Temporary file cleanup: Automatic cleanup of patches and todo files - Memory efficiency: Stream processing of large diffs when possible
4. Testing Architecture¶
Test Categories:
- Unit Tests: Individual component functionality with mocking
- Integration Tests: Component interaction with real Git repositories
- TUI Tests: User interface behavior without DOM dependencies
- End-to-End Tests: Complete workflow simulation
Test Infrastructure: - Mocking Strategy: Mock Git operations for reliable, fast tests - Test Data: Structured test repositories and diff scenarios - Edge Case Coverage: Boundary conditions and error scenarios
Configuration and Extensibility¶
Future Extension Points¶
- Configuration System:
- User preferences for approval defaults
- Custom confidence thresholds
-
Blame analysis parameters
-
Plugin Architecture:
- Custom hunk filtering rules
- Alternative conflict resolution strategies
-
Integration with external tools
-
Output Formats:
- JSON output for tooling integration
- Structured logging for automation
- Custom report generation
Security Considerations¶
Git Command Safety¶
- Command Injection Prevention: All Git arguments properly escaped
- Repository Validation: Verify repository integrity before operations
- Branch Protection: Only operate on feature branches with clear merge-base
Data Integrity¶
- Atomic Operations: Rebase operations are atomic where possible
- Backup Strategy: Automatic stashing preserves user work
- Rollback Capability: Complete restoration on failure or cancellation
User Safety¶
- Default Deny: All operations require explicit user approval
- Clear Feedback: Detailed progress and error reporting
- Escape Mechanisms: Multiple ways to safely abort operations
This architecture provides a robust, maintainable foundation for git-autosquash while supporting future enhancements and ensuring user safety throughout the workflow.