GitHub Repository Mapper

                 ██████████
                █▓       ░██
                █▒        ██ T H E   P I N A C L E  O F  H A K C I N G   Q U A L I T Y
    █████████████░        █████████████████ ████████████ ████████████      ████████████
   ██         ███░        ███▓▒▒▒▒▒▒▒▒▒▒▒██ █▒▒▒▒▒▒▒▒▓████        █████████▓          ▒█
   ██         ███         ███▒▒▒▒▒▒▒▒▒▒▒▒▓██████████████▓        ███▓▒      ▒▓░       ▒█
   ██         ███        ░██▓▒▒▒▒▒▒▒▒▒▒▒▒▒▓██▓▒▒▒▒▒▒▒▒█▓        ███░       ░██░       ▒█
   ██         ███        ▒██▓▒▒▒▒▒▒▒▒▒▒▒▒▒▒██▓▒▒▒▒▒▒▒▓▒        ██  ▓        ██░       ▓█
   ██         ██▓        ███▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒█▓▒▒▒▒▒▒▒▓▒       ██   █        ██░       ▓
   ██         ██▒        ██▓▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▓▓▒▒▒▒▒▒▒▓▒      ██    █        ▓█████████
   ██                    ██▒▒▒▒▒▒▒▒█▓▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▓▒   ▒███████ █░       ░▓        █
   ██         ░░         ██▒▒▒▒▒▒▒▒██▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▓█ ▓        ░█ ▓       ░▒       ░█
   ██         ██░       ░█▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▓█ █░        ▒ █                ░█
   ██         ██        ▓█▒▒▒▒▒▒▒▒▒██▓▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▓█ █░        ▒ █░               ▒█
    ██████████  ███████████▓██▓▓█▓█  █▓▒▒▒▒▒▒▒▒▒▓██▓██   █▓▓▓▓▓▓▓█    █▓▓▓▓▓▓▓▓▓▓▓▓▓▓██
  .:/====================█▓██▓██=========████▓█▓█ ███======> [ P R E S E N T S ] ====\:.
                           ██▓██           █▓▓▓██ ██  ▄▄▄▄ ▄▄ ▄▄▄▄▄▄ ██▄  ▄██ ▄████▄ █████▄
                             █▓█             ██▓██   ██ ▄▄ ██   ██   ██ ▀▀ ██ ██▄▄██ ██▄▄█▀
                              ██               ███   ▀███▀ ██   ██   ██    ██ ██  ██ ██

GitHub Repository Mapper

Automatically discover, analyze, and categorize all your GitHub repositories by cybersecurity domain and purpose. Generates an evergreen mind map that updates weekly.

Overview

Features

Automatic Discovery: Fetches all repositories from your personal account and organizations
AI-Powered Categorization: Uses Claude AI to analyze README files and categorize repos
Cybersecurity Focus: 14 specialized cybersecurity categories plus general development categories
Multiple Output Formats: XMind JSON, Markdown tree view, and statistical reports
Intelligent Caching: Only re-analyzes repos when needed (configurable cache period)
Weekly Automation: GitHub Actions workflow for automatic weekly updates
Evergreen: Continuously maintains up-to-date repository mappings

Categories

The system intelligently categorizes your repositories into specialized domains:

---
config:
  theme: base
  themeVariables:
    primaryColor: '#fff'
    primaryTextColor: '#1e40af'
    primaryBorderColor: '#cbd5e1'
    lineColor: '#94a3b8'
    fontSize: 10px
    fontFamily: Lato
  layout: elk
  look: neo
  elk:
    mergeEdges: true
    nodePlacementStrategy: SIMPLE
  flowchart:
    nodeSpacing: 10
    rankSpacing: 15
    curve: linear
    htmlLabels: true
    padding: 8
---
flowchart TB
    subgraph core["Core Security Domains"]
        direction LR
        subgraph row1[" "]
            direction LR
            subgraph c1["Risk Mgmt"]
                direction TB
                C1A(["Lines of Defense"])
                C1B(["Risk Treatment"])
                C1C(["BCP/DR"])
                C1D(["Cyber Insurance"])
            end
            subgraph c2["Governance"]
                direction TB
                C2A(["Laws & Regs"])
                C2B(["Policies"])
                C2C(["Standards"])
                C2D(["Audit"])
            end
            subgraph c3["Architecture"]
                direction TB
                C3A(["Network"])
                C3B(["Crypto"])
                C3C(["Access Control"])
                C3D(["Cloud"])
            end
        end
        subgraph row2[" "]
            direction LR
            subgraph c4["Operations"]
                direction TB
                C4A(["SOC"])
                C4B(["IR"])
                C4C(["SIEM/SOAR"])
                C4D(["Threat Hunt"])
            end
            subgraph c5["AppSec"]
                direction TB
                C5A(["S-SDLC"])
                C5B(["SAST/DAST"])
                C5C(["API"])
                C5D(["Code Scan"])
            end
            subgraph c6["Assessment"]
                direction TB
                C6A(["Pen Test"])
                C6B(["Vuln Scan"])
                C6C(["3rd Party"])
                C6D(["Assets"])
            end
        end
    end

    subgraph extended[" "]
        direction LR
        subgraph intel["Threat Intel"]
            direction TB
            I1(["IOCs"])
            I2(["OSINT"])
            I3(["Malware"])
        end
        subgraph hack["Hacking"]
            direction TB
            H1(["Exploits"])
            H2(["Red Team"])
            H3(["Social Eng"])
        end
        subgraph ai["AI & ML"]
            direction TB
            A1(["LLMs"])
            A2(["AI Security"])
            A3(["ML Models"])
        end
        subgraph research["Research"]
            direction TB
            R1(["Papers"])
            R2(["PoC Code"])
            R3(["Datasets"])
        end
    end

    subgraph growth[" "]
        direction LR
        subgraph edu["Education"]
            direction TB
            E1(["Awareness"])
            E2(["CTF"])
            E3(["Certs"])
        end
        subgraph dev["Development"]
            direction TB
            D1(["Web Dev"])
            D2(["DevOps"])
            D3(["Automation"])
        end
        subgraph proj["Projects"]
            direction TB
            P1(["Personal"])
            P2(["Hardware"])
            P3(["IoT"])
        end
    end

    C4D -.-> I1
    C4B -.-> I3
    C6A -.-> H1
    I3 -.-> A2
    H1 -.-> R2
    E2 -.-> H1
    D2 -.-> C5A
    P1 -.-> R2

    style row1 fill:none,stroke:none
    style row2 fill:none,stroke:none
    style extended fill:none,stroke:none
    style growth fill:none,stroke:none

    style c1 fill:#f0f9ff,stroke:#0284c7,rx:8,ry:8
    style c2 fill:#f0f9ff,stroke:#0284c7,rx:8,ry:8
    style c3 fill:#f0f9ff,stroke:#0284c7,rx:8,ry:8
    style c4 fill:#f0f9ff,stroke:#0284c7,rx:8,ry:8
    style c5 fill:#f0f9ff,stroke:#0284c7,rx:8,ry:8
    style c6 fill:#f0f9ff,stroke:#0284c7,rx:8,ry:8

    style C1A fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C1B fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C1C fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C1D fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C2A fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C2B fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C2C fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C2D fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C3A fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C3B fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C3C fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C3D fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C4A fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C4B fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C4C fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C4D fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C5A fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C5B fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C5C fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C5D fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C6A fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C6B fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C6C fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C6D fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5

    style intel fill:#faf5ff,stroke:#7c3aed,rx:8,ry:8
    style hack fill:#fdf2f8,stroke:#db2777,rx:8,ry:8
    style ai fill:#ecfdf5,stroke:#059669,rx:8,ry:8
    style research fill:#e0f7fa,stroke:#0891b2,rx:8,ry:8
    style edu fill:#fffbeb,stroke:#d97706,rx:8,ry:8
    style dev fill:#fff7ed,stroke:#ea580c,rx:8,ry:8
    style proj fill:#f5f5f4,stroke:#78716c,rx:8,ry:8

    style I1 fill:#f3e8ff,stroke:#7c3aed,rx:5,ry:5
    style I2 fill:#f3e8ff,stroke:#7c3aed,rx:5,ry:5
    style I3 fill:#f3e8ff,stroke:#7c3aed,rx:5,ry:5
    style H1 fill:#fce8f4,stroke:#db2777,rx:5,ry:5
    style H2 fill:#fce8f4,stroke:#db2777,rx:5,ry:5
    style H3 fill:#fce8f4,stroke:#db2777,rx:5,ry:5
    style A1 fill:#d1fae5,stroke:#059669,rx:5,ry:5
    style A2 fill:#d1fae5,stroke:#059669,rx:5,ry:5
    style A3 fill:#d1fae5,stroke:#059669,rx:5,ry:5
    style R1 fill:#cffafe,stroke:#0891b2,rx:5,ry:5
    style R2 fill:#cffafe,stroke:#0891b2,rx:5,ry:5
    style R3 fill:#cffafe,stroke:#0891b2,rx:5,ry:5
    style E1 fill:#fef3c7,stroke:#d97706,rx:5,ry:5
    style E2 fill:#fef3c7,stroke:#d97706,rx:5,ry:5
    style E3 fill:#fef3c7,stroke:#d97706,rx:5,ry:5
    style D1 fill:#ffedd5,stroke:#ea580c,rx:5,ry:5
    style D2 fill:#ffedd5,stroke:#ea580c,rx:5,ry:5
    style D3 fill:#ffedd5,stroke:#ea580c,rx:5,ry:5
    style P1 fill:#e7e5e4,stroke:#78716c,rx:5,ry:5
    style P2 fill:#e7e5e4,stroke:#78716c,rx:5,ry:5
    style P3 fill:#e7e5e4,stroke:#78716c,rx:5,ry:5

    style core fill:#f8fafc,stroke:#0284c7,stroke-width:2px,rx:10,ry:10

    linkStyle 0 stroke:#0284c7,stroke-width:1px,stroke-dasharray:4,fill:none
    linkStyle 1 stroke:#0284c7,stroke-width:1px,stroke-dasharray:4,fill:none
    linkStyle 2 stroke:#0284c7,stroke-width:1px,stroke-dasharray:4,fill:none
    linkStyle 3 stroke:#7c3aed,stroke-width:1px,stroke-dasharray:4,fill:none
    linkStyle 4 stroke:#db2777,stroke-width:1px,stroke-dasharray:4,fill:none
    linkStyle 5 stroke:#d97706,stroke-width:1px,stroke-dasharray:4,fill:none
    linkStyle 6 stroke:#ea580c,stroke-width:1px,stroke-dasharray:4,fill:none
    linkStyle 7 stroke:#78716c,stroke-width:1px,stroke-dasharray:4,fill:none

Cybersecurity Categories (14)

Offensive Security - Penetration Testing, Exploit Development, Red Teaming
Defensive Security - Blue Teaming, Incident Response, Security Monitoring
Vulnerability Management - Scanning, Assessment, Bug Bounty
Application Security - SAST/DAST, Code Review, API Security
Network Security - Firewall, IDS/IPS, Network Monitoring
Cloud Security - AWS, Azure, GCP, Container Security
Identity & Access Management - Authentication, Authorization, SSO, PAM
Cryptography - Encryption, PKI, Hashing, Key Management
Malware Analysis - Reverse Engineering, Forensics, Sandboxing
Security Operations - SIEM, SOAR, Log Management
Compliance & Governance - GRC, Policy, Audit, Risk Management
Data Security - DLP, Privacy, Database Security
Security Tools & Utilities - Automation, Frameworks, Libraries
Education & Training - Tutorials, CTF, Documentation

General Categories (4)

General Development - Web, Backend, Frontend, DevOps
Data Science & Analytics - ML, Data Analysis, Visualization
Infrastructure & Systems - SysAdmin, Monitoring, Deployment
Other/Miscellaneous - Personal Projects, Experiments

Installation

Prerequisites

GitHub CLI (gh) - Install Guide
Python 3.8+
Anthropic API Key - Get one here

Quick Setup

sequenceDiagram
    participant User
    participant Installer
    participant Venv
    participant Dependencies

    User->>Installer: ./installer.sh
    Installer->>Venv: Create virtual environment
    Venv-->>Installer: ✓ venv created
    Installer->>Dependencies: Install anthropic + hakcer
    Dependencies-->>Installer: ✓ packages installed
    Installer->>User: ✓ Setup complete!
    Note over User: export ANTHROPIC_API_KEY
    User->>User: ./run.sh

Setup Steps

Clone the repository

git clone https://github.com/haKC-ai/RepoMapper.git
cd RepoMapper

Run the installer

./installer.sh

This will:

Create a Python virtual environment
Install all dependencies (anthropic, hakcer)
Set up directory structure
Make scripts executable

Authenticate with GitHub CLI (if not already done)

gh auth login

Set your Anthropic API key

export ANTHROPIC_API_KEY='your-api-key-here'

Or add it permanently to your shell profile:

echo 'export ANTHROPIC_API_KEY="your-api-key-here"' >> ~/.bashrc
source ~/.bashrc

Usage

Organization Selection (Optional)

Before running the mapper, you can interactively select which organizations to include:

source venv/bin/activate
python3 select_orgs.py

This launches an interactive checkbox interface where:

All organizations are INCLUDED by default
Enter a number to toggle an organization on/off
Enter 'all' to select everything
Enter 'none' to deselect everything
Enter 'done' when finished

Your selections are saved to config.yaml and will be respected by future runs.

Quick Start

Run all three steps in sequence:

# Optional: Select which organizations to include
python3 select_orgs.py

# Step 1: Fetch all repositories and READMEs
chmod +x repo_mapper.sh
./repo_mapper.sh

# Step 2: Analyze and categorize repositories
python3 analyze_repos.py

# Step 3: Generate visualizations
python3 generate_xmind.py

Viewing Results

After running the scripts, you'll find the following in the output/ directory:

repo_map.xmind.json - Import this into XMind for interactive mind mapping
repo_map.md - Markdown tree view (readable in any text editor)
repo_statistics.json - Statistical breakdown of your repositories

Importing into XMind

Open XMind (8 or later)
Go to File → Import
Select output/repo_map.xmind.json
Explore your repository map!

Alternative mind mapping tools that support JSON import:

MindMeister
Coggle
MindNode (with conversion)

Automation

GitHub Actions (Recommended)

For evergreen weekly updates:

Push this repository to GitHub

git init
git add .
git commit -m "Initial commit: Repository mapper"
gh repo create my-repo-map --private --source=. --push

Add the Anthropic API key as a repository secret

gh secret set ANTHROPIC_API_KEY
# Paste your API key when prompted

Enable GitHub Actions

The workflow is already configured in .github/workflows/weekly-update.yml and will:

Run every Monday at 9 AM UTC
Fetch latest repository data
Analyze new/updated repositories
Generate fresh visualizations
Commit updates automatically
Upload artifacts for download

Manual Trigger

You can also trigger the workflow manually:

gh workflow run "Weekly Repository Map Update"

Local Cron Job

Alternatively, set up a local cron job:

# Edit crontab
crontab -e

# Add this line to run every Monday at 9 AM
0 9 * * 1 cd /path/to/repo_map && ./repo_mapper.sh && python3 analyze_repos.py && python3 generate_xmind.py

Configuration

Edit config.yaml to customize:

Cache duration: How long before re-analyzing repos
Model selection: Choose between Claude models
Repository filters: Include/exclude archived, forked, or private repos
Custom categories: Add your own categorization schemes
Output formats: Enable/disable different output types
Notifications: Slack, Discord, or email alerts
Schedule: Change the automation frequency

Data Storage

repo_map/
├── data/                      # Repository metadata
│   ├── orgs.json             # Your organizations
│   ├── all_repos.json        # All discovered repositories
│   └── cache/                # README files (cached)
│       └── owner_repo_README.md
├── output/                    # Generated outputs
│   ├── repo_map.xmind.json   # XMind mind map
│   ├── repo_map.md           # Markdown tree
│   ├── repo_statistics.json  # Statistics
│   └── repo_analysis.json    # Full analysis data
└── .github/workflows/         # Automation

How It Works

graph LR
    subgraph "Phase 1: Discovery"
        A[GitHub API] -->|gh CLI| B[Fetch Orgs]
        B --> C[Fetch Repos]
        C --> D[Download READMEs]
        D --> E[Cache Files]
    end

    subgraph "Phase 2: Analysis"
        E --> F[Load Metadata]
        F --> G[Claude AI]
        G -->|Categorize| H[Assign Category]
        H -->|Tag| I[Add Metadata]
        I --> J[Cache Analysis]
    end

    subgraph "Phase 3: Visualization"
        J --> K[Build Hierarchy]
        K --> L[XMind JSON]
        K --> M[Markdown Tree]
        K --> N[Statistics]
    end

    style G fill:#ff9900,stroke:#ff6600,stroke-width:2px
    style L fill:#00cc88,stroke:#009966,stroke-width:2px
    style M fill:#00cc88,stroke:#009966,stroke-width:2px
    style N fill:#00cc88,stroke:#009966,stroke-width:2px

Phase Breakdown

1. Discovery Phase (repo_mapper.sh)

Fetches all your GitHub organizations using gh api user/orgs
Lists all repositories from each org and your personal account
Downloads README files for each repository
Caches READMEs to avoid redundant fetches (7-day cache)

2. Analysis Phase (analyze_repos.py)

Loads repository metadata and README content
Sends each repo to Claude AI for intelligent categorization
Claude analyzes the purpose, technology stack, and domain
Assigns primary category, subcategory, confidence level, and tags
Caches analysis results (re-analyzes after 7 days)

3. Visualization Phase (generate_xmind.py)

Organizes repositories into hierarchical structure
Generates XMind-compatible JSON format
Creates Markdown tree view for text-based browsing
Compiles statistics (by category, language, relevance)

Advanced Usage

Filter Repositories

Edit config.yaml to exclude certain repos:

filters:
  exclude_patterns:
    - "^test-.*"        # Exclude repos starting with "test-"
    - ".*-backup$"      # Exclude backup repos
    - "^archive-.*"     # Exclude archived projects

Custom Categories

Add domain-specific categories in config.yaml:

categories:
  custom_categories:
    - key: "blockchain_security"
      name: "Blockchain & Web3 Security"
      subcategories:
        - "Smart Contract Auditing"
        - "DeFi Security"
        - "Wallet Security"

Change AI Model

Use a more powerful model for better analysis:

analysis:
  model: "claude-opus-4-5-20251101"  # More accurate but slower/costlier

Schedule Frequency

Modify the cron expression in .github/workflows/weekly-update.yml:

schedule:
  - cron: '0 9 * * *'  # Daily at 9 AM instead of weekly

Cost Estimates

Using Claude Sonnet 4.5:

Per repository: ~1,000 tokens input + 500 tokens output
100 repositories: ~$0.40 USD per full analysis
Weekly updates: Minimal cost due to caching (only new/updated repos)

Troubleshooting

"ANTHROPIC_API_KEY not set"

export ANTHROPIC_API_KEY='your-key-here'

"gh: command not found"

Install GitHub CLI: https://cli.github.com/

Rate Limiting

Increase the delay in config.yaml:

analysis:
  rate_limit_delay: 1.0  # seconds between requests

No repositories found

Make sure you're authenticated:

gh auth status
gh auth login

XMind import fails

Try the Markdown output instead: output/repo_map.md

Privacy & Security

All analysis runs locally or in your GitHub Actions
No data is sent to third parties except Anthropic Claude API
README content is cached locally
Private repositories are included by default (configurable)
API keys are stored securely as GitHub secrets

Contributing

Suggestions for improvements:

Additional cybersecurity categories
Alternative visualization formats
Integration with other tools
Enhanced analysis prompts

License

MIT License - Feel free to modify and distribute

Support

For issues or questions:

Check the troubleshooting section
Review GitHub Actions logs for automated runs
Examine output/repo_analysis.json for detailed results

Built with:

GitHub CLI (gh)
Claude AI by Anthropic
Python 3
GitHub Actions

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
img		img
EXAMPLE_OUTPUT.md		EXAMPLE_OUTPUT.md
QUICKSTART.md		QUICKSTART.md
README.md		README.md
activate.sh		activate.sh
analyze_repos.py		analyze_repos.py
banner		banner
config.yaml		config.yaml
fetch_all_readmes.py		fetch_all_readmes.py
generate_xmind.py		generate_xmind.py
installer.sh		installer.sh
repo_mapper.sh		repo_mapper.sh
requirements.txt		requirements.txt
run.sh		run.sh
select_orgs.py		select_orgs.py

haKC-ai/RepoMapper

Folders and files

Latest commit

History

Repository files navigation

GitHub Repository Mapper

Overview

Features

Categories

Cybersecurity Categories (14)

General Categories (4)

Installation

Prerequisites

Quick Setup

Setup Steps

Usage

Organization Selection (Optional)

Quick Start

Viewing Results

Importing into XMind

Automation

GitHub Actions (Recommended)

Local Cron Job

Configuration

Data Storage

How It Works

Phase Breakdown

Advanced Usage

Filter Repositories

Custom Categories

Change AI Model

Schedule Frequency

Cost Estimates

Troubleshooting

"ANTHROPIC_API_KEY not set"

"gh: command not found"

Rate Limiting

No repositories found

XMind import fails

Privacy & Security

Contributing

License

Support

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages