Adversaries may clone GitHub repositories to exfiltrate sensitive data or establish a foothold, leveraging T1213 techniques to move laterally or steal credentials. SOC teams should proactively hunt for unusual clone patterns in Azure Sentinel to detect potential data exfiltration and early-stage compromise.
KQL Query
let min_t = toscalar(GitHubRepo
| summarize min(timestamp_t));
let max_t = toscalar(GitHubRepo
| summarize max(timestamp_t));
GitHubRepo
| where Action == "Clones"
| distinct TimeGenerated, Repository, Count
| make-series num=sum(tolong(Count)) default=0 on TimeGenerated in range(min_t, max_t, 1h) by Repository
| extend (anomalies, score, baseline) = series_decompose_anomalies(num, 1.5, -1, 'linefit')
| render timechart
id: ccef3c74-4b4f-445b-8109-06d38687e4a4
name: GitHub Repo Clone - Time Series Anomly
description: |
'Attacker can exfiltrate data from your GitHub repository by cloning it. This hunting query tracks clone activities for each repository, allowing quick identification of anomalies/excessive clones to investigate repo access & permissions.'
description_detailed: |
'Attacker can exfiltrate data from you GitHub repository after gaining access to it by performing clone action. This hunting queries allows you to track the clones activities for each of your repositories. The visualization allow you to quickly identify anomalies/excessive clone, to further investigate repo access & permissions'
requiredDataConnectors: []
tactics:
- Collection
relevantTechniques:
- T1213
query: |
let min_t = toscalar(GitHubRepo
| summarize min(timestamp_t));
let max_t = toscalar(GitHubRepo
| summarize max(timestamp_t));
GitHubRepo
| where Action == "Clones"
| distinct TimeGenerated, Repository, Count
| make-series num=sum(tolong(Count)) default=0 on TimeGenerated in range(min_t, max_t, 1h) by Repository
| extend (anomalies, score, baseline) = series_decompose_anomalies(num, 1.5, -1, 'linefit')
| render timechart
version: 1.0.1
metadata:
source:
kind: Community
author:
name: itay6588
support:
tier: Microsoft
categories:
domains: [ "Security - Threat Protection" ]
Scenario: Scheduled Backup Job Cloning a Repository
Description: A legitimate scheduled job (e.g., using cron or Jenkins) clones a GitHub repository as part of a backup process.
Filter/Exclusion: Exclude clones that occur during known backup windows or match specific job names (e.g., backup_script.sh).
Scenario: CI/CD Pipeline Artifact Fetching
Description: A CI/CD tool like GitHub Actions, GitLab CI, or Jenkins clones a repository to fetch code for a build or deployment.
Filter/Exclusion: Exclude clones initiated by known CI/CD agents or with specific environment variables (e.g., GITHUB_ACTIONS=true).
Scenario: Admin Task for Repository Sync
Description: A system administrator manually clones a repository to synchronize local development environments or for debugging purposes.
Filter/Exclusion: Exclude clones from known admin accounts or those that match specific command-line patterns (e.g., git clone https://github.com/... executed by root or admin user).
Scenario: Internal Tool for Code Sharing
Description: An internal tool or script (e.g., Jenkinsfile, Ansible, or Puppet) clones a GitHub repo to distribute code across multiple servers.
Filter/Exclusion: Exclude clones that originate from internal IP ranges or match known internal tooling patterns (e.g., git clone https://github.com/... executed by jenkins or ansible user).
Scenario: User-Driven Development Environment Sync
Description: A developer clones a repository to their local machine for development or testing, which may appear as a high number of clones.
Filter/Exclusion: Exclude clones from user accounts that have access to the repo and are known developers (e