git2doc

Convert a public GitHub repository into one Markdown document.

What it does

Accepts GitHub input as:
- owner/repo or owner/repo.git (shorthand)
- https://github.com/owner/repo
- https://github.com/owner/repo/tree/<branch-or-path-like-branch>
Resolves HEAD to the repository default branch.
Fetches tree + raw file content from GitHub.
Collects text files, with optional inclusion of dotfiles and data/log files.
Supports include/exclude path filtering (* and ** patterns).
Optionally detects likely binary files using sampled bytes and skips them.
Produces one Markdown output with:
- repository snapshot
- directory structure
- combined source blocks
- optional file-level TOC
- notes for skipped/failed files

Project layout

Frontend: index.html, styles.css, js/app.js
Core logic: js/github-api.js, js/repository.js, js/document-builder.js
UI rendering/helpers: js/ui.js, js/markdown.js, js/document-utils.js
Cloudflare Functions:
- functions/api.js (/api)
- functions/config.js (/config)
- functions/[owner]/[repo].js (path adapter)
CLI: cli.mjs (calls hosted API)

Run locally

npx wrangler pages dev . --port 8888

Then open http://localhost:8888.

Browser usage

Enter repository input.
Choose options:
- include TOC
- include dotfiles
- include .csv/.tsv/.log/.env
- max file size limit (or disable)
Optional advanced settings:
- allow shorthand on/off
- strict preview mode
- detect binary-like files by byte sample
- include/exclude path patterns (comma/newline)
- temporary GitHub token
- GitHub timeout and max retries
Click Fetch & Build Markdown.
Copy or download output.

Status includes stage counters:

tree: textFiles/treeItems
queue: queued/totalTextFiles (+ skipped large/binary)
fetch: completed/total (+ ok/fail)
build: completed/total

API usage

Supported routes

curl "https://<your-pages-domain>/api?repo=owner/repo.git" -o repo.md
curl "https://<your-pages-domain>/api/owner/repo.git" -o repo.md
curl "https://<your-pages-domain>/owner/repo" -o repo.md

Query parameters

repo=<input> (required unless using path route)
branch=<name>
includeDotfiles=true|false
includeDataFiles=true|false
includeToc=true|false
maxKb=<number>
noMax=true
concurrency=<number>
includePaths=<pattern1,pattern2,...>
excludePaths=<pattern1,pattern2,...>
detectBinaryBySample=true|false
githubToken=<token>
githubTimeoutMs=<number>
githubMaxRetries=<number>
githubCacheTtlMs=<number>
githubCacheSWRMs=<number>

Optional token headers

X-GitHub-Token: <token>
Authorization: Bearer <token>

Response diagnostics

X-Request-Id
X-Cache-Status: HIT|STALE|MISS
X-Stage-Tree
X-Stage-Queue
X-Stage-Fetch
X-Stage-Build

Error JSON includes requestId and stageCounters.

CLI usage

node cli.mjs <repo-input> [options]

Examples:

node cli.mjs owner/repo --service-url https://<your-pages-domain>/api --out repo.md
GIT2DOC_SERVICE_URL=https://<your-pages-domain>/api node cli.mjs owner/repo.git --out repo.md

Options:

--out <file>
--service-url <url> (or GIT2DOC_SERVICE_URL)
--branch <name>
--max-kb <number>
--no-max-size
--include-dotfiles
--include-data-files
--include-toc
--include-paths <value>
--exclude-paths <value>
--detect-binary-sample
--quiet
--help

CLI note: if input ends with .gi, it is auto-corrected to .git.

Environment variables

Server/runtime:

GIT2DOC_EXTRA_INPUT_HOSTS
GIT2DOC_ALLOW_SHORTHAND
GITHUB_TOKEN or GIT2DOC_GITHUB_TOKEN
GIT2DOC_GITHUB_TIMEOUT_MS
GIT2DOC_GITHUB_MAX_RETRIES
GIT2DOC_GITHUB_CACHE_TTL_MS
GIT2DOC_GITHUB_CACHE_SWR_MS

CLI:

GIT2DOC_SERVICE_URL

Defaults from code:

max file size: 250000 bytes (about 250 KB)
fetch concurrency: 6
GitHub timeout: 12000 ms
GitHub max retries: 2
cache TTL: 30000 ms
cache stale-while-revalidate: 120000 ms

Tests

Run:

node --test tests/git2doc.test.mjs

Current tests cover:

GitHub URL/shorthand parsing
branch resolution fallback behavior
option parsing (size, TOC, data files, dotfiles, binary detection, include/exclude patterns)

Limits

Public repositories only.
GitHub rate limits still apply (use a token for heavier usage).
Very large repositories can still take time due to tree and content fetch volume.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
functions		functions
js		js
scripts		scripts
tests		tests
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
cli.mjs		cli.mjs
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
styles.css		styles.css

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

git2doc

What it does

Project layout

Run locally

Browser usage

API usage

Supported routes

Query parameters

Optional token headers

Response diagnostics

CLI usage

Environment variables

Tests

Limits

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

git2doc

What it does

Project layout

Run locally

Browser usage

API usage

Supported routes

Query parameters

Optional token headers

Response diagnostics

CLI usage

Environment variables

Tests

Limits

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages