Add byok migration script for newly hosted keys by TheodoreSpeaks · Pull Request #3462 · simstudioai/sim

TheodoreSpeaks · 2026-03-07T23:55:19Z

Summary

Currently, hosted keys will override users' currently provided api keys. Instead, we need to create a script to migrate their provided api keys to the BYOK section so that users continue to use their existing keys.

This handles:

Agent tool sub-blocks.
Raw inputted keys
Keys set via secrets

Note: BYOK only allows for a single api key. This could cause different behavior if a user has multiple api keys provided for different blocks for the same service. The dry run flag will detect these cases and raise them so we can look on a case-by-case basis.

Example usage:

# Dry run - detect duplicate api keys, validates 
bun run packages/db/scripts/migrate-block-api-keys-to-byok.ts \
  --map jina=jina --map perplexity=perplexity --map linkup=linkup \
  --map huggingface=huggingface --map firecrawl=firecrawl --map serper=serper \
  --map elevenlabs=elevenlabs --map exa=exa --map browser_use=browser_use \
  --map google_books=google_cloud --map google_maps=google_cloud \
  --map google_pagespeed=google_cloud --map google_translate=google_cloud --dry-run

# Output

Found 2 candidate blocks

Loaded env vars: 1 users, 1 workspaces

Literal API keys: 1
Env var references: 1
Nested tool keys (agent/HITL): 1
Env var resolution failures: 0
Skipped (no workspace): 0
Skipped (empty key): 0
Total resolved key entries: 2

Unique (workspace, provider) pairs to insert: 2

No conflicts detected.

  [DRY RUN] Would insert BYOK key for workspace 438e6738-c3aa-4fe6-907e-e996e87d5681, provider "firecrawl": ••••••••
  [DRY RUN] Would insert BYOK key for workspace 438e6738-c3aa-4fe6-907e-e996e87d5681, provider "exa": ••••••••

[DRY RUN] No changes were made to the database.
Run without --dry-run to apply changes.

# Real migration
bun run packages/db/scripts/migrate-block-api-keys-to-byok.ts \
  --map jina=jina --map perplexity=perplexity --map linkup=linkup \
  --map huggingface=huggingface --map firecrawl=firecrawl --map serper=serper \
  --map elevenlabs=elevenlabs --map exa=exa --map browser_use=browser_use \
  --map google_books=google_cloud --map google_maps=google_cloud \
  --map google_pagespeed=google_cloud --map google_translate=google_cloud

Found 2 candidate blocks

Loaded env vars: 1 users, 1 workspaces

Literal API keys: 1
Env var references: 1
Nested tool keys (agent/HITL): 1
Env var resolution failures: 0
Skipped (no workspace): 0
Skipped (empty key): 0
Total resolved key entries: 2

Unique (workspace, provider) pairs to insert: 2

Insert batch 1 (1-2 of 2)
  [INSERT] BYOK key for workspace 438e6738-c3aa-4fe6-907e-e996e87d5681, provider "firecrawl": ••••••••
  [INSERT] BYOK key for workspace 438e6738-c3aa-4fe6-907e-e996e87d5681, provider "exa": ••••••••

---
Migration Summary:
  BYOK keys inserted: 2
  BYOK keys skipped (already existed): 0
  BYOK insert errors: 0
  Env var resolution failures: 0

Migration completed successfully!

Done!

Type of Change

Testing

Tested with existing api key, migration moves it to byok uses key
Tested with existing env key, migration moves to byok and uses key
Tested with agent subblock, migration moves to byok and uses key
Tested with multiple same keys, migration moves to byok
Tested with different keys, dry run flags as potential issue.

Checklist

Code follows project style guidelines
Self-reviewed my changes
Tests added/updated and passing
No new warnings introduced
[ x I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

Screenshots/Videos

vercel · 2026-03-07T23:55:23Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
docs	Ready	Preview, Comment	Mar 10, 2026 7:21am

TheodoreSpeaks · 2026-03-07T23:57:34Z

packages/db/scripts/migrate-block-api-keys-to-byok.ts

+    console.log(`Found ${rows.length} candidate blocks\n`)
+
+    // 2. Pre-load env vars for resolving {{VAR}} references
+    const personalEnvRows = await db.select().from(environment)


We shouldn't load all env vars into memory. Can we first find the blocks that reference these env vars, then map the env rows needed from that?

cursor · 2026-03-08T00:33:25Z

PR Summary

Medium Risk
Medium risk because it reads/decrypts stored secrets and writes new encrypted BYOK rows; incorrect mappings or conflict resolution could migrate the wrong key for a provider despite dry-run safeguards.

Overview
Adds a new self-contained Bun script, migrate-block-api-keys-to-byok.ts, to migrate block-level apiKey values into workspace_byok_keys without modifying the original blocks.

The script supports --dry-run auditing (detects multiple distinct keys per workspace/provider, logs chosen key, and outputs a workspace-id list) and a gated live mode (--from-file) that encrypts and inserts BYOK keys with onConflictDoNothing. It resolves both literal keys and {{ENV_VAR}} references by decrypting values from workspace_environment and user environment, and also scans nested tool API keys inside agent/human_in_the_loop blocks.

^{Written by Cursor Bugbot for commit efb23da. This will update automatically on new commits. Configure here.}

packages/db/scripts/migrate-block-api-keys-to-byok.ts

greptile-apps · 2026-03-08T00:36:54Z

Greptile Summary

This PR adds a one-shot migration script that reads API keys stored directly on workflow blocks and inserts them into the workspace_byok_keys table, so users continue using their own keys after hosted keys are introduced. The script supports literal key values, environment-variable references ({{VAR_NAME}}), and nested tool API keys inside agent/HITL sub-blocks, and includes a --dry-run mode for auditing conflicts before committing changes.

Key points:

Statistics bug: (result as any).rowCount === 0 will always be false for the postgres-js driver (which exposes count, not rowCount). Every conflict-silenced insert is miscounted as a successful insert, making the summary numbers unreliable. Switching to .returning({ id: ... }) and checking array length (as done in production route.ts) would fix this.
Silent key misses: The sub-block field name apiKey is hardcoded. Block types that store their credential under a different name will be silently skipped with no counter incremented.
Live-run conflict handling: Conflicts are logged but not blocked in the live run. Operators should be aware that running without --dry-run will automatically choose the first resolved key for any provider with multiple distinct values.
Workspace query over-inclusion: Adding agent/human_in_the_loop block types to the initial workspace query may pull in workspaces that only have those block types but no mapped tool keys, resulting in unnecessary DB round-trips and noisy log output.

Confidence Score: 3/5

Safe to merge as a script-only change, but the statistics reported at the end of a live run will be incorrect due to the rowCount bug.
The actual database writes (encrypt + insert with onConflictDoNothing) are correct and idempotent, so re-running the script is safe. However, the rowCount logic bug means operators cannot trust the summary counts (inserted vs. skipped) after a live run, which reduces confidence in the migration's reported outcome. The hardcoded apiKey sub-block field is a secondary concern that could cause silent data gaps.
packages/db/scripts/migrate-block-api-keys-to-byok.ts — specifically the insert result handling around line 451 and the sub-block field name assumption around line 290.

Important Files Changed

Filename	Overview
packages/db/scripts/migrate-block-api-keys-to-byok.ts	New one-shot migration script that reads block-level API keys from workflow sub-blocks and inserts them as workspace BYOK keys. Contains a logic bug where `(result as any).rowCount` is always `undefined` for the postgres-js driver, causing the skip/insert statistics to be miscounted. Also hardcodes `apiKey` as the sub-block field name which could silently miss keys stored under different names.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Parse CLI args\n--map blockType=providerId\n--dry-run] --> B[Query distinct workspaceIds\nwith matching block types]
    B --> C{For each workspace}
    C --> D[Query all candidate blocks\nfor this workspace]
    D --> E[Extract raw API key refs\nper provider]
    E --> F{Has env var\nreferences?}
    F -- Yes --> G[Load workspace env vars\n+ personal env vars]
    F -- No --> H
    G --> H[For each provider:\nresolve all keys]
    H --> I{Multiple distinct\nresolved keys?}
    I -- Yes --> J[Log CONFLICT\nUse first resolved key]
    I -- No --> K[Use single resolved key]
    J --> L{DRY_RUN?}
    K --> L
    L -- Yes --> M[Print preview line\nNo DB write]
    L -- No --> N[encryptSecret key\nInsert into workspace_byok_keys\nonConflictDoNothing]
    N --> O{rowCount check\n⚠️ always undefined}
    O -- undefined === 0 false --> P[Count as inserted\n even if skipped]
    O -- never reached --> Q[Count as skipped]
    M --> R[Print summary]
    P --> R
    Q --> R

_{Last reviewed commit: 353ef31}

greptile-apps · 2026-03-08T00:36:57Z

packages/db/scripts/migrate-block-api-keys-to-byok.ts

+      const blocks = await db
+        .select({
+          blockId: workflowBlocks.id,
+          blockName: workflowBlocks.name,
+          blockType: workflowBlocks.type,
+          subBlocks: workflowBlocks.subBlocks,
+          workflowId: workflow.id,


Hardcoded apiKey sub-block ID may silently miss keys for some providers

The script unconditionally reads API keys via subBlocks?.apiKey?.value. If any of the target block types store their credential under a differently-named sub-block key (e.g., api_key, token, secret), those keys will be silently skipped with no warning — they won't appear in Skipped (empty key) stats either, because the value simply won't be found.

Before running a live migration it is worth verifying (or documenting) that every mapped block type in BLOCK_TYPE_TO_PROVIDER uses exactly apiKey as the sub-block identifier. A quick sanity-check log line listing how many non-empty apiKey values were found per block type would make this more auditable.

greptile-apps · 2026-03-08T00:36:58Z

packages/db/scripts/migrate-block-api-keys-to-byok.ts

+
+        // Detect conflicting values
+        const distinctKeys = new Set(resolved.map((r) => r.key))
+        if (distinctKeys.size > 1) {
+          stats.conflicts++
+          console.log(`  [CONFLICT] provider "${providerId}": ${distinctKeys.size} distinct keys`)
+          for (const { ref, key } of resolved) {
+            const display = isEnvVarReference(ref.rawValue)
+              ? `${ref.rawValue} -> ${maskKey(key)}`
+              : maskKey(ref.rawValue)
+            console.log(`    "${ref.blockName}" in "${ref.workflowName}": ${display}`)
+          }
+          console.log('    Using first resolved key')
+        }
+
+        // Use the first resolved key
+        const chosen = resolved[0]
+
+        if (DRY_RUN) {
+          console.log(
+            `  [DRY RUN] Would insert BYOK for provider "${providerId}": ${maskKey(chosen.key)}`
+          )
+          continue
+        }


Conflicts are silently resolved in live run; no abort option

The PR description states: "The dry run flag will detect these cases and raise them so we can look on a case-by-case basis." However, in the live run the conflict is logged as [CONFLICT] but the migration immediately proceeds to insert the first resolved key without any pause or opt-in confirmation.

For a one-shot production migration, consider adding an --abort-on-conflict flag (or making the live run abort by default when conflicts are found), so operators are not surprised when their conflict review leads to no action. At minimum, a note in the script header documenting this behavior would be helpful.

greptile-apps · 2026-03-08T00:36:59Z

packages/db/scripts/migrate-block-api-keys-to-byok.ts

+
+  const stats = {
+    workspacesProcessed: 0,
+    workspacesSkipped: 0,
+    conflicts: 0,
+    inserted: 0,
+    skippedExisting: 0,
+    errors: 0,
+    envVarFailures: 0,
+  }
+
+  try {
+    // 1. Get distinct workspace IDs that have matching blocks
+    const mappedBlockTypes = Object.keys(BLOCK_TYPE_TO_PROVIDER)
+    const agentTypes = Object.keys(TOOL_INPUT_SUBBLOCK_IDS)
+    const allBlockTypes = [...new Set([...mappedBlockTypes, ...agentTypes])]
+


Initial workspace discovery query may include false-positive workspaces

allBlockTypes is the union of mappedBlockTypes (the providers we care about) and agentTypes (agent, human_in_the_loop). This means any workspace that has any agent or HITL block — even one with no tool configured — will be included in the outer loop and generate a per-workspace log line. For large deployments this is noisy and wastes DB round-trips.

Consider restricting the initial query to only mappedBlockTypes, and querying the agent/HITL blocks only inside the workspace loop (or alongside the mapped blocks in a single per-workspace query). That way a workspace that has agents but no matching tool API keys is excluded upfront rather than found and then immediately skipped as "No API keys found".

greptile-apps · 2026-03-08T00:37:00Z

packages/db/scripts/migrate-block-api-keys-to-byok.ts

+          if ((result as any).rowCount === 0) {
+            console.log(`  [SKIP] BYOK already exists for provider "${providerId}"`)
+            stats.skippedExisting++
+          } else {
+            console.log(`  [INSERT] BYOK for provider "${providerId}": ${maskKey(chosen.key)}`)
+            stats.inserted++


rowCount does not exist on postgres-js insert result

With drizzle-orm + postgres-js (the driver used in this project), calling db.insert(...).onConflictDoNothing() without .returning() returns a postgres.js command-result object whose relevant property is count (a BigInt), not rowCount. Because (result as any).rowCount will always be undefined, the expression undefined === 0 is always false — the skip branch is never entered. Every conflict-silenced insert will be incorrectly counted as stats.inserted++ instead of stats.skippedExisting++.

The safest fix is to add .returning({ id: workspaceBYOKKeys.id }) and check whether the returned array is empty — which is how the production route.ts already handles insert results. This makes the statistics trustworthy, which is especially important for a one-shot data migration.

cursor · 2026-03-08T01:25:17Z

packages/db/scripts/migrate-block-api-keys-to-byok.ts

+          for (const row of personalRows) {
+            personalEnvCache.set(row.userId, (row.variables as Record<string, string>) || {})
+          }
+        }


All env vars loaded into memory per workspace

Low Severity

The PR reviewer specifically flagged that env vars should not all be loaded into memory. While the code now loads per-workspace rather than globally, it still does select() (all columns) from workspaceEnvironment and environment, pulling the entire variables JSON blob for each workspace and user. This includes every env var, not just the ones referenced by blocks. The referenced env var names are known from the providerKeys refs — the script could first collect the needed var names, then only decrypt those specific entries rather than loading all variables into wsEnvVars and personalEnvCache.

icecrasher321 · 2026-03-08T10:41:30Z

packages/db/scripts/migrate-block-api-keys-to-byok.ts

@@ -0,0 +1,598 @@
+#!/usr/bin/env bun


You'd have to include this in pure SQL in an actual migration at the top. We can discuss this on Monday morning. It's hard to tell our self-hosted users to run the script is what we realized.

On second thought, this one only applies for hosted -- so actually a script is fine here

packages/db/scripts/migrate-block-api-keys-to-byok.ts

gitguardian · 2026-03-10T04:33:02Z

️✅ There are no secrets present in this pull request anymore.

If these secrets were true positive and are still valid, we highly recommend you to revoke them.
While these secrets were previously flagged, we no longer have a reference to the
specific commits where they were detected. Once a secret has been leaked into a git
repository, you should consider it compromised, even if it was deleted immediately.
Find here more information about risks.

^{_{🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.}}

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

cursor · 2026-03-10T07:23:29Z

packages/db/scripts/migrate-block-api-keys-to-byok.ts

+    }
+
+    console.log(`  [${index}/${total}] Done with workspace ${workspaceId}\n`)
+    return { stats, shouldWriteWorkspaceId: DRY_RUN }


Dry run writes workspace IDs even without insertable keys

Low Severity

shouldWriteWorkspaceId: DRY_RUN is unconditionally true for every processed workspace during a dry run, even when no keys actually resolved successfully (e.g., all env var lookups failed or all were skipped as non-owner personal keys). This causes the output file to include workspace IDs that have no valid keys to migrate. The live run then wastes time re-processing those workspaces with no result, and the dry-run summary inaccurately reports the count of "workspace IDs (with keys)."

TheodoreSpeaks commented Mar 7, 2026

View reviewed changes

vercel bot deployed to Preview March 8, 2026 00:15 View deployment

TheodoreSpeaks marked this pull request as ready for review March 8, 2026 00:33

cursor bot reviewed Mar 8, 2026

View reviewed changes

packages/db/scripts/migrate-block-api-keys-to-byok.ts Show resolved Hide resolved

vercel bot deployed to Preview March 8, 2026 00:35 View deployment

greptile-apps bot reviewed Mar 8, 2026

View reviewed changes

vercel bot deployed to Preview March 8, 2026 00:44 View deployment

vercel bot deployed to Preview March 8, 2026 01:09 View deployment

vercel bot deployed to Preview March 8, 2026 01:23 View deployment

cursor bot reviewed Mar 8, 2026

View reviewed changes

icecrasher321 reviewed Mar 8, 2026

View reviewed changes

TheodoreSpeaks force-pushed the feat/migrate-byok branch from f6281f9 to 62219a6 Compare March 9, 2026 22:12

vercel bot deployed to Preview March 9, 2026 22:31 View deployment

vercel bot deployed to Preview March 9, 2026 22:49 View deployment

cursor bot reviewed Mar 9, 2026

View reviewed changes

packages/db/scripts/migrate-block-api-keys-to-byok.ts Show resolved Hide resolved

Theodore Li added 9 commits March 9, 2026 21:22

Add byok migration script for newly hosted keys

70e76e8

Change script to iterate on workflow

01b21ed

Refactor to iterate per workspace to avoid overconsuming memory

6e7bdae

Fix row count bug

9a1b681

Add sleep every 1000, dry run writes to files that prod consumes from

a2329e2

Fix lint

e127817

Sleep only between workspaces, not during single workspace

e8ac273

Prefer plain > workspace > personal

bba16f2

Personal key should only be used if its the owner of the workflow

11f402b

TheodoreSpeaks force-pushed the feat/migrate-byok branch from 9df482d to 11f402b Compare March 10, 2026 04:32

TheodoreSpeaks changed the base branch from staging to feat/mothership-copilot March 10, 2026 04:34

vercel bot deployed to Preview March 10, 2026 04:35 View deployment

Make migration concurrent

efb23da

vercel bot deployed to Preview March 10, 2026 07:21 View deployment

cursor bot reviewed Mar 10, 2026

View reviewed changes

Conversation

TheodoreSpeaks commented Mar 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Type of Change

Testing

Checklist

Screenshots/Videos

Uh oh!

vercel bot commented Mar 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TheodoreSpeaks Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

cursor bot commented Mar 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Uh oh!

Uh oh!

greptile-apps bot commented Mar 8, 2026

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps bot Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

cursor bot Mar 8, 2026

Choose a reason for hiding this comment

All env vars loaded into memory per workspace

Uh oh!

icecrasher321 Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

icecrasher321 Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gitguardian bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

️✅ There are no secrets present in this pull request anymore.

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Mar 10, 2026

Choose a reason for hiding this comment

Dry run writes workspace IDs even without insertable keys

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

TheodoreSpeaks commented Mar 7, 2026 •

edited

Loading

vercel bot commented Mar 7, 2026 •

edited

Loading

cursor bot commented Mar 8, 2026 •

edited

Loading

gitguardian bot commented Mar 10, 2026 •

edited

Loading