Skip to content

feat: Multi-tenancy for Parseable server#1518

Closed
parmesant wants to merge 10 commits intoparseablehq:mainfrom
parmesant:multi-tenancy
Closed

feat: Multi-tenancy for Parseable server#1518
parmesant wants to merge 10 commits intoparseablehq:mainfrom
parmesant:multi-tenancy

Conversation

@parmesant
Copy link
Contributor

@parmesant parmesant commented Jan 13, 2026

This is a WIP

PR to introduce multi-tenancy to parseable server

Description


This PR has:

  • been tested to ensure log ingestion and log query works.
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added documentation for new or modified features or behaviors.

Summary by CodeRabbit

  • New Features

    • Multi-tenant mode: per-tenant isolation, tenant management, and tenant-scoped operations across the platform.
  • Improvements

    • Tenant-aware RBAC, alerts/targets, ingestion, querying, storage, retention, hot-tier, dashboards, filters, correlations, OTEL and LLM flows.
    • Metrics, billing and observability now labeled and aggregated per tenant for accurate per-tenant reporting.
  • Chores

    • CLI/runtime flag to enable multi-tenancy and tenant lifecycle utilities.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 13, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Threads an optional tenant_id: Option<String> across the codebase, converting many global maps to per-tenant maps and propagating tenant context through metastore, object storage, parseable/streams, alerts, RBAC, HTTP handlers, query execution, metrics, storage backends, ingest, OTEL, Kafka, and related paths.

Changes

Cohort / File(s) Summary
Tenancy core & metadata
src/tenants/mod.rs, src/lib.rs, Cargo.toml
Adds TENANT_METADATA, tenant types/errors and APIs; exposes tenants metadata and adds parking_lot dependency.
Metastore & object-store
src/metastore/..., src/metastore/metastores/object_store_metastore.rs, src/storage/object_storage.rs, src/storage/{azure_blob.rs,gcs.rs,s3.rs,localfs.rs}
Makes Metastore and ObjectStorage tenant-aware; many metastore methods and storage backends gain a tenant_id parameter and use tenant-prefixed paths and tenant-scoped metrics.
Parseable & Streams
src/parseable/mod.rs, src/parseable/streams.rs, src/parseable/staging/mod.rs
Adds DEFAULT_TENANT and tenants registry; Parseable APIs and Streams become tenant-partitioned; stream creation/lookup accept tenant_id.
Alerts & Targets
src/alerts/*, src/alerts/target.rs, src/alerts/alert_traits.rs, src/alerts/alert_structs.rs, src/alerts/alert_types.rs, src/alerts/alerts_utils.rs
Threads tenant_id through AlertTrait/manager, AlertConfig, alert/target in-memory maps, path resolution, query execution and metastore persistence.
RBAC, users & roles
src/rbac/*, src/rbac/map.rs, src/rbac/user.rs, src/rbac/role.rs
Partitions roles/users/groups/sessions by tenant, adds SuperAdmin and suspension logic, and makes RBAC lookups tenant-scoped.
HTTP handlers & middleware
src/handlers/http/**, src/handlers/http/middleware.rs, src/handlers/http/modal/**
Many handlers now accept HttpRequest, extract tenant_id, and pass tenant context into services; middleware validates tenant header and suspension.
Query, execution & schema provider
src/query/*, src/query/stream_schema_provider.rs, src/handlers/http/query.rs
QUERY_SESSION and execution paths accept tenant_id; per-tenant schemas, manifest collection and metrics attribution introduced.
Storage flows, retention & hot-tier
src/storage/*, src/hottier.rs, src/storage/retention.rs, src/storage/field_stats.rs
Retention, hot-tier and manifest flows threaded with tenant_id; path helpers and orchestration use tenant prefixes and tenant-aware metrics.
Ingest, events, OTEL & Kafka
src/event/*, src/handlers/http/ingest.rs, src/otel/*, src/connectors/kafka/*
Event carries tenant_id; ingest/Kafka/OTEL processors and schema/stream creation propagate tenant context.
Metrics, stats & metadata
src/metrics/mod.rs, src/stats.rs, src/metadata.rs
Prometheus labels extended to include tenant_id; metric and stats APIs accept tenant context (DEFAULT_TENANT fallback).
UI, dashboards, filters, correlations
src/users/*, src/correlation.rs, src/handlers/http/users/*
Dashboards, Filters, Correlations become per-tenant in-memory maps; handlers and metastore interactions accept tenant_id.
Utilities & CLI
src/utils/mod.rs, src/utils/header_parsing.rs, src/cli.rs
Adds tenant helpers (get_tenant_id_from_request, get_user_and_tenant_from_request, mutate_request_with_tenant), header parsing errors for tenant, and CLI multi_tenancy flag and tenant-aware path helper.

Sequence Diagram(s)

sequenceDiagram
  participant C as Client
  participant H as HTTP Handler
  participant M as Middleware
  participant R as RBAC
  participant P as Parseable
  participant MS as Metastore
  participant OS as ObjectStore

  rect rgba(100,149,237,0.5)
  C->>H: HTTP request (tenant header)
  end

  rect rgba(60,179,113,0.5)
  H->>M: forward HttpRequest
  M->>R: extract (user, tenant) & check suspension
  R-->>M: auth result (tenant_id)
  M-->>H: continue with tenant_id
  end

  rect rgba(255,165,0,0.5)
  H->>P: ensure/get stream/schema (tenant_id)
  P->>MS: get_stream_json / schema (tenant_id)
  MS->>OS: read/write tenant-prefixed object (tenant_id)
  OS-->>MS: bytes/ok
  MS-->>P: schema/metadata
  end

  rect rgba(219,112,147,0.5)
  H->>MS: get/put alerts/targets/state (tenant_id)
  MS->>OS: read/write tenant-scoped object (tenant_id)
  OS-->>MS: OK
  MS-->>H: response
  end

  H-->>C: HTTP response
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

Suggested labels

for next release

Suggested reviewers

  • de-sh
  • nikhilsinhaparseable

Poem

"I’m a rabbit in the code so spry,
I hop tenant_ids from low to high,
Maps nested snug and paths aligned,
Each tenant’s data neatly signed,
A joyful hop — multi-tenant sky!" 🐇✨

🚥 Pre-merge checks | ✅ 1 | ❌ 2
❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is incomplete. It is marked as WIP, contains only the template with all checklist items unchecked, and provides no actual description of goals, solutions, rationale, or key changes beyond stating the general intent to introduce multi-tenancy. Complete the PR description by removing the WIP status, filling in the Description section with actual goals and rationale, documenting key changes made, and checking off completed items from the checklist before merging.
Docstring Coverage ⚠️ Warning Docstring coverage is 64.52% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: Multi-tenancy for Parseable server' is directly related to the main change in the changeset, which introduces comprehensive multi-tenant support throughout the codebase.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 20

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (22)
src/hottier.rs (2)

208-220: Critical: delete_hot_tier does not use tenant_id when constructing the delete path.

The function accepts tenant_id and uses it for the existence check at line 213, but line 216 constructs the path without considering tenant_id:

let path = self.hot_tier_path.join(stream);

For multi-tenant deployments, this will delete the wrong directory (the non-tenant path) instead of the tenant-scoped path.

🐛 Proposed fix
 pub async fn delete_hot_tier(
     &self,
     stream: &str,
     tenant_id: &Option<String>,
 ) -> Result<(), HotTierError> {
     if !self.check_stream_hot_tier_exists(stream, tenant_id) {
         return Err(HotTierValidationError::NotFound(stream.to_owned()).into());
     }
-    let path = self.hot_tier_path.join(stream);
+    let path = if let Some(tid) = tenant_id.as_ref() {
+        self.hot_tier_path.join(tid).join(stream)
+    } else {
+        self.hot_tier_path.join(stream)
+    };
     fs::remove_dir_all(path).await?;

     Ok(())
 }

471-497: fetch_hot_tier_dates and get_stream_path_for_date must accept and use tenant_id parameter.

These functions construct paths without tenant awareness, while hot_tier_file_path() is already tenant-scoped. This causes a mismatch: cleanup_hot_tier_old_data() has access to tenant_id but cannot pass it to fetch_hot_tier_dates(), and process_parquet_file() cannot pass tenant_id to get_stream_path_for_date(). In multi-tenant deployments, this will cause incorrect path resolution for hot-tier data. Update both function signatures to accept tenant_id and construct paths as self.hot_tier_path.join(tenant_id).join(stream) when present, consistent with hot_tier_file_path().

src/handlers/http/alerts.rs (1)

209-244: Missing tenant_id in list endpoint - potential cross-tenant alert visibility.

The list handler does not extract tenant_id from the request, unlike all other handlers in this file. The list_alerts_for_user call may return alerts across all tenants instead of filtering by the requesting tenant's context.

🐛 Proposed fix to add tenant context
 pub async fn list(req: HttpRequest) -> Result<impl Responder, AlertError> {
     let session_key = extract_session_key_from_req(&req)?;
+    let tenant_id = get_tenant_id_from_request(&req);
     let query_map = web::Query::<HashMap<String, String>>::from_query(req.query_string())
         .map_err(|_| AlertError::InvalidQueryParameter("malformed query parameters".to_string()))?;
 
     // ... existing code ...
 
     // Fetch alerts for the user
     let alerts = alerts
-        .list_alerts_for_user(session_key, params.tags_list)
+        .list_alerts_for_user(session_key, params.tags_list, &tenant_id)
         .await?;
src/storage/store_metadata.rs (1)

301-323: Missing directory creation for tenant-specific staging path.

When tenant_id is provided, the code constructs a path under a tenant subdirectory (line 309), but doesn't ensure this directory exists. The OpenOptions::open() call will fail with NotFound if the tenant directory hasn't been created yet.

🐛 Proposed fix to ensure tenant directory exists
 pub fn put_staging_metadata(meta: &StorageMetadata, tenant_id: &Option<String>) -> io::Result<()> {
     let mut staging_metadata = meta.clone();
     staging_metadata.server_mode = PARSEABLE.options.mode;
     staging_metadata.staging = PARSEABLE.options.staging_dir().to_path_buf();
     let path = if let Some(tenant_id) = tenant_id.as_ref() {
-        PARSEABLE
+        let tenant_dir = PARSEABLE
             .options
             .staging_dir()
-            .join(tenant_id)
-            .join(PARSEABLE_METADATA_FILE_NAME)
+            .join(tenant_id);
+        fs::create_dir_all(&tenant_dir)?;
+        tenant_dir.join(PARSEABLE_METADATA_FILE_NAME)
     } else {
         PARSEABLE
             .options
             .staging_dir()
             .join(PARSEABLE_METADATA_FILE_NAME)
     };
src/handlers/http/targets.rs (2)

35-45: Missing tenant_id in post endpoint - targets created without tenant context.

The post handler doesn't extract tenant_id from the request, unlike list, get, update, and delete. This could result in targets being created without proper tenant association, breaking tenant isolation.

🐛 Proposed fix to add tenant context
 // POST /targets
 pub async fn post(
-    _req: HttpRequest,
+    req: HttpRequest,
     Json(target): Json<Target>,
 ) -> Result<impl Responder, AlertError> {
+    let tenant_id = get_tenant_id_from_request(&req);
     // should check for duplicacy and liveness (??)
     // add to the map
-    TARGETS.update(target.clone()).await?;
+    TARGETS.update(target.clone(), &tenant_id).await?;
 
     // Ok(web::Json(target.mask()))
     Ok(web::Json(target))
 }

72-98: update handler missing tenant_id in TARGETS.update call.

While tenant_id is correctly extracted and used to fetch old_target, the subsequent TARGETS.update(target.clone()) call on line 94 doesn't pass the tenant context. This may cause the updated target to lose tenant association.

🐛 Proposed fix
     // should check for duplicacy and liveness (??)
     // add to the map
-    TARGETS.update(target.clone()).await?;
+    TARGETS.update(target.clone(), &tenant_id).await?;
src/alerts/alerts_utils.rs (1)

77-90: Tenant isolation gap: execute_remote_query does not receive tenant_id parameter.

The execute_local_query path explicitly receives and uses tenant_id for stream creation and query execution (lines 101, 112), but execute_remote_query (line 84) is called without this parameter and does not propagate any tenant context to send_query_request. The Query struct serialized to the remote querier contains no tenant information. If Prism mode requires tenant isolation, either:

  1. Add tenant_id parameter to execute_remote_query and include it in the Query struct or HTTP request, or
  2. Verify that tenant context is derived from the Authorization header on the remote side and document this assumption.
src/handlers/http/ingest.rs (1)

426-445: Pass tenant context through the unchecked event path.

push_logs_unchecked and append_temporary_events hardcode tenant_id: None, but the calling context in airplane.rs has access to tenant information via the key (SessionKey) parameter. Extract tenant_id using get_tenant_id_from_key(&key) and thread it through both functions to maintain consistency with the normal ingest flow.

src/handlers/http/modal/ingest/ingestor_rbac.rs (2)

189-213: Metadata persisted before password hash is updated.

Line 198 calls put_staging_metadata before the password hash is actually updated in the metadata (lines 199-211). This means the old password hash is persisted instead of the new one.

🐛 Proposed fix: Move persistence after the mutation
 pub async fn post_gen_password(
     req: HttpRequest,
     username: web::Path<String>,
 ) -> Result<HttpResponse, RBACError> {
     let username = username.into_inner();
     let tenant_id = get_tenant_id_from_request(&req);
     let mut new_hash = String::default();
     let mut metadata = get_metadata(&tenant_id).await?;

-    let _ = storage::put_staging_metadata(&metadata, &tenant_id);
     if let Some(user) = metadata
         .users
         .iter_mut()
         .filter_map(|user| match user.ty {
             user::UserType::Native(ref mut user) => Some(user),
             _ => None,
         })
         .find(|user| user.username == username)
     {
         new_hash.clone_from(&user.password_hash);
     } else {
         return Err(RBACError::UserDoesNotExist);
     }
+    let _ = storage::put_staging_metadata(&metadata, &tenant_id);
     Users.change_password_hash(&username, &new_hash, &tenant_id);
     Ok(HttpResponse::Ok().status(StatusCode::OK).finish())
 }

98-107: Use tenant_id to access the nested roles HashMap.

The roles().get(r) calls at lines 101 and 145 (in remove_roles_from_user) incorrectly attempt to look up role names directly. The roles() function returns HashMap<tenant_id, HashMap<role_name, privileges>>, so the lookup must first access by tenant_id. Both functions have tenant_id available from the request but don't use it:

Change:

if roles().get(r).is_none()

To:

if roles().get(&tenant_id).and_then(|r_map| r_map.get(r)).is_none()

This mirrors the pattern used throughout the codebase (e.g., src/rbac/utils.rs, src/rbac/mod.rs).

src/users/dashboards.rs (1)

244-268: Critical: Dashboard creation silently fails for new tenants.

If dashboards.get_mut(tenant) returns None (tenant doesn't exist in the map), the function returns Ok(()) without creating the dashboard. This is a logic error — new tenants would never be able to create dashboards.

     pub async fn create(
         &self,
         user_id: &str,
         dashboard: &mut Dashboard,
         tenant_id: &Option<String>,
     ) -> Result<(), DashboardError> {
         dashboard.created = Some(Utc::now());
         dashboard.set_metadata(user_id, None);
         let tenant = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v);
         let mut dashboards = self.0.write().await;

-        if let Some(dbs) = dashboards.get_mut(tenant) {
-            let has_duplicate = dbs
-                .iter()
-                .any(|d| d.title == dashboard.title && d.dashboard_id != dashboard.dashboard_id);
-            if has_duplicate {
-                return Err(DashboardError::Metadata("Dashboard title must be unique"));
-            }
-            self.save_dashboard(dashboard, tenant_id).await?;
-
-            dbs.push(dashboard.clone());
+        let dbs = dashboards.entry(tenant.to_owned()).or_default();
+        let has_duplicate = dbs
+            .iter()
+            .any(|d| d.title == dashboard.title && d.dashboard_id != dashboard.dashboard_id);
+        if has_duplicate {
+            return Err(DashboardError::Metadata("Dashboard title must be unique"));
         }
+        self.save_dashboard(dashboard, tenant_id).await?;
+        dbs.push(dashboard.clone());

         Ok(())
     }
src/handlers/http/rbac.rs (1)

128-136: Role existence checks are not tenant-aware.

The roles().contains_key(role) checks query the global roles map without tenant scoping. In a multi-tenant system, this could allow:

  1. Validating against roles from other tenants
  2. Assigning roles that exist in another tenant but not in the user's tenant

Consider using tenant-scoped role lookups:

-    for role in &user_roles {
-        if !roles().contains_key(role) {
-            non_existent_roles.push(role.clone());
-        }
-    }
+    let tenant = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v);
+    for role in &user_roles {
+        if !roles().get(tenant).map_or(true, |r| r.contains_key(role)) {
+            non_existent_roles.push(role.clone());
+        }
+    }

Also applies to: 322-333, 378-389

src/rbac/user.rs (1)

153-164: Use standard SaltString::generate(&mut OsRng) instead of custom salt generation.

RFC 9106 (Argon2 specification) recommends 16 bytes of salt; this implementation uses 32 bytes. While the custom approach with SaltString::encode_b64 is technically compatible with Argon2, it's unnecessarily complex and deviates from the specification without clear justification. The commented-out standard approach (SaltString::generate(&mut OsRng)) handles salt generation correctly and should be used instead for consistency with best practices.

src/catalog/mod.rs (1)

397-490: Avoid failing snapshot/retention flows if the stream isn’t in memory.
Both create_manifest() and remove_manifest_from_snapshot() can error out on PARSEABLE.get_stream(...)?, which can break cleanup on nodes that haven’t loaded that stream. Prefer best-effort in-memory updates, and keep storage updates authoritative.

Proposed fix (best-effort in-memory updates)
- let mut first_event_at = PARSEABLE
-     .get_stream(stream_name, tenant_id)?
-     .get_first_event();
+ let mut first_event_at = PARSEABLE
+     .get_stream(stream_name, tenant_id)
+     .ok()
+     .and_then(|s| s.get_first_event());

  ...
- match PARSEABLE.get_stream(stream_name, tenant_id) {
-     Ok(stream) => stream.set_first_event_at(first_event_at.as_ref().unwrap()),
-     Err(err) => error!(...),
- }
+ if let Some(first_event_at) = first_event_at.as_deref()
+     && let Ok(stream) = PARSEABLE.get_stream(stream_name, tenant_id)
+ {
+     stream.set_first_event_at(first_event_at);
+ }

 // remove_manifest_from_snapshot():
- PARSEABLE.get_stream(stream_name, tenant_id)?.reset_first_event_at();
+ if let Ok(stream) = PARSEABLE.get_stream(stream_name, tenant_id) {
+     stream.reset_first_event_at();
+ }

Also applies to: 492-527

src/parseable/streams.rs (1)

1188-1725: Tests need updates for new Stream::new(..., tenant_id) + local_stream_data_path(..., tenant_id) signatures.
As written, the test module still uses the old function arity and will fail to compile.

src/rbac/map.rs (1)

201-306: Sessions.user_sessions indexing is inconsistent (will reduce to “always not found”).
track_new() writes user_sessions[user][tenant], but is_session_expired() / remove_session() / remove_user() / remove_expired_session() read it as user_sessions[tenant][user]. Also, remove_expired_session() keeps expired sessions (expiry < now).

Proposed fix (align to user → tenant → sessions, and correct expiry retention)
 pub fn is_session_expired(&self, key: &SessionKey) -> bool {
     let (userid, tenant_id) = if let Some((user, tenant_id, _)) = self.active_sessions.get(key) {
         (user, tenant_id)
     } else {
         return false;
     };

-    let session = if let Some(tenant_sessions) = self.user_sessions.get(tenant_id)
-        && let Some(session) = tenant_sessions.get(userid)
-    {
-        session
-    } else {
-        return false;
-    };
+    let session = self
+        .user_sessions
+        .get(userid)
+        .and_then(|m| m.get(tenant_id));
+    let Some(session) = session else { return false };

     session
         .par_iter()
         .find_first(|(sessionid, expiry)| sessionid.eq(key) && expiry < &Utc::now())
         .is_some()
 }

 pub fn track_new(
     &mut self,
     user: String,
     key: SessionKey,
     expiry: DateTime<Utc>,
     permissions: Vec<Permission>,
     tenant_id: &Option<String>,
 ) {
     let tenant_id = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v);
     self.remove_expired_session(&user, tenant_id);

-    let sessions = self.user_sessions.entry(user.clone()).or_default();
-    sessions.insert(tenant_id.to_owned(), vec![(key.clone(), expiry)]);
+    self.user_sessions
+        .entry(user.clone())
+        .or_default()
+        .entry(tenant_id.to_owned())
+        .or_default()
+        .push((key.clone(), expiry));

     self.active_sessions
         .insert(key, (user, tenant_id.to_string(), permissions));
 }

 pub fn remove_session(&mut self, key: &SessionKey) -> Option<String> {
     let (user, tenant_id, _) = self.active_sessions.remove(key)?;
-    if let Some(tenant_sessions) = self.user_sessions.get_mut(&tenant_id)
-        && let Some(sessions) = tenant_sessions.get_mut(&user)
+    if let Some(user_sessions) = self.user_sessions.get_mut(&user)
+        && let Some(sessions) = user_sessions.get_mut(&tenant_id)
     {
         sessions.retain(|(session, _)| session != key);
         Some(user)
     } else {
         None
     }
 }

 pub fn remove_user(&mut self, username: &str, tenant_id: &str) {
-    tracing::warn!("removing user- {username}, tenant_id- {tenant_id}");
-    tracing::warn!("active sessions- {:?}", self.active_sessions);
-    tracing::warn!("user sessions- {:?}", self.user_sessions);
-    let sessions = if let Some(tenant_sessions) = self.user_sessions.get_mut(tenant_id) {
-        tenant_sessions.remove(username)
-    } else {
-        None
-    };
+    let sessions = self
+        .user_sessions
+        .get_mut(username)
+        .and_then(|m| m.remove(tenant_id));

     if let Some(sessions) = sessions {
         sessions.into_iter().for_each(|(key, _)| {
             self.active_sessions.remove(&key);
         })
     }
 }

 fn remove_expired_session(&mut self, user: &str, tenant_id: &str) {
     let now = Utc::now();

-    let sessions = if let Some(tenant_sessions) = self.user_sessions.get_mut(tenant_id)
-        && let Some(sessions) = tenant_sessions.get_mut(user)
-    {
-        sessions
-    } else {
-        return;
-    };
-    sessions.retain(|(_, expiry)| expiry < &now);
+    let Some(sessions) = self
+        .user_sessions
+        .get_mut(user)
+        .and_then(|m| m.get_mut(tenant_id))
+    else {
+        return;
+    };
+    // keep only non-expired
+    sessions.retain(|(_, expiry)| expiry >= &now);
 }
src/storage/object_storage.rs (1)

1149-1182: Inconsistent tenant_id handling across path builder functions.

schema_path(), stream_json_path(), and manifest_path() include empty string segments when tenant_id is None, whereas alert_json_path() and mttr_json_path() in the same file use conditional logic to omit the tenant segment entirely. Standardize all path builders to conditionally include tenant only when present, matching the established pattern.

Proposed fix (conditional segments)
 pub fn schema_path(stream_name: &str, tenant_id: &Option<String>) -> RelativePathBuf {
-    let tenant = tenant_id.as_ref().map_or("", |v| v);
+    let tenant = tenant_id.as_deref();

     if PARSEABLE.options.mode == Mode::Ingest {
         ...
-        RelativePathBuf::from_iter([tenant, stream_name, STREAM_ROOT_DIRECTORY, &file_name])
+        if let Some(tenant) = tenant {
+            RelativePathBuf::from_iter([tenant, stream_name, STREAM_ROOT_DIRECTORY, &file_name])
+        } else {
+            RelativePathBuf::from_iter([stream_name, STREAM_ROOT_DIRECTORY, &file_name])
+        }
     } else {
-        RelativePathBuf::from_iter([tenant, stream_name, STREAM_ROOT_DIRECTORY, SCHEMA_FILE_NAME])
+        if let Some(tenant) = tenant {
+            RelativePathBuf::from_iter([tenant, stream_name, STREAM_ROOT_DIRECTORY, SCHEMA_FILE_NAME])
+        } else {
+            RelativePathBuf::from_iter([stream_name, STREAM_ROOT_DIRECTORY, SCHEMA_FILE_NAME])
+        }
     }
 }

 pub fn stream_json_path(stream_name: &str, tenant_id: &Option<String>) -> RelativePathBuf {
-    let tenant = tenant_id.as_ref().map_or("", |v| v);
+    let tenant = tenant_id.as_deref();

     if PARSEABLE.options.mode == Mode::Ingest {
         ...
-        RelativePathBuf::from_iter([tenant, stream_name, STREAM_ROOT_DIRECTORY, &file_name])
+        if let Some(tenant) = tenant {
+            RelativePathBuf::from_iter([tenant, stream_name, STREAM_ROOT_DIRECTORY, &file_name])
+        } else {
+            RelativePathBuf::from_iter([stream_name, STREAM_ROOT_DIRECTORY, &file_name])
+        }
     } else {
-        RelativePathBuf::from_iter([
-            tenant,
-            stream_name,
-            STREAM_ROOT_DIRECTORY,
-            STREAM_METADATA_FILE_NAME,
-        ])
+        if let Some(tenant) = tenant {
+            RelativePathBuf::from_iter([
+                tenant,
+                stream_name,
+                STREAM_ROOT_DIRECTORY,
+                STREAM_METADATA_FILE_NAME,
+            ])
+        } else {
+            RelativePathBuf::from_iter([
+                stream_name,
+                STREAM_ROOT_DIRECTORY,
+                STREAM_METADATA_FILE_NAME,
+            ])
+        }
     }
 }

 pub fn manifest_path(prefix: &str, tenant_id: &Option<String>) -> RelativePathBuf {
-    let tenant = tenant_id.as_ref().map_or("", |v| v);
+    let tenant = tenant_id.as_deref();
     ...
-    RelativePathBuf::from_iter([tenant, prefix, &manifest_file_name])
+    if let Some(tenant) = tenant {
+        RelativePathBuf::from_iter([tenant, prefix, &manifest_file_name])
+    } else {
+        RelativePathBuf::from_iter([prefix, &manifest_file_name])
+    }
 }
src/metastore/metastores/object_store_metastore.rs (5)

342-390: put_alert_state does not use tenant_id in path construction.

Similar to get_alert_state_entry, the tenant_id parameter is accepted but not used when calling alert_state_json_path at line 352.

Proposed fix
-        let path = alert_state_json_path(id);
+        let path = alert_state_json_path(id, tenant_id);

1028-1049: get_all_schemas does not use tenant_id in path construction.

The path is constructed as {stream_name}/{STREAM_ROOT_DIRECTORY} without tenant prefix, which would fetch schemas from the wrong location for tenant-scoped streams.

Proposed fix
     async fn get_all_schemas(
         &self,
         stream_name: &str,
         tenant_id: &Option<String>,
     ) -> Result<Vec<Schema>, MetastoreError> {
-        let path_prefix =
-            relative_path::RelativePathBuf::from(format!("{stream_name}/{STREAM_ROOT_DIRECTORY}"));
+        let path_prefix = if let Some(tenant) = tenant_id {
+            relative_path::RelativePathBuf::from(format!("{tenant}/{stream_name}/{STREAM_ROOT_DIRECTORY}"))
+        } else {
+            relative_path::RelativePathBuf::from(format!("{stream_name}/{STREAM_ROOT_DIRECTORY}"))
+        };

864-866: date_path in get_all_manifest_files doesn't include tenant prefix.

While root is correctly constructed with tenant prefix, the date_path on line 865 only uses stream_name without the tenant, which may cause incorrect path resolution.

Proposed fix
         for date in dates {
-            let date_path = object_store::path::Path::from(format!("{}/{}", stream_name, &date));
+            let date_path = object_store::path::Path::from(format!("{}/{}", root, &date));
             let resp = self.storage.list_with_delimiter(Some(date_path)).await?;

323-340: alert_state_json_path function signature must be updated to accept and use tenant_id.

The get_alert_state_entry, put_alert_state, and delete_alert_state methods accept tenant_id but don't use it when constructing paths. This breaks tenant isolation—different tenants can access and modify each other's alert states.

The root cause is that alert_state_json_path(alert_id: Ulid) doesn't accept tenant_id, unlike related functions such as alert_json_path and mttr_json_path which properly scope paths by tenant. The get_alert_states method correctly demonstrates the pattern by constructing tenant-scoped paths: {tenant}/.alerts/.

Update alert_state_json_path to accept tenant_id and include it in the path construction, similar to how alert_json_path handles tenants. Then update all callers to pass tenant_id.


392-403: Unused tenant_id parameter creates inconsistent behavior in delete/put methods.

Methods like delete_alert_state, delete_alert, delete_target, and others accept tenant_id but ignore it when constructing paths. However, the corresponding get_* methods use tenant_id to retrieve the same data (e.g., get_alert_states retrieves from [&tenant, ALERTS_ROOT_DIRECTORY] but delete_alert_state uses the tenant-independent alert_state_json_path(id)). This inconsistency creates cross-tenant data isolation risks.

For example:

  • Target struct has a tenant field, but get_object_path() doesn't include it, yet get_targets() retrieves from tenant-specific paths
  • AlertStateEntry is retrieved with tenant context in get_alert_states() but deleted without it in delete_alert_state()

The pattern affects: delete_alert, delete_alert_state, put_llmconfig, delete_llmconfig, put_dashboard, delete_dashboard, put_chat, delete_chat, put_filter, delete_filter, put_correlation, delete_correlation, put_target, delete_target.

Either remove the unused parameter from the method signature, or ensure the path construction includes tenant context consistently with how data is retrieved.

🤖 Fix all issues with AI agents
In @src/alerts/mod.rs:
- Around line 1244-1254: The update() method (and similarly update_state() and
update_notification_state()) currently ignores writes when
self.alerts.write().await.get_mut(tenant) returns None; change the logic to
ensure a tenant bucket is created when missing before inserting: acquire the
write lock on self.alerts and use an entry-or-insert pattern (or explicitly
insert a default bucket for tenant/DEFAULT_TENANT) so that
alerts.insert(*alert.get_id(), alert.clone_box()) always runs for first-time
tenants or racing initializations; apply the same fix to the other referenced
functions (update_state, update_notification_state) that use get_mut(tenant).

In @src/correlation.rs:
- Around line 203-206: The memory delete is removing from the outer map using
correlation.id (self.write().await.remove(&correlation.id)) which deletes a
tenant entry; instead, acquire the write lock, find the tenant's CorrelationMap
by tenant_id, and remove the correlation.id from that inner map (and optionally
remove the tenant key if the inner map becomes empty). Update the code that
follows PARSEABLE.metastore.delete_correlation to lookup
self.write().await.get_mut(&tenant_id) or equivalent, call
inner_map.remove(&correlation.id), and only fall back to removing the outer
entry if you explicitly intend to delete the tenant when its map is empty.
- Around line 140-144: create() persists correlations but only inserts into the
in-memory map if the tenant bucket already exists, so new tenant buckets are
never created and the in-memory cache is inconsistent; update the write lock
usage to use the HashMap entry API (e.g., on self.write().await) for tenant
(using tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v)) so you call
entry(tenant).or_insert_with(|| /* new map */) and then insert the correlation
(correlation.id / correlation.clone()) into that returned bucket; ensure this
change is applied where the current code checks if let Some(corrs) =
self.write().await.get_mut(tenant) and also verify get_correlation reads from
the same map.
- Around line 176-183: The current update() silently does nothing when the
tenant bucket is missing; change the in-memory update to use the map's entry API
instead of get_mut so a missing bucket is created and the correlation is
inserted. Specifically, replace the get_mut(tenant) branch with an entry for the
tenant (using DEFAULT_TENANT when tenant_id is None) that either returns the
existing corrs map or inserts a new empty map, then put updated_correlation.id
-> updated_correlation.clone() into that map (referencing tenant,
DEFAULT_TENANT, updated_correlation, and corrs.insert).

In @src/handlers/http/cluster/mod.rs:
- Around line 319-322: The sync_streams_with_ingestors call is dropping tenant
context: uncomment and restore the tenant_id parameter in
sync_streams_with_ingestors's signature and update its implementation to attach
tenant_id to the outgoing sync payload/headers, then propagate tenant_id from
put_stream (where it is extracted) into the call; also update
create_update_stream so it preserves and returns headers that include tenant_id
instead of discarding them. Mirror the pattern used by
sync_role_update_with_ingestors and fetch_stats_from_ingestors: add tenant_id
parameter to signatures, forward it through all callers, and ensure ingestor
requests include the tenant_id header/field so stream syncs honor tenant
isolation.

In @src/handlers/http/modal/ingest/ingestor_role.rs:
- Around line 46-52: The tenant validation is inverted: currently it rejects
when a non-default request tenant matches the payload tenant. Update the
condition in the block that uses get_tenant_id_from_request, req_tenant_id,
req_tenant, DEFAULT_TENANT and sync_req.tenant_id to reject only when the
request tenant is not the default AND does not equal the payload tenant (i.e.,
change the second check to a mismatch), and keep returning RoleError::Anyhow
with the same message in that case.

In @src/handlers/http/modal/ingest/mod.rs:
- Around line 27-36: The SyncRole struct currently has private fields which
break external access in ingestor_role.rs; either make the fields public (change
privileges: Vec<DefaultPrivilege> and tenant_id: String to pub privileges:
Vec<DefaultPrivilege> and pub tenant_id: String in SyncRole) or add accessor
methods (e.g., impl SyncRole { pub fn tenant_id(&self) -> &String {
&self.tenant_id } and pub fn privileges(&self) -> &Vec<DefaultPrivilege> {
&self.privileges } }) and update ingestor_role.rs to use those getters if
chosen.

In @src/handlers/http/modal/query/querier_logstream.rs:
- Around line 73-75: The delete_stream call on PARSEABLE.storage currently omits
tenant scoping and may delete across tenants; modify the deletion to use tenant
context by passing tenant_id to delete_stream (or build the tenant-scoped path
before calling it). Locate PARSEABLE.storage.delete_stream and change the
invocation to include tenant_id and stream_name (or compute path = [tenant_id,
stream_name, ...] and call delete_stream(path)) so the removal is scoped to the
requesting tenant.

In @src/handlers/http/oidc.rs:
- Around line 286-290: Existing OIDC user creation is omitting tenant context:
in the match over (existing_user, final_roles) replace the None tenant passed to
put_user with the extracted tenant_id so new users are created within the tenant
scope (i.e., call put_user(&user_id, roles, user_info, bearer, tenant_id)); if
leaving it intentionally unset, instead add a clear TODO with a tracking issue
reference next to the (None, roles) arm to avoid silent multi-tenant gaps;
update related comments to reflect the chosen approach and ensure
existing_user/lookups use the same tenant_id variable.

In @src/handlers/http/rbac.rs:
- Around line 147-148: The user is created without tenant context by calling
user::User::new_basic(username.clone(), None) which breaks tenant scoping;
change that call to pass the extracted tenant_id (e.g. Some(tenant_id.clone())
or wrap/convert tenant_id to the expected tenant type) so the new user is
affiliated with the correct tenant, and ensure the surrounding code imports/uses
tenant_id and matches the function signature of user::User::new_basic.

In @src/hottier.rs:
- Around line 100-116: The loop filtering logic incorrectly uses separate
inequality checks for stream and tenant; replace the dual checks so we skip only
when both the stream and tenant match the current ones. Concretely, in the
for-loop condition that currently uses stream != current_stream && tenant_id !=
*current_tenant_id, change it to exclude entries only when (stream ==
current_stream && tenant_id == *current_tenant_id) — e.g., use && !(stream ==
current_stream && tenant_id == *current_tenant_id) or equivalent — so
check_stream_hot_tier_exists(&stream, &tenant_id) && !(stream == current_stream
&& tenant_id == *current_tenant_id) before calling get_hot_tier and accumulating
into total_hot_tier_size/total_hot_tier_used_size.

In @src/metastore/metastores/object_store_metastore.rs:
- Around line 949-954: The put_manifest (and delete_manifest) implementation
uses manifest_path("", tenant_id) which creates a different hierarchy than
get_manifest_path that calls manifest_path(path.as_str(), tenant_id); update
put_manifest (and delete_manifest) to pass the full partition path into
manifest_path (i.e., build the partition_path string first via
partition_path(stream_name, lower_bound, upper_bound) and use that string as the
first argument to manifest_path) so manifest_path, get_manifest_path,
put_manifest, and delete_manifest all construct the same final object key
consistently.

In @src/parseable/mod.rs:
- Around line 1073-1109: The function load_tenants has an empty branch for
single-tenant mode and silently swallows a poisoned write lock; change the
tenant-meta handling so that when get_parseable_metadata returns Some(...) and
is_multi_tenant is false you return an error (e.g., "Found tenant directory
while not in multi-tenant mode") instead of doing nothing, and replace the last
block that currently does if let Ok(mut t) = self.tenants.write() { ... } else {
Ok(None) } with a match that on Ok(mut t) extends and returns Ok(Some(())) and
on Err(poison) converts the poisoned lock into an anyhow::Error and returns
Err(...) so callers always get an Err on real failures rather than Ok(None); use
the existing symbols load_tenants, PARSEABLE.metastore.get_parseable_metadata,
TENANT_METADATA.insert, and self.tenants.write() to locate and update the logic.

In @src/prism/logstream/mod.rs:
- Around line 71-73: The code has a hardcoded fallback—QueriedStats::default()
with a tracing::warn—that replaces the real query result; revert to using the
actual result by restoring the original assignment (replace the fake "let stats
= QueriedStats::default();" and warning with the real "let stats = stats?;" or
equivalent error-propagating handling), remove the debugging warn, and ensure
the enclosing function signature allows propagation of the error (or map the
error to the appropriate error type) so the real stats are returned instead of
defaults.

In @src/query/mod.rs:
- Around line 136-167: The registered per-tenant schemas (via
catalog.register_schema in create_session_context and GlobalSchemaProvider)
don’t give unqualified table names a way to resolve to the current tenant at
query time; fix by applying tenant-specific schema routing per request rather
than only at transform time: either (A) ensure SQL is rewritten to qualify table
names with the tenant (e.g., "tenant"."table") before parsing/execution, or (B)
set the session’s default schema/search path per request using the tenant_id
(update the SessionContext/SessionState before parsing/execution in the request
path that calls execute()) so unqualified names resolve to the tenant’s
registered schema; update code references create_session_context,
GlobalSchemaProvider, catalog.register_schema and the request/execution
entrypoint that passes tenant_id to apply the per-request default schema.

In @src/query/stream_schema_provider.rs:
- Around line 284-291: The borrow-of-temporary and unwrap are present here as in
get_hottier_execution_plan: stop passing a reference to a temporary format!
result and remove unwrap; construct an owned String for object_store_url (e.g.
let object_store_url = if let Some(tenant_id) = self.tenant_id.as_ref() {
format!("file:///{tenant_id}/") } else { "file:///".to_string() }) and then call
ObjectStoreUrl::parse(&object_store_url) handling the Result (propagate with ?
or map_err to a descriptive error) before passing the parsed ObjectStoreUrl into
create_parquet_physical_plan; update the surrounding function signature to
return Result if needed.
- Around line 224-231: The code currently takes a reference to a temporary
String with &format!(...) and then calls ObjectStoreUrl::parse(...).unwrap(),
which risks a borrow-of-temporary and panics on invalid input; change to build
an owned String (e.g., let object_store_url_string = if let Some(tenant_id) =
self.tenant_id.as_ref() { format!("file:///{tenant_id}/") } else {
"file:///".to_string() }) and then call
ObjectStoreUrl::parse(&object_store_url_string) but handle the Result instead of
unwrap (propagate the error, return a Result, or map_err with a descriptive
error) before passing the parsed ObjectStoreUrl into
create_parquet_physical_plan so no temporary borrow or panic occurs; refer to
tenant_id, object_store_url_string, ObjectStoreUrl::parse, and
create_parquet_physical_plan.
- Around line 631-638: Replace the unwraps with proper error propagation in the
scan() flow: when building object_store_url use
glob_storage.store_url().join(tenant_id).map_err(|e|
DataFusionError::Execution(format!("joining tenant id into store URL failed:
{}", e)))? (or propagate with ? after mapping to DataFusionError), and when
converting to ObjectStoreUrl call
ObjectStoreUrl::parse(object_store_url).map_err(|e|
DataFusionError::Execution(format!("parsing ObjectStoreUrl failed: {}", e)))? so
the errors bubble up from the join and parse calls instead of panicking; keep
the call to self.create_parquet_physical_plan(...) but pass the parsed
ObjectStoreUrl result.

In @src/rbac/map.rs:
- Around line 62-104: Remove the sensitive debug logging that can leak BasicAuth
passwords: delete the tracing::warn! calls in users(), mut_users(), roles(), and
mut_roles() (remove the warn lines that print caller info and {by}). Also update
remove_user() (the code that logs active_sessions and user_sessions around lines
276-306) to stop printing SessionKey::BasicAuth passwords — either remove those
session logs or map/redact sessions before logging (e.g., log only session IDs,
types, or a redacted flag, not the password). Ensure no SessionKey or
Display/Debug impl reveals the password in any log emitted by remove_user(),
users(), mut_users(), roles(), or mut_roles().

In @src/users/filters.rs:
- Around line 128-133: The update method currently uses get_mut on self.0 and
silently drops filters when the tenant bucket doesn't exist; change update to
use the map's entry API so a bucket is created if missing (use
entry(tenant_key_or_DEFAULT_TENANT).or_default()), then remove any existing
filter with matching filter_id (retain) and push the cloned filter into that
bucket; reference the update function, DEFAULT_TENANT, self.0, retain, and push
when locating where to replace the get_mut logic.
🟡 Minor comments (7)
src/hottier.rs-596-603 (1)

596-603: Avoid unwrap() on hot_tier_file_path result - could panic on path errors.

hot_tier_file_path returns a Result and can fail (e.g., on invalid path conversion). Using unwrap() here could cause a panic and crash the service. Since this is an existence check, it should gracefully return false on path errors.

Also, remove the commented-out dead code (lines 597-600).

🐛 Proposed fix
 pub fn check_stream_hot_tier_exists(&self, stream: &str, tenant_id: &Option<String>) -> bool {
-    // let path = self
-    //     .hot_tier_path
-    //     .join(stream)
-    //     .join(STREAM_HOT_TIER_FILENAME);
-    let path = self.hot_tier_file_path(stream, tenant_id).unwrap();
-    PathBuf::from(path.to_string()).exists()
+    match self.hot_tier_file_path(stream, tenant_id) {
+        Ok(path) => PathBuf::from(path.to_string()).exists(),
+        Err(_) => false,
+    }
 }
src/utils/mod.rs-79-85 (1)

79-85: Potential panic on invalid UTF-8 header value.

tenant_value.to_str().unwrap() will panic if the tenant header contains non-UTF8 bytes. Consider handling the error gracefully.

🔧 Proposed fix
 pub fn get_tenant_id_from_request(req: &HttpRequest) -> Option<String> {
     if let Some(tenant_value) = req.headers().get("tenant") {
-        Some(tenant_value.to_str().unwrap().to_owned())
+        tenant_value.to_str().ok().map(|s| s.to_owned())
     } else {
         None
     }
 }
src/handlers/http/middleware.rs-167-177 (1)

167-177: Potential panic on invalid tenant_id header value.

HeaderValue::from_str(&tid).unwrap() will panic if tid contains characters that are invalid in HTTP headers (e.g., non-visible ASCII). Consider handling the error gracefully.

🔧 Proposed fix
         let user_and_tenant_id = match get_user_and_tenant_from_request(req.request()) {
             Ok((uid, tid)) => {
-                req.headers_mut().insert(
-                    HeaderName::from_static("tenant"),
-                    HeaderValue::from_str(&tid).unwrap(),
-                );
+                if let Ok(header_value) = HeaderValue::from_str(&tid) {
+                    req.headers_mut().insert(
+                        HeaderName::from_static("tenant"),
+                        header_value,
+                    );
+                }
                 Ok((uid, tid))
             }
             Err(e) => Err(e),
         };
src/rbac/map.rs-95-104 (1)

95-104: Minor: mut_roles() log message says mut_users.
Copy/paste typo makes debugging harder.

src/catalog/mod.rs-455-461 (1)

455-461: warn! for manifest path_url is too loud (and tenant-revealing).
This should likely be debug!/trace! (or removed).

src/storage/object_storage.rs-109-117 (1)

109-117: Remove/خفض warn! debug spam from the hot path.
These look like temporary diagnostics and will flood logs (and may expose tenant IDs and local paths). Prefer debug!/trace! behind a feature flag if needed.

Also applies to: 460-461, 964-973, 1078-1084, 1086-1116

src/metastore/metastore_traits.rs-172-182 (1)

172-182: Remove unused get_chats() method or align return type with other similar methods.

get_chats() returns DashMap<String, Vec<Bytes>> while all comparable methods (get_dashboards(), get_filters(), get_correlations(), get_keystones(), etc.) return HashMap. The DashMap choice is inconsistent with the pattern used throughout the trait. Additionally, this method is not called anywhere in the codebase, making it dead code. Either remove the method if no longer needed, or change the return type to HashMap to match other similar getters.

Comment on lines 1244 to 1254
/// Update the in-mem vector of alerts
async fn update(&self, alert: &dyn AlertTrait) {
self.alerts
.write()
.await
.insert(*alert.get_id(), alert.clone_box());
let tenant = alert.get_tenant_id().as_ref().map_or(DEFAULT_TENANT, |v| v);
if let Some(alerts) = self.alerts.write().await.get_mut(tenant) {
alerts.insert(*alert.get_id(), alert.clone_box());
}
// self.alerts
// .write()
// .await
// .insert(*alert.get_id(), alert.clone_box());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don’t silently drop alert updates when the tenant bucket doesn’t exist.
update(), the write-back in update_state(), and update_notification_state() only insert if get_mut(tenant) returns Some(_). For first-time tenants (or races with initialization), this can lose writes.

Proposed fix (always create tenant bucket)
 async fn update(&self, alert: &dyn AlertTrait) {
     let tenant = alert.get_tenant_id().as_ref().map_or(DEFAULT_TENANT, |v| v);
-    if let Some(alerts) = self.alerts.write().await.get_mut(tenant) {
-        alerts.insert(*alert.get_id(), alert.clone_box());
-    }
+    let mut guard = self.alerts.write().await;
+    guard
+        .entry(tenant.to_owned())
+        .or_default()
+        .insert(*alert.get_id(), alert.clone_box());
 }

 // In update_state() write-back block:
 {
     let mut write_access = self.alerts.write().await;
     let tenant = alert.get_tenant_id().as_ref().map_or(DEFAULT_TENANT, |v| v);
-    if let Some(alerts) = write_access.get_mut(tenant) {
-        alerts.insert(*alert.get_id(), alert.clone_box());
-    }
+    write_access
+        .entry(tenant.to_owned())
+        .or_default()
+        .insert(*alert.get_id(), alert.clone_box());
 }

 // In update_notification_state():
- if let Some(alerts) = write_access.get_mut(tenant) {
-     alerts.insert(*alert.get_id(), alert.clone_box());
- }
+ write_access
+     .entry(tenant.to_owned())
+     .or_default()
+     .insert(*alert.get_id(), alert.clone_box());

Also applies to: 1333-1342, 1347-1386, 1388-1404

🤖 Prompt for AI Agents
In @src/alerts/mod.rs around lines 1244 - 1254, The update() method (and
similarly update_state() and update_notification_state()) currently ignores
writes when self.alerts.write().await.get_mut(tenant) returns None; change the
logic to ensure a tenant bucket is created when missing before inserting:
acquire the write lock on self.alerts and use an entry-or-insert pattern (or
explicitly insert a default bucket for tenant/DEFAULT_TENANT) so that
alerts.insert(*alert.get_id(), alert.clone_box()) always runs for first-time
tenants or racing initializations; apply the same fix to the other referenced
functions (update_state, update_notification_state) that use get_mut(tenant).

Comment on lines +140 to +144
let tenant = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v);
// Update in memory
self.write()
.await
.insert(correlation.id.to_owned(), correlation.clone());
if let Some(corrs) = self.write().await.get_mut(tenant) {
corrs.insert(correlation.id.to_owned(), correlation.clone());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Correlation silently not added to memory when tenant bucket doesn't exist.

In create(), if the tenant bucket doesn't exist in the in-memory map (e.g., first correlation for a tenant after restart), the correlation is persisted to metastore but not added to the in-memory cache. This could cause inconsistencies where get_correlation fails even after successful creation.

🐛 Proposed fix using entry API
         let tenant = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v);
         // Update in memory
-        if let Some(corrs) = self.write().await.get_mut(tenant) {
-            corrs.insert(correlation.id.to_owned(), correlation.clone());
-        }
+        self.write()
+            .await
+            .entry(tenant.to_owned())
+            .or_default()
+            .insert(correlation.id.to_owned(), correlation.clone());
🤖 Prompt for AI Agents
In @src/correlation.rs around lines 140 - 144, create() persists correlations
but only inserts into the in-memory map if the tenant bucket already exists, so
new tenant buckets are never created and the in-memory cache is inconsistent;
update the write lock usage to use the HashMap entry API (e.g., on
self.write().await) for tenant (using tenant_id.as_ref().map_or(DEFAULT_TENANT,
|v| v)) so you call entry(tenant).or_insert_with(|| /* new map */) and then
insert the correlation (correlation.id / correlation.clone()) into that returned
bucket; ensure this change is applied where the current code checks if let
Some(corrs) = self.write().await.get_mut(tenant) and also verify get_correlation
reads from the same map.

Comment on lines +176 to +183
let tenant = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v);
// Update in memory
self.write().await.insert(
updated_correlation.id.to_owned(),
updated_correlation.clone(),
);
if let Some(corrs) = self.write().await.get_mut(tenant) {
corrs.insert(
updated_correlation.id.to_owned(),
updated_correlation.clone(),
);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Same issue: update() silently fails when tenant bucket doesn't exist.

Apply the same fix using the entry API to ensure the correlation is always inserted.

🐛 Proposed fix
         let tenant = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v);
         // Update in memory
-        if let Some(corrs) = self.write().await.get_mut(tenant) {
-            corrs.insert(
-                updated_correlation.id.to_owned(),
-                updated_correlation.clone(),
-            );
-        }
+        self.write()
+            .await
+            .entry(tenant.to_owned())
+            .or_default()
+            .insert(updated_correlation.id.to_owned(), updated_correlation.clone());
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let tenant = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v);
// Update in memory
self.write().await.insert(
updated_correlation.id.to_owned(),
updated_correlation.clone(),
);
if let Some(corrs) = self.write().await.get_mut(tenant) {
corrs.insert(
updated_correlation.id.to_owned(),
updated_correlation.clone(),
);
}
let tenant = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v);
// Update in memory
self.write()
.await
.entry(tenant.to_owned())
.or_default()
.insert(updated_correlation.id.to_owned(), updated_correlation.clone());
🤖 Prompt for AI Agents
In @src/correlation.rs around lines 176 - 183, The current update() silently
does nothing when the tenant bucket is missing; change the in-memory update to
use the map's entry API instead of get_mut so a missing bucket is created and
the correlation is inserted. Specifically, replace the get_mut(tenant) branch
with an entry for the tenant (using DEFAULT_TENANT when tenant_id is None) that
either returns the existing corrs map or inserts a new empty map, then put
updated_correlation.id -> updated_correlation.clone() into that map (referencing
tenant, DEFAULT_TENANT, updated_correlation, and corrs.insert).

Comment on lines 203 to 206
PARSEABLE.metastore.delete_correlation(&correlation, tenant_id).await?;

// Delete from memory
self.write().await.remove(&correlation.id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical bug: remove operates on wrong map level, deletes entire tenant's correlations.

Line 206 calls self.write().await.remove(&correlation.id) which removes an entry from the outer HashMap<String, CorrelationMap> using correlation.id as the key. This is incorrect—it should remove the correlation from the inner CorrelationMap for the specific tenant. As written, this could corrupt the correlation store by deleting an unrelated tenant's data (if a tenant_id happens to match a correlation_id) or silently fail.

🐛 Proposed fix
         // Delete from storage
         PARSEABLE.metastore.delete_correlation(&correlation, tenant_id).await?;

         // Delete from memory
-        self.write().await.remove(&correlation.id);
+        let tenant = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v.as_str());
+        if let Some(corrs) = self.write().await.get_mut(tenant) {
+            corrs.remove(&correlation.id);
+        }

         Ok(())
🤖 Prompt for AI Agents
In @src/correlation.rs around lines 203 - 206, The memory delete is removing
from the outer map using correlation.id
(self.write().await.remove(&correlation.id)) which deletes a tenant entry;
instead, acquire the write lock, find the tenant's CorrelationMap by tenant_id,
and remove the correlation.id from that inner map (and optionally remove the
tenant key if the inner map becomes empty). Update the code that follows
PARSEABLE.metastore.delete_correlation to lookup
self.write().await.get_mut(&tenant_id) or equivalent, call
inner_map.remove(&correlation.id), and only fall back to removing the outer
entry if you explicitly intend to delete the tenant when its map is empty.

Comment on lines +224 to +231
let object_store_url = if let Some(tenant_id) = self.tenant_id.as_ref() {
&format!("file:///{tenant_id}/")
} else {
"file:///"
};
self.create_parquet_physical_plan(
execution_plans,
ObjectStoreUrl::parse("file:///").unwrap(),
ObjectStoreUrl::parse(object_store_url).unwrap(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential borrow-of-temporary issue and unhandled parse error.

  1. The pattern &format!("file:///{tenant_id}/") creates a reference to a temporary String that may be dropped before use. This typically causes a compile error ("temporary value dropped while borrowed") or leads to fragile code.

  2. The unwrap() on ObjectStoreUrl::parse() can panic if tenant_id contains characters that produce an invalid URL.

Suggested fix using owned String
-        let object_store_url = if let Some(tenant_id) = self.tenant_id.as_ref() {
-            &format!("file:///{tenant_id}/")
-        } else {
-            "file:///"
-        };
+        let object_store_url = match self.tenant_id.as_ref() {
+            Some(tenant_id) => format!("file:///{tenant_id}/"),
+            None => "file:///".to_string(),
+        };
         self.create_parquet_physical_plan(
             execution_plans,
-            ObjectStoreUrl::parse(object_store_url).unwrap(),
+            ObjectStoreUrl::parse(&object_store_url)
+                .map_err(|e| DataFusionError::Plan(format!("Invalid object store URL: {e}")))?,
🤖 Prompt for AI Agents
In @src/query/stream_schema_provider.rs around lines 224 - 231, The code
currently takes a reference to a temporary String with &format!(...) and then
calls ObjectStoreUrl::parse(...).unwrap(), which risks a borrow-of-temporary and
panics on invalid input; change to build an owned String (e.g., let
object_store_url_string = if let Some(tenant_id) = self.tenant_id.as_ref() {
format!("file:///{tenant_id}/") } else { "file:///".to_string() }) and then call
ObjectStoreUrl::parse(&object_store_url_string) but handle the Result instead of
unwrap (propagate the error, return a Result, or map_err with a descriptive
error) before passing the parsed ObjectStoreUrl into
create_parquet_physical_plan so no temporary borrow or panic occurs; refer to
tenant_id, object_store_url_string, ObjectStoreUrl::parse, and
create_parquet_physical_plan.

Comment on lines +284 to +291
let object_store_url = if let Some(tenant_id) = self.tenant_id.as_ref() {
&format!("file:///{tenant_id}/")
} else {
"file:///"
};
self.create_parquet_physical_plan(
execution_plans,
ObjectStoreUrl::parse("file:///").unwrap(),
ObjectStoreUrl::parse(object_store_url).unwrap(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Same borrow-of-temporary and unwrap issues as hot tier path.

This duplicates the problematic pattern from get_hottier_execution_plan. Apply the same fix using an owned String and proper error handling for ObjectStoreUrl::parse().

🤖 Prompt for AI Agents
In @src/query/stream_schema_provider.rs around lines 284 - 291, The
borrow-of-temporary and unwrap are present here as in
get_hottier_execution_plan: stop passing a reference to a temporary format!
result and remove unwrap; construct an owned String for object_store_url (e.g.
let object_store_url = if let Some(tenant_id) = self.tenant_id.as_ref() {
format!("file:///{tenant_id}/") } else { "file:///".to_string() }) and then call
ObjectStoreUrl::parse(&object_store_url) handling the Result (propagate with ?
or map_err to a descriptive error) before passing the parsed ObjectStoreUrl into
create_parquet_physical_plan; update the surrounding function signature to
return Result if needed.

Comment on lines +631 to +638
let object_store_url = if let Some(tenant_id) = self.tenant_id.as_ref() {
glob_storage.store_url().join(tenant_id).unwrap()
} else {
glob_storage.store_url()
};
self.create_parquet_physical_plan(
&mut execution_plans,
ObjectStoreUrl::parse(glob_storage.store_url()).unwrap(),
ObjectStoreUrl::parse(object_store_url).unwrap(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# First, let's look at the actual file and the specific lines
cat -n src/query/stream_schema_provider.rs | sed -n '625,645p'

Repository: parseablehq/parseable

Length of output: 1004


🏁 Script executed:

#!/bin/bash
# Find the store_url() method definition
rg -n "fn store_url" --type rust -B 2 -A 5

Repository: parseablehq/parseable

Length of output: 2397


🏁 Script executed:

#!/bin/bash
# Check ObjectStoreUrl type and its methods, particularly parse() and join()
rg -n "ObjectStoreUrl" --type rust | head -30

Repository: parseablehq/parseable

Length of output: 1227


🏁 Script executed:

#!/bin/bash
# Look for the actual implementation of join() on the store_url return type
rg -n "pub fn join" --type rust -A 3

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

#!/bin/bash
# Check the url crate's join() method behavior - look for examples in the codebase
rg -n "\.join\(" --type rust src/query/stream_schema_provider.rs -B 2 -A 2

Repository: parseablehq/parseable

Length of output: 585


🏁 Script executed:

#!/bin/bash
# Check how other code in this file handles ObjectStoreUrl::parse errors
rg -n "ObjectStoreUrl::parse" --type rust -B 2 -A 2 src/query/stream_schema_provider.rs

Repository: parseablehq/parseable

Length of output: 735


🏁 Script executed:

#!/bin/bash
# Look at the create_parquet_physical_plan signature to understand what type it expects
rg -n "fn create_parquet_physical_plan" --type rust -A 8 src/query/stream_schema_provider.rs

Repository: parseablehq/parseable

Length of output: 435


🏁 Script executed:

#!/antml/bash
# Check if there are any error handling patterns for url::Url operations elsewhere
rg -n "store_url\(\)\.join" --type rust -B 3 -A 3

Repository: parseablehq/parseable

Length of output: 640


🏁 Script executed:

#!/bin/bash
# Search for where tenant_id is assigned/validated in the scan method
rg -n "tenant_id" --type rust src/query/stream_schema_provider.rs | head -20

Repository: parseablehq/parseable

Length of output: 1225


🏁 Script executed:

#!/bin/bash
# Check the scan method signature and how tenant_id is provided
rg -n "fn scan" --type rust -A 15 src/query/stream_schema_provider.rs | head -50

Repository: parseablehq/parseable

Length of output: 745


Handle error cases in URL construction instead of unwrapping.

The join() call on line 632 and ObjectStoreUrl::parse() on line 638 should propagate errors rather than panic, since the scan() method returns Result<Arc<dyn ExecutionPlan>, DataFusionError>. The parse() call is necessary here for type conversion from url::Url to ObjectStoreUrl, not redundant.

Suggested fix with error handling
         let object_store_url = if let Some(tenant_id) = self.tenant_id.as_ref() {
-            glob_storage.store_url().join(tenant_id).unwrap()
+            glob_storage.store_url().join(tenant_id)
+                .map_err(|e| DataFusionError::Plan(format!("Invalid tenant path: {e}")))?
         } else {
             glob_storage.store_url()
         };
         self.create_parquet_physical_plan(
             &mut execution_plans,
-            ObjectStoreUrl::parse(object_store_url).unwrap(),
+            ObjectStoreUrl::parse(&object_store_url)
+                .map_err(|e| DataFusionError::Plan(format!("Invalid object store URL: {e}")))?,
🤖 Prompt for AI Agents
In @src/query/stream_schema_provider.rs around lines 631 - 638, Replace the
unwraps with proper error propagation in the scan() flow: when building
object_store_url use glob_storage.store_url().join(tenant_id).map_err(|e|
DataFusionError::Execution(format!("joining tenant id into store URL failed:
{}", e)))? (or propagate with ? after mapping to DataFusionError), and when
converting to ObjectStoreUrl call
ObjectStoreUrl::parse(object_store_url).map_err(|e|
DataFusionError::Execution(format!("parsing ObjectStoreUrl failed: {}", e)))? so
the errors bubble up from the join and parse calls instead of panicking; keep
the call to self.create_parquet_physical_plan(...) but pass the parsed
ObjectStoreUrl result.

Comment on lines +128 to +133
pub async fn update(&self, filter: &Filter, tenant_id: &Option<String>) {
let mut s = self.0.write().await;
s.retain(|f| f.filter_id != filter.filter_id);
s.push(filter.clone());
if let Some(filters) = s.get_mut(tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v)) {
filters.retain(|f| f.filter_id != filter.filter_id);
filters.push(filter.clone());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Filter silently dropped when tenant bucket doesn't exist.

The update method only modifies existing tenant buckets via get_mut(). If the tenant entry doesn't exist in the map (e.g., when adding the first filter for a new tenant after a server restart), the filter will be silently dropped without insertion.

🐛 Proposed fix using entry API
     pub async fn update(&self, filter: &Filter, tenant_id: &Option<String>) {
         let mut s = self.0.write().await;
-        if let Some(filters) = s.get_mut(tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v)) {
-            filters.retain(|f| f.filter_id != filter.filter_id);
-            filters.push(filter.clone());
-        }
+        let tenant = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v.as_str());
+        let filters = s.entry(tenant.to_owned()).or_default();
+        filters.retain(|f| f.filter_id != filter.filter_id);
+        filters.push(filter.clone());
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pub async fn update(&self, filter: &Filter, tenant_id: &Option<String>) {
let mut s = self.0.write().await;
s.retain(|f| f.filter_id != filter.filter_id);
s.push(filter.clone());
if let Some(filters) = s.get_mut(tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v)) {
filters.retain(|f| f.filter_id != filter.filter_id);
filters.push(filter.clone());
}
pub async fn update(&self, filter: &Filter, tenant_id: &Option<String>) {
let mut s = self.0.write().await;
let tenant = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v.as_str());
let filters = s.entry(tenant.to_owned()).or_default();
filters.retain(|f| f.filter_id != filter.filter_id);
filters.push(filter.clone());
}
🤖 Prompt for AI Agents
In @src/users/filters.rs around lines 128 - 133, The update method currently
uses get_mut on self.0 and silently drops filters when the tenant bucket doesn't
exist; change update to use the map's entry API so a bucket is created if
missing (use entry(tenant_key_or_DEFAULT_TENANT).or_default()), then remove any
existing filter with matching filter_id (retain) and push the cloned filter into
that bucket; reference the update function, DEFAULT_TENANT, self.0, retain, and
push when locating where to replace the get_mut logic.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 18

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (20)
src/migration/mod.rs (1)

499-504: Missing parent directory creation for tenant-specific staging path.

When tenant_id is Some, the path includes a tenant subdirectory (e.g., staging_dir/tenant_id/.parseable.json). If this tenant directory doesn't exist, OpenOptions::open() will fail with "No such file or directory" since create(true) only creates the file, not parent directories.

Suggested fix: create parent directory if needed
     } else {
         config.options.staging_dir().join(".parseable.json")
     };
+    if let Some(parent) = path.parent() {
+        std::fs::create_dir_all(parent)?;
+    }
     let mut file = OpenOptions::new()
         .create(true)
         .truncate(true)
         .write(true)
         .open(path)?;
src/handlers/http/rbac.rs (3)

128-136: Role existence check is not tenant-scoped.

The check roles().contains_key(role) at line 130 doesn't use tenant_id, but later role lookups (lines 218-224, 241-247) use tenant-scoped access via roles().get(tenant). This inconsistency could allow roles from other tenants to pass validation or cause false negatives.

🔧 Suggested fix
+    let tenant = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v);
     let mut non_existent_roles = Vec::new();
     for role in &user_roles {
-        if !roles().contains_key(role) {
+        if !roles().get(tenant).map_or(false, |r| r.contains_key(role)) {
             non_existent_roles.push(role.clone());
         }
     }

322-333: Role existence check not tenant-scoped (same issue as post_user).

Same issue as in post_user: the role existence check roles().contains_key(role) doesn't use tenant context, while subsequent lookups are tenant-scoped.

🔧 Suggested fix
+    let tenant = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v);
     let mut non_existent_roles = Vec::new();
 
     // check if the role exists
     for role in &roles_to_add {
-        if !roles().contains_key(role) {
+        if !roles().get(tenant).map_or(false, |r| r.contains_key(role)) {
             non_existent_roles.push(role.clone());
         }
     }

378-389: Role existence check not tenant-scoped (same pattern).

The remove_roles_from_user handler has the same tenant-scoping gap in role validation.

🔧 Suggested fix
+    let tenant = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v);
     let mut non_existent_roles = Vec::new();
 
     // check if the role exists
     for role in &roles_to_remove {
-        if !roles().contains_key(role) {
+        if !roles().get(tenant).map_or(false, |r| r.contains_key(role)) {
             non_existent_roles.push(role.clone());
         }
     }
src/handlers/http/cluster/mod.rs (1)

600-641: Unused HttpRequest parameter in sync_password_reset_with_ingestors.

The req: HttpRequest parameter is accepted but never used. If tenant context is needed, it should be extracted via get_tenant_id_from_request(&req). Otherwise, remove the unused parameter.

🔧 Suggested fix - either use it or remove it

Option 1 - Use it:

 pub async fn sync_password_reset_with_ingestors(
     req: HttpRequest,
     username: &str,
 ) -> Result<(), RBACError> {
     let username = username.to_owned();
+    let tenant_id = get_tenant_id_from_request(&req);

Option 2 - Remove if not needed:

 pub async fn sync_password_reset_with_ingestors(
-    req: HttpRequest,
     username: &str,
 ) -> Result<(), RBACError> {
src/alerts/alerts_utils.rs (1)

129-153: Pass auth_token to send_query_request or remove the unused parameter.

The auth_token parameter is accepted by execute_remote_query but always passes None to send_query_request instead (line 148). This causes the auth token from the caller to be ignored, falling back to the querier's token. Either convert and pass the auth_token as a HeaderMap to send_query_request, or remove the unused parameter if it's not needed.

src/storage/store_metadata.rs (1)

298-320: Add parent directory creation for tenant-scoped staging metadata.

When tenant_id is present, the path includes a tenant subdirectory (line 306). OpenOptions::open will fail with NotFound if the parent directory doesn't exist. The fix is to create the parent directory before opening the file, following the standard Rust pattern.

Note: create_dir_all is already imported at line 21; the implementation can safely call it.

Proposed fix
 pub fn put_staging_metadata(meta: &StorageMetadata, tenant_id: &Option<String>) -> io::Result<()> {
     let mut staging_metadata = meta.clone();
     staging_metadata.server_mode = PARSEABLE.options.mode;
     staging_metadata.staging = PARSEABLE.options.staging_dir().to_path_buf();
     let path = if let Some(tenant_id) = tenant_id.as_ref() {
-        PARSEABLE
+        let tenant_dir = PARSEABLE
             .options
             .staging_dir()
-            .join(tenant_id)
-            .join(PARSEABLE_METADATA_FILE_NAME)
+            .join(tenant_id);
+        create_dir_all(&tenant_dir)?;
+        tenant_dir.join(PARSEABLE_METADATA_FILE_NAME)
     } else {
         PARSEABLE
             .options
             .staging_dir()
             .join(PARSEABLE_METADATA_FILE_NAME)
     };
src/catalog/mod.rs (2)

529-548: Retention cleanup request does not propagate tenant_id to ingestors.

The for_each_live_node call sends retention cleanup requests without including the tenant_id. In a multi-tenant setup, this could cause ingestors to delete data from the wrong tenant or fail to scope the cleanup correctly.

Consider passing tenant_id to the closure and including it in the cleanup request URL or payload.

+    let tenant_for_closure = tenant_id.clone();
     for_each_live_node(move |ingestor| {
         let stream_name = stream_name_clone.clone();
         let dates = dates_clone.clone();
+        let tenant_id = tenant_for_closure.clone();
         async move {
             let url = format!(
-                "{}{}/logstream/{}/retention/cleanup",
+                "{}{}/logstream/{}/retention/cleanup?tenant_id={}",
                 ingestor.domain_name,
                 base_path_without_preceding_slash(),
-                stream_name
+                stream_name,
+                tenant_id.as_deref().unwrap_or("")
             );

556-569: Inconsistent tenant_id handling pattern.

The partition_path function uses map_or("", |v| v) to handle the optional tenant_id, but this deviates from the established pattern in the same codebase. Functions like alert_json_path (line 1209) and alert_config_mttr_json_path (line 1244) explicitly use if let Some(tenant_id) to conditionally build paths without empty segments.

When tenant_id is None, passing an empty string to from_iter is inconsistent with similar functions and less explicit about intent. Align with the established pattern:

Proposed fix
 pub fn partition_path(
     stream: &str,
     lower_bound: DateTime<Utc>,
     upper_bound: DateTime<Utc>,
     tenant_id: &Option<String>,
 ) -> RelativePathBuf {
-    let root = tenant_id.as_ref().map_or("", |v| v);
     let lower = lower_bound.date_naive().format("%Y-%m-%d").to_string();
     let upper = upper_bound.date_naive().format("%Y-%m-%d").to_string();
-    if lower == upper {
-        RelativePathBuf::from_iter([root, stream, &format!("date={lower}")])
+    let date_segment = if lower == upper {
+        format!("date={lower}")
     } else {
-        RelativePathBuf::from_iter([root, stream, &format!("date={lower}:{upper}")])
+        format!("date={lower}:{upper}")
+    };
+    if let Some(tenant) = tenant_id {
+        RelativePathBuf::from_iter([tenant.as_str(), stream, &date_segment])
+    } else {
+        RelativePathBuf::from_iter([stream, &date_segment])
     }
 }
src/handlers/http/modal/ingest/ingestor_rbac.rs (2)

196-222: post_gen_password does not generate a new password.

This handler is documented as "Resets password for the user to a newly generated one" but it:

  1. Calls put_staging_metadata at line 207 before any password change
  2. Reads the existing password_hash from metadata at line 217 instead of generating a new one
  3. Never calls Basic::gen_new_password() or similar

Compare with src/handlers/http/modal/query/querier_rbac.rs lines 302-334 which correctly generates a new password using user::Basic::gen_new_password().

🐛 Proposed fix
 pub async fn post_gen_password(
     req: HttpRequest,
     username: web::Path<String>,
 ) -> Result<HttpResponse, RBACError> {
     let username = username.into_inner();
     let tenant_id = get_tenant_id_from_request(&req);
-    let mut new_hash = String::default();
     let mut metadata = get_metadata(&tenant_id).await?;
 
-    let _ = storage::put_staging_metadata(&metadata, &tenant_id);
+    let user::PassCode { password, hash } = user::Basic::gen_new_password();
     if let Some(user) = metadata
         .users
         .iter_mut()
         .filter_map(|user| match user.ty {
             user::UserType::Native(ref mut user) => Some(user),
             _ => None,
         })
         .find(|user| user.username == username)
     {
-        new_hash.clone_from(&user.password_hash);
+        user.password_hash.clone_from(&hash);
     } else {
         return Err(RBACError::UserDoesNotExist);
     }
-    Users.change_password_hash(&username, &new_hash, &tenant_id);
-    Ok(HttpResponse::Ok().status(StatusCode::OK).finish())
+    let _ = storage::put_staging_metadata(&metadata, &tenant_id);
+    Users.change_password_hash(&username, &hash, &tenant_id);
+    Ok(HttpResponse::Ok().status(StatusCode::OK).body(password))
 }

107-117: Role existence checks not tenant-scoped.

Both add_roles_to_user (line 110) and remove_roles_from_user (line 154) check role existence using roles().get(r).is_none() without tenant scoping. This could allow referencing roles from other tenants.

Also applies to: 151-161

src/handlers/http/query.rs (1)

82-110: Tenant-scoped schema not configured in get_records_and_fields.

The query() function (lines 121-125) configures the session state with a tenant-aware default schema, but get_records_and_fields obtains the session state without this configuration. Since into_query() relies on session_state.config_options().catalog.default_schema for schema resolution, queries executed through get_records_and_fields may resolve to the wrong schema in a multi-tenant environment.

The function has access to tenant_id but does not apply it to the session configuration. Apply the same pattern:

Suggested fix
 pub async fn get_records_and_fields(
     query_request: &Query,
     creds: &SessionKey,
     tenant_id: &Option<String>,
 ) -> Result<(Option<Vec<RecordBatch>>, Option<Vec<String>>), QueryError> {
-    let session_state = QUERY_SESSION.get_ctx().state();
+    let mut session_state = QUERY_SESSION.get_ctx().state();
+    session_state
+        .config_mut()
+        .options_mut()
+        .catalog
+        .default_schema = tenant_id.as_ref().map_or("public".into(), |v| v.to_owned());
src/handlers/http/modal/query/querier_rbac.rs (1)

60-68: Add tenant-scoped lookup for role existence check.

The roles().contains_key(role) and roles().get(r) checks query the outer HashMap level (checking for tenant_id keys) instead of the inner level where role names are stored. In a multi-tenant setup, this allows users to assign non-existent roles without validation.

The data structure is HashMap<String, HashMap<String, Vec<DefaultPrivilege>>> where the outer key is tenant_id. The correct pattern, already used elsewhere in the codebase (e.g., src/rbac/map.rs:478), is:

if let Some(roles) = roles().get(&tenant_id)
    && let Some(privileges) = roles.get(role_name)
{
    // role exists for this tenant
}

Fix this in:

  • post_user() at line 62
  • add_roles_to_user() at line 197
  • remove_roles_from_user() at line 257

The tenant_id is available in all these functions via get_tenant_id_from_request(&req). This same issue also exists in src/handlers/http/modal/ingest/ingestor_rbac.rs.

src/hottier.rs (2)

208-220: delete_hot_tier ignores tenant_id (can delete wrong directory / leave tenant data behind)

You’re scoping the metadata file under {hot_tier_path}/{tenant}/{stream}/.hot_tier.json, but deletion still uses {hot_tier_path}/{stream}. In multi-tenant this can delete the wrong tree (or fail to delete the right one).

Proposed fix
 pub async fn delete_hot_tier(
     &self,
     stream: &str,
     tenant_id: &Option<String>,
 ) -> Result<(), HotTierError> {
     if !self.check_stream_hot_tier_exists(stream, tenant_id) {
         return Err(HotTierValidationError::NotFound(stream.to_owned()).into());
     }
-    let path = self.hot_tier_path.join(stream);
+    let path = if let Some(t) = tenant_id.as_ref() {
+        self.hot_tier_path.join(t).join(stream)
+    } else {
+        self.hot_tier_path.join(stream)
+    };
     fs::remove_dir_all(path).await?;
 
     Ok(())
 }

186-206: Tenant-scoped metadata storage vs. non-tenant-aware local traversal is inconsistent and breaks hot tier operations

hot_tier_file_path() is tenant-aware and stores metadata with tenant prefix. However, manifest files downloaded via process_manifest() have file_path that includes the tenant prefix (from object store path), so they download to {hot_tier_path}/{tenant}/{stream}/date=.../.... But retrieval and cleanup functions (fetch_hot_tier_dates(), get_stream_path_for_date(), get_oldest_date_time_entry(), delete_hot_tier()) only join stream without tenant, looking for files at {hot_tier_path}/{stream}/.... This mismatch prevents cleanup and oldest-date calculation from finding files, and risks cross-tenant collisions when multiple tenants share the same stream name.

Affected locations:

  • fetch_hot_tier_dates() (line 473): should include tenant when constructing paths
  • get_stream_path_for_date() (line 529): should include tenant
  • delete_hot_tier() (line 216): should include tenant
  • get_oldest_date_time_entry() (line 708): inherits tenant issue via fetch_hot_tier_dates
src/parseable/streams.rs (2)

117-137: Update tests (and any call sites) for the new tenant_id parameter and nested map shape

Stream::new(..., tenant_id) and Streams::get_or_create(..., tenant_id) changed signatures, but the tests still call the old arity and still assume Streams is a flat HashMap<stream_name, ...>. As-is, unit tests won’t compile / assertions won’t match.

Example pattern to apply across tests
 let options = Arc::new(Options::default());
 let staging = Stream::new(
     options.clone(),
     stream_name,
     LogStreamMetadata::default(),
     None,
+    &None,
 );

 assert_eq!(
     staging.data_path,
-    options.local_stream_data_path(stream_name)
+    options.local_stream_data_path(stream_name, &None)
 );

And for Streams assertions (new nested map):

 let guard = streams.read().expect("Failed to acquire read lock");
-assert!(guard.contains_key(stream_name));
+assert!(guard
+    .get(DEFAULT_TENANT)
+    .is_some_and(|m| m.contains_key(stream_name)));

Also applies to: 1200-1725


1046-1078: Remove/downgrade tracing::warn! that logs full metadata/options in get_or_create

This will be extremely noisy and may leak sensitive config (and potentially user-related metadata) into logs. This should be trace!/debug! at most, and avoid dumping structs.

Proposed fix
-        tracing::warn!(
-            "get_or_create\nstream- {stream_name}\ntenant- {tenant_id:?}\nmetadata- {metadata:?}\noptions- {options:?}"
-        );
+        tracing::debug!(stream_name = %stream_name, tenant_id = ?tenant_id, "streams.get_or_create");
 
         let tenant = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v);
src/storage/object_storage.rs (1)

618-710: Remove/downgrade warn-level debug logging in hot paths

There are multiple tracing::warn! that look like debug leftovers (e.g., logging every parquet path, logging fetched schema). This will spam production logs and can leak internal paths. Prefer trace!/debug! with minimal fields.

Also applies to: 925-947

src/metastore/metastores/object_store_metastore.rs (2)

342-390: put_alert_state also ignores tenant_id - data isolation issue.

Like get_alert_state_entry, this method accepts tenant_id but constructs the path without it (line 352). Combined with get_alert_states() which filters by tenant path (line 302), this creates a data isolation issue where alert states may be written globally but read tenant-scoped.


323-340: tenant_id parameter is unused across all alert state methods - inconsistent with tenant-scoped get_alert_states().

The tenant_id parameter is accepted but not used in get_alert_state_entry(), put_alert_state(), and delete_alert_state(). All three call alert_state_json_path() which constructs paths without tenant context (format: .alerts/alert_state_{alert_id}.json).

This conflicts with get_alert_states() (line 302), which constructs a tenant-scoped base path using RelativePathBuf::from_iter([&tenant, ALERTS_ROOT_DIRECTORY]).

Fix: Update alert_state_json_path() to accept and use tenant_id as a path component, or remove the tenant_id parameters from the trait methods if alert states are intentionally global. Ensure consistency across all four alert state methods.

🤖 Fix all issues with AI agents
In `@src/handlers/http/middleware.rs`:
- Around line 168-178: The code uses HeaderValue::from_str(&tid).unwrap() inside
the match for get_user_and_tenant_from_request, which can panic for invalid
header characters; replace the unwrap with proper error handling: call
HeaderValue::from_str(&tid) and match or use map_err to convert the header error
into the existing Err branch (or log and skip inserting the header), then only
call req.headers_mut().insert(...) on Ok(val). Update the user_and_tenant_id
assignment so failures to construct the HeaderValue return an Err (propagated)
or a controlled fallback instead of panicking, referencing
get_user_and_tenant_from_request, HeaderValue::from_str,
req.headers_mut().insert and user_and_tenant_id.
- Around line 309-320: check_suspension currently treats missing or unknown
tenants as Authorized; change it to reject those cases: in function
check_suspension, when the tenant header is missing or tenant.to_str() fails
return rbac::Response::Unauthorized (or another appropriate denial variant)
instead of rbac::Response::Authorized, and in the branch where
TENANT_METADATA.is_action_suspended returns Ok(None) (the "tenant does not
exist" case) return rbac::Response::Unauthorized rather than falling through to
Authorized; keep the existing Suspended return when an actual suspension is
found and optionally add a short debug log mentioning the tenant value on
unauthorized paths.

In `@src/handlers/http/modal/ingest/ingestor_rbac.rs`:
- Around line 52-58: The tenant validation in ingestor_rbac.rs is inverted:
change the condition that currently returns an error when req_tenant equals the
requester's tenant to instead return an error when a non-super-admin (req_tenant
!= DEFAULT_TENANT) is trying to create a user for a different tenant;
specifically, update the check that uses req_tenant, DEFAULT_TENANT and
user.tenant (as_ref().map_or(...)) so it tests for inequality (req_tenant !=
user.tenant.as_ref().map_or(DEFAULT_TENANT, |v| v)) and then return
RBACError::Anyhow(...) when that inequality is true.

In `@src/handlers/http/modal/query/querier_rbac.rs`:
- Line 79: The call to user::User::new_basic uses None for the tenant, creating
users without tenant association; update the call in querier_rbac.rs to pass the
request's tenant_id (e.g., tenant_id.clone()) instead of None so the new user is
associated with the tenant (ensure you pass the same tenant_id variable used
elsewhere in this function when calling user::User::new_basic with username).

In `@src/handlers/http/oidc.rs`:
- Around line 132-162: The cluster sync currently treats any successful TCP
exchange as success because .send().await may return non-2xx responses; update
the closure inside for_each_live_node (the async block using
INTRA_CLUSTER_CLIENT.post(...).send().await) to call .error_for_status() on the
Response (e.g., let resp = INTRA_CLUSTER_CLIENT.post(...).send().await?;
resp.error_for_status()? ) and convert that into the closure Result so non-2xx
becomes Err; additionally catch and log per-node failures with identifying info
(node.domain_name or node.token) before returning Err so tracing shows which
node failed.
- Around line 227-228: get_tenant_id_from_request currently calls
tenant_value.to_str().unwrap(), which can panic on invalid UTF-8; change it to
handle the conversion failure and return None instead of panicking. Update
get_tenant_id_from_request(req: &HttpRequest) to check
req.headers().get("tenant") and call tenant_value.to_str().ok().map(|s|
s.to_owned()) (or equivalent) so malformed header values produce None rather
than causing a process panic.
- Around line 104-118: The basic-auth branch incorrectly uses
get_tenant_id_from_key(&session_key) which yields None for
SessionKey::BasicAuth; replace the tenant lookup inside the
SessionKey::BasicAuth arm to call get_tenant_id_from_request(&req) (or compute a
separate tenant_id_for_basic_auth = get_tenant_id_from_request(&req) before
calling Users.get_user) and pass that tenant_id to Users.get_user(&username,
&tenant_id_for_basic_auth); keep the existing tenant_id usage for non-basic-auth
branches and ensure you only switch tenant source for the SessionKey::BasicAuth
pattern.

In `@src/hottier.rs`:
- Around line 595-603: The helper check_stream_hot_tier_exists currently calls
self.hot_tier_file_path(stream, tenant_id).unwrap() which can panic; change
check_stream_hot_tier_exists to handle the Result/Option from hot_tier_file_path
without unwrapping (e.g., match or if let Ok(path) / Some(path) -> return
path.exists(); Err(_) / None -> return false), avoid unnecessary
to_string/PathBuf::from conversions and ensure any path conversion error or
missing tenant returns false rather than panicking.
- Around line 779-808: In create_pstats_hot_tier, change the StreamHotTier
initialization to use INTERNAL_STREAM_HOT_TIER_SIZE_BYTES (like pmeta) instead
of MIN_STREAM_HOT_TIER_SIZE_BYTES: set StreamHotTier.size and
StreamHotTier.available_size to INTERNAL_STREAM_HOT_TIER_SIZE_BYTES (leave
used_size at 0 and version/oldest_date_time_entry as-is), then call put_hot_tier
as before; this ensures pstats uses the internal-stream default rather than the
user-stream minimum.

In `@src/metastore/metastore_traits.rs`:
- Around line 108-112: get_alert_state_entry currently ignores tenant_id when
building the storage path; update this and the helper so alert state is
tenant-scoped. Change alert_state_json_path signature in object_storage.rs to
accept the tenant_id (e.g., &Option<String> or Option<&str>) and return a path
that includes tenant context (matching the pattern used by mttr_json_path), then
update calls: in object_store_metastore.rs modify get_alert_state_entry to call
alert_state_json_path(alert_id, tenant_id) (or the chosen arg order) and adjust
any other callers (e.g., get_all_alert_states) to use the new signature so all
alert state reads/writes are tenant-isolated. Ensure types/signatures line up
across trait and impls (get_alert_state_entry declaration, its implementation,
and alert_state_json_path).

In `@src/migration/mod.rs`:
- Around line 490-498: The path construction uses a hardcoded ".parseable.json"
instead of the established PARSEABLE_METADATA_FILE_NAME constant, causing
potential mismatches with get_staging_metadata; update the two occurrences to
use PARSEABLE_METADATA_FILE_NAME when building the path (in the branch that uses
tenant_id and the else branch), referencing tenant_id,
config.options.staging_dir(), and PARSEABLE_METADATA_FILE_NAME so the produced
path matches get_staging_metadata.
- Around line 168-199: The loop currently uses
PARSEABLE.metastore.list_streams(&tenant_id).await? which returns early on error
and skips remaining tenants; change this to handle errors per-tenant (e.g.,
match or if let Err(e) = ...) so failures from list_streams are logged/collected
and the loop continues, while successful list_streams still produce the stream
migration futures; keep the existing migration_stream(&stream_name, &*storage,
&id) handling and config.get_or_create_stream(&stream_name,
&id).set_metadata(...) logic unchanged, but ensure you aggregate or return a
composed error result after iterating all tenants instead of propagating
immediately from list_streams.

In `@src/parseable/mod.rs`:
- Around line 1116-1144: delete_tenant currently removes tenant data from
TENANT_METADATA, users and roles but never removes the tenant entry from
self.tenants, so list_tenants() still returns it; update delete_tenant to also
remove the tenant from self.tenants (e.g., by acquiring a mutable borrow of
self.tenants and calling remove(tenant_id) or filtering/retaining entries that
don't match tenant_id), ensuring you reference the same tenant_id string; keep
existing cleanup (mut_users(), Users.delete_user, mut_roles(),
TENANT_METADATA.delete_tenant) and perform the self.tenants removal before
returning Ok(()) so the in-memory tenant list and list_tenants() reflect the
deletion.
- Around line 1057-1076: The add_tenant method has a TOCTOU race: it does a
contains() under a read lock then pushes under a separate write lock; fix by
taking a single write lock once (let mut tenants =
self.tenants.write().unwrap()), perform the contains() check on that guard,
return Err if present, otherwise push the tenant_id and call
TENANT_METADATA.insert_tenant(tenant_id, tenant_meta) while still holding that
write lock so the check-and-insert is atomic.

In `@src/query/mod.rs`:
- Around line 86-121: create_session_context currently ignores errors from
catalog.register_schema (used around lines referenced) with `let _ = ...`;
update that to handle the Result and log any Err using the project's logging
facility (e.g., tracing::error! or log::error!), e.g. replace the `let _ =
catalog.register_schema(...)` with an `if let Err(e) =
catalog.register_schema(...) { error!("failed to register schema for tenant {}:
{:?}", tenant_id, e); }` pattern so startup schema registration failures are
visible; do not change the existing InMemorySessionContext::add_schema behavior
that uses .expect().

In `@src/query/stream_schema_provider.rs`:
- Around line 529-534: The logging call using tracing::warn! inside the scan
routine is too noisy for per-scan instrumentation; change it to a lower level
(tracing::debug! or tracing::trace!) so it doesn’t flood production logs—locate
the invocation that logs self.tenant_id, self.schema, and self.stream (the
tracing::warn! call in the scan path of the StreamSchemaProvider implementation)
and replace with tracing::debug! (or tracing::trace!) keeping the same message
and fields.
- Line 645: Remove or reduce the noisy warning by deleting or lowering the log
level of the tracing macro call
tracing::warn!(object_store_url=?object_store_url); — either remove it entirely
or change it to tracing::debug! or tracing::trace! (or guard it behind a
verbose/diagnostic flag) so the object_store_url is not logged as a warn on
every scan.
♻️ Duplicate comments (13)
src/query/stream_schema_provider.rs (3)

224-232: Tenant-aware URL construction is disabled; unwrap() remains risky.

The tenant-aware object store URL construction is commented out (lines 224-228), and the current code uses a hardcoded "file:///" with an unwrap() that could panic on parse failure. While ObjectStoreUrl::parse("file:///") is unlikely to fail, the pattern should handle errors gracefully.

When re-enabling tenant support, ensure proper error handling is added.


282-293: Same pattern: commented tenant URL and unwrap() on parse.

This duplicates the issue from get_hottier_exectuion_plan. The tenant-aware URL logic is commented out and unwrap() is used on parse.


639-648: Tenant-aware object store URL not yet implemented for remote storage.

The commented code (lines 639-643) shows the intended tenant-aware URL construction using glob_storage.store_url().join(tenant_id), but it's currently disabled. The active code uses glob_storage.store_url() directly without tenant scoping.

This means queries will not be properly tenant-isolated when reading from object storage. The past review comment about unwrap() on join() and parse() still applies when this is re-enabled.

src/rbac/map.rs (1)

142-168: SessionKey should not derive Debug due to password exposure.

The SessionKey enum (line 187) derives Debug with the BasicAuth variant containing plaintext passwords. The Sessions struct (line 193) also derives Debug and contains active_sessions: HashMap<SessionKey, ...>. If either struct were logged with {:?} formatting, credentials would leak—even though current logging in remove_user (lines 275-286) is commented out.

Either remove Debug from SessionKey or implement a custom Debug impl that redacts the password field to prevent accidental credential exposure if logging is later enabled.

src/handlers/http/rbac.rs (1)

147-148: User now created with tenant context - previous issue addressed.

The user creation now correctly passes tenant_id.clone() instead of None, ensuring proper tenant affiliation for new users.

src/handlers/http/cluster/mod.rs (2)

327-332: Tenant context must be propagated during stream synchronization.

The tenant_id parameter is commented out (line 331), meaning stream sync requests to ingestors/queriers won't include tenant context. This breaks tenant isolation in multi-tenant deployments.

🔧 Suggested fix
 pub async fn sync_streams_with_ingestors(
     headers: HeaderMap,
     body: Bytes,
     stream_name: &str,
-    // tenant_id: &Option<String>
+    tenant_id: &Option<String>,
 ) -> Result<(), StreamError> {

Then add the tenant header to the request:

+                    .header("tenant", tenant_id.clone().unwrap_or_default())

539-544: sync_user_creation missing tenant_id propagation.

Similar to stream sync, the tenant_id parameter is commented out. User creation sync requests won't include tenant context, breaking tenant isolation.

🔧 Suggested fix
 pub async fn sync_user_creation(
     user: User,
     role: &Option<HashSet<String>>,
-    // tenant_id: &str
+    tenant_id: &Option<String>,
 ) -> Result<(), RBACError> {

And add tenant header to the sync request.

src/handlers/http/modal/ingest/ingestor_role.rs (1)

46-52: Inverted tenant validation logic (previously flagged).

The condition req_tenant.ne(DEFAULT_TENANT) && (req_tenant.eq(&sync_req.tenant_id)) rejects requests when the request tenant matches the payload tenant, which is the opposite of the intended behavior based on the error message.

The second condition should use .ne() to check for a mismatch:

-    if req_tenant.ne(DEFAULT_TENANT) && (req_tenant.eq(&sync_req.tenant_id)) {
+    if req_tenant.ne(DEFAULT_TENANT) && (req_tenant.ne(&sync_req.tenant_id)) {
src/handlers/http/oidc.rs (1)

216-369: Fix tenantless OAuth user creation (put_user(..., None))

New OIDC users are currently persisted with tenant=None even though tenant_id is extracted from the request. That’s a multi-tenant isolation bug (and the inline comment suggests it’s knowingly incomplete).

Proposed fix (keep tenant_id available and pass it through)
-    let existing_user = find_existing_user(&user_info, tenant_id);
+    let existing_user = find_existing_user(&user_info, &tenant_id);

     let user = match (existing_user, final_roles) {
         (Some(user), roles) => update_user_if_changed(user, roles, user_info, bearer).await?,
-        // LET TENANT BE NONE FOR NOW!!!
-        (None, roles) => put_user(&user_id, roles, user_info, bearer, None).await?,
+        (None, roles) => put_user(&user_id, roles, user_info, bearer, tenant_id.clone()).await?,
     };
-fn find_existing_user(user_info: &user::UserInfo, tenant_id: Option<String>) -> Option<User> {
+fn find_existing_user(user_info: &user::UserInfo, tenant_id: &Option<String>) -> Option<User> {
     if let Some(sub) = &user_info.sub
-        && let Some(user) = Users.get_user(sub, &tenant_id)
+        && let Some(user) = Users.get_user(sub, tenant_id)
         && matches!(user.ty, UserType::OAuth(_))
     {
         return Some(user);
     }
     ...
 }

Also applies to: 499-529

src/hottier.rs (1)

92-119: Fix get_hot_tiers_size exclusion logic (and avoid moving current_tenant_id)

The current predicate stream != current_stream && tenant_id != *current_tenant_id both (a) excludes all other streams in the current tenant (undercount), and (b) won’t compile because *current_tenant_id moves an Option<String>. You want to skip only when both stream and tenant match.

Proposed fix
 for tenant_id in tenants {
     for stream in PARSEABLE.streams.list(&tenant_id) {
-        if self.check_stream_hot_tier_exists(&stream, &tenant_id)
-            && stream != current_stream
-            && tenant_id != *current_tenant_id
-        {
+        if self.check_stream_hot_tier_exists(&stream, &tenant_id)
+            && !(stream == current_stream
+                && tenant_id.as_deref() == current_tenant_id.as_deref())
+        {
             let stream_hot_tier = self.get_hot_tier(&stream, &tenant_id).await?;
-            total_hot_tier_size += &stream_hot_tier.size;
+            total_hot_tier_size += stream_hot_tier.size;
             total_hot_tier_used_size += stream_hot_tier.used_size;
         }
     }
 }
src/metastore/metastores/object_store_metastore.rs (1)

939-965: Manifest path construction inconsistency persists.

This was flagged in a previous review. The put_manifest and delete_manifest methods use:

partition_path(stream_name, lower_bound, upper_bound, tenant_id).join(&manifest_path("").to_string())

While get_manifest and get_manifest_path use:

manifest_path(partition_path(stream_name, lower_bound, upper_bound, tenant_id).as_str())

These approaches may produce different path structures depending on how manifest_path handles the prefix argument.

src/alerts/mod.rs (1)

1244-1254: Silent write drops persist for missing tenant buckets.

The update() method still silently drops alert updates when the tenant bucket doesn't exist. This was flagged in a previous review. The same issue affects update_state() (lines 1336-1340) and update_notification_state() (lines 1380-1382).

Use entry().or_default() pattern to ensure tenant buckets are created on first write:

 async fn update(&self, alert: &dyn AlertTrait) {
     let tenant = alert.get_tenant_id().as_ref().map_or(DEFAULT_TENANT, |v| v);
-    if let Some(alerts) = self.alerts.write().await.get_mut(tenant) {
-        alerts.insert(*alert.get_id(), alert.clone_box());
-    }
+    self.alerts
+        .write()
+        .await
+        .entry(tenant.to_owned())
+        .or_default()
+        .insert(*alert.get_id(), alert.clone_box());
 }
src/parseable/mod.rs (1)

1146-1182: Incomplete logic and silent lock failure in load_tenants.

This was flagged in a previous review:

  1. Empty else branch (line 1168): The else if !is_multi_tenant { } does nothing - unclear what should happen for single-tenant mode with tenant directories.

  2. Silent lock failure (lines 1176-1180): Returns Ok(None) on write lock failure instead of propagating the error, masking potential poisoned lock issues.

Suggested fix
-            } else if !is_multi_tenant {
-            } else {
+            } else if is_multi_tenant {
+                // Tenant directory without metadata is invalid in multi-tenant mode
                 return Err(anyhow::Error::msg(format!(
                     "Found invalid tenant directory with multi-tenant mode- {tenant_id}.\nExiting."
                 )));
             }
+            // Single-tenant mode: directories without .parseable.json are ignored
         }
 
-        if let Ok(mut t) = self.tenants.write() {
-            t.extend(dirs);
-            Ok(Some(()))
-        } else {
-            Ok(None)
-        }
+        let mut t = self.tenants.write().expect("tenants lock poisoned");
+        t.extend(dirs);
+        Ok(Some(()))
🧹 Nitpick comments (22)
src/migration/mod.rs (1)

473-483: Consider renaming to avoid confusion with store_metadata::put_remote_metadata.

This function has the same name as the one in src/storage/store_metadata.rs but accepts serde_json::Value instead of &StorageMetadata. While they serve different contexts (migration vs. normal operation), the naming overlap may cause confusion when importing or maintaining the code.

Consider renaming to something like put_remote_metadata_from_json or making it private since it's only used within this migration module.

src/tenants/mod.rs (3)

57-67: Silent no-op when tenant doesn't exist in suspend/resume operations.

Both suspend_service and resume_service silently do nothing if the tenant doesn't exist. This could mask configuration errors or race conditions where a tenant was deleted but suspension operations are still being attempted.

Consider returning a Result<(), TenantNotFound> to allow callers to handle missing tenants appropriately, similar to how is_action_suspended handles this case.

♻️ Suggested approach
-    pub fn suspend_service(&self, tenant_id: &str, service: Service) {
-        if let Some(mut tenant) = self.tenants.get_mut(tenant_id) {
-            tenant.suspended_services.insert(service);
-        }
+    pub fn suspend_service(&self, tenant_id: &str, service: Service) -> Result<(), TenantNotFound> {
+        if let Some(mut tenant) = self.tenants.get_mut(tenant_id) {
+            tenant.suspended_services.insert(service);
+            Ok(())
+        } else {
+            Err(TenantNotFound(tenant_id.to_owned()))
+        }
     }

-    pub fn resume_service(&self, tenant_id: &str, service: Service) {
-        if let Some(mut tenant) = self.tenants.get_mut(tenant_id) {
-            tenant.suspended_services.remove(&service);
-        }
+    pub fn resume_service(&self, tenant_id: &str, service: Service) -> Result<(), TenantNotFound> {
+        if let Some(mut tenant) = self.tenants.get_mut(tenant_id) {
+            tenant.suspended_services.remove(&service);
+            Ok(())
+        } else {
+            Err(TenantNotFound(tenant_id.to_owned()))
+        }
     }

73-92: Minor: Redundant return keyword.

The explicit return on line 90 is unnecessary since it's the last expression in the function.

♻️ Suggested fix
         } else {
-            return Err(TenantNotFound(tenant_id.to_owned()));
+            Err(TenantNotFound(tenant_id.to_owned()))
         }

106-153: Consider removing or tracking commented-out code.

This large commented-out section represents planned functionality that isn't yet implemented. Having substantial commented code in the codebase can:

  1. Become stale as surrounding code evolves
  2. Cause confusion about what's active vs planned
  3. Add noise during code reviews

Consider either removing this and tracking the planned implementation in an issue, or adding a TODO comment explaining when this will be enabled.

src/rbac/map.rs (1)

313-411: Consider removing commented-out debug statements or converting to structured logging.

Multiple commented-out tracing::warn! calls remain in check_auth (lines 320-322, 330, 370, 390, 403-408). While they're disabled, they add noise. Consider either:

  1. Removing them entirely
  2. Converting to proper trace! level logging with structured fields that don't risk leaking sensitive data
src/rbac/mod.rs (3)

72-80: Consider simplifying the nested option handling.

The pattern .filter().map().unwrap() is verbose. Consider using and_then for cleaner code:

♻️ Suggested simplification
     pub fn get_user(&self, userid: &str, tenant_id: &Option<String>) -> Option<User> {
         let tenant_id = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v);
-        let u = users()
+        users()
             .get(tenant_id)
-            .filter(|users| users.get(userid).is_some())
-            .map(|users| users.get(userid).unwrap().to_owned());
-        u
-        // .get(userid).cloned()
+            .and_then(|users| users.get(userid).cloned())
     }

119-126: Simplify the empty match arm.

The None => {} branch does nothing and can be replaced with if let:

♻️ Suggested fix
     fn remove_user(&mut self, userid: &str, tenant_id: &str) {
-        match mut_users().get_mut(tenant_id) {
-            Some(users) => {
-                users.remove(userid);
-            }
-            None => {}
+        if let Some(users) = mut_users().get_mut(tenant_id) {
+            users.remove(userid);
         }
     }

274-286: Consider caching or indexing for cross-tenant user lookup.

get_user_from_basic performs a full scan across all tenants and users (O(tenants × users)). While necessary for basic auth where tenant context isn't known upfront, this could become a performance bottleneck at scale.

Additionally, the iteration order may leak timing information about which tenant a user belongs to. Consider:

  1. Adding a username → tenant index for faster lookups
  2. Using constant-time comparison to prevent timing attacks
src/handlers/http/middleware.rs (1)

322-330: Simplify suspension check pattern.

The match with _ => {} is verbose for checking a single variant. Consider using if let:

♻️ Suggested fix for all three auth functions
 pub fn auth_no_context(req: &mut ServiceRequest, action: Action) -> Result<rbac::Response, Error> {
     // check if tenant is suspended
-    match check_suspension(req.request(), action) {
-        rbac::Response::Suspended(msg) => return Ok(rbac::Response::Suspended(msg)),
-        _ => {}
+    if let rbac::Response::Suspended(msg) = check_suspension(req.request(), action) {
+        return Ok(rbac::Response::Suspended(msg));
     }
     let creds = extract_session_key(req);
     creds.map(|key| Users.authorize(key, action, None, None))
 }

Apply the same pattern to auth_resource_context and auth_user_context.

src/handlers/http/role.rs (3)

56-61: Remove commented-out dead code.

The commented line // mut_roles().insert(name.clone(), privileges.clone()); is superseded by the tenant-scoped implementation above it. Consider removing to improve readability.


143-147: Remove commented-out dead code.

The commented line // mut_roles().remove(&name); duplicates the functionality of the tenant-scoped removal above.


183-190: Remove commented-out dead code block.

This large commented block in get_default should be removed as it's replaced by the tenant-scoped implementation.

src/handlers/http/cluster/mod.rs (1)

1789-1803: Auth token handling in send_query_request.

The function now accepts an optional HeaderMap for auth. The fallback creates a new map with the querier's token. This is a reasonable pattern, though the commented line on 1803 should be removed.

         .headers(auth.into())
-        // .header(header::AUTHORIZATION, auth)
         .header(header::CONTENT_TYPE, "application/json")
src/catalog/mod.rs (1)

460-460: Remove debug logging before merging.

This tracing::warn! appears to be debug output that should not remain in production code. Either remove it or downgrade to trace! level.

-    tracing::warn!("manifest path_url= {path_url}");
src/handlers/http/modal/query/querier_rbac.rs (1)

115-121: Repeated username lookup pattern.

The same pattern for looking up username by userid appears in delete_user, add_roles_to_user, and remove_roles_from_user. Consider extracting this into a helper function to reduce duplication.

Also applies to: 185-191, 245-251

src/handlers/http/query.rs (1)

117-120: Redundant tenant_id extraction.

tenant_id is extracted at line 118 for create_streams_for_distributed, then extracted again at line 120. Consider extracting once and reusing:

+    let tenant_id = get_tenant_id_from_request(&req);
     // check or load streams in memory
-    create_streams_for_distributed(tables.clone(), &get_tenant_id_from_request(&req)).await?;
-
-    let tenant_id = get_tenant_id_from_request(&req);
+    create_streams_for_distributed(tables.clone(), &tenant_id).await?;
src/parseable/streams.rs (1)

1091-1100: contains() warning on missing tenant bucket is likely a normal case

A tenant can legitimately exist with zero streams (or before the first get_or_create call), so warning here may spam logs. Consider returning false quietly (or debug!).

src/storage/object_storage.rs (1)

168-214: Per-stream metrics may collide across tenants (if stream names are not globally unique)

STORAGE_SIZE, EVENTS_STORAGE_SIZE_DATE, etc. label by stream_name only. In multi-tenant mode with same stream name in different tenants, these metrics will merge tenants. Consider adding tenant as an additional label or prefixing stream_name with tenant.

src/query/mod.rs (1)

946-1021: PartitionedMetricMonitor.tenant_id is currently unused

Either remove it for now, or (if you intend per-tenant billing/metrics) use it to label the emitted metrics to avoid cross-tenant aggregation.

src/parseable/mod.rs (2)

1184-1191: Consider propagating lock errors instead of returning None.

The list_tenants() method returns None on lock failure, which callers handle with map_or(). However, a poisoned lock typically indicates a panic occurred while holding the lock, which should probably propagate as an error rather than be silently treated as "no tenants."


250-265: Debug logging uses warn! level - should be trace! or debug!

Lines 255 and 259 use tracing::warn! for what appears to be debug/diagnostic logging during normal operation. This will clutter production logs.

-        tracing::warn!("check or load streams- {stream_name}, {tenant_id:?}");
+        tracing::trace!("check or load streams- {stream_name}, {tenant_id:?}");
         if self.streams.contains(stream_name, tenant_id) {
             return true;
         }
-        tracing::warn!("check or load streams not present, creating {stream_name}, {tenant_id:?}");
+        tracing::trace!("check or load streams not present, creating {stream_name}, {tenant_id:?}");
src/metastore/metastores/object_store_metastore.rs (1)

1200-1273: Remove commented-out debug logging before merge.

The method has extensive commented-out tracing::warn! statements (lines 1220-1227, 1232-1233, 1238, 1245, 1256-1260, 1262-1267). These should be cleaned up or converted to proper tracing::trace! if still needed.

Comment on lines +168 to +178
// append tenant id if present
let user_and_tenant_id = match get_user_and_tenant_from_request(req.request()) {
Ok((uid, tid)) => {
req.headers_mut().insert(
HeaderName::from_static("tenant"),
HeaderValue::from_str(&tid).unwrap(),
);
Ok((uid, tid))
}
Err(e) => Err(e),
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Handle potential panic from HeaderValue::from_str().unwrap().

If tenant_id contains characters not valid in HTTP headers (e.g., control characters), HeaderValue::from_str() will return an error and unwrap() will panic. Consider handling this gracefully:

♻️ Suggested fix
         let user_and_tenant_id = match get_user_and_tenant_from_request(req.request()) {
             Ok((uid, tid)) => {
-                req.headers_mut().insert(
-                    HeaderName::from_static("tenant"),
-                    HeaderValue::from_str(&tid).unwrap(),
-                );
+                if let Ok(header_val) = HeaderValue::from_str(&tid) {
+                    req.headers_mut().insert(
+                        HeaderName::from_static("tenant"),
+                        header_val,
+                    );
+                }
                 Ok((uid, tid))
             }
             Err(e) => Err(e),
         };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// append tenant id if present
let user_and_tenant_id = match get_user_and_tenant_from_request(req.request()) {
Ok((uid, tid)) => {
req.headers_mut().insert(
HeaderName::from_static("tenant"),
HeaderValue::from_str(&tid).unwrap(),
);
Ok((uid, tid))
}
Err(e) => Err(e),
};
// append tenant id if present
let user_and_tenant_id = match get_user_and_tenant_from_request(req.request()) {
Ok((uid, tid)) => {
if let Ok(header_val) = HeaderValue::from_str(&tid) {
req.headers_mut().insert(
HeaderName::from_static("tenant"),
header_val,
);
}
Ok((uid, tid))
}
Err(e) => Err(e),
};
🤖 Prompt for AI Agents
In `@src/handlers/http/middleware.rs` around lines 168 - 178, The code uses
HeaderValue::from_str(&tid).unwrap() inside the match for
get_user_and_tenant_from_request, which can panic for invalid header characters;
replace the unwrap with proper error handling: call HeaderValue::from_str(&tid)
and match or use map_err to convert the header error into the existing Err
branch (or log and skip inserting the header), then only call
req.headers_mut().insert(...) on Ok(val). Update the user_and_tenant_id
assignment so failures to construct the HeaderValue return an Err (propagated)
or a controlled fallback instead of panicking, referencing
get_user_and_tenant_from_request, HeaderValue::from_str,
req.headers_mut().insert and user_and_tenant_id.

Comment on lines +52 to +58
if req_tenant.ne(DEFAULT_TENANT)
&& (req_tenant.eq(user.tenant.as_ref().map_or(DEFAULT_TENANT, |v| v)))
{
return Err(RBACError::Anyhow(anyhow::Error::msg(
"non super-admin user trying to create user for another tenant",
)));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Inverted tenant validation logic.

Same issue as in ingestor_role.rs: the condition checks for a match when it should check for a mismatch. The error message indicates this should prevent cross-tenant operations, but the current logic prevents same-tenant operations.

-        if req_tenant.ne(DEFAULT_TENANT)
-            && (req_tenant.eq(user.tenant.as_ref().map_or(DEFAULT_TENANT, |v| v)))
+        if req_tenant.ne(DEFAULT_TENANT)
+            && (req_tenant.ne(user.tenant.as_ref().map_or(DEFAULT_TENANT, |v| v)))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if req_tenant.ne(DEFAULT_TENANT)
&& (req_tenant.eq(user.tenant.as_ref().map_or(DEFAULT_TENANT, |v| v)))
{
return Err(RBACError::Anyhow(anyhow::Error::msg(
"non super-admin user trying to create user for another tenant",
)));
}
if req_tenant.ne(DEFAULT_TENANT)
&& (req_tenant.ne(user.tenant.as_ref().map_or(DEFAULT_TENANT, |v| v)))
{
return Err(RBACError::Anyhow(anyhow::Error::msg(
"non super-admin user trying to create user for another tenant",
)));
}
🤖 Prompt for AI Agents
In `@src/handlers/http/modal/ingest/ingestor_rbac.rs` around lines 52 - 58, The
tenant validation in ingestor_rbac.rs is inverted: change the condition that
currently returns an error when req_tenant equals the requester's tenant to
instead return an error when a non-super-admin (req_tenant != DEFAULT_TENANT) is
trying to create a user for a different tenant; specifically, update the check
that uses req_tenant, DEFAULT_TENANT and user.tenant (as_ref().map_or(...)) so
it tests for inequality (req_tenant !=
user.tenant.as_ref().map_or(DEFAULT_TENANT, |v| v)) and then return
RBACError::Anyhow(...) when that inequality is true.

Comment on lines +104 to +118
// if control flow is here then it is most likely basic auth
// try authorize
match Users.authorize(session_key.clone(), rbac::role::Action::Login, None, None) {
rbac::Response::Authorized => (),
rbac::Response::UnAuthorized | rbac::Response::ReloadRequired => {
rbac::Response::UnAuthorized
| rbac::Response::ReloadRequired
| rbac::Response::Suspended(_) => {
return Err(OIDCError::Unauthorized);
}
}
let tenant_id = get_tenant_id_from_key(&session_key);
match session_key {
// We can exchange basic auth for session cookie
SessionKey::BasicAuth { username, password } => match Users.get_user(&username) {
SessionKey::BasicAuth { username, password } => match Users.get_user(&username, &tenant_id)
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the specific file and lines mentioned
cat -n src/handlers/http/oidc.rs | sed -n '90,125p'

Repository: parseablehq/parseable

Length of output: 186


🏁 Script executed:

# Find the implementation of get_tenant_id_from_key
rg "fn get_tenant_id_from_key" -A 10

Repository: parseablehq/parseable

Length of output: 556


🏁 Script executed:

# Find if get_tenant_id_from_request exists
rg "get_tenant_id_from_request" -B 2 -A 5

Repository: parseablehq/parseable

Length of output: 50377


🏁 Script executed:

# Search for SessionKey definition to understand BasicAuth structure
rg "enum SessionKey|struct SessionKey" -A 20

Repository: parseablehq/parseable

Length of output: 1145


🏁 Script executed:

cat -n src/handlers/http/oidc.rs | sed -n '95,135p'

Repository: parseablehq/parseable

Length of output: 2046


🏁 Script executed:

# Also check if there's any session creation logic for BasicAuth before line 104
rg "SessionKey::BasicAuth" -B 5 -A 10 src/handlers/http/oidc.rs

Repository: parseablehq/parseable

Length of output: 1243


🏁 Script executed:

# Check how Users.get_user works and if it requires tenant
rg "fn get_user" -A 15 src/rbac/

Repository: parseablehq/parseable

Length of output: 4435


Use get_tenant_id_from_request() for tenant selection in basic-auth flow

For SessionKey::BasicAuth, get_tenant_id_from_key() returns None because basic credentials have no pre-established session mapping. This causes Users.get_user(&username, &None) to default to the DEFAULT_TENANT regardless of the actual tenant context, breaking multi-tenant support for basic-auth users. Use get_tenant_id_from_request(&req) instead to allow tenant selection via HTTP header (or make tenant explicit in basic-auth credentials if preferred).

🤖 Prompt for AI Agents
In `@src/handlers/http/oidc.rs` around lines 104 - 118, The basic-auth branch
incorrectly uses get_tenant_id_from_key(&session_key) which yields None for
SessionKey::BasicAuth; replace the tenant lookup inside the
SessionKey::BasicAuth arm to call get_tenant_id_from_request(&req) (or compute a
separate tenant_id_for_basic_auth = get_tenant_id_from_request(&req) before
calling Users.get_user) and pass that tenant_id to Users.get_user(&username,
&tenant_id_for_basic_auth); keep the existing tenant_id usage for non-basic-auth
branches and ensure you only switch tenant source for the SessionKey::BasicAuth
pattern.

Comment on lines +1057 to +1076
pub fn add_tenant(
&self,
tenant_id: String,
tenant_meta: StorageMetadata,
) -> Result<(), anyhow::Error> {
if !self.options.is_multi_tenant() {
return Err(anyhow::Error::msg("P_MULTI_TENANCY is set to false"));
}

if self.tenants.read().unwrap().contains(&tenant_id) {
return Err(anyhow::Error::msg(format!(
"Tenant with id- {tenant_id} already exists"
)));
} else {
self.tenants.write().unwrap().push(tenant_id.clone());
TENANT_METADATA.insert_tenant(tenant_id, tenant_meta);
}

Ok(())
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

TOCTOU race condition in tenant addition.

The method has a time-of-check-to-time-of-use race: the existence check (line 1066) uses a read lock, but the insertion (line 1071) acquires a separate write lock. Another thread could add the same tenant between these operations.

Fix using single write lock
 pub fn add_tenant(
     &self,
     tenant_id: String,
     tenant_meta: StorageMetadata,
 ) -> Result<(), anyhow::Error> {
     if !self.options.is_multi_tenant() {
         return Err(anyhow::Error::msg("P_MULTI_TENANCY is set to false"));
     }

-    if self.tenants.read().unwrap().contains(&tenant_id) {
-        return Err(anyhow::Error::msg(format!(
-            "Tenant with id- {tenant_id} already exists"
-        )));
-    } else {
-        self.tenants.write().unwrap().push(tenant_id.clone());
+    let mut tenants = self.tenants.write().unwrap();
+    if tenants.contains(&tenant_id) {
+        return Err(anyhow::Error::msg(format!(
+            "Tenant with id- {tenant_id} already exists"
+        )));
+    }
+    tenants.push(tenant_id.clone());
+    drop(tenants); // Release lock before external call
     TENANT_METADATA.insert_tenant(tenant_id, tenant_meta);
-    }

     Ok(())
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pub fn add_tenant(
&self,
tenant_id: String,
tenant_meta: StorageMetadata,
) -> Result<(), anyhow::Error> {
if !self.options.is_multi_tenant() {
return Err(anyhow::Error::msg("P_MULTI_TENANCY is set to false"));
}
if self.tenants.read().unwrap().contains(&tenant_id) {
return Err(anyhow::Error::msg(format!(
"Tenant with id- {tenant_id} already exists"
)));
} else {
self.tenants.write().unwrap().push(tenant_id.clone());
TENANT_METADATA.insert_tenant(tenant_id, tenant_meta);
}
Ok(())
}
pub fn add_tenant(
&self,
tenant_id: String,
tenant_meta: StorageMetadata,
) -> Result<(), anyhow::Error> {
if !self.options.is_multi_tenant() {
return Err(anyhow::Error::msg("P_MULTI_TENANCY is set to false"));
}
let mut tenants = self.tenants.write().unwrap();
if tenants.contains(&tenant_id) {
return Err(anyhow::Error::msg(format!(
"Tenant with id- {tenant_id} already exists"
)));
}
tenants.push(tenant_id.clone());
drop(tenants); // Release lock before external call
TENANT_METADATA.insert_tenant(tenant_id, tenant_meta);
Ok(())
}
🤖 Prompt for AI Agents
In `@src/parseable/mod.rs` around lines 1057 - 1076, The add_tenant method has a
TOCTOU race: it does a contains() under a read lock then pushes under a separate
write lock; fix by taking a single write lock once (let mut tenants =
self.tenants.write().unwrap()), perform the contains() check on that guard,
return Err if present, otherwise push the tenant_id and call
TENANT_METADATA.insert_tenant(tenant_id, tenant_meta) while still holding that
write lock so the check-and-insert is atomic.

Comment on lines +1116 to +1144
pub fn delete_tenant(&self, tenant_id: &str) -> Result<(), anyhow::Error> {
// let mut metadata = get_metadata(&Some(tenant_id.to_owned())).await?;
// delete users and sessions
let users = mut_users().remove(tenant_id);
if let Some(users) = users {
tracing::warn!("found tenant users, deleting");
for (userid, user) in users {
// metadata
// .users
// .retain(|u| u.tenant.eq(&Some(tenant_id.to_owned())));

Users.delete_user(&userid, &user.tenant);
}
}

// delete roles
mut_roles().remove(tenant_id);
// if let Some(roles) = mut_roles().remove(tenant_id) {
// for (role, _) in roles {
// // metadata.roles.retain(|r, _| !role.eq(r));
// }
// }

// delete resources

// delete from in-mem
TENANT_METADATA.delete_tenant(&tenant_id);
Ok(())
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Incomplete tenant deletion - missing tenants list cleanup.

The method removes the tenant from TENANT_METADATA but does not remove it from self.tenants. This leaves the system in an inconsistent state where list_tenants() still returns the deleted tenant.

Add tenant list cleanup
 pub fn delete_tenant(&self, tenant_id: &str) -> Result<(), anyhow::Error> {
     // delete users and sessions
     let users = mut_users().remove(tenant_id);
     if let Some(users) = users {
         tracing::warn!("found tenant users, deleting");
         for (userid, user) in users {
             Users.delete_user(&userid, &user.tenant);
         }
     }

     // delete roles
     mut_roles().remove(tenant_id);

+    // Remove from tenants list
+    if let Ok(mut tenants) = self.tenants.write() {
+        tenants.retain(|t| t != tenant_id);
+    }

     // delete from in-mem
     TENANT_METADATA.delete_tenant(&tenant_id);
     Ok(())
 }
🤖 Prompt for AI Agents
In `@src/parseable/mod.rs` around lines 1116 - 1144, delete_tenant currently
removes tenant data from TENANT_METADATA, users and roles but never removes the
tenant entry from self.tenants, so list_tenants() still returns it; update
delete_tenant to also remove the tenant from self.tenants (e.g., by acquiring a
mutable borrow of self.tenants and calling remove(tenant_id) or
filtering/retaining entries that don't match tenant_id), ensuring you reference
the same tenant_id string; keep existing cleanup (mut_users(),
Users.delete_user, mut_roles(), TENANT_METADATA.delete_tenant) and perform the
self.tenants removal before returning Ok(()) so the in-memory tenant list and
list_tenants() reflect the deletion.

Comment on lines +86 to +121
pub static QUERY_SESSION: Lazy<InMemorySessionContext> = Lazy::new(|| {
let ctx = Query::create_session_context(PARSEABLE.storage());
InMemorySessionContext {
session_context: Arc::new(RwLock::new(ctx)),
}
});

pub struct InMemorySessionContext {
session_context: Arc<RwLock<SessionContext>>,
}

impl InMemorySessionContext {
pub fn get_ctx(&self) -> SessionContext {
let ctx = self
.session_context
.read()
.expect("SessionContext should be readable");
ctx.clone()
}

pub fn add_schema(&self, tenant_id: &str) {
self.session_context
.write()
.expect("SessionContext should be writeable")
.catalog("datafusion")
.expect("Default catalog should be available")
.register_schema(
tenant_id,
Arc::new(GlobalSchemaProvider {
storage: PARSEABLE.storage().get_object_store(),
tenant_id: Some(tenant_id.to_owned()),
}),
)
.expect("Should be able to register new schema");
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the code context around the mentioned lines and the secondary location
cat -n src/query/mod.rs | sed -n '70,130p'

Repository: parseablehq/parseable

Length of output: 2638


🏁 Script executed:

# Check the secondary location (lines 171-205)
cat -n src/query/mod.rs | sed -n '165,210p'

Repository: parseablehq/parseable

Length of output: 2225


🏁 Script executed:

# Search for how add_schema is called throughout the codebase
rg "add_schema" -B 3 -A 3 --type rust

Repository: parseablehq/parseable

Length of output: 662


🌐 Web query:

DataFusion Rust CatalogProvider register_schema behavior duplicate schema error handling

💡 Result:

Short answer:

  • CatalogProvider::register_schema adds or replaces a schema; if a schema with the same name existed it is replaced and the previous schema is returned as Option<Arc>. The provided default implementation returns a "Not Implemented" error, while MemoryCatalogProvider implements the replace-and-return-old behavior. If you want a "duplicate schema" error instead of replace, implement a custom CatalogProvider/register_schema that checks existence and returns an Err(DataFusionError). [1][2]

Sources:

  • CatalogProvider trait docs (register_schema semantics and default behavior). [1]
  • MemoryCatalogProvider / catalog docs and Python wrapper showing register_schema behavior. [2]

Log schema registration errors instead of silently ignoring them during initialization

Lines 191 and 201 in create_session_context silently drop errors with let _ = catalog.register_schema(...). These errors should be logged since failed schema registration at startup may cause query failures.

Note: The .expect() in add_schema (line 119) is safe—DataFusion's register_schema replaces existing schemas rather than erroring on duplicates, so it will only panic on legitimate I/O failures.

Also applies to: 171-205

🤖 Prompt for AI Agents
In `@src/query/mod.rs` around lines 86 - 121, create_session_context currently
ignores errors from catalog.register_schema (used around lines referenced) with
`let _ = ...`; update that to handle the Result and log any Err using the
project's logging facility (e.g., tracing::error! or log::error!), e.g. replace
the `let _ = catalog.register_schema(...)` with an `if let Err(e) =
catalog.register_schema(...) { error!("failed to register schema for tenant {}:
{:?}", tenant_id, e); }` pattern so startup schema registration failures are
visible; do not change the existing InMemorySessionContext::add_schema behavior
that uses .expect().

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 18

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (20)
src/storage/store_metadata.rs (1)

298-320: Missing parent directory creation for tenant-specific staging path.

When tenant_id is Some, the staging metadata is written to staging_dir/tenant_id/.parseable.json. However, the tenant subdirectory may not exist, causing OpenOptions::open() to fail with NotFound. The migration module's put_staging_metadata (shown in relevant snippets) has the same pattern but also doesn't create the directory.

🐛 Proposed fix to ensure parent directory exists
 pub fn put_staging_metadata(meta: &StorageMetadata, tenant_id: &Option<String>) -> io::Result<()> {
     let mut staging_metadata = meta.clone();
     staging_metadata.server_mode = PARSEABLE.options.mode;
     staging_metadata.staging = PARSEABLE.options.staging_dir().to_path_buf();
     let path = if let Some(tenant_id) = tenant_id.as_ref() {
-        PARSEABLE
+        let tenant_path = PARSEABLE
             .options
             .staging_dir()
-            .join(tenant_id)
-            .join(PARSEABLE_METADATA_FILE_NAME)
+            .join(tenant_id);
+        std::fs::create_dir_all(&tenant_path)?;
+        tenant_path.join(PARSEABLE_METADATA_FILE_NAME)
     } else {
         PARSEABLE
             .options
             .staging_dir()
             .join(PARSEABLE_METADATA_FILE_NAME)
     };
src/stats.rs (1)

151-168: Inconsistent tenant_id usage in metric labels within update_deleted_stats.

The function accepts tenant_id and uses it when calling get_current_stats (line 169), but the metric updates between lines 151-168 use hardcoded 2-element and 3-element label arrays without tenant_id. This creates a mismatch: metrics are written without tenant context but stats are read with tenant context.

These labels should include tenant_id to match the label structure used elsewhere (e.g., event_labels returns 3 elements, storage_size_labels returns 4 elements).

🐛 Proposed fix
+    let tenant = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v.as_str());
     EVENTS_DELETED
-        .with_label_values(&[stream_name, "json"])
+        .with_label_values(&[stream_name, "json", tenant])
         .add(num_row);
     EVENTS_DELETED_SIZE
-        .with_label_values(&[stream_name, "json"])
+        .with_label_values(&[stream_name, "json", tenant])
         .add(ingestion_size);
     DELETED_EVENTS_STORAGE_SIZE
-        .with_label_values(&["data", stream_name, "parquet"])
+        .with_label_values(&["data", stream_name, "parquet", tenant])
         .add(storage_size);
     EVENTS_INGESTED
-        .with_label_values(&[stream_name, "json"])
+        .with_label_values(&[stream_name, "json", tenant])
         .sub(num_row);
     EVENTS_INGESTED_SIZE
-        .with_label_values(&[stream_name, "json"])
+        .with_label_values(&[stream_name, "json", tenant])
         .sub(ingestion_size);
     STORAGE_SIZE
-        .with_label_values(&["data", stream_name, "parquet"])
+        .with_label_values(&["data", stream_name, "parquet", tenant])
         .sub(storage_size);
src/hottier.rs (1)

208-220: delete_hot_tier doesn't use tenant_id in path construction.

The function accepts tenant_id but line 216 constructs the path as self.hot_tier_path.join(stream) without tenant isolation. This is inconsistent with hot_tier_file_path which includes the tenant subdirectory. In a multi-tenant environment, this could delete another tenant's data.

🐛 Proposed fix
     pub async fn delete_hot_tier(
         &self,
         stream: &str,
         tenant_id: &Option<String>,
     ) -> Result<(), HotTierError> {
         if !self.check_stream_hot_tier_exists(stream, tenant_id) {
             return Err(HotTierValidationError::NotFound(stream.to_owned()).into());
         }
-        let path = self.hot_tier_path.join(stream);
+        let path = if let Some(tid) = tenant_id.as_ref() {
+            self.hot_tier_path.join(tid).join(stream)
+        } else {
+            self.hot_tier_path.join(stream)
+        };
         fs::remove_dir_all(path).await?;

         Ok(())
     }
src/rbac/user.rs (1)

153-164: Revert to the recommended salt generation approach.

The manual 32-byte salt generation with fill_bytes and encode_b64 is technically valid but not aligned with best practices. The Argon2 ecosystem and password-hash crate recommend using SaltString::generate(&mut OsRng) directly, which provides simpler, safer, and more idiomatic code. This approach handles PHC-safe encoding and appropriate salt length automatically, eliminating the need for manual encoding. Restore the commented line at 158 and remove the manual implementation at lines 154-157.

src/metastore/metastores/object_store_metastore.rs (5)

393-403: tenant_id parameter unused in delete_alert_state.

Similar to put_alert_state, the tenant_id parameter at line 396 is accepted but unused. The path comes from obj.get_object_path() which may not be tenant-aware.


541-566: get_chats is not tenant-aware unlike similar methods.

While get_dashboards, get_filters, and get_correlations iterate over tenants via PARSEABLE.list_tenants(), get_chats only uses a single USERS_ROOT_DIR path without tenant prefixing. This inconsistency means chats won't be properly scoped per tenant.

🐛 Suggested fix for tenant-aware chats
     async fn get_chats(&self) -> Result<DashMap<String, Vec<Bytes>>, MetastoreError> {
         let all_user_chats = DashMap::new();
-
-        let users_dir = RelativePathBuf::from(USERS_ROOT_DIR);
-        for user in self.storage.list_dirs_relative(&users_dir).await? {
-            if user.starts_with(".") {
-                continue;
-            }
-            let mut chats = Vec::new();
-            let chats_path = users_dir.join(&user).join("chats");
-            let user_chats = self
-                .storage
-                .get_objects(
-                    Some(&chats_path),
-                    Box::new(|file_name| file_name.ends_with(".json")),
-                )
-                .await?;
-            for chat in user_chats {
-                chats.push(chat);
+        let base_paths = PARSEABLE.list_tenants().map_or(vec!["".into()], |v| v);
+        for tenant in base_paths {
+            let users_dir = RelativePathBuf::from_iter([&tenant, USERS_ROOT_DIR]);
+            for user in self.storage.list_dirs_relative(&users_dir).await? {
+                if user.starts_with(".") {
+                    continue;
+                }
+                let mut chats = Vec::new();
+                let chats_path = users_dir.join(&user).join("chats");
+                let user_chats = self
+                    .storage
+                    .get_objects(
+                        Some(&chats_path),
+                        Box::new(|file_name| file_name.ends_with(".json")),
+                    )
+                    .await?;
+                for chat in user_chats {
+                    chats.push(chat);
+                }
+                all_user_chats.insert(user, chats);
             }
-
-            all_user_chats.insert(user, chats);
         }
 
         Ok(all_user_chats)
     }

297-321: Tenant path mismatch between get_alert_states and get_alert_state_entry.

get_alert_states constructs a tenant-specific base path at lines 301-302, but get_alert_state_entry (line 328) calls alert_state_json_path(*alert_id) which ignores the tenant_id parameter and returns a non-tenant-prefixed path. This breaks tenant isolation—get_alert_states lists from {tenant}/alerts/ while get_alert_state_entry reads from alerts/. The same issue affects put_alert_state (line 352).


342-390: alert_state_json_path is missing the tenant_id parameter.

The put_alert_state method accepts tenant_id but never uses it. The path construction at line 352 uses alert_state_json_path(id) without tenant context, storing alert states in a global location instead of per-tenant.

This is inconsistent with related path functions:

  • alert_json_path(alert_id, tenant_id) accepts and uses tenant_id
  • mttr_json_path(tenant_id) accepts and uses tenant_id
  • schema_path(stream_name, tenant_id) accepts and uses tenant_id

The get_alert_states method (lines 297–322) demonstrates the correct pattern by manually constructing a tenant-scoped path with tenant_id. The singular get_alert_state_entry method has the same issue.

Update alert_state_json_path signature to accept tenant_id: &Option<String> and pass it in both get_alert_state_entry and put_alert_state calls.


569-594: Unused tenant_id parameters create dead code and path inconsistency.

The methods put_chat, put_filter, put_correlation, put_target, and put_llmconfig accept tenant_id but never use it. Their implementations rely solely on obj.get_object_path(), which bypasses tenant context:

  • Filter: Path uses filter_path(user_id, ...) without tenant scoping
  • CorrelationConfig: Path uses self.path() with user_id only
  • Target: Path uses target_json_path(&self.id) while the object has a pub tenant: Option<String> field that is ignored; target_json_path() includes a TODO comment "Needs to be updated for distributed mode"

This contrasts with put_conversation and put_alert, which correctly build paths using the tenant_id parameter. In a distributed/multi-tenant scenario, this inconsistency could lead to operations on incorrect paths or data isolation issues.

src/alerts/alerts_utils.rs (2)

130-153: The auth_token parameter is received but never used.

The execute_remote_query function accepts auth_token: Option<String> but passes None to send_query_request on line 148. This means the auth context from the caller is completely ignored, and the remote query will always fall back to the internal cluster token.

Looking at the send_query_request signature (from the relevant snippets), it expects Option<HeaderMap>. The auth_token should be converted and passed through.

🐛 Proposed fix to use the auth_token
+use http::header::HeaderValue;
+use reqwest::header::HeaderMap;
+
 /// Execute alert query remotely (Prism mode)
 async fn execute_remote_query(
     auth_token: Option<String>,
     query: &str,
     time_range: &TimeRange,
 ) -> Result<AlertQueryResult, AlertError> {
     let session_state = QUERY_SESSION.get_ctx().state();
     let raw_logical_plan = session_state.create_logical_plan(query).await?;
 
     let query_request = Query {
         query: query.to_string(),
         start_time: time_range.start.to_rfc3339(),
         end_time: time_range.end.to_rfc3339(),
         streaming: false,
         send_null: false,
         fields: false,
         filter_tags: None,
     };
 
-    let (result_value, _) = send_query_request(None,&query_request)
+    let auth_header = auth_token.map(|token| {
+        let mut map = HeaderMap::new();
+        map.insert(
+            http::header::AUTHORIZATION,
+            HeaderValue::from_str(&token).expect("valid auth token"),
+        );
+        map
+    });
+
+    let (result_value, _) = send_query_request(auth_header, &query_request)
         .await
         .map_err(|err| AlertError::CustomError(format!("Failed to send query request: {err}")))?;
 
     convert_result_to_group_results(result_value, raw_logical_plan)
 }

77-91: Pass tenant_id to remote query execution in Prism mode.

The execute_remote_query function does not accept or forward the tenant_id parameter, even though the parent execute_alert_query function receives it. This breaks multi-tenant isolation in Prism deployments. The send_query_request signature also lacks tenant context (note the commented-out tenant_id parameter on line 380 of cluster/mod.rs), preventing proper tenant-scoped query execution on remote nodes. Both functions need to be updated to propagate tenant context.

src/handlers/http/modal/ingest/ingestor_logstream.rs (1)

70-89: Consider consistency in stream validation for the delete handler.

This delete handler uses get_stream (line 74) without an upfront stream existence check, while similar handlers in src/handlers/http/logstream.rs (line 52) and src/handlers/http/modal/query/querier_logstream.rs (line 55) verify stream existence before calling get_or_create_stream.

Using get_stream will error if the stream exists in storage but isn't loaded in memory, whereas the other handlers safeguard against this with explicit checks. If this stricter behavior is intentional for ingestor nodes (where streams should always be in memory), add a comment explaining why the pattern differs from other delete handlers.

src/handlers/http/targets.rs (2)

71-98: Missing tenant_id enforcement in update handler allows potential cross-tenant modifications.

The handler extracts tenant_id from the request and uses it to fetch the existing target, but TARGETS.update(target.clone()) accepts the target object deserialized directly from the JSON request body without enforcing that target.tenant matches the authenticated tenant_id. A user could send a target with a different or missing tenant value, allowing the update to bypass tenant isolation.

Set target.tenant = tenant.clone() before calling TARGETS.update() to enforce tenant isolation, matching the pattern used in get_target_by_id() which properly validates the tenant context.


34-45: Both post and update handlers must set target.tenant before calling TARGETS.update().

The post handler doesn't extract tenant_id, and the update handler extracts it but never assigns it to target.tenant. Since TargetConfigs.update() relies on target.tenant to determine the storage location (defaulting to DEFAULT_TENANT if unset), both handlers will incorrectly store targets under the default tenant, breaking multi-tenant isolation.

Add target.tenant = tenant_id; in both handlers before calling TARGETS.update(), and extract tenant_id in post() using get_tenant_id_from_request().

src/handlers/http/modal/ingest/ingestor_rbac.rs (3)

108-112: Role existence check not tenant-scoped.

The roles() map is now tenant-scoped (HashMap<String, HashMap<String, Vec<DefaultPrivilege>>>), so roles().get(r) where r is a role name will always return None. You need to first get the tenant's role map.

🐛 Proposed fix
     // check if all roles exist
     let mut non_existent_roles = Vec::new();
+    let tenant = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v);
+    let tenant_roles = roles();
+    let tenant_role_map = tenant_roles.get(tenant);
     roles_to_add.iter().for_each(|r| {
-        if roles().get(r).is_none() {
+        if tenant_role_map.map_or(true, |m| m.get(r).is_none()) {
             non_existent_roles.push(r.clone());
         }
     });

150-156: Role existence check not tenant-scoped.

Same issue as in add_roles_to_user - the role lookup needs to be scoped to the tenant's role map.

🐛 Proposed fix
     // check if all roles exist
     let mut non_existent_roles = Vec::new();
+    let tenant = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v);
+    let tenant_roles = roles();
+    let tenant_role_map = tenant_roles.get(tenant);
     roles_to_remove.iter().for_each(|r| {
-        if roles().get(r).is_none() {
+        if tenant_role_map.map_or(true, |m| m.get(r).is_none()) {
             non_existent_roles.push(r.clone());
         }
     });

197-221: Missing password generation logic - critical bug.

This implementation is fundamentally broken compared to the other post_gen_password implementations in src/handlers/http/rbac.rs (lines 180) and src/handlers/http/modal/query/querier_rbac.rs (line 313). Both those functions call user::Basic::gen_new_password() to actually generate a new password and hash, then persist the modified metadata.

In ingestor_rbac.rs:

  • Line 206 writes unmodified metadata to staging (pointless, nothing changed yet)
  • Lines 207-219 copy the existing password hash instead of generating a new one
  • Line 220 updates the in-memory Users table with that existing hash
  • The function returns a success response instead of returning the new generated password

The function comment claims it "Resets password for the user to a newly generated one" but this implementation skips the generation entirely and just syncs an unchanged password. Add the missing call to user::Basic::gen_new_password() before line 207, update metadata with the new hash, and persist the modified metadata (like rbac.rs does at line 196).

src/parseable/streams.rs (1)

1627-1632: Streams tests also missing tenant_id parameter.

Streams::get_or_create calls in tests (lines 1627-1632, 1635-1640, 1665-1670, 1703, 1712) are missing the required tenant_id parameter. Additionally, assertions at line 1647 and 1661 use guard.len() and guard.contains_key() which no longer work with the nested HashMap structure.

         let stream1 = streams.get_or_create(
             options.clone(),
             stream_name.to_owned(),
             metadata.clone(),
             ingestor_id.clone(),
+            &None,
         );

The assertion logic also needs updating to account for the nested HashMap:

-        assert_eq!(guard.len(), 1);
+        assert_eq!(guard.values().map(|m| m.len()).sum::<usize>(), 1);

Also applies to: 1661-1661, 1676-1678

src/handlers/http/rbac.rs (1)

128-136: Role existence check should use tenant context.

The roles().contains_key(role) check at line 130 bypasses tenant-scoping. While tenant_id is available on line 117, the validation doesn't use it. Other handlers like get_role correctly access tenant-scoped roles via roles().get(tenant). The current implementation may accept roles from other tenants or incorrectly reject valid tenant-specific roles if roles are indeed tenant-scoped.

src/handlers/http/logstream.rs (1)

380-392: Pass tenant_id to get_first_and_latest_event_from_storage call.

Event data is stored under tenant-scoped paths (e.g., {tenant}/{stream_name}/date=.../...), but get_first_and_latest_event_from_storage receives only stream_name and calls list_dates(stream_name), which will not find data stored under the tenant prefix. For multi-tenant deployments, this causes retrieval of timestamps from the wrong path or no data at all. Add tenant_id parameter to the function signature and storage method chain to ensure proper path scoping.

🤖 Fix all issues with AI agents
In `@src/handlers/http/cluster/mod.rs`:
- Around line 697-703: The function sync_role_update currently accepts an unused
HttpRequest parameter named req; remove the unused parameter from the signature
(change sync_role_update(req: HttpRequest, ...) to sync_role_update(name:
String, privileges: Vec<DefaultPrivilege>, tenant_id: &str)) and update every
call site to stop passing an HttpRequest, or alternatively rename it to _req to
silence the unused warning if callers cannot be changed; ensure the function
signature and any trait impls or tests referencing sync_role_update are updated
consistently.
- Around line 654-657: The function signature for
sync_password_reset_with_ingestors currently accepts an unused HttpRequest
parameter (req); either remove the unused parameter from the signature and all
call sites (update any invocations of sync_password_reset_with_ingestors) or use
req inside the function to extract and propagate tenant/context info (e.g., read
tenant header or extractor used elsewhere) and forward that context to any
downstream calls; update the function signature and callers consistently and
adjust any RBAC or tenant-related logic to use the extracted context if you
choose to keep req.
- Around line 593-598: post_user currently constructs the user with
user::User::new_basic(username.clone(), None) which drops tenant context; change
the call to pass the extracted tenant_id (e.g.,
user::User::new_basic(username.clone(), Some(tenant_id.clone()))) so the tenant
is preserved when syncing to ingestors/queriers. Ensure the tenant_id variable
extracted earlier in post_user is used and cloned as needed; the User::new_basic
call is the only change required to match the pattern used in rbac.rs.

In `@src/handlers/http/modal/query/querier_rbac.rs`:
- Line 163: The call to sync_user_deletion_with_ingestors(&userid).await? omits
tenant context so ingestors' delete_user reads tenant_id from the incoming
request and may delete from the wrong tenant; update the querier's delete_user
to pass the correct tenant_id into sync_user_deletion_with_ingestors (e.g.
sync_user_deletion_with_ingestors(&tenant_id, &userid).await?) and modify the
ingestor request builder inside sync_user_deletion_with_ingestors to include
tenant_id (preferably as a dedicated HTTP header like "X-Tenant-ID" or an
explicit query parameter) so the ingestor_rbac::delete_user can unambiguously
target the correct tenant.

In `@src/handlers/http/role.rs`:
- Around line 176-190: Remove the leftover commented-out match block and its
surrounding commented lines so only the active let-chains code remains: keep the
existing assignment to res using DEFAULT_ROLE.read().unwrap().get(tenant_id)
with the let-chain and serde_json::Value variants, and delete the old commented
match example that references DEFAULT_ROLE and role to avoid clutter and stale
code.
- Around line 162-168: The code updates the in-memory DEFAULT_ROLE via
DEFAULT_ROLE.write().unwrap() before calling put_metadata, risking inconsistency
if persistence fails and risking panic on lock poisoning; change the order to
call await put_metadata(&metadata, &tenant_id) first and only on Ok update
DEFAULT_ROLE, and replace write().unwrap() with proper error handling (e.g.,
.write().map_err(|e| …) or .write().expect("failed to acquire DEFAULT_ROLE write
lock") or propagate a mapped error) when inserting the tenant key
(tenant_id.map_or(DEFAULT_TENANT, |v| v).to_owned()) and value Some(name) to
ensure no panic and consistency between store and memory.

In `@src/metastore/metastores/object_store_metastore.rs`:
- Around line 487-510: In get_dashboards, the code currently overwrites the
HashMap entry for a tenant each time a new user's dashboards are inserted
(dashboards.insert(tenant.to_owned(), dashboard_bytes)), so change it to
accumulate/merge dashboard_bytes into the existing Vec for that tenant: ensure
you normalize empty tenant to DEFAULT_TENANT before using it, then use
dashboards.entry(tenant.to_owned()).or_insert_with(Vec::new) and extend that Vec
with dashboard_bytes so all users' dashboards for the tenant are preserved
instead of replaced.
- Around line 1229-1244: The code that builds streams from resp.common_prefixes
(using flat_map(|path| path.parts()) and mapping to strings) doesn't remove the
tenant prefix when tenant_id is provided; adjust the logic in the same block
that constructs streams (referencing resp, common_prefixes, path.parts(),
streams, and tenant_id) so that if tenant_id.is_some() you first strip the
"{tenant_id}/" prefix from each path (or only take the last non-empty path
component after splitting) before mapping to a stream name, then apply the
existing filters; this ensures the tenant segment is not included in the
resulting stream names.

In `@src/parseable/mod.rs`:
- Around line 1184-1191: The current list_tenants method silently returns None
when tenants.as_ref().read() fails, masking poisoned lock errors; change the
read() call to unwrap/expect (e.g., self.tenants.as_ref().read().expect("tenants
lock poisoned")) so the function panics consistently on lock poisoning and then
return the cloned Vec<String> (remove the None return branch), preserving the
existing clone and Some(...) return behavior.

In `@src/prism/logstream/mod.rs`:
- Around line 256-260: In get_datasets (around get_tenant_id_from_key and the
call to PARSEABLE.streams.list), remove the debug log call
tracing::warn!(get_datasets_streams=?self.streams); so the method no longer
emits debug/warn output; simply keep the tenant lookup and streams population
logic (self.streams = PARSEABLE.streams.list(&tenant_id)) and delete the
tracing::warn! line.
- Around line 66-70: Remove the debug tracing statements in
src/prism/logstream/mod.rs by deleting the three tracing::warn! calls ("starting
dataset info", "got info", and "got schema") that surround the lines assigning
let info = info?; and let schema = schema?; so production code no longer
contains those temporary debug logs; keep the info and schema assignments intact
and ensure compilation (no unused import of tracing) after removal.
- Around line 115-116: The debug logging call inside stats::get_current_stats
(used here as get_current_stats) should be removed to avoid noisy debug output;
open the get_current_stats implementation and delete any tracing::debug! /
println! / log debug statements (or guard them behind a feature flag if needed),
ensuring the function still returns the stats value unchanged and that
tracing::warn!("starting stats") and the let stats =
stats::get_current_stats(...) call remain as-is.
- Around line 338-348: In build_dataset_response remove the temporary debug
traces: delete the tracing::warn!("got counts"); and
tracing::warn!(prism_logstream_res=?res); lines so the function only constructs
and returns PrismDatasetResponse (res) without emitting development logs; ensure
no other residual debug-only tracing remains in the build_dataset_response code
path.
- Around line 297-303: In process_stream (function process_stream) remove the
debug/tracing logs: delete the tracing::warn!("not authorized for datasets") and
the tracing::warn!("unable to load stream {stream} for tenant {tenant_id:?}")
calls so the function no longer emits those debug warnings; preserve the
existing control flow (the return Ok(None) and the check_or_load_stream call and
its conditional behavior) but eliminate the two tracing::warn invocations (or
replace them with trace-level logging if you prefer quieter diagnostics) to
satisfy the "remove debug logging" request.

In `@src/tenants/mod.rs`:
- Around line 51-55: TenantOverview currently duplicates suspended_services
(stored separately and inside StorageMetadata.meta), causing state divergence
when suspend_service/resume_service update only the HashSet; remove the
duplication by deleting the suspended_services field from TenantOverview and
update insert_tenant to store only meta (no cloned suspensions), then update
suspend_service and resume_service to mutate
StorageMetadata.meta.suspended_services (or helper methods on TenantOverview
that forward to meta) so get_tenants persists the correct suspension state;
update any accessors that relied on the old suspended_services field to read
from meta.suspended_services instead.

In `@src/users/dashboards.rs`:
- Around line 255-267: The code path using dashboards.get_mut(tenant) skips
creation when the tenant key is missing, causing silent no-ops; change to obtain
a mutable bucket with dashboards.entry(tenant).or_default() (or equivalent) so a
Vec is created when missing, then perform the duplicate check using that bucket,
call self.save_dashboard(dashboard, tenant_id).await? and push the dashboard
into the bucket; ensure you still return Err(DashboardError::Metadata(...)) on
duplicate and Ok(()) on success.

In `@src/utils/mod.rs`:
- Around line 79-85: The get_tenant_id_from_request function currently calls
to_str().unwrap() which can panic on non-UTF8 header values; change it to handle
the Result returned by to_str() safely (e.g., use map_ok/ok or match) and return
None when to_str() fails instead of unwrapping, preserving the existing
signature; update the branch that extracts tenant_value to call
tenant_value.to_str().ok().map(|s| s.to_owned()) or equivalent error-safe logic
so malformed header bytes do not cause a panic.
♻️ Duplicate comments (29)
src/users/filters.rs (1)

128-133: Filter silently dropped when tenant bucket doesn't exist.

This is a duplicate of the past review comment. The update method only modifies existing tenant buckets via get_mut(). If the tenant entry doesn't exist in the map (e.g., when adding the first filter for a new tenant after a server restart), the filter will be silently dropped without insertion.

🐛 Proposed fix using entry API
     pub async fn update(&self, filter: &Filter, tenant_id: &Option<String>) {
         let mut s = self.0.write().await;
-        if let Some(filters) = s.get_mut(tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v)) {
-            filters.retain(|f| f.filter_id != filter.filter_id);
-            filters.push(filter.clone());
-        }
+        let tenant = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v.as_str());
+        let filters = s.entry(tenant.to_owned()).or_default();
+        filters.retain(|f| f.filter_id != filter.filter_id);
+        filters.push(filter.clone());
     }
src/correlation.rs (3)

140-144: Correlation silently not added to memory when tenant bucket doesn't exist.

This is a duplicate of the past review comment. In create(), if the tenant bucket doesn't exist in the in-memory map, the correlation is persisted to metastore but not added to the in-memory cache.

🐛 Proposed fix using entry API
         let tenant = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v);
         // Update in memory
-        if let Some(corrs) = self.write().await.get_mut(tenant) {
-            corrs.insert(correlation.id.to_owned(), correlation.clone());
-        }
+        self.write()
+            .await
+            .entry(tenant.to_owned())
+            .or_default()
+            .insert(correlation.id.to_owned(), correlation.clone());

176-183: Same issue: update() silently fails when tenant bucket doesn't exist.

This is a duplicate of the past review comment. Apply the same fix using the entry API.

🐛 Proposed fix
         let tenant = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v);
         // Update in memory
-        if let Some(corrs) = self.write().await.get_mut(tenant) {
-            corrs.insert(
-                updated_correlation.id.to_owned(),
-                updated_correlation.clone(),
-            );
-        }
+        self.write()
+            .await
+            .entry(tenant.to_owned())
+            .or_default()
+            .insert(updated_correlation.id.to_owned(), updated_correlation.clone());

204-211: Critical bug: remove operates on wrong map level, corrupts correlation store.

This is a duplicate of the past review comment. Line 211 calls self.write().await.remove(&correlation.id) which removes an entry from the outer HashMap<String, CorrelationMap> using correlation.id as the key. This is incorrect—it should remove the correlation from the inner CorrelationMap for the specific tenant. As written, this could delete an unrelated tenant's data (if a tenant_id happens to match a correlation_id) or silently fail.

🐛 Proposed fix
         // Delete from storage
         PARSEABLE
             .metastore
             .delete_correlation(&correlation, tenant_id)
             .await?;

         // Delete from memory
-        self.write().await.remove(&correlation.id);
+        let tenant = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v.as_str());
+        if let Some(corrs) = self.write().await.get_mut(tenant) {
+            corrs.remove(&correlation.id);
+        }

         Ok(())
src/query/stream_schema_provider.rs (2)

529-534: Reduce logging level from warn to debug or trace.

This logging runs on every table scan and will flood production logs. This appears to be debug instrumentation.

♻️ Suggested fix
-        tracing::warn!(
+        tracing::debug!(
             "entered scan with\ntenant- {:?}\nschema- {:?}\nstream- {}",
             self.tenant_id,
             self.schema,
             self.stream
         );

639-648: Same issues: commented tenant URL code, unwrap, and noisy logging.

Lines 639-643 have commented-out tenant-aware URL logic (same pattern as hot tier). Line 645 has warn! logging that should be trace!. Line 648 has ObjectStoreUrl::parse(...).unwrap() that can panic.

♻️ Combined fix
-        // let object_store_url = if let Some(tenant_id) = self.tenant_id.as_ref() {
-        //     glob_storage.store_url().join(tenant_id).unwrap()
-        // } else {
-        //     glob_storage.store_url()
-        // };
         let object_store_url = glob_storage.store_url();
-        tracing::warn!(object_store_url=?object_store_url);
+        tracing::trace!(object_store_url=?object_store_url);
         self.create_parquet_physical_plan(
             &mut execution_plans,
-            ObjectStoreUrl::parse(object_store_url).unwrap(),
+            ObjectStoreUrl::parse(&object_store_url)
+                .map_err(|e| DataFusionError::Plan(format!("Invalid object store URL: {e}")))?,
src/hottier.rs (3)

794-800: create_pstats_hot_tier should use INTERNAL_STREAM_HOT_TIER_SIZE_BYTES like pmeta.

Both pstats and pmeta are internal streams, but pstats allocates MIN_STREAM_HOT_TIER_SIZE_BYTES (10 GiB) while pmeta allocates INTERNAL_STREAM_HOT_TIER_SIZE_BYTES (10 MiB). For consistency, pstats should use the internal stream constant.

♻️ Proposed fix
                     let mut stream_hot_tier = StreamHotTier {
                         version: Some(CURRENT_HOT_TIER_VERSION.to_string()),
-                        size: MIN_STREAM_HOT_TIER_SIZE_BYTES,
+                        size: INTERNAL_STREAM_HOT_TIER_SIZE_BYTES,
                         used_size: 0,
-                        available_size: MIN_STREAM_HOT_TIER_SIZE_BYTES,
+                        available_size: INTERNAL_STREAM_HOT_TIER_SIZE_BYTES,
                         oldest_date_time_entry: None,
                     };

105-116: Logic error: stream/tenant exclusion uses incorrect boolean logic.

The condition at lines 108-109 uses && which means a stream is only skipped if BOTH conditions are true independently. The intent is to skip when the stream AND tenant both match the current ones. Current logic incorrectly includes streams when either condition fails.

🐛 Proposed fix
             for stream in PARSEABLE.streams.list(&tenant_id) {
                 if self.check_stream_hot_tier_exists(&stream, &tenant_id)
-                    && stream != current_stream
-                    && tenant_id != *current_tenant_id
+                    && !(stream == current_stream && tenant_id == *current_tenant_id)
                 {

595-603: Avoid panics in check_stream_hot_tier_exists.

hot_tier_file_path(...).unwrap() can crash on path conversion errors. This is an existence check and should be best-effort, returning false on errors.

🐛 Proposed fix
     pub fn check_stream_hot_tier_exists(&self, stream: &str, tenant_id: &Option<String>) -> bool {
-        let path = self.hot_tier_file_path(stream, tenant_id).unwrap();
-        PathBuf::from(path.to_string()).exists()
+        match self.hot_tier_file_path(stream, tenant_id) {
+            Ok(path) => PathBuf::from(path.to_string()).exists(),
+            Err(err) => {
+                warn!("Failed to resolve hot tier path for stream={stream} tenant={tenant_id:?}: {err}");
+                false
+            }
+        }
     }
src/handlers/http/modal/ingest/mod.rs (1)

27-37: Private fields lack accessors for external usage.

The SyncRole struct has private fields (privileges, tenant_id), but if ingestor_role.rs needs to access these fields directly, either make them public or add getter methods.

♻️ Option 1: Make fields public
 #[derive(Deserialize, Serialize)]
 pub struct SyncRole {
-    privileges: Vec<DefaultPrivilege>,
-    tenant_id: String
+    pub privileges: Vec<DefaultPrivilege>,
+    pub tenant_id: String,
 }
♻️ Option 2: Add getter methods
 impl SyncRole {
     pub fn new(privileges: Vec<DefaultPrivilege>, tenant_id: String) -> Self {
         Self { privileges, tenant_id }
     }
+
+    pub fn privileges(&self) -> &[DefaultPrivilege] {
+        &self.privileges
+    }
+
+    pub fn tenant_id(&self) -> &str {
+        &self.tenant_id
+    }
 }
src/handlers/http/modal/query/querier_logstream.rs (1)

73-75: Critical: delete_stream still lacks tenant context.

The delete_stream call doesn't include tenant_id, while all surrounding operations (stream existence check, local directory cleanup, hot tier deletion, in-memory cleanup) properly use tenant context. Storage paths are tenant-scoped, so deleting only by stream_name could delete data belonging to other tenants.

This needs the same tenant-scoping treatment as other operations in this function.

src/alerts/alert_types.rs (1)

91-123: Auth credential extraction remains incomplete.

This is a known work-in-progress issue. The complex logic to find an admin user for the tenant (lines 91-105) identifies a user but the credential extraction (lines 107-118) always returns None for both Native and OAuth user types. This means execute_alert_query will receive None for auth_token, which may cause remote alert queries in Prism mode to fail authentication.

src/handlers/http/middleware.rs (2)

167-177: Handle potential panic from HeaderValue::from_str().unwrap().

If tenant_id contains characters not valid in HTTP headers (e.g., control characters), HeaderValue::from_str() will return an error and unwrap() will panic. Consider handling this gracefully.


304-315: Security: Consider stricter handling when tenant doesn't exist.

The check_suspension function returns Authorized when:

  1. No tenant header is present (line 314)
  2. Tenant doesn't exist in TENANT_METADATA (line 310-312)

This could allow requests to bypass tenant-level controls. The empty else branch with the "tenant does not exist" comment suggests this needs further handling.

src/handlers/http/modal/ingest/ingestor_role.rs (1)

46-52: Inverted tenant validation logic still present.

The condition on line 48 checks if the request tenant matches the payload tenant (req_tenant.eq(&sync_req.tenant_id)), but the error message indicates this should block cross-tenant operations. The second condition should check for a mismatch (ne) to prevent non-super-admin users from creating roles for other tenants.

🐛 Proposed fix
-    if req_tenant.ne(DEFAULT_TENANT) && (req_tenant.eq(&sync_req.tenant_id)) {
+    if req_tenant.ne(DEFAULT_TENANT) && (req_tenant.ne(&sync_req.tenant_id)) {
         return Err(RoleError::Anyhow(anyhow::Error::msg(
             "non super-admin user trying to create role for another tenant",
         )));
     }
src/handlers/http/modal/ingest/ingestor_rbac.rs (1)

51-57: Inverted tenant validation logic.

Same issue as in ingestor_role.rs: the condition checks for equality when it should check for inequality. This will reject valid same-tenant operations instead of cross-tenant operations.

🐛 Proposed fix
         if req_tenant.ne(DEFAULT_TENANT)
-            && (req_tenant.eq(user.tenant.as_ref().map_or(DEFAULT_TENANT, |v| v)))
+            && (req_tenant.ne(user.tenant.as_ref().map_or(DEFAULT_TENANT, |v| v)))
         {
             return Err(RBACError::Anyhow(anyhow::Error::msg(
                 "non super-admin user trying to create user for another tenant",
             )));
         }
src/migration/mod.rs (2)

168-170: Early return on list_streams failure prevents migration of remaining tenants.

If list_streams fails for one tenant, the ? operator causes an early return, skipping migration for all subsequent tenants. This should handle errors per-tenant to allow other tenants to proceed.

🐛 Suggested fix
     for tenant_id in tenants {
         // Get all stream names
-        let stream_names = PARSEABLE.metastore.list_streams(&tenant_id).await?;
+        let stream_names = match PARSEABLE.metastore.list_streams(&tenant_id).await {
+            Ok(names) => names,
+            Err(e) => {
+                warn!("Failed to list streams for tenant {:?}: {:?}", tenant_id, e);
+                continue;
+            }
+        };

490-498: Use PARSEABLE_METADATA_FILE_NAME constant instead of hardcoded string.

Lines 495 and 497 use the hardcoded string ".parseable.json" while other parts of the codebase use the PARSEABLE_METADATA_FILE_NAME constant. This inconsistency could cause path mismatches.

🐛 Suggested fix
     let path = if let Some(tenant) = tenant_id.as_ref() {
         config
             .options
             .staging_dir()
             .join(tenant)
-            .join(".parseable.json")
+            .join(PARSEABLE_METADATA_FILE_NAME)
     } else {
-        config.options.staging_dir().join(".parseable.json")
+        config.options.staging_dir().join(PARSEABLE_METADATA_FILE_NAME)
     };
src/handlers/http/cluster/mod.rs (1)

376-381: Tenant context not propagated to ingestors during stream synchronization.

The tenant_id parameter is commented out (line 380). Stream sync requests to ingestors won't include tenant context, breaking tenant isolation.

src/handlers/http/modal/query/querier_rbac.rs (1)

79-79: User created without tenant association.

User::new_basic is called with None for the tenant parameter, but tenant_id is available from the request. New users won't be associated with their tenant.

-    let (user, password) = user::User::new_basic(username.clone(), None);
+    let (user, password) = user::User::new_basic(username.clone(), tenant_id.clone());
src/handlers/http/oidc.rs (2)

130-159: Cluster sync should check HTTP response status.

The for_each_live_node call sends login sync requests but doesn't verify that the remote node actually accepted them. reqwest::send() succeeds even on 4xx/5xx responses, so failed syncs go undetected.

Proposed fix
                     async move {
-                        INTRA_CLUSTER_CLIENT
+                        let resp = INTRA_CLUSTER_CLIENT
                             .post(url)
                             .header(header::AUTHORIZATION, node.token)
                             .header(header::CONTENT_TYPE, "application/json")
                             .json(&json!(
                                 {
                                     "sessionCookie": _session,
                                     "user": _user,
                                     "expiry": EXPIRY_DURATION
                                 }
                             ))
                             .send()
                             .await?;
+                        resp.error_for_status()?;
                         Ok::<(), anyhow::Error>(())
                     }

325-326: Address incomplete tenant implementation in OIDC user creation.

New OAuth users are created without tenant association (None passed to put_user), despite tenant_id being extracted at line 229 and used for existing user lookups. This breaks multi-tenant isolation for new OIDC users.

Replace None with the extracted tenant_id, or add a TODO with tracking issue if intentional WIP:

-        // LET TENANT BE NONE FOR NOW!!!
-        (None, roles) => put_user(&user_id, roles, user_info, bearer, None).await?,
+        (None, roles) => put_user(&user_id, roles, user_info, bearer, tenant_id.clone()).await?,
src/prism/logstream/mod.rs (1)

71-73: Critical: Stats are hardcoded to default values.

The actual stats result is commented out and replaced with QueriedStats::default(). This breaks the stats functionality entirely and appears to be debugging code left in.

-    // let stats = stats?;
-    let stats = QueriedStats::default();
-    tracing::warn!("got FAKE stats");
+    let stats = stats?;
src/query/mod.rs (1)

191-205: Log schema registration errors instead of silently ignoring them.

Lines 191 and 201 silently drop errors from catalog.register_schema(...) with let _ = .... Failed schema registration at startup may cause query failures that are difficult to diagnose.

Suggested fix
                     // tracing::warn!("registering_schema- {schema_provider:?}\nwith tenant- {t}");
-                    let _ = catalog.register_schema(t, schema_provider);
+                    if let Err(e) = catalog.register_schema(t, schema_provider) {
+                        tracing::error!("Failed to register schema for tenant {}: {:?}", t, e);
+                    }
                     // tracing::warn!("result=> {r:?}");
src/alerts/mod.rs (1)

1246-1255: Don't silently drop alert updates when the tenant bucket doesn't exist.

update() only inserts if get_mut(tenant) returns Some(_). For first-time tenants or races with initialization, this silently loses writes.

Proposed fix
     async fn update(&self, alert: &dyn AlertTrait) {
-        let tenant = alert.get_tenant_id().as_ref().map_or(DEFAULT_TENANT, |v| v);
-        if let Some(alerts) = self.alerts.write().await.get_mut(tenant) {
-            alerts.insert(*alert.get_id(), alert.clone_box());
-        }
+        let tenant = alert.get_tenant_id().await.as_ref().map_or(DEFAULT_TENANT, |v| v).to_owned();
+        let mut guard = self.alerts.write().await;
+        guard
+            .entry(tenant)
+            .or_default()
+            .insert(alert.get_id().await, alert.clone_box().await);
     }
src/parseable/mod.rs (3)

1057-1076: TOCTOU race condition persists.

This issue was previously flagged. The existence check (line 1066) uses a read lock, but the insertion (line 1071) acquires a separate write lock. Another thread could add the same tenant between these operations.


1116-1144: Incomplete tenant deletion persists.

This issue was previously flagged. The method removes the tenant from TENANT_METADATA, users, and roles, but does not remove it from self.tenants. This leaves list_tenants() returning the deleted tenant.


1146-1182: Incomplete logic and silent lock failure persist.

This issue was previously flagged:

  1. Empty else if !is_multi_tenant { } block at lines 1168-1169
  2. Returns Ok(None) on write lock failure at lines 1179-1180 instead of propagating the error
src/metastore/metastore_traits.rs (1)

108-112: tenant_id parameter unused in implementation.

This issue was previously flagged. The tenant_id parameter is accepted in the trait signature but the implementation in object_store_metastore.rs (line 328) calls alert_state_json_path(*alert_id) without tenant context, breaking tenant isolation for alert state storage.

Comment on lines +51 to +55
pub fn insert_tenant(&self, tenant_id: String, meta: StorageMetadata) {
let suspensions = meta.suspended_services.clone().unwrap_or_default();
self.tenants
.insert(tenant_id, TenantOverview { suspended_services: suspensions, meta });
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential data duplication in tenant storage.

suspended_services is stored both in TenantOverview.suspended_services and within TenantOverview.meta.suspended_services. When suspend_service or resume_service is called, only the HashSet is updated, leaving meta.suspended_services stale.

If get_tenants() returns meta for persistence, the suspension state changes won't be reflected.

🐛 Option 1: Sync meta when suspending/resuming
 pub fn suspend_service(&self, tenant_id: &str, service: Service) {
     if let Some(mut tenant) = self.tenants.get_mut(tenant_id) {
         tenant.suspended_services.insert(service.clone());
+        // Keep meta in sync
+        let services = tenant.meta.suspended_services.get_or_insert_with(HashSet::new);
+        services.insert(service);
     }
 }
♻️ Option 2: Remove duplication, derive from meta
 pub struct TenantOverview {
-    suspended_services: HashSet<Service>,
     meta: StorageMetadata,
 }
 
+// Access suspended_services through meta
+impl TenantOverview {
+    fn suspended_services(&self) -> &HashSet<Service> {
+        self.meta.suspended_services.as_ref().unwrap_or(&HashSet::new())
+    }
+}
🤖 Prompt for AI Agents
In `@src/tenants/mod.rs` around lines 51 - 55, TenantOverview currently duplicates
suspended_services (stored separately and inside StorageMetadata.meta), causing
state divergence when suspend_service/resume_service update only the HashSet;
remove the duplication by deleting the suspended_services field from
TenantOverview and update insert_tenant to store only meta (no cloned
suspensions), then update suspend_service and resume_service to mutate
StorageMetadata.meta.suspended_services (or helper methods on TenantOverview
that forward to meta) so get_tenants persists the correct suspension state;
update any accessors that relied on the old suspended_services field to read
from meta.suspended_services instead.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (16)
src/handlers/airplane.rs (2)

106-124: Flight get_schema lacks tenant context extraction.

The tenant_id is hardcoded to &None. In the HTTP query handler (see src/handlers/http/query.rs), the tenant is extracted via get_tenant_id_from_request(&req). For consistent multi-tenancy support, the Flight service should extract tenant context from gRPC request metadata.

Consider extracting tenant from request metadata:

// Example: extract tenant from gRPC metadata
let tenant_id = req.metadata()
    .get("x-p-tenant-id")
    .and_then(|v| v.to_str().ok())
    .map(String::from);

126-248: Propagate tenant context throughout do_get instead of hardcoding &None.

Multiple calls within do_get use hardcoded &None for tenant_id:

  • Line 202: user_auth_for_datasets(&permissions, &streams, &None)
  • Line 209: execute(query, false, &None)
  • Line 238: PARSEABLE.get_or_create_stream(&stream_name, &None).clear()

For consistent multi-tenancy, extract the tenant at the start of do_get and propagate it:

Suggested approach
 async fn do_get(&self, req: Request<Ticket>) -> Result<Response<Self::DoGetStream>, Status> {
     let key = extract_session_key(req.metadata())
         .map_err(|e| Status::unauthenticated(e.to_string()))?;
+    
+    // Extract tenant from request metadata
+    let tenant_id: Option<String> = req.metadata()
+        .get("x-p-tenant-id")
+        .and_then(|v| v.to_str().ok())
+        .map(String::from);

     // ... later in the function ...
     
-    user_auth_for_datasets(&permissions, &streams, &None)
+    user_auth_for_datasets(&permissions, &streams, &tenant_id)
     
-    let (records, _) = execute(query, false, &None)
+    let (records, _) = execute(query, false, &tenant_id)
     
-    PARSEABLE.get_or_create_stream(&stream_name, &None).clear();
+    PARSEABLE.get_or_create_stream(&stream_name, &tenant_id).clear();
src/alerts/alerts_utils.rs (1)

130-153: auth_token parameter is received but never used.

The function accepts auth_token: Option<String> but line 148 passes None to send_query_request instead of the received token. This means remote alert queries in Prism mode will always lack authentication, likely causing failures.

Additionally, there's a type mismatch: send_query_request expects Option<HeaderMap> (per the relevant snippet), but auth_token is Option<String>.

🐛 Suggested fix

Convert the auth_token to a HeaderMap before passing:

 async fn execute_remote_query(
-    auth_token: Option<String>,
+    auth_token: Option<HeaderMap>,
     query: &str,
     time_range: &TimeRange,
 ) -> Result<AlertQueryResult, AlertError> {
     // ...
-    let (result_value, _) = send_query_request(None,&query_request)
+    let (result_value, _) = send_query_request(auth_token, &query_request)
         .await
         .map_err(|err| AlertError::CustomError(format!("Failed to send query request: {err}")))?;

This also requires updating execute_alert_query to pass the correct type.

src/storage/store_metadata.rs (1)

298-320: Missing directory creation for tenant-specific staging path.

When tenant_id is provided, the code constructs a path under a tenant subdirectory (Line 306), but doesn't ensure the parent directory exists before attempting to write. This will fail with io::ErrorKind::NotFound if the tenant directory hasn't been created.

Compare with src/migration/mod.rs (lines 484-505) which has the same pattern but also doesn't create the directory. Consider adding directory creation:

🐛 Proposed fix
 pub fn put_staging_metadata(meta: &StorageMetadata, tenant_id: &Option<String>) -> io::Result<()> {
     let mut staging_metadata = meta.clone();
     staging_metadata.server_mode = PARSEABLE.options.mode;
     staging_metadata.staging = PARSEABLE.options.staging_dir().to_path_buf();
     let path = if let Some(tenant_id) = tenant_id.as_ref() {
-        PARSEABLE
+        let tenant_dir = PARSEABLE
             .options
             .staging_dir()
-            .join(tenant_id)
-            .join(PARSEABLE_METADATA_FILE_NAME)
+            .join(tenant_id);
+        std::fs::create_dir_all(&tenant_dir)?;
+        tenant_dir.join(PARSEABLE_METADATA_FILE_NAME)
     } else {
         PARSEABLE
             .options
             .staging_dir()
             .join(PARSEABLE_METADATA_FILE_NAME)
     };
src/handlers/http/modal/ingest/ingestor_rbac.rs (2)

106-116: Roles lookup should be tenant-scoped to match other operations in these handlers.

The role existence checks at lines 109 and 153 use roles().get(r), which queries a global roles map. However, all other operations in both add_roles_to_user and remove_roles_from_user are tenant-aware: user validation, role updates, and metadata operations all use tenant_id. This inconsistency may allow roles from one tenant to be validated/applied to another.

Consider using a tenant-scoped role lookup function instead of the global roles() map, or document why role validation is intentionally global while role application is tenant-scoped.


197-221: Ingest sync handler doesn't generate a new password and writes metadata prematurely.

Line 206 calls put_staging_metadata() before reading the password hash (lines 207-219), and the metadata hasn't been modified at that point. More critically, the function clones the existing user.password_hash instead of generating a new password—compare with the working implementations in rbac.rs (line 180) and querier_rbac.rs (line 313) which both call user::Basic::gen_new_password(). The sync version should generate a new password, update metadata with the new hash, and then persist the metadata.

src/catalog/mod.rs (1)

556-569: Use conditional logic to exclude empty tenant_id from path, matching patterns elsewhere in the codebase.

The partition_path function uses map_or("", |v| v) to extract the root, passing an empty string to from_iter when tenant_id is None. This differs from similar path-building functions like mttr_json_path() and alert_json_path() at lines 1240-1246 and 1205-1214 of src/storage/object_storage.rs, which use conditional logic to exclude the tenant segment entirely when absent.

Update partition_path to follow the established pattern:

Suggested approach
pub fn partition_path(
    stream: &str,
    lower_bound: DateTime<Utc>,
    upper_bound: DateTime<Utc>,
    tenant_id: &Option<String>,
) -> RelativePathBuf {
    let lower = lower_bound.date_naive().format("%Y-%m-%d").to_string();
    let upper = upper_bound.date_naive().format("%Y-%m-%d").to_string();
    if lower == upper {
        if let Some(tenant) = tenant_id.as_ref() {
            RelativePathBuf::from_iter([tenant, stream, &format!("date={lower}")])
        } else {
            RelativePathBuf::from_iter([stream, &format!("date={lower}")])
        }
    } else {
        if let Some(tenant) = tenant_id.as_ref() {
            RelativePathBuf::from_iter([tenant, stream, &format!("date={lower}:{upper}")])
        } else {
            RelativePathBuf::from_iter([stream, &format!("date={lower}:{upper}")])
        }
    }
}
src/parseable/streams.rs (1)

1200-1270: Tests won't compile due to signature change.

Multiple test functions call Stream::new with 4 arguments, but the signature now requires 5 (adding the tenant_id: &Option<String> parameter). At least 8 test functions are affected (beyond the 4 shown below), totaling 14+ Stream::new calls that need updating.

Update all test calls to include the tenant_id parameter. For tests that don't have a tenant, pass &None::<String>:

Example fix for test_staging_new_with_valid_stream
         let staging = Stream::new(
             options.clone(),
             stream_name,
             LogStreamMetadata::default(),
             None,
+            &None::<String>,
         );
src/alerts/mod.rs (2)

1039-1140: Avoid holding self.alerts.write().await across .await in load() (startup deadlock/latency risk).

load() takes a write lock (Line 1043) and then awaits migrations and channel sends (Line 1064-1127). Even if this is “startup-only”, it can still block other alert operations and is an easy footgun later.

Refactor suggestion: parse/migrate alerts into a local Vec<(tenant_id, Box<dyn AlertTrait>, should_start_task)> without holding the lock; then:

  1. insert into self.alerts under a short write lock, and
  2. send AlertTask::Create outside the lock.

733-759: Pass tenant context to alert query parsing functions to ensure correct schema resolution in multi-tenant setups.

Alert queries are parsed without setting the tenant's default_schema, unlike the HTTP query path (line 122–126 in src/handlers/http/query.rs), which explicitly configures it. This affects:

  • get_number_of_agg_exprs() / get_aggregate_projection() in src/alerts/mod.rs (validation)
  • execute_local_query() / execute_remote_query() in src/alerts/alerts_utils.rs (execution)

In multi-tenant, unqualified table names like FROM "stream" may resolve incorrectly or fail if the default schema differs from the tenant schema. The validate() method in src/alerts/alert_types.rs has self.tenant_id available (line 39) but doesn't pass it to parsing functions.

Suggestion: modify parsing functions to accept tenant_id: &Option<String> and set session_state.config_mut().options_mut().catalog.default_schema before calling create_logical_plan(), matching the HTTP handler pattern.

src/storage/object_storage.rs (1)

896-951: Warn-level logging in hot loop is likely too noisy.

tracing::warn!(process_parquet_files_path=?path); (Line 936) and other warn logs around per-file operations will spam logs under normal ingestion. Suggest dropping to trace!/debug! or gating behind a feature.

Also applies to: 953-995

src/parseable/mod.rs (2)

210-264: Reduce warn-level logs and confirm tenant_id validation.

check_or_load_stream() logs at warn on normal control flow (Line 254, 258). This will be noisy at scale; consider debug!/trace!.

Also, tenant ids are used to partition in-memory streams and (elsewhere) object-store paths; please ensure tenant ids are validated (no /, .., etc.) at creation/extraction time.


462-537: Bug risk: missing tenant header when syncing internal streams to ingestors.

create_internal_stream_if_not_exists() creates internal streams per tenant (Line 464-491), but sync_streams_with_ingestors(...) calls don’t include the "tenant" header (Line 508-533). Since request handlers derive tenant via get_tenant_id_from_request() (header "tenant"), this likely causes ingestors to create/sync these internal streams under the default tenant instead.

Proposed fix
             let mut header_map = HeaderMap::new();
             header_map.insert(
                 HeaderName::from_str(STREAM_TYPE_KEY).unwrap(),
                 HeaderValue::from_str(&StreamType::Internal.to_string()).unwrap(),
             );
             header_map.insert(CONTENT_TYPE, HeaderValue::from_static("application/json"));
+            if let Some(t) = tenant_id.as_deref() {
+                header_map.insert(
+                    HeaderName::from_static("tenant"),
+                    HeaderValue::from_str(t).map_err(|e| StreamError::Anyhow(e.into()))?,
+                );
+            }

             // Sync only the streams that were created successfully
src/metastore/metastores/object_store_metastore.rs (3)

342-390: The tenant_id parameter is unused in put_alert_state path construction.

Line 352 calls alert_state_json_path(id) without passing tenant_id, mirroring the same issue in get_alert_state_entry. This causes all tenants to write to the same alert state file, breaking tenant isolation and causing data corruption across tenants.

🐛 Suggested fix

Update alert_state_json_path in object_storage.rs to accept tenant_id and modify this call:

-        let path = alert_state_json_path(id);
+        let path = alert_state_json_path(id, tenant_id);

541-566: get_chats is not tenant-aware, inconsistent with other similar methods.

Unlike get_dashboards, get_filters, get_correlations, etc., the get_chats method does not iterate over tenants. It reads from a single USERS_ROOT_DIR path without tenant prefix, causing all tenants' chats to be mixed together. This breaks tenant isolation.

🐛 Suggested fix to add tenant awareness
     async fn get_chats(&self) -> Result<DashMap<String, Vec<Bytes>>, MetastoreError> {
         let all_user_chats = DashMap::new();
-
-        let users_dir = RelativePathBuf::from(USERS_ROOT_DIR);
-        for user in self.storage.list_dirs_relative(&users_dir).await? {
-            if user.starts_with(".") {
-                continue;
-            }
-            let mut chats = Vec::new();
-            let chats_path = users_dir.join(&user).join("chats");
-            let user_chats = self
-                .storage
-                .get_objects(
-                    Some(&chats_path),
-                    Box::new(|file_name| file_name.ends_with(".json")),
-                )
-                .await?;
-            for chat in user_chats {
-                chats.push(chat);
+        let base_paths = PARSEABLE.list_tenants().map_or(vec!["".into()], |v| v);
+        for tenant in base_paths {
+            let users_dir = RelativePathBuf::from_iter([&tenant, USERS_ROOT_DIR]);
+            for user in self.storage.list_dirs_relative(&users_dir).await? {
+                if user.starts_with(".") {
+                    continue;
+                }
+                let mut chats = Vec::new();
+                let chats_path = users_dir.join(&user).join("chats");
+                let user_chats = self
+                    .storage
+                    .get_objects(
+                        Some(&chats_path),
+                        Box::new(|file_name| file_name.ends_with(".json")),
+                    )
+                    .await?;
+                for chat in user_chats {
+                    chats.push(chat);
+                }
+                // Consider keying by tenant+user for proper isolation
+                all_user_chats.insert(user, chats);
             }
-
-            all_user_chats.insert(user, chats);
         }
-
         Ok(all_user_chats)
     }

Note: The return type DashMap<String, Vec<Bytes>> may also need to change to support tenant-keyed results like other methods.


852-870: Path construction issues in get_all_manifest_files when tenant_id is provided.

Two issues:

  1. Lines 852-858: The dates extraction filters out stream_name but not the tenant_id. When tenant_id is provided, path.parts() will include the tenant name as the first component, which will incorrectly be treated as a date.

  2. Lines 860-861: The date_path is constructed as format!("{}/{}", stream_name, &date) without including the tenant prefix, but the objects were listed under {tenant}/{stream_name}/.... This path mismatch will cause the second list_with_delimiter call to fail or return empty results.

🐛 Suggested fix
         let dates = resp
             .common_prefixes
             .iter()
             .flat_map(|path| path.parts())
-            .filter(|name| name.as_ref() != stream_name && name.as_ref() != STREAM_ROOT_DIRECTORY)
+            .filter(|name| {
+                let name_str = name.as_ref();
+                name_str != stream_name 
+                    && name_str != STREAM_ROOT_DIRECTORY
+                    && tenant_id.as_ref().map_or(true, |t| name_str != t.as_str())
+            })
             .map(|name| name.as_ref().to_string())
             .collect::<Vec<_>>();

         for date in dates {
-            let date_path = object_store::path::Path::from(format!("{}/{}", stream_name, &date));
+            let date_path = if let Some(tenant) = tenant_id {
+                object_store::path::Path::from(format!("{}/{}/{}", tenant, stream_name, &date))
+            } else {
+                object_store::path::Path::from(format!("{}/{}", stream_name, &date))
+            };
             let resp = self.storage.list_with_delimiter(Some(date_path)).await?;
🤖 Fix all issues with AI agents
In `@src/handlers/http/cluster/mod.rs`:
- Around line 1890-1895: The insertion using
HeaderValue::from_str(&querier.token).unwrap() can panic on invalid header
characters; replace the unwrap with a fallible conversion
(reqwest::header::HeaderValue::try_from(&querier.token)) and handle the Result
instead of panicking: validate querier.token before inserting into map (the
AUTHORIZATION header), and on Err return or propagate an appropriate error
(e.g., map to an HTTP 400/BadRequest or use the enclosing function's error type)
so the code around map and querier.token safely handles invalid tokens.

In `@src/metastore/metastores/object_store_metastore.rs`:
- Around line 621-661: The code currently calls unwrap() on
filter_value.as_object() after calling migrate_v1_v2, which can panic if
migration yields a non-object; replace these unwraps with safe checks: after
filter_value = migrate_v1_v2(filter_value) do an if let Some(obj) =
filter_value.as_object() and then extract user_id, filter_id, stream_name via
obj.get(...).and_then(...); only proceed to build path with filter_path, call
to_bytes and storage.put_object when all three are Some, otherwise log/skip the
malformed migrated value (or return a controlled error) instead of unwrapping to
avoid panics in the loop that processes filter_bytes and affects methods like
migrate_v1_v2, storage.delete_object, to_bytes, and storage.put_object.

In `@src/parseable/streams.rs`:
- Around line 1091-1100: The contains method logs a warning whenever a tenant is
missing, which noisily floods logs; change the tracing::warn! call inside
contains to a lower level (tracing::debug! or tracing::trace!) so missing
tenants during normal checks aren’t noisy, keeping the same message/context
(tenant_id and stream_name); update the log invocation in the contains function
(where LOCK_EXPECT and DEFAULT_TENANT are used) accordingly.
- Around line 1055-1078: The get_or_create path contains several development
debug logs using tracing::warn! (the multi-line call logging
stream_name/tenant_id/metadata/options, the "creating new stream- {stream_name}"
call, and the "inserted stream in mem" call); replace these with either removed
statements or downgrade them to tracing::trace! to avoid polluting production
logs while keeping low-verbosity traces, keeping the rest of the logic (tenant
lookup, Stream::new, guard.entry(...).or_default().insert(...), and the returned
stream) unchanged.

In `@src/query/mod.rs`:
- Around line 97-120: The add_tenant() flow registers tenant metadata with
TENANT_METADATA.insert_tenant(...) but never registers the tenant schema with
the query session; after calling TENANT_METADATA.insert_tenant(...) in
add_tenant(), call QUERY_SESSION.add_schema(tenant_id) (using the same
tenant_id) so the DataFusion catalog contains the new schema and runtime queries
for that tenant succeed.

In `@src/rbac/map.rs`:
- Line 303: The session expiration check in remove_expired_session is inverted:
sessions.retain(|(_, expiry)| expiry < &now) keeps expired sessions and drops
valid ones; update the predicate to retain only non-expired entries by using
expiry > &now (or expiry >= &now if you want to treat equal as valid) so
sessions.retain(|(_, expiry)| expiry > &now) preserves sessions that have not
yet expired.
♻️ Duplicate comments (34)
src/handlers/http/cluster/mod.rs (4)

377-382: Tenant context not propagated to ingestors during stream sync.

The tenant_id parameter is commented out, meaning stream synchronization to ingestors lacks tenant context. This breaks tenant isolation in multi-tenant deployments. The past review flagged this issue and it remains unaddressed.


594-599: Tenant context not propagated during user creation sync.

The tenant_id parameter is commented out in sync_user_creation. While the User struct contains a tenant field that will be serialized, this was flagged in a past review - the calling code in querier_rbac.rs creates users with None tenant instead of using the extracted tenant_id.


655-658: Unused req parameter in sync_password_reset_with_ingestors.

The HttpRequest parameter is accepted but never used. Either remove it or use it to extract and propagate tenant context.


698-704: Unused req parameter in sync_role_update.

The HttpRequest parameter is accepted but not used. The tenant_id is already passed as a separate parameter, making req redundant.

src/tenants/mod.rs (1)

51-67: Data duplication causes state divergence (previously flagged).

suspended_services is stored both in TenantOverview.suspended_services and within TenantOverview.meta.suspended_services. When suspend_service or resume_service is called, only the HashSet is updated, leaving meta.suspended_services stale. Since get_tenants() returns meta for persistence, suspension state changes won't be persisted correctly.

src/alerts/alert_types.rs (1)

91-123: Auth credential extraction always returns None (previously flagged).

The complex logic to find an admin user for the tenant (lines 91-105) correctly identifies a user with admin privileges, but the credential extraction (lines 107-118) always returns None:

  • Native user: returns None with TODO comments
  • OAuth user: logs warning and returns None

This means execute_alert_query will always receive None for auth_token, which will cause remote alert queries in Prism mode to fail authentication.

src/handlers/http/middleware.rs (2)

167-177: Handle potential panic from HeaderValue::from_str().unwrap().

If tenant_id contains characters not valid in HTTP headers (e.g., control characters), HeaderValue::from_str() will return an error and unwrap() will panic.

♻️ Suggested fix
         let user_and_tenant_id = match get_user_and_tenant_from_request(req.request()) {
             Ok((uid, tid)) => {
-                req.headers_mut().insert(
-                    HeaderName::from_static("tenant"),
-                    HeaderValue::from_str(&tid).unwrap(),
-                );
+                if let Ok(header_val) = HeaderValue::from_str(&tid) {
+                    req.headers_mut().insert(
+                        HeaderName::from_static("tenant"),
+                        header_val,
+                    );
+                }
                 Ok((uid, tid))
             }
             Err(e) => Err(e),
         };

304-315: Security: Consider stricter handling when tenant doesn't exist.

The check_suspension function returns Authorized when the tenant doesn't exist in TENANT_METADATA (Lines 310-312). This could allow requests to bypass tenant-level controls. The empty else branch suggests this needs further handling.

♻️ Suggested approach
 pub fn check_suspension(req: &HttpRequest, action: Action) -> rbac::Response {
     if let Some(tenant) = req.headers().get("tenant")
         && let Ok(tenant) = tenant.to_str()
     {
-        if let Ok(Some(suspension)) = TENANT_METADATA.is_action_suspended(tenant, &action) {
-            return rbac::Response::Suspended(suspension);
-        } else {
-            // tenant does not exist
+        match TENANT_METADATA.is_action_suspended(tenant, &action) {
+            Ok(Some(suspension)) => return rbac::Response::Suspended(suspension),
+            Ok(None) => {} // Tenant exists, action not suspended
+            Err(_) => {
+                tracing::warn!(tenant = tenant, "Tenant not found in metadata");
+                // Consider returning UnAuthorized for unknown tenants
+            }
         }
     }
     rbac::Response::Authorized
 }
src/correlation.rs (3)

140-144: Correlation silently not added to memory when tenant bucket doesn't exist.

In create(), if the tenant bucket doesn't exist in the in-memory map, the correlation is persisted to metastore but not added to the in-memory cache. This could cause inconsistencies.

🐛 Proposed fix using entry API
         let tenant = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v);
         // Update in memory
-        if let Some(corrs) = self.write().await.get_mut(tenant) {
-            corrs.insert(correlation.id.to_owned(), correlation.clone());
-        }
+        self.write()
+            .await
+            .entry(tenant.to_owned())
+            .or_default()
+            .insert(correlation.id.to_owned(), correlation.clone());

176-183: Same issue: update() silently fails when tenant bucket doesn't exist.

Apply the same fix using the entry API to ensure the correlation is always inserted.

🐛 Proposed fix
         let tenant = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v);
         // Update in memory
-        if let Some(corrs) = self.write().await.get_mut(tenant) {
-            corrs.insert(
-                updated_correlation.id.to_owned(),
-                updated_correlation.clone(),
-            );
-        }
+        self.write()
+            .await
+            .entry(tenant.to_owned())
+            .or_default()
+            .insert(updated_correlation.id.to_owned(), updated_correlation.clone());

205-211: Critical bug: remove operates on wrong map level, deletes entire tenant's correlations.

Line 211 calls self.write().await.remove(&correlation.id) which removes an entry from the outer HashMap<String, CorrelationMap> using correlation.id as the key. This should remove from the inner CorrelationMap for the specific tenant.

🐛 Proposed fix
         // Delete from storage
         PARSEABLE
             .metastore
             .delete_correlation(&correlation, tenant_id)
             .await?;

         // Delete from memory
-        self.write().await.remove(&correlation.id);
+        let tenant = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v.as_str());
+        if let Some(corrs) = self.write().await.get_mut(tenant) {
+            corrs.remove(&correlation.id);
+        }

         Ok(())
src/handlers/http/modal/ingest/ingestor_rbac.rs (1)

40-67: Inverted tenant validation logic.

The condition at Lines 51-53 checks for equality when it should check for inequality. The error message indicates this should prevent cross-tenant operations, but the current logic prevents same-tenant operations instead.

-        if req_tenant.ne(DEFAULT_TENANT)
-            && (req_tenant.eq(user.tenant.as_ref().map_or(DEFAULT_TENANT, |v| v)))
+        if req_tenant.ne(DEFAULT_TENANT)
+            && (req_tenant.ne(user.tenant.as_ref().map_or(DEFAULT_TENANT, |v| v)))
src/handlers/http/modal/query/querier_rbac.rs (3)

79-79: User created without tenant association.

Despite extracting tenant_id from the request at line 50 and using it for metadata operations, User::new_basic is still called with None for the tenant parameter. This creates users without tenant association, breaking multi-tenant isolation.

🐛 Proposed fix
-    let (user, password) = user::User::new_basic(username.clone(), None);
+    let (user, password) = user::User::new_basic(username.clone(), tenant_id.clone());

163-163: Tenant context missing in user deletion sync.

sync_user_deletion_with_ingestors(&userid) sends only userid to ingestors without tenant context. In multi-tenant deployments, ingestors may delete users from the wrong tenant if the same userid exists across tenants.


223-223: Tenant context missing in role sync operations.

sync_users_with_roles_with_ingestors at lines 223 and 295 doesn't receive tenant_id. This follows the same pattern as the user deletion sync issue—ingestors won't know which tenant's user to update.

Also applies to: 295-295

src/query/stream_schema_provider.rs (4)

282-293: Same incomplete tenant implementation and unwrap issue in staging path.

This duplicates the pattern from the hot tier path—commented-out tenant logic and unwrap() on ObjectStoreUrl::parse().


529-534: Reduce logging level from warn to debug or trace.

This logging statement runs on every table scan and outputs schema/tenant/stream info. Using warn level will flood logs in production.

♻️ Suggested fix
-        tracing::warn!(
+        tracing::debug!(
             "entered scan with\ntenant- {:?}\nschema- {:?}\nstream- {}",
             self.tenant_id,
             self.schema,
             self.stream
         );

639-648: Incomplete tenant implementation and noisy logging in object store scan path.

The tenant-aware URL construction is commented out (lines 639-644), and tracing::warn! at line 645 will be noisy in production. The unwrap() at line 648 should use proper error handling since scan() returns Result.

🐛 Proposed fix
-        // let object_store_url = if let Some(tenant_id) = self.tenant_id.as_ref() {
-        //     glob_storage.store_url().join(tenant_id).unwrap()
-        // } else {
-        //     glob_storage.store_url()
-        // };
+        // TODO: Enable tenant-aware paths once object store supports multi-tenancy
         let object_store_url = glob_storage.store_url();
-        tracing::warn!(object_store_url=?object_store_url);
+        tracing::trace!(object_store_url=?object_store_url);
         self.create_parquet_physical_plan(
             &mut execution_plans,
-            ObjectStoreUrl::parse(object_store_url).unwrap(),
+            ObjectStoreUrl::parse(&object_store_url)
+                .map_err(|e| DataFusionError::Plan(format!("Invalid object store URL: {e}")))?,

224-232: Incomplete tenant-aware object store URL construction in hot tier and other execution paths.

The struct has a tenant_id field but the hot tier execution plan uses hardcoded "file:///" instead. This pattern repeats in other execution paths (lines 284-293, 631-648). Additionally, ObjectStoreUrl::parse(...).unwrap() should use proper error handling instead of panic-on-error.

  • Apply tenant-aware path construction consistently where self.tenant_id is available, or document if global object store isolation is intentional.
  • Replace .unwrap() with .map_err() to convert parsing errors into Result for proper error propagation.
src/handlers/http/modal/ingest/ingestor_role.rs (1)

46-52: Inverted tenant validation logic.

The condition req_tenant.ne(DEFAULT_TENANT) && (req_tenant.eq(&sync_req.tenant_id)) rejects requests when the request tenant matches the payload tenant. Based on the error message ("non super-admin user trying to create role for another tenant"), the second condition should check for a mismatch instead.

🐛 Proposed fix
-    if req_tenant.ne(DEFAULT_TENANT) && (req_tenant.eq(&sync_req.tenant_id)) {
+    if req_tenant.ne(DEFAULT_TENANT) && (req_tenant.ne(&sync_req.tenant_id)) {
         return Err(RoleError::Anyhow(anyhow::Error::msg(
             "non super-admin user trying to create role for another tenant",
         )));
     }
src/handlers/http/oidc.rs (3)

102-116: BasicAuth flow uses incorrect tenant lookup.

For SessionKey::BasicAuth, get_tenant_id_from_key() returns None because basic credentials have no pre-established session. This causes Users.get_user(&username, &tenant_id) at line 115 to default to DEFAULT_TENANT, breaking multi-tenant support for basic-auth users. Use get_tenant_id_from_request(&req) instead.

🐛 Proposed fix
-    let tenant_id = get_tenant_id_from_key(&session_key);
     match session_key {
         // We can exchange basic auth for session cookie
-        SessionKey::BasicAuth { username, password } => match Users.get_user(&username, &tenant_id)
-        {
+        SessionKey::BasicAuth { username, password } => {
+            let tenant_id = get_tenant_id_from_request(&req);
+            match Users.get_user(&username, &tenant_id)
+        {

130-160: Intra-cluster login sync should treat non-2xx as failure.

reqwest::send() succeeds on non-2xx responses, so the sync can silently fail. Consider using .error_for_status() and logging per-node failures.

🐛 Proposed fix
                         INTRA_CLUSTER_CLIENT
                             .post(url)
                             .header(header::AUTHORIZATION, node.token)
                             .header(header::CONTENT_TYPE, "application/json")
                             .json(&json!(
                                 {
                                     "sessionCookie": _session,
                                     "user": _user,
                                     "expiry": EXPIRY_DURATION
                                 }
                             ))
                             .send()
-                            .await?;
+                            .await?
+                            .error_for_status()?;
                         Ok::<(), anyhow::Error>(())

325-326: User created without tenant association.

The comment "LET TENANT BE NONE FOR NOW!!!" at line 325 indicates new OAuth users are intentionally created without tenant context, despite tenant_id being available. This creates a multi-tenant isolation gap.

Replace with a proper TODO comment with tracking reference, or pass tenant_id:

-        // LET TENANT BE NONE FOR NOW!!!
-        (None, roles) => put_user(&user_id, roles, user_info, bearer, None).await?,
+        // TODO(multi-tenancy): Pass tenant_id once OIDC user creation supports it
+        (None, roles) => put_user(&user_id, roles, user_info, bearer, tenant_id).await?,
src/handlers/http/role.rs (2)

154-169: Persist default-role metadata before mutating DEFAULT_ROLE (consistency).

Currently DEFAULT_ROLE is updated before put_metadata() (Line 162-168). If persistence fails, in-memory and storage diverge. Also, write().unwrap() can panic on poisoning.

This matches prior feedback.

Proposed fix
 pub async fn put_default(
     req: HttpRequest,
     name: web::Json<String>,
 ) -> Result<impl Responder, RoleError> {
     let name = name.into_inner();
     let tenant_id = get_tenant_id_from_request(&req);
     let mut metadata = get_metadata(&tenant_id).await?;
     metadata.default_role = Some(name.clone());
-    DEFAULT_ROLE.write().unwrap().insert(
-        tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v).to_owned(),
-        Some(name),
-    );
-    // *DEFAULT_ROLE.lock().unwrap() = Some(name);
     put_metadata(&metadata, &tenant_id).await?;
+    DEFAULT_ROLE
+        .write()
+        .expect("failed to acquire DEFAULT_ROLE write lock")
+        .insert(
+            tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v).to_owned(),
+            Some(name),
+        );
     Ok(HttpResponse::Ok().finish())
 }

173-193: Remove commented-out block in get_default.

This matches prior feedback.

src/query/mod.rs (1)

173-218: Don’t ignore register_schema errors (and remove stale commented code).

let _ = catalog.register_schema(...) (Line 191, 201-204) silently drops failures; if registration fails, later queries can fail in surprising ways. Also, the large commented-out block (Line 207-215) should be removed.

This matches prior feedback.

src/alerts/mod.rs (1)

1245-1255: Ensure tenant buckets are created on update/write paths (don’t drop writes).

update(), the write-back in update_state(), and update_notification_state() only update if get_mut(tenant) returns Some(_) (Line 1247-1250, 1338-1341, 1381-1383). For first-time tenants / races with initialization, this loses updates.

This matches prior feedback.

Proposed fix pattern
-        let tenant = alert.get_tenant_id().as_ref().map_or(DEFAULT_TENANT, |v| v);
-        if let Some(alerts) = self.alerts.write().await.get_mut(tenant) {
-            alerts.insert(*alert.get_id(), alert.clone_box());
-        }
+        let tenant = alert.get_tenant_id().as_ref().map_or(DEFAULT_TENANT, |v| v);
+        let mut guard = self.alerts.write().await;
+        guard
+            .entry(tenant.to_owned())
+            .or_default()
+            .insert(*alert.get_id(), alert.clone_box());

Also applies to: 1334-1343, 1349-1387

src/parseable/mod.rs (4)

1056-1075: Fix TOCTOU race in add_tenant() (check+insert must be atomic).

This matches prior feedback.


1115-1143: delete_tenant() leaves self.tenants inconsistent (tenant still listed).

This matches prior feedback.


1145-1181: load_tenants() logic is incomplete and lock failures are swallowed.

  • The else if !is_multi_tenant {} branch (Line 1167-1168) does nothing, but the function-level comment says startup should fail when multi-tenant traces exist but the flag is off.
  • Returning Ok(None) on poisoned lock (Line 1175-1180) masks a fundamental failure (better to Err/panic). Based on learnings, fail fast on critical state persistence/locking issues.

This matches prior feedback.


1183-1190: Don’t silently return None on tenants lock failure.

This matches prior feedback.

src/metastore/metastores/object_store_metastore.rs (3)

323-340: The tenant_id parameter is unused in get_alert_state_entry.

Line 328 calls alert_state_json_path(*alert_id) without passing tenant_id. This breaks tenant isolation as all tenants would read from the same alert state path. The alert_state_json_path function should be updated to accept tenant_id and construct tenant-scoped paths, matching the pattern used by mttr_json_path.


487-510: Bug: Dashboard HashMap overwrites entries for each user within same tenant.

The loop at lines 492-506 iterates over users within a tenant, but line 504 always inserts with the same tenant key, overwriting the previous user's dashboards. Only the last user's dashboards will be retained per tenant.

🐛 Suggested fix to accumulate dashboards correctly
     async fn get_dashboards(&self) -> Result<HashMap<String, Vec<Bytes>>, MetastoreError> {
         let mut dashboards = HashMap::new();
         let base_paths = PARSEABLE.list_tenants().map_or(vec!["".into()], |v| v);
         for mut tenant in base_paths {
             let users_dir = RelativePathBuf::from_iter([&tenant, USERS_ROOT_DIR]);
+            let mut tenant_dashboards = Vec::new();
             for user in self.storage.list_dirs_relative(&users_dir).await? {
                 let dashboards_path = users_dir.join(&user).join("dashboards");
                 let dashboard_bytes = self
                     .storage
                     .get_objects(
                         Some(&dashboards_path),
                         Box::new(|file_name| file_name.ends_with(".json")),
                     )
                     .await?;
-                if tenant.eq(&mut "") {
-                    tenant.clone_from(&DEFAULT_TENANT.to_string());
-                }
-                dashboards.insert(tenant.to_owned(), dashboard_bytes);
+                tenant_dashboards.extend(dashboard_bytes);
             }
+            if tenant.eq(&mut "") {
+                tenant.clone_from(&DEFAULT_TENANT.to_string());
+            }
+            dashboards.insert(tenant, tenant_dashboards);
         }
         Ok(dashboards)
     }

1229-1244: Missing tenant prefix filter in stream extraction.

When tenant_id is provided, list_with_delimiter returns paths with the full tenant prefix (e.g., "tenant/stream1/"). The flat_map(|path| path.parts()) extracts all path components including the tenant name. The filter at lines 1237-1243 doesn't exclude the tenant prefix, causing the tenant ID to be incorrectly added to the stream list.

🐛 Suggested fix
             let streams = resp
                 .common_prefixes
                 .iter()
                 .flat_map(|path| {
                     path.parts()
                 })
                 .map(|name| name.as_ref().to_string())
                 .filter(|name| {
                     name != PARSEABLE_ROOT_DIRECTORY
                         && name != USERS_ROOT_DIR
                         && name != SETTINGS_ROOT_DIRECTORY
                         && name != ALERTS_ROOT_DIRECTORY
+                        && tenant_id.as_ref().map_or(true, |t| name != t)
                 })
                 .collect::<Vec<_>>();
🧹 Nitpick comments (22)
src/handlers/airplane.rs (1)

186-198: Consider a more informative error message for suspended users.

The handling of rbac::Response::Suspended is correct for completeness, but the error message "Suspended" is terse. Consider providing more context:

-            rbac::Response::Suspended(_) => return Err(Status::permission_denied("Suspended")),
+            rbac::Response::Suspended(msg) => {
+                return Err(Status::permission_denied(format!("User account suspended: {msg}")));
+            }
src/prism/home/mod.rs (1)

101-110: Consider extracting tenant_id without the reference-to-temporary pattern.

The pattern &get_tenant_id_from_key(key) relies on temporary lifetime extension. While valid, it's more conventional to bind the owned value first:

let tenant_id = get_tenant_id_from_key(key);
// then pass &tenant_id where needed

Also note that get_alerts_summary(key) on line 108 doesn't receive tenant_id. If alerts should be tenant-scoped for consistency with streams, this may need updating.

src/rbac/map.rs (3)

147-155: Remove commented-out code blocks.

Multiple sections of commented-out code remain throughout this file (lines 127-131, 147-155, 446-450, 465-467, 485-487, 496-498, 525-529). These make the code harder to read and should be removed before merging.


62-69: Unnecessary braces around function bodies.

The extra braces in users() and roles() functions serve no purpose and can be removed.

♻️ Suggested cleanup
 pub fn users() -> RwLockReadGuard<'static, Users> {
-    {
-        USERS
-            .get()
-            .expect("map is set")
-            .read()
-            .expect("not poisoned")
-    }
+    USERS
+        .get()
+        .expect("map is set")
+        .read()
+        .expect("not poisoned")
 }

274-290: Remove commented debug logging that previously leaked credentials.

The debug logging was correctly disabled to prevent credential leakage, but the commented-out code should be removed entirely rather than left in place.

♻️ Suggested cleanup
     pub fn remove_user(&mut self, username: &str, tenant_id: &str) {
-        // tracing::warn!("removing user- {username}, tenant_id- {tenant_id}");
-        // tracing::warn!("active sessions- {:?}", self.active_sessions);
-        // tracing::warn!("user sessions- {:?}", self.user_sessions);
         let sessions = if let Some(tenant_sessions) = self.user_sessions.get_mut(tenant_id) {
-            // tracing::warn!("found session for tenant- {tenant_id}");
             tenant_sessions.remove(username)
         } else {
-            // tracing::warn!("not found session for tenant- {tenant_id}");
             None
         };
         if let Some(sessions) = sessions {
-            // tracing::warn!("found active sessions for user {username}-   {sessions:?}");
             sessions.into_iter().for_each(|(key, _)| {
                 self.active_sessions.remove(&key);
             })
         }
     }
src/rbac/mod.rs (1)

119-125: Simplify match with if let or remove empty arm.

The None => {} arm is unnecessary and can be simplified.

♻️ Suggested cleanup
     fn remove_user(&mut self, userid: &str, tenant_id: &str) {
-        match mut_users().get_mut(tenant_id) {
-            Some(users) => {
-                users.remove(userid);
-            }
-            None => {}
+        if let Some(users) = mut_users().get_mut(tenant_id) {
+            users.remove(userid);
         }
     }
src/handlers/http/query.rs (1)

118-125: Duplicate call to get_tenant_id_from_request.

get_tenant_id_from_request(&req) is called at line 118 and again at line 120. Extract it once and reuse.

♻️ Suggested cleanup
+    let tenant_id = get_tenant_id_from_request(&req);
     // check or load streams in memory
-    create_streams_for_distributed(tables.clone(), &get_tenant_id_from_request(&req)).await?;
-
-    let tenant_id = get_tenant_id_from_request(&req);
+    create_streams_for_distributed(tables.clone(), &tenant_id).await?;
     session_state
         .config_mut()
src/handlers/livetail.rs (1)

119-122: Livetail doesn't propagate tenant context.

get_stream is called with &None for tenant_id, meaning livetail operates without tenant awareness. If multi-tenancy requires tenant isolation for livetail, the tenant context should be extracted from the request (similar to query handlers).

Is tenant-aware livetail required for this PR? If so, consider extracting tenant_id from the request metadata similar to how it's done in HTTP handlers.

src/handlers/http/modal/query/querier_role.rs (1)

57-61: Remove commented-out dead code.

Line 61 contains a commented-out line that is no longer needed since the tenant-scoped insertion on lines 57-60 replaces it.

♻️ Suggested cleanup
     mut_roles()
         .entry(tenant.to_owned())
         .or_default()
         .insert(name.clone(), privileges.clone());
-    // mut_roles().insert(name.clone(), privileges.clone());
src/tenants/mod.rs (2)

89-91: Redundant return keyword.

The explicit return on line 90 is unnecessary in Rust when it's the last expression in a branch.

♻️ Suggested fix
         } else {
-            return Err(TenantNotFound(tenant_id.to_owned()));
+            Err(TenantNotFound(tenant_id.to_owned()))
         }

106-153: Remove large block of commented-out code.

This 48-line commented block adds noise and should be removed. If this code is intended for future use, consider tracking it in an issue instead.

src/catalog/mod.rs (1)

460-460: Debug logging at warn level should be reduced.

This tracing::warn! appears to be debug/development logging. Consider changing to trace! or debug! level, or removing it before merge.

-    tracing::warn!("manifest path_url= {path_url}");
+    tracing::debug!("manifest path_url= {path_url}");
src/parseable/streams.rs (1)

1031-1033: Remove commented-out code.

Multiple blocks of commented-out code remain from the refactoring. These should be removed to improve code clarity.

Also applies to: 1066-1069, 1088-1088, 1128-1135, 1179-1185

src/handlers/http/role.rs (1)

41-89: Tenant propagation looks consistent; consider avoiding tenant_id shadowing + validate tenant header.

This handler does the right thing by persisting metadata before updating mut_roles() (Line 54-60). Two nits:

  • Shadowing tenant_id from Option<String> to &str (Line 56) is a bit error-prone; consider tenant_key.
  • get_tenant_id_from_request() (used on Line 47) currently unwrap()s header UTF-8 per snippet; that’s a crash vector and also needs tenant-id validation (no /, .., etc.) since tenant ids are used as storage path prefixes elsewhere.
src/query/mod.rs (3)

76-121: Drop commented-out legacy QUERY_SESSION + consider lock choice for SessionContext.

  • The commented-out QUERY_SESSION (Line 76-78) should be removed before merge.
  • std::sync::RwLock is probably fine here since you don’t hold guards across .await, but it’s worth confirming this won’t become a contention point under query load.

280-376: Minor: avoid repeated get_ctx() calls inside Query::execute().

You can grab let ctx = QUERY_SESSION.get_ctx(); once and reuse it for execute_logical_plan, state(), and task_ctx() to avoid repeated lock+clone.


946-1021: Consider tenant-scoping metrics or drop unused tenant_id field.

PartitionedMetricMonitor stores tenant_id (Line 954-969) but check_if_last_stream() doesn’t use it; either wire it into metrics labels (if desired) or remove the field to keep intent clear.

src/storage/object_storage.rs (2)

1239-1246: Minor: mttr_json_path() can avoid &tenant double-ref.

RelativePathBuf::from_iter([&tenant, ...]) (Line 1242-1243) can just use tenant.as_str() for clarity.


1070-1087: Use conditional pattern to avoid empty string components in path construction.

schema_path() and stream_json_path() build tenant = "" via map_or("", |v| v) then pass it to RelativePathBuf::from_iter([tenant, ...]). While the relative-path crate normalizes empty string components, the safer and more explicit pattern—already used for alert_json_path() and mttr_json_path() in the same file—is to conditionally include the tenant only when Some.

Refactor to match the existing pattern:

Proposed fix
 pub fn schema_path(stream_name: &str, tenant_id: &Option<String>) -> RelativePathBuf {
-    let tenant = tenant_id.as_ref().map_or("", |v| v);
     if PARSEABLE.options.mode == Mode::Ingest {
         ...
-        RelativePathBuf::from_iter([tenant, stream_name, STREAM_ROOT_DIRECTORY, &file_name])
+        if let Some(tenant) = tenant_id.as_deref() {
+            RelativePathBuf::from_iter([tenant, stream_name, STREAM_ROOT_DIRECTORY, &file_name])
+        } else {
+            RelativePathBuf::from_iter([stream_name, STREAM_ROOT_DIRECTORY, &file_name])
+        }
     } else {
-        RelativePathBuf::from_iter([tenant, stream_name, STREAM_ROOT_DIRECTORY, SCHEMA_FILE_NAME])
+        if let Some(tenant) = tenant_id.as_deref() {
+            RelativePathBuf::from_iter([tenant, stream_name, STREAM_ROOT_DIRECTORY, SCHEMA_FILE_NAME])
+        } else {
+            RelativePathBuf::from_iter([stream_name, STREAM_ROOT_DIRECTORY, SCHEMA_FILE_NAME])
+        }
     }
 }

Also applies to: stream_json_path() (lines 1166–1182)

src/metastore/metastores/object_store_metastore.rs (2)

460-484: Consider using tenant_id parameter directly in put_llmconfig and delete_llmconfig.

Both methods accept tenant_id but rely entirely on obj.get_object_path() for the storage path. This creates an implicit contract that the object must be constructed with the correct tenant-prefixed path. While functional, this pattern is inconsistent with methods like put_keystone which explicitly construct paths using tenant_id.


1100-1105: Consider validating tenant_id before deletion.

The delete_tenant method accepts any &str including empty string. An empty tenant_id would create a path from "", potentially deleting more than intended depending on how delete_prefix handles empty paths.

🛡️ Suggested validation
     async fn delete_tenant(&self, tenant_id: &str) -> Result<(), MetastoreError> {
+        if tenant_id.is_empty() {
+            return Err(MetastoreError::Error {
+                status_code: StatusCode::BAD_REQUEST,
+                message: "tenant_id cannot be empty".into(),
+                flow: "delete_tenant".into(),
+            });
+        }
         self.storage
             .delete_prefix(&RelativePathBuf::from(tenant_id))
             .await
             .map_err(MetastoreError::ObjectStorageError)
     }
src/metastore/metastore_traits.rs (1)

171-182: Note: get_chats has different return type pattern.

Unlike other similar methods that return HashMap<String, Vec<Bytes>> for tenant-keyed results, get_chats returns DashMap<String, Vec<Bytes>> keyed by user, not tenant. This asymmetry is compounded by the implementation not iterating over tenants (see implementation file review). Consider whether get_chats should follow the same tenant-keyed pattern.

Comment on lines +621 to +661
for filter in filter_bytes {
// deserialize into Value
let mut filter_value =
serde_json::from_slice::<serde_json::Value>(&filter)?;

if let Some(meta) = filter_value.clone().as_object() {
let version = meta.get("version").and_then(|version| version.as_str());

if version == Some("v1") {
// delete older version of the filter
self.storage.delete_object(&filters_path).await?;

filter_value = migrate_v1_v2(filter_value);
let user_id = filter_value
.as_object()
.unwrap()
.get("user_id")
.and_then(|user_id| user_id.as_str());
let filter_id = filter_value
.as_object()
.unwrap()
.get("filter_id")
.and_then(|filter_id| filter_id.as_str());
let stream_name = filter_value
.as_object()
.unwrap()
.get("stream_name")
.and_then(|stream_name| stream_name.as_str());

// if these values are present, create a new file
if let (Some(user_id), Some(stream_name), Some(filter_id)) =
(user_id, stream_name, filter_id)
{
let path = filter_path(
user_id,
stream_name,
&format!("{filter_id}.json"),
);
let filter_bytes = to_bytes(&filter_value);
self.storage.put_object(&path, filter_bytes.clone()).await?;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Potential panics from .unwrap() calls during filter migration.

Lines 636, 639, 642, 645, and 648 use .unwrap() on filter_value.as_object() results. While these are guarded by the outer if let Some(meta) = filter_value.clone().as_object() check, after migrate_v1_v2 transforms the value, there's no guarantee the result is still a valid object. If migration produces an unexpected structure, this will panic.

🛡️ Suggested defensive approach
                             filter_value = migrate_v1_v2(filter_value);
-                            let user_id = filter_value
-                                .as_object()
-                                .unwrap()
-                                .get("user_id")
-                                .and_then(|user_id| user_id.as_str());
-                            let filter_id = filter_value
-                                .as_object()
-                                .unwrap()
-                                .get("filter_id")
-                                .and_then(|filter_id| filter_id.as_str());
-                            let stream_name = filter_value
-                                .as_object()
-                                .unwrap()
-                                .get("stream_name")
-                                .and_then(|stream_name| stream_name.as_str());
+                            let migrated_obj = filter_value.as_object();
+                            let user_id = migrated_obj
+                                .and_then(|o| o.get("user_id"))
+                                .and_then(|v| v.as_str());
+                            let filter_id = migrated_obj
+                                .and_then(|o| o.get("filter_id"))
+                                .and_then(|v| v.as_str());
+                            let stream_name = migrated_obj
+                                .and_then(|o| o.get("stream_name"))
+                                .and_then(|v| v.as_str());
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
for filter in filter_bytes {
// deserialize into Value
let mut filter_value =
serde_json::from_slice::<serde_json::Value>(&filter)?;
if let Some(meta) = filter_value.clone().as_object() {
let version = meta.get("version").and_then(|version| version.as_str());
if version == Some("v1") {
// delete older version of the filter
self.storage.delete_object(&filters_path).await?;
filter_value = migrate_v1_v2(filter_value);
let user_id = filter_value
.as_object()
.unwrap()
.get("user_id")
.and_then(|user_id| user_id.as_str());
let filter_id = filter_value
.as_object()
.unwrap()
.get("filter_id")
.and_then(|filter_id| filter_id.as_str());
let stream_name = filter_value
.as_object()
.unwrap()
.get("stream_name")
.and_then(|stream_name| stream_name.as_str());
// if these values are present, create a new file
if let (Some(user_id), Some(stream_name), Some(filter_id)) =
(user_id, stream_name, filter_id)
{
let path = filter_path(
user_id,
stream_name,
&format!("{filter_id}.json"),
);
let filter_bytes = to_bytes(&filter_value);
self.storage.put_object(&path, filter_bytes.clone()).await?;
}
for filter in filter_bytes {
// deserialize into Value
let mut filter_value =
serde_json::from_slice::<serde_json::Value>(&filter)?;
if let Some(meta) = filter_value.clone().as_object() {
let version = meta.get("version").and_then(|version| version.as_str());
if version == Some("v1") {
// delete older version of the filter
self.storage.delete_object(&filters_path).await?;
filter_value = migrate_v1_v2(filter_value);
let migrated_obj = filter_value.as_object();
let user_id = migrated_obj
.and_then(|o| o.get("user_id"))
.and_then(|v| v.as_str());
let filter_id = migrated_obj
.and_then(|o| o.get("filter_id"))
.and_then(|v| v.as_str());
let stream_name = migrated_obj
.and_then(|o| o.get("stream_name"))
.and_then(|v| v.as_str());
// if these values are present, create a new file
if let (Some(user_id), Some(stream_name), Some(filter_id)) =
(user_id, stream_name, filter_id)
{
let path = filter_path(
user_id,
stream_name,
&format!("{filter_id}.json"),
);
let filter_bytes = to_bytes(&filter_value);
self.storage.put_object(&path, filter_bytes.clone()).await?;
}
🤖 Prompt for AI Agents
In `@src/metastore/metastores/object_store_metastore.rs` around lines 621 - 661,
The code currently calls unwrap() on filter_value.as_object() after calling
migrate_v1_v2, which can panic if migration yields a non-object; replace these
unwraps with safe checks: after filter_value = migrate_v1_v2(filter_value) do an
if let Some(obj) = filter_value.as_object() and then extract user_id, filter_id,
stream_name via obj.get(...).and_then(...); only proceed to build path with
filter_path, call to_bytes and storage.put_object when all three are Some,
otherwise log/skip the malformed migrated value (or return a controlled error)
instead of unwrapping to avoid panics in the loop that processes filter_bytes
and affects methods like migrate_v1_v2, storage.delete_object, to_bytes, and
storage.put_object.

Comment on lines +97 to +120
impl InMemorySessionContext {
pub fn get_ctx(&self) -> SessionContext {
let ctx = self
.session_context
.read()
.expect("SessionContext should be readable");
ctx.clone()
}

pub fn add_schema(&self, tenant_id: &str) {
self.session_context
.write()
.expect("SessionContext should be writeable")
.catalog("datafusion")
.expect("Default catalog should be available")
.register_schema(
tenant_id,
Arc::new(GlobalSchemaProvider {
storage: PARSEABLE.storage().get_object_store(),
tenant_id: Some(tenant_id.to_owned()),
}),
)
.expect("Should be able to register new schema");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# Search for add_schema invocations and tenant-related functions
rg -n "add_schema|\.add_schema\(" --type rust -B 2 -A 2

Repository: parseablehq/parseable

Length of output: 509


🏁 Script executed:

# Search for tenant addition/creation flows
rg -n "add_tenant|insert_tenant|create_tenant" --type rust -B 2 -A 2

Repository: parseablehq/parseable

Length of output: 1327


🏁 Script executed:

# Find where QUERY_SESSION is used
rg -n "QUERY_SESSION\." --type rust -B 1 -A 1

Repository: parseablehq/parseable

Length of output: 2248


🏁 Script executed:

# Get the full add_tenant function
sed -n '1056,1080p' src/parseable/mod.rs

Repository: parseablehq/parseable

Length of output: 824


🏁 Script executed:

# Search for ALL invocations of add_schema (not just definition)
rg -n "\.add_schema\(" --type rust

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Check if add_schema is ever called anywhere
rg -c "add_schema" --type rust | grep -v "^0$" | head -20

Repository: parseablehq/parseable

Length of output: 83


🏁 Script executed:

# Find GlobalSchemaProvider implementation
rg -n "struct GlobalSchemaProvider|impl GlobalSchemaProvider" --type rust -A 20

Repository: parseablehq/parseable

Length of output: 1537


🏁 Script executed:

# Check if schema resolution is dynamic or requires pre-registration
rg -n "register_schema|tenant_id" --type rust | grep -E "(GlobalSchemaProvider|schema_provider)" -B 2 -A 2

Repository: parseablehq/parseable

Length of output: 2609


🏁 Script executed:

# Search for how schemas are resolved for tenants in queries
rg -n "get_schema|resolve.*schema" --type rust -B 2 -A 2 | head -50

Repository: parseablehq/parseable

Length of output: 3040


🏁 Script executed:

# Get the initialization context around line 191 where register_schema is called
sed -n '150,210p' src/query/mod.rs

Repository: parseablehq/parseable

Length of output: 2219


🏁 Script executed:

# Check if there's a difference between how QUERY_SESSION is initialized vs QUERY_SESSION_STATE
rg -n "QUERY_SESSION\s*=" --type rust -B 5 -A 5

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Find where QUERY_SESSION is created/initialized
rg -n "lazy_static|QUERY_SESSION\s*:" --type rust -B 2 -A 10 | head -50

Repository: parseablehq/parseable

Length of output: 2824


🏁 Script executed:

# Search for static declarations of QUERY_SESSION
rg -n "^lazy_static|^static.*QUERY_SESSION" --type rust -B 1 -A 10

Repository: parseablehq/parseable

Length of output: 889


🏁 Script executed:

# Check if add_schema is called during app startup or initialization
rg -n "startup|init.*session|bootstrap" --type rust -B 2 -A 5 | grep -A 5 -B 5 "add_schema|QUERY_SESSION"

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Find all calls to add_tenant
rg -n "\.add_tenant\(" --type rust -B 3 -A 3

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Check if there's any dynamic schema registration after tenant creation
rg -n "add_schema|register_schema" --type rust -B 2 -A 2 | grep -A 5 -B 5 "add_tenant"

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Search more broadly for add_tenant calls with different patterns
rg "add_tenant" --type rust -n

Repository: parseablehq/parseable

Length of output: 113


🏁 Script executed:

# Look for HTTP endpoints that might handle tenant creation
rg -n "tenant.*create|create.*tenant|POST.*tenant" --type rust -B 2 -A 5

Repository: parseablehq/parseable

Length of output: 18934


🏁 Script executed:

# Check all files that reference add_tenant function
rg -l "add_tenant" --type rust

Repository: parseablehq/parseable

Length of output: 85


🏁 Script executed:

# Search for HTTP handler endpoints related to tenant management
rg -n "tenant" --type rust | grep -E "(post|put|get).*handler|route|endpoint" -i -B 2 -A 2

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Look for tenant routes in handlers
fd --type f -path "*/handlers/*" | xargs rg -l "tenant" --type rust

Repository: parseablehq/parseable

Length of output: 2131


🏁 Script executed:

# Search for where PARSEABLE.add_tenant might be called
rg -n "PARSEABLE\.add_tenant" --type rust

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Search directly for add_tenant method calls
rg "add_tenant" --type rust

Repository: parseablehq/parseable

Length of output: 108


🏁 Script executed:

# Check the tenants module to see if add_tenant is even exposed
cat src/tenants/mod.rs

Repository: parseablehq/parseable

Length of output: 5083


🏁 Script executed:

# Search in handlers directory for tenant creation endpoints
find src/handlers -type f -name "*.rs" | xargs rg -l "tenant" | head -20

Repository: parseablehq/parseable

Length of output: 717


🏁 Script executed:

# Look for tenant-related HTTP handlers with POST/PUT methods
rg -n "pub.*fn.*tenant|pub.*async.*fn.*tenant" --type rust | grep -v "get_tenant" | head -20

Repository: parseablehq/parseable

Length of output: 1985


🏁 Script executed:

# Check if add_tenant is called in modal handlers
rg -n "add_tenant\|POST\|tenant.*create" src/handlers/http/modal/ --type rust -B 2 -A 2

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Check the modal handlers for tenant endpoints
rg -n "tenant" src/handlers/http/modal/mod.rs --type rust -B 2 -A 2

Repository: parseablehq/parseable

Length of output: 1571


🏁 Script executed:

# Look for Parseable.add_tenant usage
rg -n "\.add_tenant|fn add_tenant" src/parseable/mod.rs --type rust -B 5 -A 15

Repository: parseablehq/parseable

Length of output: 853


🏁 Script executed:

# Search for HTTP endpoint routes that might handle tenant creation
rg -n "web::post|web::put|web::get.*tenant" src/handlers/http/mod.rs --type rust -B 2 -A 2

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Check all HTTP routes/endpoints in handlers
rg -n "route|post|put|get" src/handlers/http/mod.rs --type rust | head -40

Repository: parseablehq/parseable

Length of output: 319


🏁 Script executed:

# Search for tenant-related HTTP endpoints more broadly
rg -n "tenant" src/handlers/http/mod.rs --type rust

Repository: parseablehq/parseable

Length of output: 280


🏁 Script executed:

# Check if add_tenant is even exposed as a public API endpoint
rg -rn "\/tenant|tenant.*endpoint" src/handlers --type rust

Repository: parseablehq/parseable

Length of output: 156


🏁 Script executed:

# Search for tenant-related HTTP endpoints
rg -n "tenant" src/handlers/http/mod.rs --type rust -B 2 -A 2

Repository: parseablehq/parseable

Length of output: 431


🏁 Script executed:

# Look for where the HTTP routes are configured
rg -n "scope|service|configure" src/handlers/http/mod.rs --type rust | head -30

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Check if there's a dedicated tenant handler file
find src/handlers -name "*tenant*" -type f

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Get full view of add_tenant function implementation
sed -n '1056,1074p' src/parseable/mod.rs

Repository: parseablehq/parseable

Length of output: 708


🏁 Script executed:

# Check if there's anywhere that calls add_schema after tenant creation
rg -n "add_schema|QUERY_SESSION\.add" --type rust

Repository: parseablehq/parseable

Length of output: 248


🏁 Script executed:

# Verify the initialization flow - check what PARSEABLE.list_tenants returns
rg -n "list_tenants" --type rust -B 2 -A 5

Repository: parseablehq/parseable

Length of output: 13841


🏁 Script executed:

# Check how GlobalSchemaProvider is used in query execution
rg -n "GlobalSchemaProvider\|get_schema\|table_names" src/query/stream_schema_provider.rs --type rust | head -30

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Look at the DataFusion catalog/schema resolution
rg -n "catalog\|schema.*resolve" src/query/mod.rs --type rust -B 2 -A 2

Repository: parseablehq/parseable

Length of output: 47


Add schema registration when tenants are created at runtime.

The add_tenant() function registers tenant metadata but does not register the tenant's schema with QUERY_SESSION. When a new tenant is added at runtime, queries for that tenant will fail because the catalog will have no schema for it. Call QUERY_SESSION.add_schema(tenant_id) after TENANT_METADATA.insert_tenant() in the add_tenant() function.

🤖 Prompt for AI Agents
In `@src/query/mod.rs` around lines 97 - 120, The add_tenant() flow registers
tenant metadata with TENANT_METADATA.insert_tenant(...) but never registers the
tenant schema with the query session; after calling
TENANT_METADATA.insert_tenant(...) in add_tenant(), call
QUERY_SESSION.add_schema(tenant_id) (using the same tenant_id) so the DataFusion
catalog contains the new schema and runtime queries for that tenant succeed.

} else {
return;
};
sessions.retain(|(_, expiry)| expiry < &now);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical bug: Session expiration logic is inverted.

The remove_expired_session method retains sessions where expiry < &now, but this keeps expired sessions and removes valid ones. The condition should be expiry > &now to retain sessions that have not yet expired.

🐛 Proposed fix
-        sessions.retain(|(_, expiry)| expiry < &now);
+        sessions.retain(|(_, expiry)| expiry > &now);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
sessions.retain(|(_, expiry)| expiry < &now);
sessions.retain(|(_, expiry)| expiry > &now);
🤖 Prompt for AI Agents
In `@src/rbac/map.rs` at line 303, The session expiration check in
remove_expired_session is inverted: sessions.retain(|(_, expiry)| expiry < &now)
keeps expired sessions and drops valid ones; update the predicate to retain only
non-expired entries by using expiry > &now (or expiry >= &now if you want to
treat equal as valid) so sessions.retain(|(_, expiry)| expiry > &now) preserves
sessions that have not yet expired.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
src/handlers/http/correlation.rs (1)

89-96: Same issue: tenant_id discarded in modify operation.

Similar to the post function, tenant_id is extracted but discarded, and CORRELATIONS.update() is called without tenant context. This should be addressed alongside the post function fix for consistent multi-tenant behavior.

🔧 Proposed fix
-    correlation.user_id = get_user_and_tenant_from_request(&req)
-        .map(|(s, _)| get_hash(&s.to_string()))
+    let (user_id, tenant_id) = get_user_and_tenant_from_request(&req)
+        .map(|(s, t)| (get_hash(&s.to_string()), t))
         .map_err(|err| CorrelationError::AnyhowError(Error::msg(err.to_string())))?;
+    correlation.user_id = user_id;

     let session_key = extract_session_key_from_req(&req)
         .map_err(|err| CorrelationError::AnyhowError(anyhow::Error::msg(err.to_string())))?;

-    let correlation = CORRELATIONS.update(correlation, &session_key).await?;
+    let correlation = CORRELATIONS.update(correlation, &session_key, &tenant_id).await?;
src/handlers/http/users/dashboards.rs (1)

78-90: Same tenant isolation concern in get_dashboard.

Similar to list_dashboards, this function extracts tenant_id from the header (line 83) rather than the authenticated session. This could allow cross-tenant dashboard access.

src/parseable/streams.rs (1)

1197-1267: Tests are broken: Stream::new calls missing tenant_id parameter.

Multiple test functions call Stream::new with 4 arguments, but the updated signature requires 5 parameters (including tenant_id). Additionally, calls to local_stream_data_path need to be updated to pass tenant_id as the second parameter. This will cause compilation failures.

🔧 Suggested fix (example for one test)
     fn test_staging_new_with_valid_stream() {
         let stream_name = "test_stream";

         let options = Arc::new(Options::default());
         let staging = Stream::new(
             options.clone(),
             stream_name,
             LogStreamMetadata::default(),
             None,
+            &None,
         );

         assert_eq!(
             staging.data_path,
-            options.local_stream_data_path(stream_name)
+            options.local_stream_data_path(stream_name, &None)
         );
     }

Apply similar fixes to all test functions: test_staging_with_special_characters, test_staging_data_path_initialization, test_staging_with_alphanumeric_stream_name, test_arrow_files_empty_directory, generate_correct_path_with_current_time_and_no_custom_partitioning, generate_correct_path_with_current_time_and_custom_partitioning, test_convert_to_parquet_with_empty_staging, write_log, different_minutes_multiple_arrow_files_to_parquet, same_minute_multiple_arrow_files_to_parquet, miss_current_arrow_file_when_converting_to_parquet, get_or_create_returns_existing_stream, create_and_return_new_stream_when_name_does_not_exist, and get_or_create_stream_concurrently.

🤖 Fix all issues with AI agents
In `@src/handlers/http/correlation.rs`:
- Around line 45-52: The handler currently uses get_tenant_id_from_request
(header) to derive tenant_id which is inconsistent with delete; replace that
header-based extraction with the session-based get_user_and_tenant_from_request
flow: call get_user_and_tenant_from_request(&req, &session_key) (or the
project's equivalent) and use the returned tenant (e.g., from the (user, tenant)
tuple) as tenant_id, propagate errors the same way as delete does, then pass
that tenant_id into CORRELATIONS.get_correlation(&correlation_id, &tenant_id).
Ensure you remove or stop using get_tenant_id_from_request in this function so
tenant is always taken from the authenticated session.

In `@src/handlers/http/users/dashboards.rs`:
- Around line 248-253: list_tags currently uses get_tenant_id_from_request
(header-based) which breaks tenant isolation; change it to extract the tenant id
the same way list_dashboards does (i.e., from the authenticated session/context
rather than a raw header). Locate the list_tags function and replace the call to
get_tenant_id_from_request(&req) with the same tenant-extraction helper used by
list_dashboards (or call into the auth/session object retrieved from the
request), ensure the tenant value passed to DASHBOARDS.list_tags(...) comes from
the authenticated session, and keep the existing return/error handling (same
types: list_tags, DASHBOARDS, HttpRequest, DashboardError).

In `@src/parseable/streams.rs`:
- Around line 1163-1169: The debug tracing statements in flush_and_convert are
left over and should be removed: delete the two tracing::warn! calls that log
flush_and_convert_tenants and parseable_streams_tenants (the lines that
reference tenants from PARSEABLE.list_tenants()/DEFAULT_TENANT and
self.read().unwrap().keys()). Leave the tenants selection logic (using
PARSEABLE.list_tenants and DEFAULT_TENANT) intact and do not replace them with
other logging.
- Line 649: The tracing::warn!(part_path=?part_path) call is a debug artifact
that will flood production logs; either remove it or downgrade it to trace level
by replacing tracing::warn!(part_path=?part_path) with
tracing::trace!(part_path=?part_path) (or delete the statement entirely) in the
same function/scope where the macro appears so logging noise is eliminated.
♻️ Duplicate comments (9)
src/utils/mod.rs (1)

79-85: Potential panic on malformed header value.

Using .unwrap() on to_str() can panic if the tenant header contains non-UTF8 bytes. This was previously flagged.

Suggested fix
 pub fn get_tenant_id_from_request(req: &HttpRequest) -> Option<String> {
     if let Some(tenant_value) = req.headers().get("tenant") {
-        Some(tenant_value.to_str().unwrap().to_owned())
+        tenant_value.to_str().ok().map(|s| s.to_owned())
     } else {
         None
     }
 }
src/parseable/mod.rs (4)

1066-1085: TOCTOU race condition in tenant addition.

This issue was previously flagged. The existence check (line 1075) uses a read lock, but the insertion (line 1080) acquires a separate write lock. Another thread could add the same tenant between these operations.


1125-1153: Incomplete tenant deletion - missing tenants list cleanup.

This issue was previously flagged. The method removes the tenant from TENANT_METADATA but does not remove it from self.tenants. This leaves the system in an inconsistent state where list_tenants() still returns the deleted tenant.


1155-1191: Incomplete logic in load_tenants and silent lock failure.

This issue was previously flagged. The empty else if !is_multi_tenant { } block (lines 1177-1178) does nothing, and the silent lock failure at lines 1185-1190 masks potential issues.


1193-1201: Silent lock failure masks potential issues.

This issue was previously flagged. Consider using expect("tenants lock poisoned") for consistency with lock handling elsewhere.

src/handlers/http/middleware.rs (2)

167-180: Handle potential panic from HeaderValue::from_str().unwrap().

If tenant_id contains characters not valid in HTTP headers (e.g., control characters), HeaderValue::from_str() will return an error and unwrap() will panic. This is a duplicate of a prior review comment that should be addressed.

🔧 Suggested fix
         let user_and_tenant_id = match get_user_and_tenant_from_request(req.request()) {
             Ok((uid, tid)) => {
-                if tid.is_some() {
-                    req.headers_mut().insert(
-                        HeaderName::from_static("tenant"),
-                        HeaderValue::from_str(&tid.as_ref().unwrap()).unwrap(),
-                    );
+                if let Some(ref tenant) = tid {
+                    if let Ok(header_val) = HeaderValue::from_str(tenant) {
+                        req.headers_mut().insert(
+                            HeaderName::from_static("tenant"),
+                            header_val,
+                        );
+                    }
                 }
-
                 Ok((uid, tid))
             }
             Err(e) => Err(e),
         };

316-327: Security: Consider stricter handling when tenant doesn't exist.

The check_suspension function returns Authorized when:

  1. No tenant header is present (line 325-326)
  2. Tenant doesn't exist in TENANT_METADATA (empty else block at line 322-324)

This could allow requests to bypass tenant-level controls. The empty else branch with the "tenant does not exist" comment suggests this needs further handling. This is a duplicate of a prior review comment that should be addressed.

🔧 Suggested approach
 pub fn check_suspension(req: &HttpRequest, action: Action) -> rbac::Response {
     if let Some(tenant) = req.headers().get("tenant")
         && let Ok(tenant) = tenant.to_str()
     {
-        if let Ok(Some(suspension)) = TENANT_METADATA.is_action_suspended(tenant, &action) {
-            return rbac::Response::Suspended(suspension);
-        } else {
-            // tenant does not exist
+        match TENANT_METADATA.is_action_suspended(tenant, &action) {
+            Ok(Some(suspension)) => return rbac::Response::Suspended(suspension),
+            Ok(None) => {} // Tenant exists, action not suspended - continue
+            Err(_) => {
+                tracing::warn!(tenant = tenant, "Tenant not found in metadata during suspension check");
+                // Consider returning UnAuthorized for unknown tenants in production
+            }
         }
     }
     rbac::Response::Authorized
 }
src/parseable/streams.rs (2)

1056-1079: Remove debug logging statements.

Multiple tracing::warn! calls (lines 1056-1058, 1073, 1078) are used for debugging during development. These will flood production logs and should be removed or reduced to trace level. This is a duplicate of a prior review comment.

🔧 Suggested fix
     pub fn get_or_create(
         ...
     ) -> StreamRef {
         let mut guard = self.write().expect(LOCK_EXPECT);
-        tracing::warn!(
-            "get_or_create\nstream- {stream_name}\ntenant- {tenant_id:?}\nmetadata- {metadata:?}\noptions- {options:?}"
-        );
         let tenant = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v);

         if let Some(tenant_streams) = guard.get(tenant)
             && let Some(stream) = tenant_streams.get(&stream_name)
         {
             return stream.clone();
         }

-        // if let Some(stream) = guard.get(&stream_name) {
-        //     return stream.clone();
-        // }
-        // guard.insert(stream_name, stream.clone());
-
         let stream = Stream::new(options, &stream_name, metadata, ingestor_id, tenant_id);
-        tracing::warn!("creating new stream- {stream_name}");
         guard
             .entry(tenant.to_owned())
             .or_default()
             .insert(stream_name, stream.clone());
-        tracing::warn!("inserted stream in mem");
         stream
     }

1092-1101: Noisy warning for missing tenant.

The warn! at lines 1097-1099 triggers on every contains check when a tenant doesn't exist. This could flood logs during normal operation (e.g., checking streams before tenant is registered). Consider using debug or trace level. This is a duplicate of a prior review comment.

🔧 Suggested fix
         } else {
-            tracing::warn!(
+            tracing::debug!(
                 "Tenant with id {tenant_id} does not exist! Shouldn't happen (stream- {stream_name})"
             );
             false
         }
🧹 Nitpick comments (16)
src/utils/mod.rs (2)

61-77: Inconsistent tenant_id handling between BasicAuth and SessionId paths.

The BasicAuth path (line 68) returns user.tenant.clone() which is already Option<String>, while the SessionId path (line 76) wraps the tenant_id in Some(). This creates inconsistent behavior:

  • BasicAuth users may have tenant_id = None if user.tenant is None
  • SessionId users always have tenant_id = Some(...) even if the underlying value represents "no tenant"

Consider normalizing the return value to ensure consistent semantics across both authentication methods.


87-93: Unnecessary clone on tenant_id.

The tenant_id returned from get_userid_from_session is already a String. The .clone() on line 89 is redundant since you're returning by value.

Suggested fix
 pub fn get_tenant_id_from_key(key: &SessionKey) -> Option<String> {
     if let Some((_, tenant_id)) = Users.get_userid_from_session(key) {
-        Some(tenant_id.clone())
+        Some(tenant_id)
     } else {
         None
     }
 }
src/storage/object_storage.rs (6)

116-117: Remove commented-out debug logging.

These commented debug statements should be removed before merging.

-    // tracing::warn!("upload single stream_relative_path- {stream_relative_path:?}");
-    // tracing::warn!("upload single path- {path:?}");

634-636: Consider reducing log level for production paths.

The tracing::warn! statement here logs on every stream lookup miss, which could generate excessive logs in production. Consider using tracing::debug! or tracing::trace! instead.

-        tracing::warn!(
-            "unable to find stream- {stream_name} with tenant- {tenant_id:?} in PARSEABLE.get_stream"
-        );
+        tracing::debug!(
+            "Stream {stream_name} with tenant {tenant_id:?} not found in memory, loading from storage"
+        );

939-939: Remove debug logging artifact.

-        tracing::warn!(process_parquet_files_path=?path);

971-979: Remove commented debug code.

-    // tracing::warn!("spawn parquet file name- {filename}");
...
-    // tracing::warn!("spawn parquet stream_relative_path- {stream_relative_path}");

1061-1061: Remove or reduce debug logging in production path.

-        tracing::warn!(upload_context_schema_files=?path);
+        tracing::trace!(upload_context_schema_files=?path);

1156-1169: Consider consistent tenant handling in path functions.

The schema_path and stream_json_path functions unconditionally include an empty string when tenant_id is None (via .map_or("", |v| v)), creating paths like ["", stream_name, ...]. While the relative-path crate normalizes empty path segments and this doesn't cause runtime issues, this pattern is inconsistent with similar functions like alert_json_path and alert_state_json_path, which use conditional inclusion (if let Some(tenant_id)). For consistency and clarity, consider adopting the same pattern: only include the tenant component when it's present.

src/parseable/mod.rs (3)

252-267: Consider reducing log level for check_or_load_stream.

The tracing::warn! calls at lines 257 and 261 will fire frequently in normal operation. Consider using tracing::debug! instead.

-        tracing::warn!("check or load streams- {stream_name}, {tenant_id:?}");
+        tracing::debug!("check or load streams- {stream_name}, {tenant_id:?}");
...
-        tracing::warn!("check or load streams not present, creating {stream_name}, {tenant_id:?}");
+        tracing::debug!("check or load streams not present, creating {stream_name}, {tenant_id:?}");

821-831: Reduce debug logging in stream creation path.

Multiple tracing::warn! statements in the stream creation success path will generate excessive logs in production.

-                tracing::warn!(created_stream_at=?created_at);
-                tracing::warn!(stream_name=?stream_name);
-                tracing::warn!(schema=?schema);
-                tracing::warn!(tenant_id=?tenant_id);
+                tracing::debug!(created_stream_at=?created_at, stream_name=?stream_name, tenant_id=?tenant_id);

460-462: Reduce log level for schema commit.

-        tracing::warn!(commit_schema=?schema);
+        tracing::trace!(commit_schema=?schema);
src/handlers/http/middleware.rs (1)

329-337: Simplify suspension check with if let instead of match.

The match statement only handles the Suspended variant explicitly; consider using if let for cleaner code.

♻️ Suggested refactor
 pub fn auth_no_context(req: &mut ServiceRequest, action: Action) -> Result<rbac::Response, Error> {
     // check if tenant is suspended
-    match check_suspension(req.request(), action) {
-        rbac::Response::Suspended(msg) => return Ok(rbac::Response::Suspended(msg)),
-        _ => {}
+    if let rbac::Response::Suspended(msg) = check_suspension(req.request(), action) {
+        return Ok(rbac::Response::Suspended(msg));
     }
     let creds = extract_session_key(req);
     creds.map(|key| Users.authorize(key, action, None, None))
 }

The same pattern applies to auth_resource_context (lines 343-347) and auth_user_context (lines 370-374).

src/parseable/streams.rs (2)

413-421: Minor: Unnecessary variable binding.

The dirs variable is immediately returned. Consider returning the expression directly.

♻️ Suggested fix
-        let dirs = dir
+        dir
             .flatten()
             .map(|file| file.path())
             .filter(|file| {
                 file.extension().is_some_and(|ext| ext.eq("parquet"))
                     && Self::is_valid_parquet_file(file, &self.stream_name)
             })
-            .collect();
-        dirs
+            .collect()
     }

1032-1036: Breaking change: Streams type now requires nested tenant lookup.

The type change from HashMap<String, StreamRef> to HashMap<String, HashMap<String, StreamRef>> is a significant structural change. The commented-out old definition should be removed.

♻️ Suggested fix
-// #[derive(Deref, DerefMut, Default)]
-// pub struct Streams(RwLock<HashMap<String, StreamRef>>);
-
 #[derive(Deref, DerefMut, Default)]
 pub struct Streams(RwLock<HashMap<String, HashMap<String, StreamRef>>>);
src/rbac/user.rs (2)

153-164: Salt generation updated to use 32-byte random value.

The change from SaltString::generate(&mut OsRng) to manual 32-byte generation with encode_b64 is functionally equivalent but more explicit. The commented-out line should be removed.

♻️ Suggested fix
 fn gen_hash(password: &str) -> String {
     let mut bytes = [0u8; 32];
     let r = &mut OsRng;
     r.fill_bytes(&mut bytes);
     let salt = SaltString::encode_b64(&bytes).unwrap();
-    // let salt = SaltString::generate(&mut OsRng);
     let argon2 = Argon2::default();
     argon2
         .hash_password(password.as_bytes(), &salt)
         .expect("can hash random alphanumeric")
         .to_string()
 }

472-478: Remove commented-out code.

The commented-out update_in_metadata method should be removed or implemented. Leaving dead code makes maintenance harder.

♻️ Suggested fix
-    // pub async fn update_in_metadata(&self, tenant_id: &Option<String>) -> Result<(), RBACError> {
-    //     let mut metadata = get_metadata(tenant_id).await?;
-    //     metadata.user_groups.retain(|x| x.name != self.name);
-    //     metadata.user_groups.push(self.clone());
-    //     put_metadata(&metadata).await?;
-    //     Ok(())
-    // }
 }
📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9802cbf and be1a48b.

📒 Files selected for processing (9)
  • src/handlers/http/correlation.rs
  • src/handlers/http/middleware.rs
  • src/handlers/http/users/dashboards.rs
  • src/handlers/http/users/filters.rs
  • src/parseable/mod.rs
  • src/parseable/streams.rs
  • src/rbac/user.rs
  • src/storage/object_storage.rs
  • src/utils/mod.rs
🧰 Additional context used
🧠 Learnings (26)
📚 Learning: 2025-05-01T10:27:56.858Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1305
File: src/handlers/http/users/dashboards.rs:0-0
Timestamp: 2025-05-01T10:27:56.858Z
Learning: The `add_tile()` function in `src/handlers/http/users/dashboards.rs` should use `get_dashboard_by_user(dashboard_id, &user_id)` instead of `get_dashboard(dashboard_id)` to ensure proper authorization checks when modifying a dashboard.

Applied to files:

  • src/handlers/http/users/filters.rs
  • src/handlers/http/users/dashboards.rs
  • src/handlers/http/middleware.rs
📚 Learning: 2025-09-05T09:27:12.659Z
Learnt from: parmesant
Repo: parseablehq/parseable PR: 1424
File: src/users/filters.rs:116-121
Timestamp: 2025-09-05T09:27:12.659Z
Learning: The Filters::load() function in src/users/filters.rs is only called once at server initialization, so there's no risk of duplicate entries from repeated invocations.

Applied to files:

  • src/handlers/http/users/filters.rs
📚 Learning: 2025-02-14T09:49:25.818Z
Learnt from: de-sh
Repo: parseablehq/parseable PR: 1185
File: src/handlers/http/logstream.rs:255-261
Timestamp: 2025-02-14T09:49:25.818Z
Learning: In Parseable's logstream handlers, stream existence checks must be performed for both query and standalone modes. The pattern `!PARSEABLE.streams.contains(&stream_name) && (PARSEABLE.options.mode != Mode::Query || !PARSEABLE.create_stream_and_schema_from_storage(&stream_name).await?)` ensures proper error handling in both modes.

Applied to files:

  • src/utils/mod.rs
  • src/parseable/mod.rs
  • src/parseable/streams.rs
  • src/storage/object_storage.rs
📚 Learning: 2025-05-01T12:22:42.363Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1305
File: src/users/dashboards.rs:154-165
Timestamp: 2025-05-01T12:22:42.363Z
Learning: Title validation for dashboards is performed in the `create_dashboard` HTTP handler function rather than in the `DASHBOARDS.create` method, avoiding redundant validation.

Applied to files:

  • src/handlers/http/users/dashboards.rs
📚 Learning: 2025-05-01T10:33:51.767Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1305
File: src/handlers/http/users/dashboards.rs:125-148
Timestamp: 2025-05-01T10:33:51.767Z
Learning: When adding a tile to a dashboard in `add_tile()` function, the tile ID must be provided by the client and should not be generated by the server. If the tile ID is missing (nil), the API should fail the operation with an appropriate error message.

Applied to files:

  • src/handlers/http/users/dashboards.rs
📚 Learning: 2025-08-25T01:31:41.786Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/metadata.rs:63-68
Timestamp: 2025-08-25T01:31:41.786Z
Learning: The TOTAL_EVENTS_INGESTED_DATE, TOTAL_EVENTS_INGESTED_SIZE_DATE, and TOTAL_EVENTS_STORAGE_SIZE_DATE metrics in src/metadata.rs and src/storage/object_storage.rs are designed to track total events across all streams, not per-stream. They use labels [origin, parsed_date] to aggregate by format and date, while per-stream metrics use [stream_name, origin, parsed_date] labels.

Applied to files:

  • src/parseable/mod.rs
  • src/parseable/streams.rs
  • src/storage/object_storage.rs
📚 Learning: 2025-08-25T01:32:25.980Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/metrics/mod.rs:163-173
Timestamp: 2025-08-25T01:32:25.980Z
Learning: The TOTAL_EVENTS_INGESTED_DATE, TOTAL_EVENTS_INGESTED_SIZE_DATE, and TOTAL_EVENTS_STORAGE_SIZE_DATE metrics in src/metrics/mod.rs are intentionally designed to track global totals across all streams for a given date, using labels ["format", "date"] rather than per-stream labels. This is the correct design for global aggregation purposes.

Applied to files:

  • src/parseable/mod.rs
  • src/parseable/streams.rs
  • src/storage/object_storage.rs
📚 Learning: 2025-10-28T02:10:41.140Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1453
File: src/parseable/mod.rs:397-400
Timestamp: 2025-10-28T02:10:41.140Z
Learning: In Parseable enterprise deployments with multiple query nodes, hot tier configuration must be persisted in object storage so that newly started query nodes can fetch and synchronize the hot tier settings at startup (file: src/parseable/mod.rs, function: create_stream_and_schema_from_storage).

Applied to files:

  • src/parseable/mod.rs
  • src/parseable/streams.rs
  • src/storage/object_storage.rs
📚 Learning: 2025-09-18T09:52:07.554Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/storage/object_storage.rs:173-177
Timestamp: 2025-09-18T09:52:07.554Z
Learning: In Parseable's upload system (src/storage/object_storage.rs), the update_storage_metrics function can safely use path.metadata().map_err() to fail on local file metadata read failures because parquet validation (validate_uploaded_parquet_file) ensures file integrity before this step, and the system guarantees local staging files remain accessible throughout the upload flow.

Applied to files:

  • src/parseable/mod.rs
  • src/parseable/streams.rs
  • src/storage/object_storage.rs
📚 Learning: 2025-09-06T04:26:17.191Z
Learnt from: parmesant
Repo: parseablehq/parseable PR: 1424
File: src/enterprise/utils.rs:65-72
Timestamp: 2025-09-06T04:26:17.191Z
Learning: In Parseable's metastore implementation, MetastoreError::to_detail() returns a MetastoreErrorDetail struct (not a string), which contains structured error information including operation, message, stream_name, and other contextual fields. This struct is designed to be boxed in ObjectStorageError::MetastoreError(Box<MetastoreErrorDetail>).

Applied to files:

  • src/parseable/mod.rs
  • src/storage/object_storage.rs
📚 Learning: 2025-10-21T02:22:24.403Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1448
File: src/parseable/mod.rs:419-432
Timestamp: 2025-10-21T02:22:24.403Z
Learning: In Parseable's internal stream creation (`create_internal_stream_if_not_exists` in `src/parseable/mod.rs`), errors should not propagate to fail server initialization. The function creates both pmeta and pbilling internal streams, and failures are logged but the function always returns `Ok(())` to ensure server startup resilience. Individual stream creation failures should not prevent syncing of successfully created streams.

Applied to files:

  • src/parseable/mod.rs
  • src/parseable/streams.rs
  • src/storage/object_storage.rs
📚 Learning: 2025-08-18T14:56:18.463Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1405
File: src/storage/object_storage.rs:997-1040
Timestamp: 2025-08-18T14:56:18.463Z
Learning: In Parseable's staging upload system (src/storage/object_storage.rs), failed parquet file uploads should remain in the staging directory for retry in the next sync cycle, while successful uploads remove their staged files immediately. Early return on first error in collect_upload_results is correct behavior as concurrent tasks handle their own cleanup and failed files need to stay for retry.

Applied to files:

  • src/parseable/mod.rs
  • src/storage/object_storage.rs
📚 Learning: 2025-03-26T06:13:48.898Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1271
File: src/prism/home/mod.rs:207-224
Timestamp: 2025-03-26T06:13:48.898Z
Learning: In the Parseable codebase, if a stream is found, the stream_jsons array will always have at least one element. Additionally, for any valid stream_json object, the log_source array will always have at least one element. This is a design invariant that makes additional null checks unnecessary.

Applied to files:

  • src/parseable/mod.rs
📚 Learning: 2025-08-18T12:37:47.732Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1405
File: src/parseable/mod.rs:528-533
Timestamp: 2025-08-18T12:37:47.732Z
Learning: In Parseable, the validate_time_partition function in src/utils/json/flatten.rs already provides a default time partition limit of 30 days using `map_or(30, |days| days.get() as i64)` when time_partition_limit is None, so no additional defaulting is needed in the stream creation logic in src/parseable/mod.rs.

Applied to files:

  • src/parseable/mod.rs
  • src/parseable/streams.rs
  • src/storage/object_storage.rs
📚 Learning: 2025-08-18T19:10:11.941Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1405
File: src/handlers/http/ingest.rs:163-164
Timestamp: 2025-08-18T19:10:11.941Z
Learning: Field statistics calculation in src/storage/field_stats.rs uses None for the time_partition parameter when calling flatten_and_push_logs(), as field stats generation does not require time partition functionality.

Applied to files:

  • src/parseable/mod.rs
  • src/storage/object_storage.rs
📚 Learning: 2025-08-21T11:47:01.279Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1410
File: src/storage/object_storage.rs:0-0
Timestamp: 2025-08-21T11:47:01.279Z
Learning: In Parseable's object storage implementation (src/storage/object_storage.rs), the hour and minute directory prefixes (hour=XX, minute=YY) are generated from arrow file timestamps following proper datetime conventions, so they are guaranteed to be within valid ranges (0-23 for hours, 0-59 for minutes) and don't require additional range validation.

Applied to files:

  • src/parseable/mod.rs
📚 Learning: 2025-08-21T14:41:55.462Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1410
File: src/storage/object_storage.rs:876-916
Timestamp: 2025-08-21T14:41:55.462Z
Learning: In Parseable's object storage system (src/storage/object_storage.rs), date directories (date=YYYY-MM-DD) are only created when there's actual data to store, which means they will always contain corresponding hour and minute subdirectories. There can be no case where a date directory exists without hour or minute subdirectories.

Applied to files:

  • src/parseable/mod.rs
📚 Learning: 2025-03-26T06:44:53.362Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1263
File: src/handlers/http/ingest.rs:300-310
Timestamp: 2025-03-26T06:44:53.362Z
Learning: In Parseable, every stream is always associated with a log_source - no stream can exist without a log_source. For otel-traces and otel-metrics, strict restrictions are implemented where ingestion is rejected if a stream already has a different log_source format. However, regular logs from multiple log_sources can coexist in a single stream.

Applied to files:

  • src/parseable/streams.rs
  • src/storage/object_storage.rs
📚 Learning: 2025-09-18T09:59:20.177Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/metrics/mod.rs:700-756
Timestamp: 2025-09-18T09:59:20.177Z
Learning: In src/event/mod.rs, the parsed_timestamp used in increment_events_ingested_by_date() is correctly UTC-normalized: for dynamic streams it remains Utc::now(), and for streams with time partition enabled it uses the time partition value. Both cases result in proper UTC date strings for metrics labeling, preventing double-counting issues.

Applied to files:

  • src/parseable/streams.rs
  • src/storage/object_storage.rs
📚 Learning: 2025-09-09T14:08:45.809Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1427
File: resources/ingest_demo_data.sh:440-440
Timestamp: 2025-09-09T14:08:45.809Z
Learning: In the resources/ingest_demo_data.sh demo script, hardcoded stream names like "demodata" in alert queries should be ignored and not flagged for replacement with $P_STREAM variables.

Applied to files:

  • src/parseable/streams.rs
📚 Learning: 2025-10-20T17:48:53.444Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1448
File: src/handlers/http/cluster/mod.rs:1370-1400
Timestamp: 2025-10-20T17:48:53.444Z
Learning: In src/handlers/http/cluster/mod.rs, the billing metrics processing logic should NOT accumulate counter values from multiple Prometheus samples with the same labels. The intended behavior is to convert each received counter from nodes into individual events for ingestion, using `.insert()` to store the counter value directly.

Applied to files:

  • src/parseable/streams.rs
📚 Learning: 2025-07-28T17:10:39.448Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1392
File: src/migration/stream_metadata_migration.rs:303-322
Timestamp: 2025-07-28T17:10:39.448Z
Learning: In Parseable's migration system (src/migration/stream_metadata_migration.rs), each migration function updates the metadata to the current latest format using CURRENT_OBJECT_STORE_VERSION and CURRENT_SCHEMA_VERSION constants, rather than producing incremental versions. For example, v5_v6 function produces v7 format output when these constants are set to "v7", not v6 format.

Applied to files:

  • src/parseable/streams.rs
📚 Learning: 2025-09-14T15:17:59.234Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1432
File: src/storage/object_storage.rs:124-132
Timestamp: 2025-09-14T15:17:59.234Z
Learning: In Parseable's upload validation system (src/storage/object_storage.rs), the validate_uploaded_parquet_file function should not include bounded retries for metadata consistency issues. Instead, failed validations rely on the 30-second sync cycle for natural retries, with staging files preserved when manifest_file is set to None.

Applied to files:

  • src/storage/object_storage.rs
📚 Learning: 2025-08-20T17:01:25.791Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1409
File: src/storage/field_stats.rs:429-456
Timestamp: 2025-08-20T17:01:25.791Z
Learning: In Parseable's field stats calculation (src/storage/field_stats.rs), the extract_datetime_from_parquet_path_regex function correctly works with filename-only parsing because Parseable's server-side filename generation guarantees the dot-separated format date=YYYY-MM-DD.hour=HH.minute=MM pattern in parquet filenames.

Applied to files:

  • src/storage/object_storage.rs
📚 Learning: 2025-08-18T18:01:22.834Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1405
File: src/handlers/http/modal/utils/ingest_utils.rs:271-292
Timestamp: 2025-08-18T18:01:22.834Z
Learning: In Parseable's ingestion validation, validate_stream_for_ingestion is designed to prevent regular log ingestion endpoints (ingest() and post_event()) from ingesting into streams that exclusively contain OTEL traces or metrics. The function allows mixed streams (regular logs + OTEL) but blocks ingestion into OTEL-only streams, maintaining architectural separation between regular log and OTEL ingestion pathways.

Applied to files:

  • src/storage/object_storage.rs
📚 Learning: 2025-06-16T09:50:38.636Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1346
File: src/parseable/streams.rs:319-331
Timestamp: 2025-06-16T09:50:38.636Z
Learning: In Parseable's Ingest or Query mode, the node_id is always available because it's generated during server initialization itself, before the get_node_id_string() function in streams.rs would be called. This makes the .expect() calls on QUERIER_META.get() and INGESTOR_META.get() safe in this context.

Applied to files:

  • src/storage/object_storage.rs
🧬 Code graph analysis (7)
src/handlers/http/correlation.rs (1)
src/utils/mod.rs (4)
  • get_hash (95-100)
  • get_tenant_id_from_request (79-85)
  • get_user_and_tenant_from_request (61-77)
  • user_auth_for_datasets (114-180)
src/handlers/http/users/filters.rs (1)
src/utils/mod.rs (3)
  • get_hash (95-100)
  • get_user_and_tenant_from_request (61-77)
  • is_admin (182-199)
src/utils/mod.rs (1)
src/utils/actix.rs (2)
  • req (31-31)
  • extract_session_key_from_req (51-71)
src/handlers/http/users/dashboards.rs (1)
src/utils/mod.rs (4)
  • get_hash (95-100)
  • get_tenant_id_from_request (79-85)
  • get_user_and_tenant_from_request (61-77)
  • is_admin (182-199)
src/handlers/http/middleware.rs (2)
src/rbac/mod.rs (1)
  • roles_to_permission (313-328)
src/utils/mod.rs (1)
  • get_user_and_tenant_from_request (61-77)
src/rbac/user.rs (2)
src/rbac/mod.rs (1)
  • add_roles (142-150)
src/rbac/map.rs (2)
  • roles (80-88)
  • mut_sessions (106-112)
src/parseable/mod.rs (5)
src/handlers/http/modal/utils/rbac_utils.rs (1)
  • get_metadata (24-34)
src/rbac/map.rs (3)
  • mut_roles (90-96)
  • mut_users (72-78)
  • users (62-70)
src/storage/store_metadata.rs (1)
  • put_remote_metadata (290-296)
src/metastore/metastore_traits.rs (1)
  • delete_tenant (306-306)
src/tenants/mod.rs (1)
  • delete_tenant (69-71)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Build Default x86_64-pc-windows-msvc
  • GitHub Check: Build Default aarch64-apple-darwin
  • GitHub Check: Build Kafka x86_64-unknown-linux-gnu
  • GitHub Check: Build Default aarch64-unknown-linux-gnu
  • GitHub Check: Build Default x86_64-unknown-linux-gnu
  • GitHub Check: Build Kafka aarch64-apple-darwin
  • GitHub Check: coverage
  • GitHub Check: Quest Smoke and Load Tests for Standalone deployments
  • GitHub Check: Quest Smoke and Load Tests for Distributed deployments
🔇 Additional comments (33)
src/handlers/http/correlation.rs (3)

26-28: LGTM!

The new imports for tenant-related utilities are appropriate for the multi-tenancy changes.


101-114: LGTM!

The delete function correctly extracts and propagates both user_id and tenant_id, implementing proper tenant-scoped deletion. This pattern should be applied consistently to the other endpoints (get, post, modify).


73-78: Tenant isolation is maintained via session_key; the current implementation is correct.

The create() method does not require an explicit tenant_id parameter because it internally derives the tenant context from the session_key via get_tenant_id_from_key(session_key) (line 134 in src/correlation.rs). This tenant_id is then correctly passed to the metastore and used for in-memory storage, ensuring tenant isolation.

The extracted tenant_id in the post function is unused, but this does not represent a bug—it's vestigial code. The design intentionally passes session_key to create(), which handles tenant derivation internally (unlike delete(), which takes an explicit tenant_id parameter for a different purpose).

src/utils/mod.rs (2)

102-180: LGTM - tenant context properly threaded through authorization.

The user_auth_for_query and user_auth_for_datasets functions correctly propagate tenant context through check_or_load_stream and get_stream calls. The permission matching logic properly handles tenant-scoped stream access.


182-199: LGTM - admin check updated for tenant-aware permissions.

The is_admin function now correctly matches Some(ParseableResourceType::All) to align with the updated RBAC model.

src/handlers/http/users/filters.rs (4)

45-62: LGTM - tenant context properly integrated into filter retrieval.

The get handler correctly extracts both user_id and tenant_id, then passes tenant_id to FILTERS.get_filter for tenant-scoped filter access.


64-78: LGTM - filter creation with tenant context.

The post handler correctly propagates tenant_id to both metastore persistence (put_filter) and in-memory state update (FILTERS.update).


80-107: LGTM - filter update with tenant context.

The update handler correctly uses tenant_id for both authorization check (get_filter) and persistence operations.


109-131: LGTM - filter deletion with tenant context.

The delete handler correctly propagates tenant_id through authorization, metastore deletion, and in-memory state cleanup.

src/handlers/http/users/dashboards.rs (5)

92-106: LGTM - create_dashboard uses auth-based tenant extraction.

The create_dashboard handler correctly uses get_user_and_tenant_from_request to derive tenant context from the authenticated user.


108-195: LGTM - update_dashboard with proper tenant context.

The handler correctly uses get_user_and_tenant_from_request and passes tenant_id through both authorization (get_dashboard_by_user) and update operations.


197-212: LGTM - delete_dashboard with proper tenant context.

The handler correctly uses auth-based tenant extraction for deletion operations.


214-246: LGTM - add_tile properly checks ownership with tenant context.

Based on learnings, get_dashboard_by_user is correctly used for authorization before modification, and tenant_id is properly propagated.


36-76: No security issue. The list_dashboards function correctly uses the tenant ID from the request. All dashboard endpoints are protected by the .authorize() middleware, which invokes the AuthMiddleware that validates the tenant from the authenticated user session and overwrites the request tenant header with the authenticated user's actual tenant before the handler executes. This prevents header-based tenant spoofing attacks, ensuring consistent tenant isolation across all operations.

Likely an incorrect or invalid review comment.

src/storage/object_storage.rs (3)

1074-1093: LGTM - tenant-prefixed path construction.

The stream_relative_path function correctly handles tenant prefixing, using DEFAULT_TENANT comparison to determine whether to include tenant in the path.


1095-1123: LGTM - multi-tenant stream sync orchestration.

The sync_all_streams function correctly iterates over tenants (or uses None for single-tenant mode) and spawns upload tasks per tenant/stream combination.


1211-1221: LGTM - tenant-aware alert path construction.

The alert_json_path function correctly handles optional tenant prefix for alert storage paths.

src/parseable/mod.rs (3)

44-45: LGTM - DEFAULT_TENANT constant.

Good practice to define a constant for the default tenant identifier.


213-227: LGTM - tenant-aware stream lookup.

The get_stream function correctly falls back to DEFAULT_TENANT when no tenant is specified, maintaining backward compatibility.


467-541: LGTM - per-tenant internal stream creation.

The create_internal_stream_if_not_exists function correctly iterates over all tenants (or uses None for single-tenant mode) to create internal streams per tenant.

src/handlers/http/middleware.rs (4)

200-216: Tenant-aware user lookup during session refresh looks correct.

The code correctly uses map_or(DEFAULT_TENANT, |v| v) to handle the Option<String> tenant_id and retrieves the user from the tenant-scoped map. The nested if let pattern appropriately handles both the tenant and user lookups.


251-264: Consistent tenant handling in user mutation path.

The mutable user lookup mirrors the read path with the same tenant resolution pattern. The early return with an appropriate error message when user is not found is correct.


266-289: Tenant propagation to track_new and roles_to_permission is correct.

The tenant_id is properly passed through to both roles_to_permission (which expects &str per the relevant snippet) and track_new. The fallback to DEFAULT_TENANT is consistent with the rest of the codebase.


305-308: New Suspended response mapped to BadRequest.

The suspension response is properly handled and converted to an HTTP 400 error. This aligns with the pattern of informing clients about the specific suspension reason.

src/parseable/streams.rs (1)

123-126: Tenant-aware data path derivation.

The Stream::new function correctly accepts tenant_id and passes it to local_stream_data_path for tenant-scoped storage paths. This aligns with the multi-tenancy design.

src/rbac/user.rs (8)

48-48: New tenant field added to User struct.

The addition of the optional tenant field properly supports multi-tenancy. The field is correctly propagated through constructors.


53-67: new_basic correctly accepts and stores tenant.

The constructor properly initializes the new tenant field from the parameter.


69-86: new_oauth correctly accepts and stores tenant.

The OAuth user constructor properly handles the tenant parameter.


131-131: Password length increased from 16 to 32 characters.

This is a security improvement, providing more entropy for generated passwords.


246-247: New tenant_id field added to GroupUser.

The addition supports tenant-scoped user group operations. The field is correctly propagated in from_user.


374-384: add_roles now requires tenant_id: &str parameter.

The signature change aligns with the relevant code snippet from src/rbac/mod.rs which shows add_roles accepting tenant_id: &Option<String>. However, here it takes &str directly. Ensure callers resolve Option<String> to &str before calling.


393-397: Tenant resolution in add_users uses GroupUser.tenant_id.

The code correctly uses the tenant_id from each GroupUser with DEFAULT_TENANT as fallback for session removal.


171-185: No action required. All callers of the renamed function and role definitions have been updated consistently. The function get_admin_user has been completely removed with no orphaned references, and the single caller in src/rbac/map.rs:138 correctly uses get_super_admin_user. The role definition in src/rbac/map.rs:135 is consistent with the new "super-admin" name.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +45 to +52
let tenant_id = get_tenant_id_from_request(&req);
let correlation_id = correlation_id.into_inner();
let session_key = extract_session_key_from_req(&req)
.map_err(|err| CorrelationError::AnyhowError(Error::msg(err.to_string())))?;

let correlation = CORRELATIONS.get_correlation(&correlation_id).await?;
let correlation = CORRELATIONS
.get_correlation(&correlation_id, &tenant_id)
.await?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Inconsistent tenant_id extraction creates potential security risk.

The get function extracts tenant_id from an HTTP header via get_tenant_id_from_request, while delete extracts it from the authenticated session via get_user_and_tenant_from_request. This inconsistency could allow a user to access correlations from other tenants by manipulating the tenant header.

For consistency and security, consider using the session-based approach here as well:

🔒 Proposed fix
 pub async fn get(
     req: HttpRequest,
     correlation_id: Path<String>,
 ) -> Result<impl Responder, CorrelationError> {
-    let tenant_id = get_tenant_id_from_request(&req);
     let correlation_id = correlation_id.into_inner();
     let session_key = extract_session_key_from_req(&req)
         .map_err(|err| CorrelationError::AnyhowError(Error::msg(err.to_string())))?;
+    let tenant_id = get_user_and_tenant_from_request(&req)
+        .map(|(_, t)| t)
+        .map_err(|err| CorrelationError::AnyhowError(Error::msg(err.to_string())))?;

     let correlation = CORRELATIONS
         .get_correlation(&correlation_id, &tenant_id)
         .await?;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let tenant_id = get_tenant_id_from_request(&req);
let correlation_id = correlation_id.into_inner();
let session_key = extract_session_key_from_req(&req)
.map_err(|err| CorrelationError::AnyhowError(Error::msg(err.to_string())))?;
let correlation = CORRELATIONS.get_correlation(&correlation_id).await?;
let correlation = CORRELATIONS
.get_correlation(&correlation_id, &tenant_id)
.await?;
let correlation_id = correlation_id.into_inner();
let session_key = extract_session_key_from_req(&req)
.map_err(|err| CorrelationError::AnyhowError(Error::msg(err.to_string())))?;
let tenant_id = get_user_and_tenant_from_request(&req)
.map(|(_, t)| t)
.map_err(|err| CorrelationError::AnyhowError(Error::msg(err.to_string())))?;
let correlation = CORRELATIONS
.get_correlation(&correlation_id, &tenant_id)
.await?;
🤖 Prompt for AI Agents
In `@src/handlers/http/correlation.rs` around lines 45 - 52, The handler currently
uses get_tenant_id_from_request (header) to derive tenant_id which is
inconsistent with delete; replace that header-based extraction with the
session-based get_user_and_tenant_from_request flow: call
get_user_and_tenant_from_request(&req, &session_key) (or the project's
equivalent) and use the returned tenant (e.g., from the (user, tenant) tuple) as
tenant_id, propagate errors the same way as delete does, then pass that
tenant_id into CORRELATIONS.get_correlation(&correlation_id, &tenant_id). Ensure
you remove or stop using get_tenant_id_from_request in this function so tenant
is always taken from the authenticated session.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (7)
src/handlers/http/correlation.rs (1)

89-96: Missing tenant_id propagation in correlation update.

Same issue as post — the tenant_id is extracted but discarded. If CORRELATIONS.update supports tenant scoping, it should be passed here for consistency with the delete handler.

src/handlers/http/users/dashboards.rs (2)

36-69: Inconsistent tenant_id extraction in read operations.

The list_dashboards function extracts tenant_id from an HTTP header via get_tenant_id_from_request (line 37), while write operations like create_dashboard (line 99) use get_user_and_tenant_from_request which derives tenant from the authenticated session.

This inconsistency could allow users to view dashboards from other tenants by manipulating the tenant header. For proper tenant isolation, use session-based extraction:

🔒 Suggested fix
 pub async fn list_dashboards(req: HttpRequest) -> Result<impl Responder, DashboardError> {
-    let tenant_id = get_tenant_id_from_request(&req);
+    let (_, tenant_id) = get_user_and_tenant_from_request(&req)?;
     let query_map = web::Query::<HashMap<String, String>>::from_query(req.query_string())

78-90: Inconsistent tenant_id extraction in get_dashboard.

Same issue as list_dashboards — uses header-based get_tenant_id_from_request instead of session-based extraction, which could allow cross-tenant data access.

🔒 Suggested fix
 pub async fn get_dashboard(
     req: HttpRequest,
     dashboard_id: Path<String>,
 ) -> Result<impl Responder, DashboardError> {
     let dashboard_id = validate_dashboard_id(dashboard_id.into_inner())?;
-    let tenant_id = get_tenant_id_from_request(&req);
+    let (_, tenant_id) = get_user_and_tenant_from_request(&req)?;
     let dashboard = DASHBOARDS
src/rbac/user.rs (1)

241-305: Fix GroupUser equality/hash to include tenant_id (prevents cross-tenant collisions).
Now that GroupUser has tenant_id (Line 246), keeping Eq/Hash based only on userid risks treating two different-tenant users as the same set element.

Proposed diff
 impl PartialEq for GroupUser {
     fn eq(&self, other: &Self) -> bool {
-        self.userid == other.userid
+        self.userid == other.userid && self.tenant_id == other.tenant_id
     }
 }
 impl Eq for GroupUser {}
 impl std::hash::Hash for GroupUser {
     fn hash<H: std::hash::Hasher>(&self, state: &mut H) {
         self.userid.hash(state)
+        self.tenant_id.hash(state)
     }
 }
src/parseable/streams.rs (1)

117-137: Update 14 test call sites for the new Stream::new(..., tenant_id) signature.

Stream::new now requires a 5th parameter tenant_id: &Option<String> (line 123), but 14 test calls in the file are still using the old 4-parameter signature. These need to be updated to pass the missing parameter:

  • Lines 1202, 1220, 1238, 1256, 1277, 1300, 1334, 1363, 1420, 1469, 1519 (multi-line calls with 4 parameters)
  • Lines 1446, 1496, 1550 (single-line calls with 4 parameters)

Production code at line 1072 already uses the correct signature. Test calls should add &None as the 5th argument.

src/handlers/http/query.rs (1)

112-162: Validate tenant headers before using them as DataFusion schema identifiers.

The tenant header is extracted via get_tenant_id_from_request() (which calls .to_str().unwrap() on the raw header value) and then passed directly to default_schema on lines 121–125 without any format validation or normalization. Since tenant IDs are used as SQL schema identifiers and as partitioning keys throughout the system, an attacker-controlled tenant header could contain characters that violate SQL identifier rules (e.g., spaces, special characters) or cause tenant-isolation issues.

This pattern is also repeated in src/storage/field_stats.rs:108. Add explicit validation (e.g., alphanumeric + underscore only, length limits) and normalize tenant IDs before using them as schema names or auth scope. Ideally, also verify that the tenant exists in your system.

src/storage/object_storage.rs (1)

1073-1093: Unify tenant prefix rules across path builders (avoid empty/DEFAULT_TENANT path segments).

schema_path and stream_json_path unconditionally include a tenant element using map_or("", ...), resulting in empty string path components when tenant_id is None. This diverges from stream_relative_path (skips tenant for None/DEFAULT_TENANT), alert_json_path, and mttr_json_path (conditional inclusion). The metastore code explicitly works around this by converting empty tenant strings back to DEFAULT_TENANT when reading—evidence the inconsistency causes actual path mismatches. Standardize all path builders to either skip tenant components when None or uniformly include the tenant value without empty segments.

🤖 Fix all issues with AI agents
In `@src/handlers/http/query.rs`:
- Around line 126-131: The local binding schema_names is computed from
session_state.catalog_list().catalog("datafusion").unwrap().schema_names() but
never used; remove the unused variable and its computation or, if you intended
to use it later, reference it appropriately—specifically delete the let
schema_names = ... statement (or rename to _schema_names if you want to suppress
the unused warning temporarily) so the build won’t fail under deny(warnings).

In `@src/rbac/user.rs`:
- Around line 374-383: The add_roles method extends self.roles then refreshes
sessions using the caller-provided tenant_id for every user, which can
invalidate the wrong tenant; update the session invalidation loop in
UserGroup::add_roles to call mut_sessions().remove_user(group_user.userid(),
group_user.tenant_id()) (i.e., use each group_user.tenant_id instead of the
function parameter) so session invalidation uses each user’s actual tenant.

In `@src/storage/object_storage.rs`:
- Around line 939-949: Replace noisy per-file warning logs with lower-severity
debug/trace logs: change the tracing::warn! that logs process_parquet_files_path
in process_parquet_files to tracing::debug! (or tracing::trace! if extremely
verbose) and likewise change the tracing::warn!/warn! usage around schema
processing in the same module (the block that iterates schema files ~lines
1055–1069) to tracing::debug!/trace!. Update any similar per-file warn calls
passed into spawn_parquet_upload_task or nearby helpers so only actionable
operator issues remain at warn level; keep error-level logs for actual failures.
♻️ Duplicate comments (12)
src/handlers/http/correlation.rs (1)

45-52: Inconsistent tenant_id extraction creates potential security risk.

The get function extracts tenant_id from an HTTP header via get_tenant_id_from_request, while delete (line 106) extracts it from the authenticated session via get_user_and_tenant_from_request. This inconsistency could allow a user to access correlations from other tenants by manipulating the tenant header.

For consistency and security, use the session-based approach here as well.

src/handlers/http/middleware.rs (2)

167-180: Handle potential panic from HeaderValue::from_str().unwrap().

If tenant_id contains characters not valid in HTTP headers (e.g., control characters), HeaderValue::from_str() will return an error and unwrap() will panic.

♻️ Suggested fix
         let user_and_tenant_id = match get_user_and_tenant_from_request(req.request()) {
             Ok((uid, tid)) => {
-                if tid.is_some() {
-                    req.headers_mut().insert(
-                        HeaderName::from_static("tenant"),
-                        HeaderValue::from_str(&tid.as_ref().unwrap()).unwrap(),
-                    );
-                }
+                if let Some(ref tenant) = tid {
+                    if let Ok(header_val) = HeaderValue::from_str(tenant) {
+                        req.headers_mut().insert(
+                            HeaderName::from_static("tenant"),
+                            header_val,
+                        );
+                    }
+                }
 
                 Ok((uid, tid))
             }
             Err(e) => Err(e),
         };

316-327: Security: Consider stricter handling when tenant doesn't exist.

The check_suspension function returns Authorized when:

  1. No tenant header is present (line 326)
  2. Tenant doesn't exist in TENANT_METADATA (lines 322-324, empty else branch)

This could allow requests to bypass tenant-level controls. Consider whether non-existent tenants should return Unauthorized rather than Authorized.

src/utils/mod.rs (1)

84-90: Potential panic on malformed header value.

Using .unwrap() on to_str() can panic if the tenant header contains non-UTF8 bytes.

♻️ Suggested fix
 pub fn get_tenant_id_from_request(req: &HttpRequest) -> Option<String> {
     if let Some(tenant_value) = req.headers().get("tenant") {
-        Some(tenant_value.to_str().unwrap().to_owned())
+        tenant_value.to_str().ok().map(|s| s.to_owned())
     } else {
         None
     }
 }
src/handlers/http/users/dashboards.rs (1)

248-253: Tenant isolation concern in list_tags.

Similar to list_dashboards, this extracts tenant from header rather than authenticated session, which could allow cross-tenant data access.

🔒 Suggested fix
 pub async fn list_tags(req: HttpRequest) -> Result<impl Responder, DashboardError> {
+    let (_, tenant_id) = get_user_and_tenant_from_request(&req)?;
     let tags = DASHBOARDS
-        .list_tags(&get_tenant_id_from_request(&req))
+        .list_tags(&tenant_id)
         .await;
     Ok((web::Json(tags), StatusCode::OK))
 }
src/parseable/streams.rs (2)

1055-1079: Remove/downgrade the new tracing::warn! debug logs (noise + possible secrets).
The warn logs added in get_or_create/contains/flush_and_convert are development-level verbosity and include options / metadata formatting.

Also applies to: 1092-1101, 1163-1169


648-650: Drop tracing::warn!(part_path=?part_path) debug artifact.
This is likely to flood logs during normal parquet conversion.

src/parseable/mod.rs (3)

1066-1085: Make tenant add atomic (avoid TOCTOU between read and write locks).
Current add_tenant does contains() under a read lock then pushes under a write lock (Line 1075-1081).


1125-1153: delete_tenant must also remove the tenant from self.tenants (in-memory list).
Right now it cleans users/roles and TENANT_METADATA, but list_tenants() can still return the deleted tenant.


1155-1191: load_tenants / list_tenants shouldn’t silently swallow poisoned locks or have empty branches.
load_tenants has an empty else if !is_multi_tenant {} (Line 1177-1178) and returns Ok(None) on write-lock failure (Line 1185-1190); list_tenants similarly returns None on lock failure.

Also applies to: 1193-1201

src/alerts/mod.rs (2)

1331-1339: Silent write drop still possible in update_state().

The write-back block still uses get_mut(tenant) which silently skips the insert if the tenant bucket doesn't exist. This was flagged in a previous review and should use the same entry().or_default() pattern as update().

🐛 Proposed fix
         {
             let mut write_access = self.alerts.write().await;
-
             let tenant = alert.get_tenant_id().as_ref().map_or(DEFAULT_TENANT, |v| v);
-            if let Some(alerts) = write_access.get_mut(tenant) {
-                alerts.insert(*alert.get_id(), alert.clone_box());
-            }
-            // write_access.insert(*alert.get_id(), alert.clone_box());
+            write_access
+                .entry(tenant.to_owned())
+                .or_default()
+                .insert(*alert.get_id(), alert.clone_box());
         }

1377-1381: Silent write drop still possible in update_notification_state().

Same issue as update_state() - uses get_mut(tenant) which silently skips the insert if the tenant bucket doesn't exist.

🐛 Proposed fix
         alert
             .update_notification_state(new_notification_state)
             .await?;
-        if let Some(alerts) = write_access.get_mut(tenant) {
-            alerts.insert(*alert.get_id(), alert.clone_box());
-        }
-        // write_access.insert(*alert.get_id(), alert.clone_box());
+        write_access
+            .entry(tenant.to_owned())
+            .or_default()
+            .insert(*alert.get_id(), alert.clone_box());

         Ok(())
🧹 Nitpick comments (9)
src/handlers/http/middleware.rs (1)

329-337: Consider simplifying the suspension check pattern.

The match statement can be simplified using if let for better readability:

♻️ Suggested refactor
 pub fn auth_no_context(req: &mut ServiceRequest, action: Action) -> Result<rbac::Response, Error> {
     // check if tenant is suspended
-    match check_suspension(req.request(), action) {
-        rbac::Response::Suspended(msg) => return Ok(rbac::Response::Suspended(msg)),
-        _ => {}
+    if let rbac::Response::Suspended(msg) = check_suspension(req.request(), action) {
+        return Ok(rbac::Response::Suspended(msg));
     }
     let creds = extract_session_key(req);
     creds.map(|key| Users.authorize(key, action, None, None))
 }

The same pattern applies to auth_resource_context (lines 343-347) and auth_user_context (lines 370-374).

src/handlers/http/query.rs (1)

423-473: Consider propagating distributed stream-load failures instead of always returning Ok(()).

create_streams_for_distributed logs failures (Line 458-463) but the caller proceeds; that can turn “stream couldn’t be loaded” into opaque execution errors later. If callers require these streams to exist, consider collecting task results and returning an error if any load fails.

src/parseable/mod.rs (1)

252-267: Downgrade/remove tracing::warn! debug logs that can flood production and expose internals.
Examples include check_or_load_stream (Line 257/261), create_stream_and_schema_from_storage (Line 385/392/393/460), and create_stream (Line 821/827-831).

Also applies to: 376-465, 645-871

src/alerts/mod.rs (6)

1045-1050: Consider simplifying tenant Option construction.

The current logic converts empty string to &None and non-empty to &Some(tenant_id.clone()). This works but could be cleaner.

♻️ Suggested simplification
-        for (tenant_id, raw_bytes) in raw_objects {
-            let tenant = if tenant_id.is_empty() {
-                &None
-            } else {
-                &Some(tenant_id.clone())
-            };
+        for (tenant_id, raw_bytes) in raw_objects {
+            let tenant: Option<String> = (!tenant_id.is_empty()).then(|| tenant_id.clone());

Then use &tenant where needed. This avoids the awkward reference-to-temporary pattern.


1165-1169: Remove commented-out code.

This commented-out code block appears to be leftover from refactoring. It should be removed to maintain code clarity.

♻️ Proposed cleanup
             } else {
                 vec![]
             }
-            // alerts_guard
-            //     .values()
-            //     .map(|alert| alert.to_alert_config())
-            //     .collect()
         };

1395-1399: Remove commented-out code.

♻️ Proposed cleanup
         } else {
             warn!("Alert ID- {alert_id} not found in memory!");
         }
-        // if self.alerts.write().await.remove(&alert_id).is_some() {
-        //     trace!("removed alert from memory");
-        // } else {
-        //     warn!("Alert ID- {alert_id} not found in memory!");
-        // }
         Ok(())

1443-1458: Remove commented-out code and approve tenant-scoped logic.

The tenant-scoped tag listing is correct. Remove the commented line for cleanliness.

♻️ Proposed cleanup
     async fn list_tags(&self, tenant_id: &Option<String>) -> Vec<String> {
         let tenant = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v);
-        // let alerts = self.alerts.read().await;
         let mut tags = if let Some(alerts) = self.alerts.read().await.get(tenant) {

1470-1472: Remove commented-out code.

♻️ Proposed cleanup
         } else {
             HashMap::new()
         }
-        // let alerts = self.alerts.read().await;
-        // alerts.iter().map(|(k, v)| (*k, v.clone_box())).collect()
     }

1282-1298: Remove large commented-out code block.

This entire block of commented-out code should be removed. It shows the old implementation that has been replaced by the tenant-aware version above.

♻️ Proposed cleanup
             } else {
                 return Err(AlertError::CustomError(format!(
                     "No alert found for the given ID- {alert_id}"
                 )));
             };
-            // let alert = if let Some(alert) = read_access.get(&alert_id) {
-            //     match &alert.get_alert_type() {
-            //         AlertType::Threshold => Box::new(ThresholdAlert::from(alert.to_alert_config()))
-            //             as Box<dyn AlertTrait>,
-            //         AlertType::Anomaly(_) => {
-            //             return Err(AlertError::NotPresentInOSS("anomaly"));
-            //         }
-            //         AlertType::Forecast(_) => {
-            //             return Err(AlertError::NotPresentInOSS("forecast"));
-            //         }
-            //     }
-            // } else {
-            //     return Err(AlertError::CustomError(format!(
-            //         "No alert found for the given ID- {alert_id}"
-            //     )));
-            // };

             let current_state = *alert.get_state();
📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between be1a48b and ca421da.

📒 Files selected for processing (11)
  • src/alerts/mod.rs
  • src/handlers/http/correlation.rs
  • src/handlers/http/middleware.rs
  • src/handlers/http/query.rs
  • src/handlers/http/users/dashboards.rs
  • src/handlers/http/users/filters.rs
  • src/parseable/mod.rs
  • src/parseable/streams.rs
  • src/rbac/user.rs
  • src/storage/object_storage.rs
  • src/utils/mod.rs
🧰 Additional context used
🧠 Learnings (30)
📚 Learning: 2025-05-01T10:27:56.858Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1305
File: src/handlers/http/users/dashboards.rs:0-0
Timestamp: 2025-05-01T10:27:56.858Z
Learning: The `add_tile()` function in `src/handlers/http/users/dashboards.rs` should use `get_dashboard_by_user(dashboard_id, &user_id)` instead of `get_dashboard(dashboard_id)` to ensure proper authorization checks when modifying a dashboard.

Applied to files:

  • src/handlers/http/correlation.rs
  • src/handlers/http/middleware.rs
  • src/handlers/http/users/filters.rs
  • src/handlers/http/users/dashboards.rs
📚 Learning: 2025-09-05T09:27:12.659Z
Learnt from: parmesant
Repo: parseablehq/parseable PR: 1424
File: src/users/filters.rs:116-121
Timestamp: 2025-09-05T09:27:12.659Z
Learning: The Filters::load() function in src/users/filters.rs is only called once at server initialization, so there's no risk of duplicate entries from repeated invocations.

Applied to files:

  • src/handlers/http/users/filters.rs
📚 Learning: 2025-02-14T09:49:25.818Z
Learnt from: de-sh
Repo: parseablehq/parseable PR: 1185
File: src/handlers/http/logstream.rs:255-261
Timestamp: 2025-02-14T09:49:25.818Z
Learning: In Parseable's logstream handlers, stream existence checks must be performed for both query and standalone modes. The pattern `!PARSEABLE.streams.contains(&stream_name) && (PARSEABLE.options.mode != Mode::Query || !PARSEABLE.create_stream_and_schema_from_storage(&stream_name).await?)` ensures proper error handling in both modes.

Applied to files:

  • src/utils/mod.rs
  • src/handlers/http/query.rs
  • src/parseable/streams.rs
  • src/parseable/mod.rs
  • src/storage/object_storage.rs
📚 Learning: 2025-06-18T06:39:04.775Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1340
File: src/query/mod.rs:64-66
Timestamp: 2025-06-18T06:39:04.775Z
Learning: In src/query/mod.rs, QUERY_SESSION_STATE and QUERY_SESSION serve different architectural purposes: QUERY_SESSION_STATE is used for stats calculation and allows dynamic registration of individual parquet files from the staging path (files created every minute), while QUERY_SESSION is used for object store queries with the global schema provider. Session contexts with schema providers don't support registering individual tables/parquets, so both session objects are necessary for their respective use cases.

Applied to files:

  • src/handlers/http/query.rs
  • src/alerts/mod.rs
📚 Learning: 2025-06-18T12:44:31.983Z
Learnt from: parmesant
Repo: parseablehq/parseable PR: 1347
File: src/handlers/http/query.rs:0-0
Timestamp: 2025-06-18T12:44:31.983Z
Learning: The counts API in src/handlers/http/query.rs does not currently support group_by functionality in COUNT queries, so the hard-coded fields array ["start_time", "end_time", "count"] is appropriate for the current scope.

Applied to files:

  • src/handlers/http/query.rs
📚 Learning: 2025-10-28T02:10:41.140Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1453
File: src/parseable/mod.rs:397-400
Timestamp: 2025-10-28T02:10:41.140Z
Learning: In Parseable enterprise deployments with multiple query nodes, hot tier configuration must be persisted in object storage so that newly started query nodes can fetch and synchronize the hot tier settings at startup (file: src/parseable/mod.rs, function: create_stream_and_schema_from_storage).

Applied to files:

  • src/handlers/http/query.rs
  • src/parseable/streams.rs
  • src/parseable/mod.rs
  • src/storage/object_storage.rs
📚 Learning: 2025-10-21T02:22:24.403Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1448
File: src/parseable/mod.rs:419-432
Timestamp: 2025-10-21T02:22:24.403Z
Learning: In Parseable's internal stream creation (`create_internal_stream_if_not_exists` in `src/parseable/mod.rs`), errors should not propagate to fail server initialization. The function creates both pmeta and pbilling internal streams, and failures are logged but the function always returns `Ok(())` to ensure server startup resilience. Individual stream creation failures should not prevent syncing of successfully created streams.

Applied to files:

  • src/handlers/http/query.rs
  • src/parseable/streams.rs
  • src/parseable/mod.rs
  • src/storage/object_storage.rs
📚 Learning: 2025-05-01T12:22:42.363Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1305
File: src/users/dashboards.rs:154-165
Timestamp: 2025-05-01T12:22:42.363Z
Learning: Title validation for dashboards is performed in the `create_dashboard` HTTP handler function rather than in the `DASHBOARDS.create` method, avoiding redundant validation.

Applied to files:

  • src/handlers/http/users/dashboards.rs
📚 Learning: 2025-05-01T10:33:51.767Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1305
File: src/handlers/http/users/dashboards.rs:125-148
Timestamp: 2025-05-01T10:33:51.767Z
Learning: When adding a tile to a dashboard in `add_tile()` function, the tile ID must be provided by the client and should not be generated by the server. If the tile ID is missing (nil), the API should fail the operation with an appropriate error message.

Applied to files:

  • src/handlers/http/users/dashboards.rs
📚 Learning: 2025-03-26T06:44:53.362Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1263
File: src/handlers/http/ingest.rs:300-310
Timestamp: 2025-03-26T06:44:53.362Z
Learning: In Parseable, every stream is always associated with a log_source - no stream can exist without a log_source. For otel-traces and otel-metrics, strict restrictions are implemented where ingestion is rejected if a stream already has a different log_source format. However, regular logs from multiple log_sources can coexist in a single stream.

Applied to files:

  • src/parseable/streams.rs
  • src/storage/object_storage.rs
📚 Learning: 2025-08-25T01:31:41.786Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/metadata.rs:63-68
Timestamp: 2025-08-25T01:31:41.786Z
Learning: The TOTAL_EVENTS_INGESTED_DATE, TOTAL_EVENTS_INGESTED_SIZE_DATE, and TOTAL_EVENTS_STORAGE_SIZE_DATE metrics in src/metadata.rs and src/storage/object_storage.rs are designed to track total events across all streams, not per-stream. They use labels [origin, parsed_date] to aggregate by format and date, while per-stream metrics use [stream_name, origin, parsed_date] labels.

Applied to files:

  • src/parseable/streams.rs
  • src/parseable/mod.rs
  • src/storage/object_storage.rs
📚 Learning: 2025-09-18T09:59:20.177Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/metrics/mod.rs:700-756
Timestamp: 2025-09-18T09:59:20.177Z
Learning: In src/event/mod.rs, the parsed_timestamp used in increment_events_ingested_by_date() is correctly UTC-normalized: for dynamic streams it remains Utc::now(), and for streams with time partition enabled it uses the time partition value. Both cases result in proper UTC date strings for metrics labeling, preventing double-counting issues.

Applied to files:

  • src/parseable/streams.rs
  • src/storage/object_storage.rs
📚 Learning: 2025-07-28T17:10:39.448Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1392
File: src/migration/stream_metadata_migration.rs:303-322
Timestamp: 2025-07-28T17:10:39.448Z
Learning: In Parseable's migration system (src/migration/stream_metadata_migration.rs), each migration function updates the metadata to the current latest format using CURRENT_OBJECT_STORE_VERSION and CURRENT_SCHEMA_VERSION constants, rather than producing incremental versions. For example, v5_v6 function produces v7 format output when these constants are set to "v7", not v6 format.

Applied to files:

  • src/parseable/streams.rs
📚 Learning: 2025-08-25T01:32:25.980Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/metrics/mod.rs:163-173
Timestamp: 2025-08-25T01:32:25.980Z
Learning: The TOTAL_EVENTS_INGESTED_DATE, TOTAL_EVENTS_INGESTED_SIZE_DATE, and TOTAL_EVENTS_STORAGE_SIZE_DATE metrics in src/metrics/mod.rs are intentionally designed to track global totals across all streams for a given date, using labels ["format", "date"] rather than per-stream labels. This is the correct design for global aggregation purposes.

Applied to files:

  • src/parseable/streams.rs
  • src/parseable/mod.rs
  • src/storage/object_storage.rs
📚 Learning: 2025-09-09T14:08:45.809Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1427
File: resources/ingest_demo_data.sh:440-440
Timestamp: 2025-09-09T14:08:45.809Z
Learning: In the resources/ingest_demo_data.sh demo script, hardcoded stream names like "demodata" in alert queries should be ignored and not flagged for replacement with $P_STREAM variables.

Applied to files:

  • src/parseable/streams.rs
📚 Learning: 2025-10-20T17:48:53.444Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1448
File: src/handlers/http/cluster/mod.rs:1370-1400
Timestamp: 2025-10-20T17:48:53.444Z
Learning: In src/handlers/http/cluster/mod.rs, the billing metrics processing logic should NOT accumulate counter values from multiple Prometheus samples with the same labels. The intended behavior is to convert each received counter from nodes into individual events for ingestion, using `.insert()` to store the counter value directly.

Applied to files:

  • src/parseable/streams.rs
📚 Learning: 2025-09-18T09:52:07.554Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/storage/object_storage.rs:173-177
Timestamp: 2025-09-18T09:52:07.554Z
Learning: In Parseable's upload system (src/storage/object_storage.rs), the update_storage_metrics function can safely use path.metadata().map_err() to fail on local file metadata read failures because parquet validation (validate_uploaded_parquet_file) ensures file integrity before this step, and the system guarantees local staging files remain accessible throughout the upload flow.

Applied to files:

  • src/parseable/streams.rs
  • src/parseable/mod.rs
  • src/storage/object_storage.rs
📚 Learning: 2025-08-18T12:37:47.732Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1405
File: src/parseable/mod.rs:528-533
Timestamp: 2025-08-18T12:37:47.732Z
Learning: In Parseable, the validate_time_partition function in src/utils/json/flatten.rs already provides a default time partition limit of 30 days using `map_or(30, |days| days.get() as i64)` when time_partition_limit is None, so no additional defaulting is needed in the stream creation logic in src/parseable/mod.rs.

Applied to files:

  • src/parseable/streams.rs
  • src/parseable/mod.rs
  • src/storage/object_storage.rs
📚 Learning: 2025-09-06T04:26:17.191Z
Learnt from: parmesant
Repo: parseablehq/parseable PR: 1424
File: src/enterprise/utils.rs:65-72
Timestamp: 2025-09-06T04:26:17.191Z
Learning: In Parseable's metastore implementation, MetastoreError::to_detail() returns a MetastoreErrorDetail struct (not a string), which contains structured error information including operation, message, stream_name, and other contextual fields. This struct is designed to be boxed in ObjectStorageError::MetastoreError(Box<MetastoreErrorDetail>).

Applied to files:

  • src/parseable/mod.rs
  • src/storage/object_storage.rs
📚 Learning: 2025-08-18T14:56:18.463Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1405
File: src/storage/object_storage.rs:997-1040
Timestamp: 2025-08-18T14:56:18.463Z
Learning: In Parseable's staging upload system (src/storage/object_storage.rs), failed parquet file uploads should remain in the staging directory for retry in the next sync cycle, while successful uploads remove their staged files immediately. Early return on first error in collect_upload_results is correct behavior as concurrent tasks handle their own cleanup and failed files need to stay for retry.

Applied to files:

  • src/parseable/mod.rs
  • src/storage/object_storage.rs
📚 Learning: 2025-03-26T06:13:48.898Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1271
File: src/prism/home/mod.rs:207-224
Timestamp: 2025-03-26T06:13:48.898Z
Learning: In the Parseable codebase, if a stream is found, the stream_jsons array will always have at least one element. Additionally, for any valid stream_json object, the log_source array will always have at least one element. This is a design invariant that makes additional null checks unnecessary.

Applied to files:

  • src/parseable/mod.rs
📚 Learning: 2025-08-18T19:10:11.941Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1405
File: src/handlers/http/ingest.rs:163-164
Timestamp: 2025-08-18T19:10:11.941Z
Learning: Field statistics calculation in src/storage/field_stats.rs uses None for the time_partition parameter when calling flatten_and_push_logs(), as field stats generation does not require time partition functionality.

Applied to files:

  • src/parseable/mod.rs
  • src/storage/object_storage.rs
📚 Learning: 2025-08-21T11:47:01.279Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1410
File: src/storage/object_storage.rs:0-0
Timestamp: 2025-08-21T11:47:01.279Z
Learning: In Parseable's object storage implementation (src/storage/object_storage.rs), the hour and minute directory prefixes (hour=XX, minute=YY) are generated from arrow file timestamps following proper datetime conventions, so they are guaranteed to be within valid ranges (0-23 for hours, 0-59 for minutes) and don't require additional range validation.

Applied to files:

  • src/parseable/mod.rs
📚 Learning: 2025-08-21T14:41:55.462Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1410
File: src/storage/object_storage.rs:876-916
Timestamp: 2025-08-21T14:41:55.462Z
Learning: In Parseable's object storage system (src/storage/object_storage.rs), date directories (date=YYYY-MM-DD) are only created when there's actual data to store, which means they will always contain corresponding hour and minute subdirectories. There can be no case where a date directory exists without hour or minute subdirectories.

Applied to files:

  • src/parseable/mod.rs
📚 Learning: 2025-09-14T15:17:59.234Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1432
File: src/storage/object_storage.rs:124-132
Timestamp: 2025-09-14T15:17:59.234Z
Learning: In Parseable's upload validation system (src/storage/object_storage.rs), the validate_uploaded_parquet_file function should not include bounded retries for metadata consistency issues. Instead, failed validations rely on the 30-second sync cycle for natural retries, with staging files preserved when manifest_file is set to None.

Applied to files:

  • src/storage/object_storage.rs
📚 Learning: 2025-08-20T17:01:25.791Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1409
File: src/storage/field_stats.rs:429-456
Timestamp: 2025-08-20T17:01:25.791Z
Learning: In Parseable's field stats calculation (src/storage/field_stats.rs), the extract_datetime_from_parquet_path_regex function correctly works with filename-only parsing because Parseable's server-side filename generation guarantees the dot-separated format date=YYYY-MM-DD.hour=HH.minute=MM pattern in parquet filenames.

Applied to files:

  • src/storage/object_storage.rs
📚 Learning: 2025-06-16T09:50:38.636Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1346
File: src/parseable/streams.rs:319-331
Timestamp: 2025-06-16T09:50:38.636Z
Learning: In Parseable's Ingest or Query mode, the node_id is always available because it's generated during server initialization itself, before the get_node_id_string() function in streams.rs would be called. This makes the .expect() calls on QUERIER_META.get() and INGESTOR_META.get() safe in this context.

Applied to files:

  • src/storage/object_storage.rs
📚 Learning: 2025-08-18T18:01:22.834Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1405
File: src/handlers/http/modal/utils/ingest_utils.rs:271-292
Timestamp: 2025-08-18T18:01:22.834Z
Learning: In Parseable's ingestion validation, validate_stream_for_ingestion is designed to prevent regular log ingestion endpoints (ingest() and post_event()) from ingesting into streams that exclusively contain OTEL traces or metrics. The function allows mixed streams (regular logs + OTEL) but blocks ingestion into OTEL-only streams, maintaining architectural separation between regular log and OTEL ingestion pathways.

Applied to files:

  • src/storage/object_storage.rs
📚 Learning: 2025-07-24T11:09:21.781Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1388
File: src/alerts/mod.rs:88-104
Timestamp: 2025-07-24T11:09:21.781Z
Learning: In the Parseable alert system (src/alerts/mod.rs), alert versions are server-generated and controlled via CURRENT_ALERTS_VERSION constant, not user input. The AlertVerison enum's From<&str> implementation correctly defaults unknown versions to V2 since the server only generates known versions (v1, v2). Unknown versions would only occur in exceptional cases like file corruption, making the current fallback approach appropriate.

Applied to files:

  • src/alerts/mod.rs
📚 Learning: 2025-08-14T10:14:50.453Z
Learnt from: parmesant
Repo: parseablehq/parseable PR: 1398
File: src/alerts/mod.rs:712-718
Timestamp: 2025-08-14T10:14:50.453Z
Learning: In the alerts module, get_number_of_agg_exprs() function validates that exactly 1 aggregate expression is present in SQL queries before other aggregate-related processing, which prevents empty aggr_expr scenarios in downstream functions like _get_aggregate_projection().

Applied to files:

  • src/alerts/mod.rs
🧬 Code graph analysis (9)
src/handlers/http/correlation.rs (1)
src/utils/mod.rs (4)
  • get_hash (102-107)
  • get_tenant_id_from_request (84-90)
  • get_user_and_tenant_from_request (61-82)
  • user_auth_for_datasets (121-187)
src/handlers/http/middleware.rs (2)
src/rbac/mod.rs (1)
  • roles_to_permission (313-328)
src/utils/mod.rs (1)
  • get_user_and_tenant_from_request (61-82)
src/handlers/http/users/filters.rs (1)
src/utils/mod.rs (3)
  • get_hash (102-107)
  • get_user_and_tenant_from_request (61-82)
  • is_admin (189-206)
src/utils/mod.rs (6)
src/utils/actix.rs (2)
  • req (31-31)
  • extract_session_key_from_req (51-71)
src/handlers/http/rbac.rs (1)
  • Users (74-74)
src/rbac/user.rs (1)
  • username (262-264)
src/handlers/http/cluster/mod.rs (3)
  • stream (1036-1039)
  • stream (1199-1202)
  • stream (1574-1577)
src/cli.rs (1)
  • format (669-670)
src/parseable/mod.rs (1)
  • storage (338-340)
src/handlers/http/query.rs (3)
src/utils/mod.rs (1)
  • get_tenant_id_from_request (84-90)
src/handlers/http/mod.rs (1)
  • fetch_schema (89-94)
src/event/mod.rs (1)
  • commit_schema (136-157)
src/handlers/http/users/dashboards.rs (2)
src/utils/mod.rs (4)
  • get_hash (102-107)
  • get_tenant_id_from_request (84-90)
  • get_user_and_tenant_from_request (61-82)
  • is_admin (189-206)
src/users/dashboards.rs (4)
  • dashboards (435-439)
  • dashboards (462-466)
  • get_dashboard (364-383)
  • list_tags (458-478)
src/parseable/streams.rs (2)
src/parseable/mod.rs (1)
  • new (196-211)
src/rbac/mod.rs (1)
  • contains (163-169)
src/storage/object_storage.rs (3)
src/storage/field_stats.rs (1)
  • calculate_field_stats (88-178)
src/parseable/streams.rs (3)
  • new (118-136)
  • len (1105-1111)
  • v (606-608)
src/catalog/mod.rs (1)
  • update_snapshot (108-130)
src/rbac/user.rs (2)
src/rbac/mod.rs (1)
  • add_roles (142-150)
src/rbac/map.rs (2)
  • roles (80-88)
  • mut_sessions (106-112)
🔇 Additional comments (22)
src/handlers/http/correlation.rs (2)

106-112: LGTM!

The delete handler correctly extracts both user_id and tenant_id from the authenticated session and passes them to CORRELATIONS.delete. This is the correct pattern that should be followed by the other handlers.


73-78: No action needed. The tenant_id is properly handled through the session_key mechanism. The CORRELATIONS.create() method extracts the tenant_id directly from the session_key (via get_tenant_id_from_key()) and uses it to store the correlation in both the metastore and in-memory cache. Discarding the request-level tenant_id is intentional and correct—it ensures the correlation is created under the authenticated user's tenant derived from the session, not from untrusted request data.

Likely an incorrect or invalid review comment.

src/handlers/http/users/filters.rs (4)

49-54: LGTM!

The get handler correctly extracts tenant_id from the authenticated session and passes it to get_filter for proper tenant isolation.


68-75: LGTM!

The post handler correctly propagates tenant_id to both the metastore (put_filter) and the in-memory cache (FILTERS.update).


85-104: LGTM!

The update handler correctly uses tenant_id for both authorization (checking filter ownership) and persistence operations.


113-128: LGTM!

The delete handler correctly implements tenant-aware deletion with proper authorization checks.

src/handlers/http/middleware.rs (2)

200-290: LGTM!

The token refresh logic correctly threads tenant_id through user lookups, permission resolution, and session tracking with appropriate DEFAULT_TENANT fallback.


305-307: LGTM!

Properly maps the Suspended response to ErrorBadRequest with the suspension message.

src/utils/mod.rs (4)

61-82: LGTM!

The get_user_and_tenant_from_request function correctly handles both BasicAuth and SessionId authentication methods, and properly normalizes DEFAULT_TENANT to None for consistent downstream handling.


92-100: LGTM!

The get_tenant_id_from_key function correctly extracts tenant context from a SessionKey with consistent DEFAULT_TENANT normalization.


109-119: LGTM!

The user_auth_for_query function correctly derives tenant_id from the session key and propagates it to dataset authorization.


121-157: LGTM!

The user_auth_for_datasets function correctly accepts tenant_id and uses it for tenant-scoped stream existence checks and metadata lookups.

src/handlers/http/users/dashboards.rs (4)

99-104: LGTM!

The create_dashboard handler correctly uses session-based tenant extraction and propagates tenant_id to the create operation.


113-191: LGTM!

The update_dashboard handler correctly uses session-based tenant extraction and propagates tenant_id through authorization and update operations.


201-208: LGTM!

The delete_dashboard handler correctly uses session-based tenant extraction.


223-242: LGTM!

The add_tile handler correctly uses session-based tenant extraction and get_dashboard_by_user for proper authorization checks. Based on learnings, this aligns with the expected pattern.

src/alerts/mod.rs (6)

107-150: LGTM!

The migrate_from_v1 function correctly propagates tenant_id through the migration flow - passing it to build_query_from_v1, storing it in the migrated alert config, and using it when persisting via put_alert.


187-207: LGTM!

Tenant context is correctly threaded through build_query_from_v1add_where_conditionsfetch_schema, ensuring schema lookups during v1 migration are tenant-aware.

Also applies to: 274-301


620-624: LGTM!

Target lookup during notification dispatch correctly uses the alert's tenant_id.


1224-1240: LGTM!

The get_alert_by_id method correctly handles tenant-scoped lookups with appropriate error handling when the alert is not found.


1242-1251: Previous issue addressed.

The update() method now correctly uses entry(tenant.to_owned()).or_default() to ensure the tenant bucket exists before insertion, preventing silent write drops for first-time tenants.


1404-1420: LGTM!

The get_state() method correctly handles tenant-scoped lookups with appropriate error messaging.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +126 to +131
let schema_names = session_state
.catalog_list()
.catalog("datafusion")
.unwrap()
.schema_names();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove unused schema_names (can fail builds with deny(warnings)).
Line 126-131 computes schema_names and never uses it.

Proposed diff
-    let schema_names = session_state
-        .catalog_list()
-        .catalog("datafusion")
-        .unwrap()
-        .schema_names();
-    
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let schema_names = session_state
.catalog_list()
.catalog("datafusion")
.unwrap()
.schema_names();
🤖 Prompt for AI Agents
In `@src/handlers/http/query.rs` around lines 126 - 131, The local binding
schema_names is computed from
session_state.catalog_list().catalog("datafusion").unwrap().schema_names() but
never used; remove the unused variable and its computation or, if you intended
to use it later, reference it appropriately—specifically delete the let
schema_names = ... statement (or rename to _schema_names if you want to suppress
the unused warning temporarily) so the build won’t fail under deny(warnings).

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (18)
src/handlers/http/users/dashboards.rs (1)

36-69: Derive tenant from authenticated session, not raw header.

Using the tenant header here lets a client spoof tenancy, which can leak dashboards across tenants. The codebase already establishes the secure pattern in create_dashboard (line 99); apply the same approach here.

🔧 Suggested fix
-    let tenant_id = get_tenant_id_from_request(&req);
+    let (_user_id, tenant_id) = get_user_and_tenant_from_request(&req)?;

Note: get_dashboard (line 83) and other functions in this file also use the insecure get_tenant_id_from_request and should be updated similarly.

src/handlers/http/demo_data.rs (1)

19-24: Avoid panic on malformed tenant header.

get_tenant_id_from_request unwraps header parsing; a non‑UTF8 tenant header will panic and can take down the worker. Please make header parsing fallible and treat invalid values as None or a 400. One option is to make this call site safe (or update the helper similarly).

🛠️ Suggested fix (safe extraction)
 use crate::{
     handlers::http::{cluster::get_demo_data_from_ingestor, ingest::PostError},
     option::Mode,
     parseable::PARSEABLE,
-    utils::get_tenant_id_from_request,
 };
@@
-    let tenant_id = get_tenant_id_from_request(&req);
+    let tenant_id = req
+        .headers()
+        .get("tenant")
+        .and_then(|value| value.to_str().ok())
+        .map(|value| value.to_owned());

Also applies to: 52-52

src/handlers/http/modal/query/querier_logstream.rs (1)

93-110: Add tenant header to stream delete request sent to ingestors.

send_stream_delete_request only sets CONTENT_TYPE and AUTHORIZATION headers. The ingestor delete handler relies on the tenant header (extracted via get_tenant_id_from_request) to identify which tenant's stream to delete. Without it, non-default tenant deletes will operate on the default tenant or fail silently. Pass tenant_id to send_stream_delete_request and have it set the tenant header, similar to how sync_streams_with_ingestors propagates request headers.

src/rbac/role.rs (1)

191-233: Potential breaking change in DefaultPrivilege JSON shape.

Switching Writer/Reader/Ingestor to struct variants changes serialized field names (e.g., "0""resource"). This can break existing clients of /user/roles. Consider adding serde aliases for backward compatibility or clearly versioning/migrating the API.

🛠️ Backward-compatibility shim (serde alias)
-        Writer {
-            resource: ParseableResourceType,
-        },
+        Writer {
+            #[serde(alias = "0")]
+            resource: ParseableResourceType,
+        },
-        Ingestor {
-            resource: Option<ParseableResourceType>,
-        },
+        Ingestor {
+            #[serde(alias = "0")]
+            resource: Option<ParseableResourceType>,
+        },
-        Reader {
-            resource: ParseableResourceType,
-        },
+        Reader {
+            #[serde(alias = "0")]
+            resource: ParseableResourceType,
+        },
src/hottier.rs (2)

208-218: delete_hot_tier ignores tenant_id in the path.

The delete path is still hot_tier_path/stream, which will miss tenant-scoped directories (or delete the wrong tenant’s data).

🐛 Proposed fix
-        let path = self.hot_tier_path.join(stream);
+        let path = if let Some(tenant_id) = tenant_id.as_ref() {
+            self.hot_tier_path.join(tenant_id).join(stream)
+        } else {
+            self.hot_tier_path.join(stream)
+        };

410-466: Local hot-tier paths are still tenant-agnostic.

parquet_path, manifest paths, and get_stream_path_for_date remain tenant-agnostic, so hot-tier files for different tenants will collide if stream names overlap. Please thread tenant_id into the local-path helpers (e.g., get_stream_path_for_date, fetch_hot_tier_dates, manifest path construction) and include the tenant segment in the local hot-tier directory.

🐛 Directional fix (call site)
-        let path = self.get_stream_path_for_date(stream, &date);
+        let path = self.get_stream_path_for_date(stream, &date, tenant_id);
src/storage/store_metadata.rs (2)

124-144: Staging metadata read is still global despite tenant-scoped writes.
resolve_parseable_metadata now receives tenant_id (Line 126) but get_staging_metadata() still reads the global .parseable.json. For non-default tenants this can mis-detect environment changes and overwrite tenant metadata. Consider making get_staging_metadata tenant-aware and wiring the parameter through.

🛠️ Proposed fix: read staging metadata per tenant
 pub async fn resolve_parseable_metadata(
     parseable_metadata: &Option<Bytes>,
     tenant_id: &Option<String>,
 ) -> Result<StorageMetadata, ObjectStorageError> {
-    let staging_metadata = get_staging_metadata()?;
+    let staging_metadata = get_staging_metadata(tenant_id)?;
     let remote_metadata = parseable_metadata
         .as_ref()
         .map(|meta| serde_json::from_slice(meta).expect("parseable config is valid json"));
@@
-pub fn get_staging_metadata() -> io::Result<Option<StorageMetadata>> {
-    let path = RelativePathBuf::from(PARSEABLE_METADATA_FILE_NAME)
-        .to_path(PARSEABLE.options.staging_dir());
+pub fn get_staging_metadata(tenant_id: &Option<String>) -> io::Result<Option<StorageMetadata>> {
+    let path = if let Some(tenant_id) = tenant_id.as_ref() {
+        PARSEABLE
+            .options
+            .staging_dir()
+            .join(tenant_id)
+            .join(PARSEABLE_METADATA_FILE_NAME)
+    } else {
+        PARSEABLE
+            .options
+            .staging_dir()
+            .join(PARSEABLE_METADATA_FILE_NAME)
+    };

307-327: Ensure tenant staging directory exists before writing.
When tenant_id is Some (Line 311), the path includes a tenant subdirectory but no create_dir_all, so writes will fail if the directory isn’t pre-created.

🛠️ Proposed fix: create tenant staging directory
-    let path = if let Some(tenant_id) = tenant_id.as_ref() {
-        PARSEABLE
-            .options
-            .staging_dir()
-            .join(tenant_id)
-            .join(PARSEABLE_METADATA_FILE_NAME)
-    } else {
-        PARSEABLE
-            .options
-            .staging_dir()
-            .join(PARSEABLE_METADATA_FILE_NAME)
-    };
+    let path = if let Some(tenant_id) = tenant_id.as_ref() {
+        let tenant_dir = PARSEABLE.options.staging_dir().join(tenant_id);
+        create_dir_all(&tenant_dir)?;
+        tenant_dir.join(PARSEABLE_METADATA_FILE_NAME)
+    } else {
+        PARSEABLE
+            .options
+            .staging_dir()
+            .join(PARSEABLE_METADATA_FILE_NAME)
+    };
src/migration/mod.rs (1)

50-155: Per-tenant staging metadata isn’t loaded yet.
put_staging_metadata now writes under staging_dir/<tenant>/…, but run_metadata_migration still calls get_staging_metadata without tenant context, so tenant-specific staging metadata won’t be migrated. Consider making get_staging_metadata tenant-aware and passing tenant_id through.

🔧 Suggested call-site adjustment
-    let staging_metadata = get_staging_metadata(config)?;
+    let staging_metadata = get_staging_metadata(config, tenant_id)?;
src/parseable/streams.rs (1)

117-127: Update tests/callers for new Stream::new signature.
Stream::new now requires tenant_id; tests in this file still call with four args and will fail to compile. Also note that staging metrics now require a tenant_id label in assertions.

src/storage/gcs.rs (1)

179-438: Record object‑store metrics only after successful operations.
In _get_object, _put_object, _delete_prefix, _list_dates, _upload_file, and _upload_multipart, counters are incremented before confirming success, which can over‑count failures.

♻️ Example fix (apply pattern to similar helpers)
-        let resp = self.client.get(&to_object_store_path(path)).await;
-        let tenant = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v);
-        increment_object_store_calls_by_date("GET", &Utc::now().date_naive().to_string(), tenant);
-        match resp {
-            Ok(resp) => {
+        let resp = self.client.get(&to_object_store_path(path)).await;
+        let tenant = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v);
+        match resp {
+            Ok(resp) => {
+                increment_object_store_calls_by_date("GET", &Utc::now().date_naive().to_string(), tenant);
                 let body: Bytes = resp.bytes().await?;
                 increment_files_scanned_in_object_store_calls_by_date(
                     "GET",
                     1,
                     &Utc::now().date_naive().to_string(),
                     tenant,
                 );
Based on learnings, increment_object_store_calls_by_date should only run after successful operations to keep metrics accurate.
src/storage/s3.rs (3)

342-401: Move object‑store call metrics to the success path.
increment_object_store_calls_by_date is executed before error handling, so failed GET/PUT/HEAD/DELETE attempts get counted and skew per‑tenant usage.

🔧 Example fix (apply same pattern to other calls in this file)
-        let resp = self.client.get(&to_object_store_path(path)).await;
-        increment_object_store_calls_by_date(
-            "GET",
-            &Utc::now().date_naive().to_string(),
-            tenant_str,
-        );
-
-        match resp {
-            Ok(resp) => {
-                let body = resp.bytes().await?;
+        let resp = self.client.get(&to_object_store_path(path)).await?;
+        increment_object_store_calls_by_date(
+            "GET",
+            &Utc::now().date_naive().to_string(),
+            tenant_str,
+        );
+        let body = resp.bytes().await?;
                 increment_files_scanned_in_object_store_calls_by_date(
                     "GET",
                     1,
                     &Utc::now().date_naive().to_string(),
                     tenant_str,
                 );
                 increment_bytes_scanned_in_object_store_calls_by_date(
                     "GET",
                     body.len() as u64,
                     &Utc::now().date_naive().to_string(),
                     tenant_str,
                 );
-                Ok(body)
-            }
-            Err(err) => Err(err.into()),
-        }
+        Ok(body)

Based on learnings, metrics should be recorded only after successful operations.

Also applies to: 504-625, 673-694, 826-910


462-502: Tenant prefix is missing when listing date/hour/minute partitions.
_list_dates, list_hours, and list_minutes use stream_name only, which can list default‑tenant partitions when tenant_id is non‑default.

🔧 Suggested fix for `_list_dates` (apply analogous prefixing in list_hours/list_minutes)
-        let resp: Result<object_store::ListResult, object_store::Error> = self
-            .client
-            .list_with_delimiter(Some(&(stream.into())))
-            .await;
+        let prefix = tenant_id
+            .as_deref()
+            .filter(|t| *t != DEFAULT_TENANT)
+            .map(|t| format!("{t}/{stream}"))
+            .unwrap_or_else(|| stream.to_string());
+        let resp: Result<object_store::ListResult, object_store::Error> = self
+            .client
+            .list_with_delimiter(Some(&prefix.clone().into()))
+            .await;
@@
-            .filter_map(|path| path.as_ref().strip_prefix(&format!("{stream}/")))
+            .filter_map(|path| path.as_ref().strip_prefix(&format!("{prefix}/")))

Also applies to: 978-1059


944-953: Use a consistent tenant label for list_old_streams HEAD metrics.
The HEAD metric inside the per‑dir task uses "", which creates a separate label bucket.

🔧 Suggested fix
-                increment_object_store_calls_by_date(
-                    "HEAD",
-                    &Utc::now().date_naive().to_string(),
-                    "",
-                );
+                increment_object_store_calls_by_date(
+                    "HEAD",
+                    &Utc::now().date_naive().to_string(),
+                    tenant_str,
+                );
src/storage/object_storage.rs (1)

1131-1150: Normalize empty/default tenant before prefixing stream paths.
stream_relative_path prefixes when tenant_id is Some and not DEFAULT_TENANT. If an empty tenant slips through, this produces "/{stream}/..." and breaks object paths.

🔧 Suggested fix
-    if let Some(tenant) = tenant_id
-        && !tenant.eq(DEFAULT_TENANT)
-    {
-        format!("{tenant}/{stream_name}/{file_suffix}")
-    } else {
-        format!("{stream_name}/{file_suffix}")
-    }
+    if let Some(tenant) = tenant_id
+        .as_deref()
+        .filter(|t| !t.is_empty() && *t != DEFAULT_TENANT)
+    {
+        format!("{tenant}/{stream_name}/{file_suffix}")
+    } else {
+        format!("{stream_name}/{file_suffix}")
+    }
src/storage/azure_blob.rs (2)

215-266: Only count successful object‑store calls in metrics.
Metrics are incremented before error handling, so failed GET/PUT/HEAD/DELETE are counted as successes.

🔧 Example fix (apply similarly to other calls)
-        let resp = self.client.get(&to_object_store_path(path)).await;
-        increment_object_store_calls_by_date("GET", &Utc::now().date_naive().to_string(), tenant);
-
-        match resp {
-            Ok(resp) => {
-                let body: Bytes = resp.bytes().await?;
+        let resp = self.client.get(&to_object_store_path(path)).await?;
+        increment_object_store_calls_by_date("GET", &Utc::now().date_naive().to_string(), tenant);
+        let body: Bytes = resp.bytes().await?;
                 increment_files_scanned_in_object_store_calls_by_date(
                     "GET",
                     1,
                     &Utc::now().date_naive().to_string(),
                     tenant,
                 );
                 increment_bytes_scanned_in_object_store_calls_by_date(
                     "GET",
                     body.len() as u64,
                     &Utc::now().date_naive().to_string(),
                     tenant,
                 );
-                Ok(body)
-            }
-            Err(err) => Err(err.into()),
-        }
+        Ok(body)

Based on learnings, metrics should be recorded only after successful operations.

Also applies to: 359-472, 499-717


321-357: Tenant prefix is missing when listing date/hour/minute partitions.
Listing uses stream_name only, which can read default‑tenant partitions for non‑default tenants.

🔧 Suggested fix for `_list_dates` (apply analogous prefixing in list_hours/list_minutes)
-        let resp: Result<object_store::ListResult, object_store::Error> = self
-            .client
-            .list_with_delimiter(Some(&(stream.into())))
-            .await;
+        let prefix = tenant_id
+            .as_deref()
+            .filter(|t| *t != DEFAULT_TENANT)
+            .map(|t| format!("{t}/{stream}"))
+            .unwrap_or_else(|| stream.to_string());
+        let resp: Result<object_store::ListResult, object_store::Error> = self
+            .client
+            .list_with_delimiter(Some(&prefix.clone().into()))
+            .await;
@@
-            .filter_map(|path| path.as_ref().strip_prefix(&format!("{stream}/")))
+            .filter_map(|path| path.as_ref().strip_prefix(&format!("{prefix}/")))

Also applies to: 775-858

src/metastore/metastores/object_store_metastore.rs (1)

316-414: Alert state paths are tenant‑inconsistent (read/write mismatch).
get_alert_states builds a tenant‑prefixed base path, but get_alert_state_entry / put_alert_state still use the non‑tenant alert_state_json_path. This makes tenant alert states unreadable and risks cross‑tenant collisions.

🔧 Suggested fix (compute tenant‑aware alert_state path consistently)
-        let tenant = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v);
-        let base_path = RelativePathBuf::from_iter([&tenant, ALERTS_ROOT_DIRECTORY]);
+        let base_path = if let Some(tenant) = tenant_id.as_ref().filter(|t| !t.is_empty() && *t != DEFAULT_TENANT) {
+            RelativePathBuf::from_iter([tenant, ALERTS_ROOT_DIRECTORY])
+        } else {
+            RelativePathBuf::from_iter([ALERTS_ROOT_DIRECTORY])
+        };
@@
-        let path = alert_state_json_path(*alert_id);
+        let path = if let Some(tenant) = tenant_id.as_ref().filter(|t| !t.is_empty() && *t != DEFAULT_TENANT) {
+            RelativePathBuf::from_iter([tenant, ALERTS_ROOT_DIRECTORY, &format!("alert_state_{alert_id}.json")])
+        } else {
+            alert_state_json_path(*alert_id)
+        };
@@
-        let path = alert_state_json_path(id);
+        let path = if let Some(tenant) = tenant_id.as_ref().filter(|t| !t.is_empty() && *t != DEFAULT_TENANT) {
+            RelativePathBuf::from_iter([tenant, ALERTS_ROOT_DIRECTORY, &format!("alert_state_{id}.json")])
+        } else {
+            alert_state_json_path(id)
+        };
🤖 Fix all issues with AI agents
In `@src/alerts/target.rs`:
- Around line 56-73: In TargetConfigs::load (the async load method) change how
t.tenant is set so that the sentinel DEFAULT_TENANT string is converted to None
instead of Some(DEFAULT_TENANT); specifically, inside the map over targets (the
closure that mutates t) set t.tenant = None when tenant_id == DEFAULT_TENANT,
otherwise set t.tenant = Some(tenant_id.clone()), then continue to collect and
insert as before.

In `@src/catalog/mod.rs`:
- Around line 567-576: The partition_path code currently maps tenant_id None to
an empty string which yields a path component ["", ...]; update partition_path
to match codebase patterns by using DEFAULT_TENANT when tenant_id is None (or
alternatively only include the tenant component when tenant_id.is_some());
modify the mapping of tenant_id (used when building the RelativePathBuf via
RelativePathBuf::from_iter) to reference DEFAULT_TENANT instead of "" so the
produced path components are consistent with the rest of the codebase.

In `@src/handlers/airplane.rs`:
- Around line 113-115: The Flight handler is using &None for tenant context
(e.g., PARSEABLE.get_stream(&table_name, &None)), which causes all Flight
queries to use the global/default tenant; derive the tenant_id from the session
key or ticket metadata in the Flight request and thread it through every
tenant-aware call: replace &None with &tenant_id when calling
PARSEABLE.get_stream, pass tenant_id into ingestor lookup, authorization checks,
execution calls, and staging cleanup routines so each of those code paths use
the extracted tenant_id rather than the default; update the same pattern in the
other Flight-related areas (the ingestor/authorization/execution/staging cleanup
call sites) to accept and use the tenant_id parameter.

In `@src/handlers/http/health_check.rs`:
- Around line 120-128: get_tenant_id_from_request currently calls
HeaderValue::to_str().unwrap(), which panics on malformed UTF-8 and crashes the
readiness handler; change that unwrap to to_str().ok() so header parsing is
fallible (returning Option) and update callers like the readiness(HttpRequest)
function to handle the None case safely (e.g., treat missing/invalid tenant as
None or return a controlled error response) — locate and modify
get_tenant_id_from_request and the readiness usage to accept an Option<String>
(or handle Option returned) instead of assuming a valid String.

In `@src/handlers/http/query.rs`:
- Around line 82-101: In get_records_and_fields: before calling into_query or
execute, set the DataFusion default schema on the current session context so
tenant-scoped SQL resolves to the tenant schema instead of "public"; obtain the
schema name from tenant_id (fall back to "public" if None), and call the session
context/state method that sets the default schema on
QUERY_SESSION.get_ctx().state() (i.e., update the session_state's
default_schema) so subsequent into_query(...) and execute(...) run under the
tenant schema.

In `@src/handlers/http/users/dashboards.rs`:
- Around line 78-86: The handler get_dashboard currently calls
get_tenant_id_from_request(&req) which reads an untrusted header and allows
cross-tenant access; change it to derive and validate the tenant from the
authenticated session (e.g., call the session-based accessor used elsewhere such
as get_tenant_id_from_session or the project’s session validation helper),
ensure the session tenant is present and return an auth/validation error if not,
then pass that session-validated tenant to
DASHBOARDS.get_dashboard(dashboard_id, &tenant_id). Also remove or ignore any
tenant value derived from headers in this function and reuse
validate_dashboard_id(dashboard_id) as already used.

In `@src/metastore/metastore_traits.rs`:
- Around line 182-192: The get_chats implementation needs to mirror
get_dashboards by iterating tenants via PARSEABLE.list_tenants() and collecting
chats for each tenant; update the get_chats function to loop over tenants first,
then for each tenant call PARSEABLE.list_dirs_relative(USERS_ROOT_DIR,
Some(tenant_id)) (or equivalent) to list users, then call
PARSEABLE.get_objects(..., Some(tenant_id)) when fetching chat objects so
tenant_id is passed through; ensure the DashMap aggregation still groups Bytes
by user key across all tenants.
- Around line 320-330: The get_node_metadata signature is inconsistent because
NodeMetadata is global but it currently takes tenant_id; change the trait method
signature async fn get_node_metadata(&self, node_type: NodeType) ->
Result<Vec<Bytes>, MetastoreError> (remove tenant_id) and update all
implementations and call sites (including any calls that currently pass
tenant_id and the storage-layer invocation) to stop passing tenant_id and
instead use the same global behavior as put_node_metadata/delete_node_metadata
(i.e., treat tenant as None). Ensure trait impls, tests, and any storage adapter
methods invoked by get_node_metadata are updated to the new signature.

In `@src/otel/traces.rs`:
- Around line 159-181: The test call site fails to compile because
flatten_otel_traces now requires a tenant_id; update the test
test_flatten_otel_traces_complete_structure to pass a tenant_id string (e.g.,
"test-tenant" or reuse any existing tenant_str in the test) when calling
flatten_otel_traces(&traces_data, tenant_id), matching how the production call
passes tenant_id (see flatten_otel_traces and process_resource_spans usage).

In `@src/query/listing_table_builder.rs`:
- Around line 101-102: Update the misleading comment above the call to
storage.list_dirs_relative(&prefix, &None).await in listing_table_builder.rs to
state that tenant_id is passed as None to intentionally use the default tenant
(DEFAULT_TENANT) for historical listing, and that multi-tenancy filtering is
handled at a higher level; reference storage.list_dirs_relative and the prefix
variable so reviewers can locate the call.

In `@src/storage/localfs.rs`:
- Around line 442-460: delete_stream and the other stream-scoped methods
(list_dirs, list_dates, list_hours, list_minutes) build paths using
self.root.join(stream_name) which ignores tenant_id and will mix tenants when
filesystem layout is tenant-prefixed; update each of these functions to derive
tenant_str = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT) and scope paths by
joining tenant_str before the stream (e.g.,
self.root.join(tenant_str).join(stream_name) or equivalent using existing path
building helpers), then use that scoped path for fs operations and metrics;
apply the same change to the other affected methods referenced in the comment
(the blocks around lines ~555-593, ~630-663, ~665-702) so all stream-scoped
filesystem operations consistently include the tenant prefix.
♻️ Duplicate comments (36)
src/prism/logstream/mod.rs (1)

66-73: Restore real stats instead of a default placeholder.

The stats result is computed but then discarded and replaced with QueriedStats::default(), which silently returns fake data and hides failures. This breaks dataset stats.

🐛 Proposed fix
-    tracing::warn!("starting dataset info");
     let info = info?;
-    tracing::warn!("got info");
     let schema = schema?;
-    tracing::warn!("got schema");
-    // let stats = stats?;
-    let stats = QueriedStats::default();
-    tracing::warn!("got FAKE stats");
+    let stats = stats?;
src/handlers/http/users/dashboards.rs (1)

248-251: Tenant scoping should not rely on raw header.

This is the same tenant-spoofing risk as list_dashboards; use session-derived tenant or validate the header against the session before listing tags.

🔧 Suggested fix
-    let tags = DASHBOARDS
-        .list_tags(&get_tenant_id_from_request(&req))
-        .await;
+    let (_user_id, tenant_id) = get_user_and_tenant_from_request(&req)?;
+    let tags = DASHBOARDS.list_tags(&tenant_id).await;
src/handlers/http/llm.rs (1)

92-104: Same tenant header parsing risk here.
This path also relies on get_tenant_id_from_request, which currently unwraps to_str() and can panic on malformed headers.

src/tenants/mod.rs (1)

36-104: Suspension state isn’t persisted (duplicate storage).
suspended_services is stored both on TenantOverview and inside meta.suspended_services, but suspend/resume only mutates the HashSet. Since get_tenants() returns meta, suspension changes won’t be reflected/persisted. Consider syncing meta on updates or removing the duplicate field.

🛠️ Option: keep meta in sync
 pub fn suspend_service(&self, tenant_id: &str, service: Service) {
     if let Some(mut tenant) = self.tenants.get_mut(tenant_id) {
         tenant.suspended_services.insert(service.clone());
+        let services = tenant.meta.suspended_services.get_or_insert_with(HashSet::new);
+        services.insert(service);
     }
 }

 pub fn resume_service(&self, tenant_id: &str, service: Service) {
     if let Some(mut tenant) = self.tenants.get_mut(tenant_id) {
         tenant.suspended_services.remove(&service);
+        if let Some(services) = tenant.meta.suspended_services.as_mut() {
+            services.remove(&service);
+        }
     }
 }
src/handlers/http/middleware.rs (2)

167-175: Avoid panic when inserting tenant header.
HeaderValue::from_str(...).unwrap() will panic on invalid header bytes. Handle the error and return a controlled failure instead.

♻️ Suggested fix
- if tid.is_some() {
-     req.headers_mut().insert(
-         HeaderName::from_static("tenant"),
-         HeaderValue::from_str(&tid.as_ref().unwrap()).unwrap(),
-     );
- }
+ if let Some(tid) = tid {
+     let header_val =
+         HeaderValue::from_str(&tid).map_err(|_| ErrorBadRequest("invalid tenant header"))?;
+     req.headers_mut()
+         .insert(HeaderName::from_static("tenant"), header_val);
+ }

167-175: Suspension bypass for default/missing/unknown tenants.
Because the header is only inserted when tid.is_some(), the default tenant never gets a tenant header. check_suspension then treats missing/unknown tenants as Authorized, so suspensions won’t apply to default (or unknown) tenants. Consider always setting the header (including DEFAULT_TENANT) or treating missing/unknown tenants as unauthorized.

💡 Possible fix
- if let Some(tid) = tid {
-     req.headers_mut().insert(
-         HeaderName::from_static("tenant"),
-         HeaderValue::from_str(&tid).unwrap(),
-     );
- }
+ let tenant_value = tid.unwrap_or_else(|| DEFAULT_TENANT.to_string());
+ let header_val =
+     HeaderValue::from_str(&tenant_value).map_err(|_| ErrorBadRequest("invalid tenant header"))?;
+ req.headers_mut()
+     .insert(HeaderName::from_static("tenant"), header_val);

 pub fn check_suspension(req: &HttpRequest, action: Action) -> rbac::Response {
-    if let Some(tenant) = req.headers().get("tenant")
-        && let Ok(tenant) = tenant.to_str()
-    {
-        if let Ok(Some(suspension)) = TENANT_METADATA.is_action_suspended(tenant, &action) {
-            return rbac::Response::Suspended(suspension);
-        } else {
-            // tenant does not exist
-        }
-    }
-    rbac::Response::Authorized
+    let tenant = req
+        .headers()
+        .get("tenant")
+        .and_then(|t| t.to_str().ok())
+        .unwrap_or(DEFAULT_TENANT);
+    match TENANT_METADATA.is_action_suspended(tenant, &action) {
+        Ok(Some(msg)) => rbac::Response::Suspended(msg),
+        Ok(None) => rbac::Response::Authorized,
+        Err(_) => rbac::Response::UnAuthorized,
+    }
 }

Also applies to: 316-327

src/handlers/http/modal/query/querier_rbac.rs (1)

79-80: Associate new users with the request tenant.

User::new_basic(..., None) drops the tenant context, so newly created users won’t be scoped to the tenant. This breaks isolation in multi-tenant setups.

🛠️ Suggested fix
-    let (user, password) = user::User::new_basic(username.clone(), None);
+    let (user, password) = user::User::new_basic(username.clone(), tenant_id.clone());
src/handlers/http/oidc.rs (3)

112-116: Use request tenant for BasicAuth lookups.

get_tenant_id_from_key yields None for BasicAuth, so lookups default to DEFAULT_TENANT even when a tenant header is present.

🛠️ Suggested fix
-    let tenant_id = get_tenant_id_from_key(&session_key);
+    let tenant_id = get_tenant_id_from_key(&session_key);
+    let tenant_id_from_req = get_tenant_id_from_request(&req);
...
-        SessionKey::BasicAuth { username, password } => match Users.get_user(&username, &tenant_id)
+        SessionKey::BasicAuth { username, password } => match Users.get_user(&username, &tenant_id_from_req)

132-155: Treat non-2xx intra-cluster login sync as failure.

send().await succeeds on non-2xx responses, so failed syncs are silently ignored.

🛠️ Suggested fix
-                        INTRA_CLUSTER_CLIENT
+                        let resp = INTRA_CLUSTER_CLIENT
                             .post(url)
                             .header(header::AUTHORIZATION, node.token)
                             .header(header::CONTENT_TYPE, "application/json")
                             .json(&json!(
                                 {
                                     "sessionCookie": _session,
                                     "user": _user,
                                     "expiry": EXPIRY_DURATION
                                 }
                             ))
                             .send()
                             .await?;
+                        resp.error_for_status()?;
                         Ok::<(), anyhow::Error>(())

323-327: Create new OIDC users within tenant scope.

New users are still created with None, which breaks tenant isolation.

🛠️ Suggested fix
-        // LET TENANT BE NONE FOR NOW!!!
-        (None, roles) => put_user(&user_id, roles, user_info, bearer, None).await?,
+        (None, roles) => put_user(&user_id, roles, user_info, bearer, tenant_id.clone()).await?,
src/rbac/user.rs (1)

377-385: Invalidate sessions using each user’s tenant.

add_roles removes sessions using the caller-provided tenant_id, which can invalidate the wrong tenant for mixed-tenant groups.

🛠️ Suggested fix
-        for group_user in &self.users {
-            mut_sessions().remove_user(group_user.userid(), tenant_id);
-        }
+        for group_user in &self.users {
+            mut_sessions().remove_user(
+                group_user.userid(),
+                group_user.tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v),
+            );
+        }
src/hottier.rs (3)

92-115: Fix stream/tenant exclusion logic to avoid cross-tenant leakage.

The current condition excludes a stream only when both stream and tenant differ, which can still include the current stream under other tenants. This reintroduces the bug previously flagged.

🐛 Proposed fix
-                if self.check_stream_hot_tier_exists(&stream, &tenant_id)
-                    && stream != current_stream
-                    && tenant_id != *current_tenant_id
+                if self.check_stream_hot_tier_exists(&stream, &tenant_id)
+                    && !(stream == current_stream && tenant_id == *current_tenant_id)
                 {

595-602: Avoid panics in an “exists” check.

hot_tier_file_path(...).unwrap() can panic on path conversion errors; this should be a best-effort check.

🐛 Proposed fix
-        let path = self.hot_tier_file_path(stream, tenant_id).unwrap();
-        PathBuf::from(path.to_string()).exists()
+        match self.hot_tier_file_path(stream, tenant_id) {
+            Ok(path) => PathBuf::from(path.to_string()).exists(),
+            Err(err) => {
+                warn!(
+                    "Failed to resolve hot tier file path for stream={stream} tenant={tenant_id:?}: {err}"
+                );
+                false
+            }
+        }

779-808: Use internal stream sizing for pstats hot tier.

pstats is an internal stream and should use INTERNAL_STREAM_HOT_TIER_SIZE_BYTES, consistent with pmeta.

🐛 Proposed fix
-                    size: MIN_STREAM_HOT_TIER_SIZE_BYTES,
+                    size: INTERNAL_STREAM_HOT_TIER_SIZE_BYTES,
                     used_size: 0,
-                    available_size: MIN_STREAM_HOT_TIER_SIZE_BYTES,
+                    available_size: INTERNAL_STREAM_HOT_TIER_SIZE_BYTES,
src/handlers/http/modal/ingest/mod.rs (1)

27-38: SyncRole fields are private but used externally.

This still looks like a compilation issue if other modules access sync_req.tenant_id / sync_req.privileges directly.

🐛 Option: make fields public
 pub struct SyncRole {
-    privileges: Vec<DefaultPrivilege>,
-    tenant_id: String,
+    pub privileges: Vec<DefaultPrivilege>,
+    pub tenant_id: String,
 }
src/alerts/alert_types.rs (1)

91-123: Auth credential extraction still resolves to None.
The admin lookup path (Line 107-117) currently returns None for both Native and OAuth users, so tenant-scoped alert queries may still run unauthenticated.

src/handlers/http/modal/ingest/ingestor_role.rs (1)

49-55: Inverted tenant validation logic remains unfixed.

The condition req_tenant.ne(DEFAULT_TENANT) && (req_tenant.eq(&sync_req.tenant_id)) rejects when the request tenant matches the payload tenant (same tenant), but the error message indicates it should reject cross-tenant operations. The second condition should use .ne() instead of .eq().

🐛 Proposed fix
-    if req_tenant.ne(DEFAULT_TENANT) && (req_tenant.eq(&sync_req.tenant_id)) {
+    if req_tenant.ne(DEFAULT_TENANT) && (req_tenant.ne(&sync_req.tenant_id)) {
         return Err(RoleError::Anyhow(anyhow::Error::msg(
             "non super-admin user trying to create role for another tenant",
         )));
     }
src/migration/mod.rs (2)

161-199: Avoid aborting all tenants on single list_streams failure.
The ? returns early and skips remaining tenants. This matches a previously flagged issue.

🔧 Suggested fix (continue on per-tenant errors)
-        let stream_names = PARSEABLE.metastore.list_streams(&tenant_id).await?;
+        let stream_names = match PARSEABLE.metastore.list_streams(&tenant_id).await {
+            Ok(names) => names,
+            Err(e) => {
+                warn!("Failed to list streams for tenant {:?}: {:?}", tenant_id, e);
+                continue;
+            }
+        };

494-507: Use PARSEABLE_METADATA_FILE_NAME instead of hardcoded .parseable.json.
This matches an earlier review item and avoids path mismatches.

🔧 Suggested fix
-            .join(".parseable.json")
+            .join(PARSEABLE_METADATA_FILE_NAME)
     } else {
-        config.options.staging_dir().join(".parseable.json")
+        config.options.staging_dir().join(PARSEABLE_METADATA_FILE_NAME)
     };
src/parseable/streams.rs (4)

657-658: Remove or downgrade debug warn! for part_path.
This is a noisy debug artifact (previously flagged).

🔧 Suggested fix
-            tracing::warn!(part_path=?part_path);
+            tracing::trace!(part_path=?part_path);

1068-1090: Downgrade/remove verbose warn logs in get_or_create.
These were previously flagged and will flood logs at scale.

🔧 Suggested fix
-        tracing::warn!(
+        tracing::trace!(
             "get_or_create\nstream- {stream_name}\ntenant- {tenant_id:?}\nmetadata- {metadata:?}\noptions- {options:?}"
         );
@@
-        tracing::warn!("creating new stream- {stream_name}");
+        tracing::trace!("creating new stream- {stream_name}");
@@
-        tracing::warn!("inserted stream in mem");
+        tracing::trace!("inserted stream in mem");

1109-1111: Noisy warn on missing tenant in contains.
This was previously flagged; consider debug/trace instead.

🔧 Suggested fix
-            tracing::warn!(
+            tracing::debug!(
                 "Tenant with id {tenant_id} does not exist! Shouldn't happen (stream- {stream_name})"
             );

1180-1181: Remove debug warns in flush_and_convert.
These were previously flagged.

🔧 Suggested fix
-        tracing::warn!(flush_and_convert_tenants=?tenants);
-        tracing::warn!(parseable_streams_tenants=?self.read().unwrap().keys());
src/handlers/http/cluster/mod.rs (1)

1875-1959: Avoid unwrap() on header conversion to prevent panics.
If the token contains invalid header bytes, this will panic. Prefer a fallible conversion and map to QueryError.

🐛 Suggested fix
-        map.insert(
-            reqwest::header::AUTHORIZATION,
-            reqwest::header::HeaderValue::from_str(&querier.token).unwrap(),
-        );
+        match reqwest::header::HeaderValue::from_str(&querier.token) {
+            Ok(val) => { map.insert(reqwest::header::AUTHORIZATION, val); }
+            Err(e) => {
+                mark_querier_available(&domain_name).await;
+                return Err(QueryError::Anyhow(anyhow::anyhow!("Invalid token header: {}", e)));
+            }
+        }
Does reqwest::header::HeaderValue::from_str return Err for non-ASCII/control bytes, and what is the recommended pattern for safe header construction?
src/query/mod.rs (1)

173-205: Don’t silently drop schema‑registration errors.
register_schema failures should be logged to surface startup issues.

✅ Suggested fix
-                    let _ = catalog.register_schema(t, schema_provider);
+                    if let Err(e) = catalog.register_schema(t, schema_provider) {
+                        tracing::error!("Failed to register schema for tenant {}: {:?}", t, e);
+                    }
In DataFusion, under what conditions can CatalogProvider::register_schema return an error, and is logging failures at startup recommended?
src/parseable/mod.rs (4)

1071-1090: Tenant creation still has a TOCTOU race and misses schema registration.
The read‑then‑write lock split allows duplicate inserts, and queries won’t work for newly added tenants without registering their schema.

🔧 Suggested fix (atomic insert + schema registration)
 pub fn add_tenant(
     &self,
     tenant_id: String,
     tenant_meta: StorageMetadata,
 ) -> Result<(), anyhow::Error> {
     if !self.options.is_multi_tenant() {
         return Err(anyhow::Error::msg("P_MULTI_TENANCY is set to false"));
     }

-    if self.tenants.read().unwrap().contains(&tenant_id) {
-        return Err(anyhow::Error::msg(format!(
-            "Tenant with id- {tenant_id} already exists"
-        )));
-    } else {
-        self.tenants.write().unwrap().push(tenant_id.clone());
-        TENANT_METADATA.insert_tenant(tenant_id, tenant_meta);
-    }
+    let mut tenants = self.tenants.write().unwrap();
+    if tenants.contains(&tenant_id) {
+        return Err(anyhow::Error::msg(format!(
+            "Tenant with id- {tenant_id} already exists"
+        )));
+    }
+    tenants.push(tenant_id.clone());
+    drop(tenants);
+
+    TENANT_METADATA.insert_tenant(tenant_id.clone(), tenant_meta);
+    crate::query::QUERY_SESSION.add_schema(&tenant_id);
 
     Ok(())
 }

1130-1158: Tenant deletion leaves stale entries in the in‑memory tenant list.
self.tenants is never updated, so list_tenants() will still return deleted tenants.

🧹 Suggested fix
 pub fn delete_tenant(&self, tenant_id: &str) -> Result<(), anyhow::Error> {
     // delete users and sessions
     let users = mut_users().remove(tenant_id);
@@
     // delete roles
     mut_roles().remove(tenant_id);

+    // Remove tenant from in-memory list
+    if let Ok(mut tenants) = self.tenants.write() {
+        tenants.retain(|t| t != tenant_id);
+    }
+
     // delete from in-mem
     TENANT_METADATA.delete_tenant(&tenant_id);
     Ok(())
 }

1160-1196: load_tenants still masks invalid state and lock failures.
Single‑tenant mode silently ignores tenant directories, and poisoned lock failures return Ok(None).

✅ Suggested fix
-            } else if !is_multi_tenant {
-            } else {
+            } else if is_multi_tenant {
                 return Err(anyhow::Error::msg(format!(
                     "Found invalid tenant directory with multi-tenant mode- {tenant_id}.\nExiting."
                 )));
             }
+            // In single-tenant mode, ignore directories without tenant metadata
         }
 
-        if let Ok(mut t) = self.tenants.write() {
-            t.extend(dirs);
-            Ok(Some(()))
-        } else {
-            Ok(None)
-        }
+        let mut t = self.tenants.write().expect("tenants lock poisoned");
+        t.extend(dirs);
+        Ok(Some(()))

1198-1207: list_tenants still hides lock poisoning.
Returning None on lock failure masks real errors.

✅ Suggested fix
 pub fn list_tenants(&self) -> Option<Vec<String>> {
-    if let Ok(t) = self.tenants.as_ref().read()
-        && !t.is_empty()
-    {
-        let t = t.clone();
-        Some(t)
-    } else {
-        None
-    }
+    let t = self.tenants.read().expect("tenants lock poisoned");
+    if t.is_empty() { None } else { Some(t.clone()) }
 }
src/storage/object_storage.rs (1)

997-1007: Per‑file warn logs during sync are still noisy.

Also applies to: 1114-1123

src/alerts/mod.rs (1)

1331-1337: Ensure tenant bucket exists before write‑back.

Also applies to: 1374-1379

src/metastore/metastores/object_store_metastore.rs (4)

516-541: Dashboards are still overwritten per user within a tenant.


666-693: Potential panic from unwrap() after filter migration.


997-1025: Manifest path construction inconsistency persists.


1274-1344: Tenant prefix is still included in stream names from list_streams.

src/metastore/metastore_traits.rs (1)

115-119: The tenant_id parameter in get_alert_state_entry is unused in path construction (implementation issue).

This issue has been previously identified. The implementation in object_store_metastore.rs calls alert_state_json_path(*alert_id) without passing tenant_id, breaking tenant isolation for alert state retrieval.

🧹 Nitpick comments (10)
src/storage/retention.rs (3)

48-52: Handle empty tenant list explicitly to avoid skipping retention.

If PARSEABLE.list_tenants() returns Some(vec![]), the scheduler will skip all streams. Please confirm that no tenantless/default streams exist in that state; otherwise, consider falling back to None or logging a clear message.

Possible adjustment
-        let tenants = if let Some(tenants) = PARSEABLE.list_tenants() {
-            tenants.into_iter().map(|v| Some(v)).collect()
-        } else {
-            vec![None]
-        };
+        let tenants = match PARSEABLE.list_tenants() {
+            Some(tenants) if !tenants.is_empty() => tenants.into_iter().map(Some).collect(),
+            _ => vec![None],
+        };

71-73: Include tenant_id in retention-load warnings.

In multi-tenant mode this warning is hard to attribute without tenant context. Consider adding tenant_id to the log fields.

Example tweak
-                    Err(err) => {
-                        warn!("failed to load retention config for {stream_name} due to {err:?}")
-                    }
+                    Err(err) => {
+                        warn!(
+                            "failed to load retention config for stream={stream_name} tenant={tenant_id:?} due to {err:?}"
+                        )
+                    }

188-222: Add tenant context to retention delete logs and failures.

With tenant scoping, it’s hard to trace delete actions without tenant info. Consider adding tenant_id to the info/error logs and logging list_dates failures.

Example tweak
-    pub(super) async fn delete(stream_name: String, days: u32, tenant_id: &Option<String>) {
-        info!("running retention task - delete for stream={stream_name}");
+    pub(super) async fn delete(stream_name: String, days: u32, tenant_id: &Option<String>) {
+        info!(
+            "running retention task - delete for stream={stream_name} tenant={tenant_id:?}"
+        );
         let store = PARSEABLE.storage.get_object_store();

-        let Ok(mut dates) = store.list_dates(&stream_name, tenant_id).await else {
+        let Ok(mut dates) = store.list_dates(&stream_name, tenant_id).await else {
+            tracing::error!(
+                "failed to list dates for retention delete stream={stream_name} tenant={tenant_id:?}"
+            );
             return;
         };
src/analytics.rs (1)

209-232: Simplify .map(|v| Some(v)) to .map(Some).

The tenant iteration logic is correct—when tenants exist, iterate per-tenant; otherwise fall back to vec![None] for non-multi-tenant deployments. However, line 210 can be simplified.

✨ Suggested simplification
     let tenants = if let Some(tenants) = PARSEABLE.list_tenants() {
-        tenants.into_iter().map(|v| Some(v)).collect()
+        tenants.into_iter().map(Some).collect()
     } else {
         vec![None]
     };
src/handlers/http/modal/mod.rs (1)

632-684: Tenant-aware hot-tier bootstrap looks good; remove commented legacy block.
The tenant iteration and DEFAULT_TENANT mapping (Line 632-684) look solid. Consider deleting the commented-out legacy block to keep the function clean.

♻️ Suggested cleanup
-        // let streams_guard = PARSEABLE.streams.read().unwrap();
-        // streams_guard
-        //     .iter()
-        //     .filter_map(|(stream_name, stream)| {
-        //         // Skip if hot tier metadata file already exists for this stream
-        //         if hot_tier_manager.check_stream_hot_tier_exists(stream_name) {
-        //             return None;
-        //         }
-
-        //         // Get the hot tier configuration from the in-memory stream metadata
-        //         stream
-        //             .get_hot_tier()
-        //             .map(|config| (stream_name.clone(), config))
-        //     })
-        //     .collect()
src/catalog/mod.rs (1)

467-467: Remove debug logging artifact.

This warn! log for manifest path appears to be a debugging artifact that should be removed or downgraded to trace! level before merging.

♻️ Suggested fix
-    tracing::warn!("manifest path_url= {path_url}");
+    tracing::trace!("manifest path_url= {path_url}");
src/storage/localfs.rs (1)

494-498: Prefer DEFAULT_TENANT constant over "default" literal.
Keeps labels consistent if the default tenant name ever changes.

♻️ Suggested tweak
-                    "default",
+                    DEFAULT_TENANT,
@@
-                    "default",
+                    DEFAULT_TENANT,

Also applies to: 530-534

src/handlers/http/logstream.rs (1)

416-477: Consider removing the stale “TODO tenants” note.
Tenant_id is already threaded through hot‑tier operations, so the TODO is misleading.

src/metastore/metastore_traits.rs (2)

316-317: Commented-out code should be removed or tracked.

The commented-out method declaration put_tenant_metadata should either be removed if no longer needed, or tracked via a TODO/issue if planned for future implementation.

🧹 Suggested cleanup
     async fn delete_tenant(&self, tenant_id: &str) -> Result<(), MetastoreError>;
-    // async fn put_tenant_metadata(&self, obj: &dyn MetastoreObject) -> Result<(), MetastoreError>;

46-335: Consider using Option<&str> instead of &Option<String> for ergonomics.

Throughout this trait, the pattern tenant_id: &Option<String> is used consistently. However, Option<&str> is generally more ergonomic in Rust as it:

  • Avoids borrowing an owned container
  • Allows callers to pass Some("literal") directly without creating a String
  • Is a common Rust idiom for optional string parameters

This is a minor suggestion given the scope of changes, but worth considering for API polish if breaking changes are acceptable at this stage.

Comment on lines 56 to 73
impl TargetConfigs {
/// Loads alerts from disk, blocks
pub async fn load(&self) -> anyhow::Result<()> {
let targets = PARSEABLE.metastore.get_targets().await?;
let mut map = self.target_configs.write().await;
for target in targets {
map.insert(target.id, target);
for (tenant_id, targets) in targets {
let inner = targets
.into_iter()
.map(|mut t| {
t.tenant = Some(tenant_id.clone());
(t.id, t)
})
.collect();
map.insert(tenant_id, inner);
}

Ok(())
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, find the metastore implementation of get_targets
find . -type f -name "*.rs" | xargs grep -l "get_targets" | head -20

Repository: parseablehq/parseable

Length of output: 311


🏁 Script executed:

# Search for the metastore module and get_targets implementation
rg "fn get_targets" -A 20 -B 2

Repository: parseablehq/parseable

Length of output: 7924


🏁 Script executed:

# Find DEFAULT_TENANT constant definition
rg "DEFAULT_TENANT" -B 2 -A 2

Repository: parseablehq/parseable

Length of output: 50377


🏁 Script executed:

# Check the target.rs file structure to understand the context better
cat -n src/alerts/target.rs | head -80

Repository: parseablehq/parseable

Length of output: 3093


🏁 Script executed:

# Get the full get_targets implementation from object_store_metastore.rs
rg "async fn get_targets" -A 30 src/metastore/metastores/object_store_metastore.rs

Repository: parseablehq/parseable

Length of output: 1272


🏁 Script executed:

# Check how metastore handles default tenant in keys
rg "list_tenants\|DEFAULT_TENANT" src/metastore/metastores/object_store_metastore.rs -B 2 -A 2

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Look for how Target struct and tenant field are used elsewhere
rg "impl.*Target\|struct Target" -A 15 src/alerts/alert_types.rs | head -50

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Check the Target struct definition
rg "struct Target" -A 20 src/alerts/

Repository: parseablehq/parseable

Length of output: 3362


🏁 Script executed:

# Look for other usages of Target.tenant to see how it's handled
rg "\.tenant\s*=" src/alerts/target.rs -B 3 -A 3

Repository: parseablehq/parseable

Length of output: 289


🏁 Script executed:

# Check how other code paths set target.tenant
rg "target\.tenant\s*=" -B 2 -A 2

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Check if there's any code that converts DEFAULT_TENANT string back to None
rg "DEFAULT_TENANT.*None\|None.*DEFAULT_TENANT" -B 2 -A 2

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Look at how alerts load() function handles tenant_id (if it exists)
fd "alert" --type f -name "*.rs" | xargs rg "impl.*AlertConfigs\|pub async fn load" | head -20

Repository: parseablehq/parseable

Length of output: 295


🏁 Script executed:

# Search for other load() implementations that might handle DEFAULT_TENANT
rg "pub async fn load.*->.*Result" -A 25 | head -100

Repository: parseablehq/parseable

Length of output: 5646


🏁 Script executed:

# Check how target.tenant field is accessed/used after being loaded
rg "\.tenant\s*==" -B 2 -A 2 src/alerts/

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Look at how get_tenant_id_from_key converts tenant fields
cat -n src/utils/mod.rs | grep -A 10 "get_tenant_id_from_key"

Repository: parseablehq/parseable

Length of output: 894


🏁 Script executed:

# Check if there are any tests for target loading
find . -name "*test*" -o -name "*spec*" | xargs grep -l "target.*load\|TargetConfigs" 2>/dev/null

Repository: parseablehq/parseable

Length of output: 49


Convert DEFAULT_TENANT string to None in Target.tenant field for consistency.

When get_targets() returns targets with tenant_id="DEFAULT_TENANT" (the string literal), the current code stores Some("DEFAULT_TENANT") in t.tenant. This violates the codebase convention where None represents the default tenant and Some(String) represents a specific tenant (as seen in get_tenant_id_from_key() and throughout the codebase).

Update the mapping to convert the DEFAULT_TENANT string to None:

Suggested fix
let inner = targets
    .into_iter()
    .map(|mut t| {
        t.tenant = if tenant_id == DEFAULT_TENANT {
            None
        } else {
            Some(tenant_id.clone())
        };
        (t.id, t)
    })
    .collect();
🤖 Prompt for AI Agents
In `@src/alerts/target.rs` around lines 56 - 73, In TargetConfigs::load (the async
load method) change how t.tenant is set so that the sentinel DEFAULT_TENANT
string is converted to None instead of Some(DEFAULT_TENANT); specifically,
inside the map over targets (the closure that mutates t) set t.tenant = None
when tenant_id == DEFAULT_TENANT, otherwise set t.tenant =
Some(tenant_id.clone()), then continue to collect and insert as before.

Comment on lines 113 to 115
let schema = PARSEABLE
.get_stream(&table_name)
.get_stream(&table_name, &None)
.map_err(|err| Status::failed_precondition(err.to_string()))?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Flight queries still default to the global tenant.
Using &None for stream access, ingestor lookup, authorization, execution, and staging cleanup means Flight requests for non-default tenants will hit the default tenant, risking data leakage or incorrect auth. Please derive tenant_id from the session key (or ticket metadata) and pass it through these calls.

💡 Suggested approach
- let session_state = QUERY_SESSION.get_ctx().state();
+ let session_state = QUERY_SESSION.get_ctx().state();

  let key = extract_session_key(req.metadata())
      .map_err(|e| Status::unauthenticated(e.to_string()))?;

+ let (_, tenant_raw) =
+     Users.get_userid_from_session(&key).ok_or_else(|| Status::unauthenticated("invalid session"))?;
+ let tenant_id = if tenant_raw == DEFAULT_TENANT { None } else { Some(tenant_raw) };

- let schema = PARSEABLE.get_stream(&table_name, &None) ...
+ let schema = PARSEABLE.get_stream(&table_name, &tenant_id) ...

- let ingester_metadatas = get_node_info(NodeType::Ingestor, &None).await?;
+ let ingester_metadatas = get_node_info(NodeType::Ingestor, &tenant_id).await?;

- user_auth_for_datasets(&permissions, &streams, &None).await?;
+ user_auth_for_datasets(&permissions, &streams, &tenant_id).await?;

- let (records, _) = execute(query, false, &None).await?;
+ let (records, _) = execute(query, false, &tenant_id).await?;

- PARSEABLE.get_or_create_stream(&stream_name, &None).clear();
+ PARSEABLE.get_or_create_stream(&stream_name, &tenant_id).clear();

Also applies to: 169-210, 238-238

🤖 Prompt for AI Agents
In `@src/handlers/airplane.rs` around lines 113 - 115, The Flight handler is using
&None for tenant context (e.g., PARSEABLE.get_stream(&table_name, &None)), which
causes all Flight queries to use the global/default tenant; derive the tenant_id
from the session key or ticket metadata in the Flight request and thread it
through every tenant-aware call: replace &None with &tenant_id when calling
PARSEABLE.get_stream, pass tenant_id into ingestor lookup, authorization checks,
execution calls, and staging cleanup routines so each of those code paths use
the extracted tenant_id rather than the default; update the same pattern in the
other Flight-related areas (the ingestor/authorization/execution/staging cleanup
call sites) to accept and use the tenant_id parameter.

Comment on lines +120 to +128
pub async fn readiness(req: HttpRequest) -> HttpResponse {
let tenant_id = get_tenant_id_from_request(&req);
// Check the object store connection
if PARSEABLE.storage.get_object_store().check().await.is_ok() {
if PARSEABLE
.storage
.get_object_store()
.check(&tenant_id)
.await
.is_ok()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's find and examine the get_tenant_id_from_request function
find . -name "*.rs" -type f | head -20

Repository: parseablehq/parseable

Length of output: 610


🏁 Script executed:

# Search for get_tenant_id_from_request definition
rg "fn get_tenant_id_from_request" -A 10

Repository: parseablehq/parseable

Length of output: 654


🏁 Script executed:

# Also check the readiness function in health_check.rs
rg "pub async fn readiness" -A 20

Repository: parseablehq/parseable

Length of output: 977


🏁 Script executed:

# Search for all usages of get_tenant_id_from_request
rg "get_tenant_id_from_request" -B 2 -A 2

Repository: parseablehq/parseable

Length of output: 38932


Fix panic on invalid tenant header.
get_tenant_id_from_request unwraps HeaderValue::to_str(), so a malformed tenant header causes a panic in the readiness endpoint. Replace .unwrap() with .ok() to safely handle invalid UTF-8:

Safe fallible header parsing
pub fn get_tenant_id_from_request(req: &HttpRequest) -> Option<String> {
-    if let Some(tenant_value) = req.headers().get("tenant") {
-        Some(tenant_value.to_str().unwrap().to_owned())
-    } else {
-        None
-    }
+    match req.headers().get("tenant") {
+        Some(tenant_value) => tenant_value.to_str().ok().map(|s| s.to_owned()),
+        None => None,
+    }
}
🤖 Prompt for AI Agents
In `@src/handlers/http/health_check.rs` around lines 120 - 128,
get_tenant_id_from_request currently calls HeaderValue::to_str().unwrap(), which
panics on malformed UTF-8 and crashes the readiness handler; change that unwrap
to to_str().ok() so header parsing is fallible (returning Option) and update
callers like the readiness(HttpRequest) function to handle the None case safely
(e.g., treat missing/invalid tenant as None or return a controlled error
response) — locate and modify get_tenant_id_from_request and the readiness usage
to accept an Option<String> (or handle Option returned) instead of assuming a
valid String.

Comment on lines +101 to +102
// None because no new data will be created using this method
match storage.list_dirs_relative(&prefix, &None).await {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find list_dirs_relative implementations and trait definitions
echo "=== Searching for list_dirs_relative implementations ==="
rg -n "fn list_dirs_relative" --type rust -B 2 -A 10

Repository: parseablehq/parseable

Length of output: 4304


🏁 Script executed:

#!/bin/bash
# Also search for the trait definition
echo "=== Searching for trait definition ==="
rg -n "list_dirs_relative" --type rust | head -30

Repository: parseablehq/parseable

Length of output: 1527


🏁 Script executed:

#!/bin/bash
# Check the context around lines 101-102 in listing_table_builder.rs
echo "=== Context around lines 101-102 ==="
sed -n '95,110p' src/query/listing_table_builder.rs

Repository: parseablehq/parseable

Length of output: 716


🏁 Script executed:

#!/bin/bash
# Search for other uses of list_dirs_relative to understand tenant_id usage patterns
echo "=== Other uses of list_dirs_relative ==="
rg -n "list_dirs_relative" --type rust -B 1 -A 1

Repository: parseablehq/parseable

Length of output: 3493


Clarify the comment to reflect correct multi-tenancy behavior.

The comment "None because no new data will be created using this method" is misleading. This is a read/listing operation, not a data creation operation. More importantly, passing None for tenant_id defaults to DEFAULT_TENANT (as seen in the implementations across s3.rs, gcs.rs, and azure_blob.rs).

The comment should explain why the default tenant is appropriate for this historical data listing context, for example: "tenant_id is None to list from the default tenant (multi-tenant queries handled at a higher level)" or similar.

Suggested comment improvement
-            // None because no new data will be created using this method
+            // tenant_id is None to use the default tenant for this historical data listing
             match storage.list_dirs_relative(&prefix, &None).await {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// None because no new data will be created using this method
match storage.list_dirs_relative(&prefix, &None).await {
// tenant_id is None to use the default tenant for this historical data listing
match storage.list_dirs_relative(&prefix, &None).await {
🤖 Prompt for AI Agents
In `@src/query/listing_table_builder.rs` around lines 101 - 102, Update the
misleading comment above the call to storage.list_dirs_relative(&prefix,
&None).await in listing_table_builder.rs to state that tenant_id is passed as
None to intentionally use the default tenant (DEFAULT_TENANT) for historical
listing, and that multi-tenancy filtering is handled at a higher level;
reference storage.list_dirs_relative and the prefix variable so reviewers can
locate the call.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (8)
src/storage/s3.rs (1)

944-952: Use tenant label consistently in list_old_streams HEAD checks.

The HEAD metrics use "" as tenant, while the surrounding LIST metrics use DEFAULT_TENANT. This makes labels inconsistent and can fragment metrics.

🔧 Suggested fix
-                increment_object_store_calls_by_date(
-                    "HEAD",
-                    &Utc::now().date_naive().to_string(),
-                    "",
-                );
+                increment_object_store_calls_by_date(
+                    "HEAD",
+                    &Utc::now().date_naive().to_string(),
+                    tenant_str,
+                );
src/parseable/streams.rs (1)

118-126: Update all Stream::new call sites to pass the required tenant_id parameter.

The signature requires tenant_id: &Option<String> as the 5th parameter. Multiple test call sites pass only 4 arguments and will fail to compile. Update all usages at lines 1216, 1234, 1252, 1270, 1291, 1314, 1348, 1377, 1434, 1460, 1483, 1510, 1533, and 1564 to pass &None as the final argument.

src/storage/azure_blob.rs (1)

214-266: Move object-store call metrics to the success path.

increment_object_store_calls_by_date is invoked before the result is validated, so failed GET/PUT calls are still counted. Please move the increment into the Ok branch (and mirror this pattern in the other storage ops in this file) so metrics only reflect successful operations. Based on learnings, this is the intended behavior.

🛠️ Suggested fix (apply same pattern to other ops)
-        increment_object_store_calls_by_date("GET", &Utc::now().date_naive().to_string(), tenant);
-
         match resp {
             Ok(resp) => {
+                increment_object_store_calls_by_date(
+                    "GET",
+                    &Utc::now().date_naive().to_string(),
+                    tenant,
+                );
                 let body: Bytes = resp.bytes().await?;
                 increment_files_scanned_in_object_store_calls_by_date(
                     "GET",
                     1,
                     &Utc::now().date_naive().to_string(),
                     tenant,
                 );
                 increment_bytes_scanned_in_object_store_calls_by_date(
                     "GET",
                     body.len() as u64,
                     &Utc::now().date_naive().to_string(),
                     tenant,
                 );
                 Ok(body)
             }
             Err(err) => Err(err.into()),
         }
-        increment_object_store_calls_by_date("PUT", &Utc::now().date_naive().to_string(), tenant);
         match resp {
             Ok(_) => {
+                increment_object_store_calls_by_date(
+                    "PUT",
+                    &Utc::now().date_naive().to_string(),
+                    tenant,
+                );
                 increment_files_scanned_in_object_store_calls_by_date(
                     "PUT",
                     1,
                     &Utc::now().date_naive().to_string(),
                     tenant,
                 );
                 Ok(())
             }
             Err(err) => Err(err.into()),
         }
src/event/format/mod.rs (1)

221-235: Fix missing tenant_id parameter in Kafka processor call site.

The into_event trait definition now requires a tenant_id parameter, but the Kafka processor in src/connectors/kafka/processor.rs:84 is missing this argument. Three other call sites (ingest.rs, modal/utils/ingest_utils.rs, field_stats.rs) correctly pass the parameter. Update the Kafka processor call to include tenant_id as the final argument.

src/alerts/alerts_utils.rs (1)

131-150: Auth token is ignored in remote query path.
auth_token is accepted but never forwarded, so Prism alerts will always run with the cluster token. That can break RBAC expectations or tenant isolation. Please pass it into send_query_request.

🔧 Suggested fix
-    let (result_value, _) = send_query_request(None, &query_request, tenant_id)
+    let auth_header = auth_token.map(|token| {
+        let mut map = reqwest::header::HeaderMap::new();
+        map.insert(
+            reqwest::header::AUTHORIZATION,
+            reqwest::header::HeaderValue::from_str(&token).unwrap(),
+        );
+        map
+    });
+    let (result_value, _) = send_query_request(auth_header, &query_request, tenant_id)
src/storage/store_metadata.rs (1)

307-329: Missing directory creation for tenant-specific staging path.

When tenant_id is provided, the staging metadata is written to staging_dir/{tenant_id}/.parseable.json, but the tenant subdirectory may not exist. The OpenOptions::new().create(true) only creates the file, not intermediate directories.

🐛 Suggested fix
 pub fn put_staging_metadata(meta: &StorageMetadata, tenant_id: &Option<String>) -> io::Result<()> {
     let mut staging_metadata = meta.clone();
     staging_metadata.server_mode = PARSEABLE.options.mode;
     staging_metadata.staging = PARSEABLE.options.staging_dir().to_path_buf();
     let path = if let Some(tenant_id) = tenant_id.as_ref() {
-        PARSEABLE
+        let tenant_dir = PARSEABLE
             .options
             .staging_dir()
-            .join(tenant_id)
-            .join(PARSEABLE_METADATA_FILE_NAME)
+            .join(tenant_id);
+        create_dir_all(&tenant_dir)?;
+        tenant_dir.join(PARSEABLE_METADATA_FILE_NAME)
     } else {
         PARSEABLE
             .options
             .staging_dir()
             .join(PARSEABLE_METADATA_FILE_NAME)
     };
src/alerts/alert_structs.rs (1)

291-373: Add tenantId and tenant_id to RESERVED_FIELDS to prevent client-supplied values from spoofing the server-set tenant.

Since other_fields is flattened during serialization and tenant_id is now a first-class field in AlertConfig, a client can submit tenantId/tenant_id in the request payload where it will be captured in other_fields. These values are not currently stripped during sanitization (lines 302–306), which means they will be serialized alongside the explicit tenant_id field, potentially creating duplicate keys or confusing the server's tenant assignment. Add both variants to RESERVED_FIELDS to ensure they are removed.

🔧 Proposed fix
 const RESERVED_FIELDS: &[&str] = &[
     "lastTriggeredAt",
     "last_triggered_at",
+    "tenantId",
+    "tenant_id",
 ];
src/metastore/metastores/object_store_metastore.rs (1)

896-941: Include tenant prefix when constructing the nested date_path.

The date_path must include the tenant segment to maintain consistency with the initial root path. When tenant_id is present, date_path should be constructed as {tenant}/{stream_name}/{date}, not {stream_name}/{date}. See how list_streams correctly handles this pattern with conditional path construction.

Suggested fix
for date in dates {
    let date_path = if let Some(tenant) = tenant_id {
        object_store::path::Path::from_iter([
            tenant,
            stream_name,
            &date
        ])
    } else {
        object_store::path::Path::from(format!("{}/{}", stream_name, &date))
    };
    let resp = self.storage.list_with_delimiter(Some(date_path)).await?;
    // ... rest of code
}
🤖 Fix all issues with AI agents
In `@src/handlers/http/cluster/mod.rs`:
- Around line 918-920: The call to get_tenant_id_from_request within
get_cluster_info can panic because get_tenant_id_from_request currently uses
to_str().unwrap(); change get_tenant_id_from_request (in src/utils/mod.rs) to
return a Result<String, Error> or Option<String> instead of unwrapping, validate
the header with to_str()? and map any error to a clear Err/None, and then update
get_cluster_info to handle that return (e.g., return an HTTP 400/BadRequest or
other appropriate StreamError when the tenant header is invalid) rather than
taking a &str from an unwrap—this prevents server panics on malformed tenant
headers and centralizes error handling.

In `@src/handlers/http/middleware.rs`:
- Around line 171-210: The non-ingest branch incorrectly treats Ok((_, None))
from get_user_and_tenant_from_request as an InvalidTenantId; instead, treat None
as the DEFAULT_TENANT and insert that into headers. Update the non-ingest logic
inside the PARSEABLE.options.is_multi_tenant() branch (the block handling
Action::Ingest == false) so that when
get_user_and_tenant_from_request(req.request()) returns Ok((_, tenant)) you
compute tid = tenant.as_ref().map_or(DEFAULT_TENANT.to_string(), |v| v.clone()),
insert that tid into req.headers_mut(), and do not set header_error; only set
header_error for the Err(_) case. Ensure references to DEFAULT_TENANT,
get_user_and_tenant_from_request, header_error, PostError::Header and
ParseHeaderError::InvalidTenantId are used to locate and modify the code.

In `@src/handlers/http/query.rs`:
- Around line 112-126: Replace the spoofable tenant extraction in query by
calling get_user_and_tenant_from_request(req) instead of
get_tenant_id_from_request(&req), map any error to QueryError::Unauthorized,
extract the tenant from that result and pass it to
create_streams_for_distributed(tables.clone(), &tenant) and to set
session_state.config_mut().options_mut().catalog.default_schema (use
tenant.as_ref().map_or("public".into(), |v| v.to_owned())); apply the same
change in get_counts so both paths derive tenant from the authenticated session
rather than the tenant header.

In `@src/storage/object_storage.rs`:
- Around line 1214-1226: schema_path currently always inserts the tenant segment
when tenant_id is Some(...), which causes divergence from stream_relative_path
and existing layout for DEFAULT_TENANT; update schema_path (and the analogous
stream_json_path) to treat DEFAULT_TENANT the same as None by skipping the
tenant segment when tenant_id.as_deref() == Some(DEFAULT_TENANT) (i.e., compute
tenant = tenant_id.as_ref().filter(|t| t.as_str() != DEFAULT_TENANT).map_or("",
|v| v) or equivalent) so both the ingest branch (file_name =
format!(".ingestor.{id}{SCHEMA_FILE_NAME}")) and the normal branch build paths
without the tenant prefix for the default tenant, matching stream_relative_path
and parquet paths.

In `@src/storage/s3.rs`:
- Around line 341-353: The metric increment is currently called before knowing
operation success in S3::_get_object; move the call to
increment_object_store_calls_by_date into the success (Ok) branch after the
client.get completes successfully, and apply the same change pattern to
S3::_put_object, S3::_delete_prefix (list/delete), S3::_list_dates,
S3::_upload_file, the small-file branch in S3::_upload_multipart, and the
helpers head/delete_object/check/list so that
increment_object_store_calls_by_date("GET"/"PUT"/"DELETE"/etc., ...) is only
invoked after confirming the operation returned Ok, not before.
♻️ Duplicate comments (51)
src/query/listing_table_builder.rs (1)

101-102: Misleading comment for tenant_id parameter.

This issue was already flagged in a previous review. The comment "None because no new data will be created using this method" is misleading since this is a read/listing operation. The None actually causes the code to use DEFAULT_TENANT. The comment should clarify that tenant_id is None to use the default tenant for historical data listing.

src/migration/mod.rs (1)

499-507: Inconsistent filename: hardcoded string vs constant.

Lines 504 and 506 use the hardcoded string ".parseable.json" while the rest of the codebase (including get_staging_metadata at line 468-469) uses the PARSEABLE_METADATA_FILE_NAME constant. This inconsistency could lead to path mismatches if the constant value ever changes.

Suggested fix: use the constant
     let path = if let Some(tenant) = tenant_id.as_ref() {
         config
             .options
             .staging_dir()
             .join(tenant)
-            .join(".parseable.json")
+            .join(PARSEABLE_METADATA_FILE_NAME)
     } else {
-        config.options.staging_dir().join(".parseable.json")
+        config.options.staging_dir().join(PARSEABLE_METADATA_FILE_NAME)
     };
src/handlers/http/middleware.rs (1)

424-435: Missing/unknown tenant still yields Authorized.

check_suspension returns Authorized when the tenant header is missing or the tenant isn’t found, which can bypass tenant suspension. Please confirm this is intended.

src/handlers/http/modal/ingest/ingestor_role.rs (1)

45-51: Tenant validation condition is inverted.

The guard currently rejects when the request tenant matches the payload tenant, but the error message indicates it should reject only when they differ. This blocks valid same-tenant updates.

🐛 Proposed fix
-    if req_tenant.ne(DEFAULT_TENANT) && (req_tenant.eq(&sync_req.tenant_id)) {
+    if req_tenant.ne(DEFAULT_TENANT) && (req_tenant.ne(&sync_req.tenant_id)) {
         return Err(RoleError::Anyhow(anyhow::Error::msg(
             "non super-admin user trying to create role for another tenant",
         )));
     }
src/handlers/airplane.rs (1)

113-115: Flight queries still default to global tenant — data isolation risk.

All tenant-aware calls in this file pass &None for tenant_id:

  • Line 114: get_stream(&table_name, &None)
  • Line 169: get_node_info(NodeType::Ingestor, &None)
  • Line 202: user_auth_for_datasets(&permissions, &streams, &None)
  • Line 209: execute(query, false, &None)
  • Line 238: get_or_create_stream(&stream_name, &None)

In a multi-tenant deployment, Flight requests will query/modify the default tenant's data regardless of the authenticated user's tenant, risking data leakage or incorrect authorization. Derive tenant_id from the session key or ticket metadata and thread it through these calls.

Also applies to: 169-171, 202-206, 209-211, 238-238

src/handlers/http/modal/query/querier_rbac.rs (1)

79-79: User created without tenant association.

user::User::new_basic(username.clone(), None) passes None for the tenant parameter even though tenant_id is available from the request (line 50). New users won't be associated with their tenant.

-    let (user, password) = user::User::new_basic(username.clone(), None);
+    let (user, password) = user::User::new_basic(username.clone(), tenant_id.clone());
src/rbac/map.rs (1)

311-322: Critical bug: Session expiration logic is inverted.

Line 321 retains sessions where expiry < &now, which keeps expired sessions and removes valid ones. The condition should be expiry > &now to retain sessions that have not yet expired.

-        sessions.retain(|(_, expiry)| expiry < &now);
+        sessions.retain(|(_, expiry)| expiry > &now);
src/storage/localfs.rs (1)

442-460: Tenant_id isn't applied to LocalFS paths for stream-scoped operations.

While tenant_id is now used for metrics, the filesystem paths still operate on self.root.join(stream_name) without tenant prefixing:

  • delete_stream (line 447): let path = self.root.join(stream_name);
  • list_dirs (line 560): fs::read_dir(&self.root)
  • list_dates (line 635): let path = self.root.join(stream_name);

In a multi-tenant deployment with tenant-prefixed filesystem layout, these operations will mix tenant data. Either confirm that upstream code already prefixes stream_name with tenant, or scope paths by tenant:

-        let path = self.root.join(stream_name);
+        let tenant_str = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
+        let path = self.root.join(tenant_str).join(stream_name);

Also applies to: 555-593, 630-663

src/hottier.rs (3)

105-116: Logic error in filtering: uses && instead of correct exclusion logic.

Lines 108-109 use && stream != current_stream && tenant_id != *current_tenant_id, which only skips a stream if BOTH conditions are true. This incorrectly includes streams when either condition fails.

The intent is to exclude only the current stream for the current tenant:

 for tenant_id in tenants {
     for stream in PARSEABLE.streams.list(&tenant_id) {
         if self.check_stream_hot_tier_exists(&stream, &tenant_id)
-            && stream != current_stream
-            && tenant_id != *current_tenant_id
+            && !(stream == current_stream && tenant_id == *current_tenant_id)
         {

596-603: Avoid panics in check_stream_hot_tier_exists.

Line 601 calls self.hot_tier_file_path(stream, tenant_id).unwrap() which can panic on path conversion errors. Since this is an existence check, it should return false on errors instead:

 pub fn check_stream_hot_tier_exists(&self, stream: &str, tenant_id: &Option<String>) -> bool {
-    let path = self.hot_tier_file_path(stream, tenant_id).unwrap();
-    PathBuf::from(path.to_string()).exists()
+    match self.hot_tier_file_path(stream, tenant_id) {
+        Ok(path) => PathBuf::from(path.to_string()).exists(),
+        Err(_) => false,
+    }
 }

779-808: create_pstats_hot_tier should use INTERNAL_STREAM_HOT_TIER_SIZE_BYTES.

Line 796 allocates MIN_STREAM_HOT_TIER_SIZE_BYTES (10 GiB) for the pstats internal stream, while pmeta (line 767-769) uses INTERNAL_STREAM_HOT_TIER_SIZE_BYTES (10 MiB). For consistency with other internal streams:

                     let mut stream_hot_tier = StreamHotTier {
                         version: Some(CURRENT_HOT_TIER_VERSION.to_string()),
-                        size: MIN_STREAM_HOT_TIER_SIZE_BYTES,
+                        size: INTERNAL_STREAM_HOT_TIER_SIZE_BYTES,
                         used_size: 0,
-                        available_size: MIN_STREAM_HOT_TIER_SIZE_BYTES,
+                        available_size: INTERNAL_STREAM_HOT_TIER_SIZE_BYTES,
                         oldest_date_time_entry: None,
                     };
src/parseable/streams.rs (4)

657-658: Downgrade per-file warn log in parquet conversion.

tracing::warn!(part_path=...) is per-file and will spam logs in normal operation; this should be trace or removed.


1068-1090: Remove debug warn logs in get_or_create.

The multi-line warn! plus the “creating new stream” and “inserted stream” warn logs are noisy for normal flows.


1104-1112: Reduce missing-tenant warning verbosity.

This warn fires on normal “tenant not registered yet” checks. Prefer debug/trace to avoid log noise.


1175-1181: Remove debug warn logs from flush_and_convert.

The tenant list and keyset warnings are debug artifacts and will be noisy in production.

src/handlers/http/cluster/mod.rs (2)

709-715: Remove or use the unused req parameter in sync_role_update.

req isn’t used; this will trigger unused warnings. Either remove it (and update callers) or rename to _req.


1900-1914: Prevent panic on invalid querier token header.

HeaderValue::from_str(&querier.token).unwrap() can panic on invalid characters. Handle the Result and return an error instead.

src/storage/object_storage.rs (1)

996-999: Downgrade per-file warning logs in the sync pipeline.

Both process_parquet_files and process_schema_files warn on every file path; this should be debug/trace to avoid log spam.

Also applies to: 1118-1120

src/handlers/http/correlation.rs (1)

41-63: Use session-derived tenant in get for consistency and safety.

get still derives tenant_id from the request header while delete uses the authenticated session. This inconsistency can allow cross-tenant access if headers are manipulated. Recommend switching get to session-based tenant extraction.

🔒 Suggested fix
 pub async fn get(
     req: HttpRequest,
     correlation_id: Path<String>,
 ) -> Result<impl Responder, CorrelationError> {
-    let tenant_id = get_tenant_id_from_request(&req);
     let correlation_id = correlation_id.into_inner();
     let session_key = extract_session_key_from_req(&req)
         .map_err(|err| CorrelationError::AnyhowError(Error::msg(err.to_string())))?;
+    let tenant_id = get_user_and_tenant_from_request(&req)
+        .map(|(_, t)| t)
+        .map_err(|err| CorrelationError::AnyhowError(Error::msg(err.to_string())))?;

     let correlation = CORRELATIONS
         .get_correlation(&correlation_id, &tenant_id)
         .await?;
src/otel/traces.rs (1)

922-923: Update test call site to pass tenant_id.
flatten_otel_traces now requires tenant_id, but the test still calls it without one.

🔧 Suggested fix
-        let result = flatten_otel_traces(&traces_data);
+        let result = flatten_otel_traces(&traces_data, "test-tenant");
src/query/stream_schema_provider.rs (2)

533-538: Reduce logging level from warn to debug or trace.

This logging runs on every table scan and will flood production logs. It appears to be debug instrumentation.

♻️ Suggested fix
-        tracing::warn!(
+        tracing::trace!(
             "entered scan with\ntenant- {:?}\nschema- {:?}\nstream- {}",
             self.tenant_id,
             self.schema,
             self.stream
         );

649-649: Remove or reduce debug logging.

This warn! log on every scan is too noisy for production.

♻️ Suggested fix
-        tracing::warn!(object_store_url=?object_store_url);
+        tracing::trace!(object_store_url=?object_store_url);
src/tenants/mod.rs (1)

36-40: Data duplication causes state divergence when suspending/resuming services.

TenantOverview stores suspended_services in two places: the HashSet<Service> field and inside meta.suspended_services. When suspend_service or resume_service is called, only the HashSet is updated, leaving meta.suspended_services stale. Since get_tenants() returns meta for persistence, suspension state changes won't be reflected.

Either sync both locations on mutation, or derive the suspension state from meta only.

Also applies to: 51-72

src/alerts/alert_types.rs (1)

91-123: Auth credential extraction is incomplete - always returns None.

The logic to find an admin user for the tenant (lines 91-105) correctly identifies a user with admin privileges, but the credential extraction (lines 107-118) always returns None:

  • Native user: returns None with TODO comments
  • OAuth user: logs warning and returns None

This means execute_alert_query will receive None for auth_token, which may cause remote alert queries in Prism mode to fail authentication.

src/handlers/http/users/dashboards.rs (1)

36-39: Use session‑derived tenant for dashboard reads (header is spoofable).

get_tenant_id_from_request trusts the raw tenant header, allowing cross‑tenant listing. Prefer session‑derived tenant (discard the user_id). Apply the same change to get_dashboard/list_tags if they still read the header.

🔧 Proposed fix
-    let tenant_id = get_tenant_id_from_request(&req);
+    let (_user_id, tenant_id) = get_user_and_tenant_from_request(&req)?;
src/alerts/target.rs (1)

61-69: Normalize DEFAULT_TENANT to None on load for consistency.

Default-tenant targets currently load as Some("DEFAULT_TENANT"), which diverges from the established None == default convention and can create mismatches downstream.

🔧 Proposed fix
-            let inner = targets
-                .into_iter()
-                .map(|mut t| {
-                    t.tenant = Some(tenant_id.clone());
-                    (t.id, t)
-                })
-                .collect();
+            let inner = targets
+                .into_iter()
+                .map(|mut t| {
+                    t.tenant = if tenant_id == DEFAULT_TENANT {
+                        None
+                    } else {
+                        Some(tenant_id.clone())
+                    };
+                    (t.id, t)
+                })
+                .collect();
src/handlers/http/query.rs (1)

82-100: Set DataFusion default schema for tenant in get_records_and_fields.

query() sets the schema, but this helper still runs with the global default. Counts routed through this path can resolve against public instead of the tenant schema.

🔧 Proposed fix
-    let session_state = QUERY_SESSION.get_ctx().state();
+    let mut session_state = QUERY_SESSION.get_ctx().state();
+    session_state
+        .config_mut()
+        .options_mut()
+        .catalog
+        .default_schema = tenant_id.as_ref().map_or("public".into(), |v| v.to_owned());
src/catalog/mod.rs (1)

563-576: Avoid empty tenant path segment in partition_path.
Using "" as the root yields a leading empty path component; elsewhere the codebase falls back to a tenant name or omits the segment. Align here to avoid path-matching inconsistencies.

♻️ Suggested fix
 pub fn partition_path(
     stream: &str,
     lower_bound: DateTime<Utc>,
     upper_bound: DateTime<Utc>,
     tenant_id: &Option<String>,
 ) -> RelativePathBuf {
-    let root = tenant_id.as_ref().map_or("", |v| v);
     let lower = lower_bound.date_naive().format("%Y-%m-%d").to_string();
     let upper = upper_bound.date_naive().format("%Y-%m-%d").to_string();
-    if lower == upper {
-        RelativePathBuf::from_iter([root, stream, &format!("date={lower}")])
-    } else {
-        RelativePathBuf::from_iter([root, stream, &format!("date={lower}:{upper}")])
-    }
+    let date_segment = if lower == upper {
+        format!("date={lower}")
+    } else {
+        format!("date={lower}:{upper}")
+    };
+    if let Some(tenant) = tenant_id.as_ref() {
+        RelativePathBuf::from_iter([tenant.as_str(), stream, &date_segment])
+    } else {
+        RelativePathBuf::from_iter([stream, &date_segment])
+    }
 }
src/rbac/mod.rs (1)

241-265: Basic-auth tenant ambiguity still exists.
get_user_tenant_from_basic/mutate_request_with_basic_user select the first matching tenant when credentials are duplicated across tenants, which can mis-route access. Consider enforcing username uniqueness across tenants or require explicit tenant selection (header/subdomain/credential field).

Also applies to: 305-369

src/rbac/user.rs (1)

377-385: Use each group user’s tenant when invalidating sessions.
add_roles uses the caller-provided tenant_id for all users, which can invalidate the wrong tenant if a group contains mixed-tenant users (even accidentally).

🔧 Suggested fix
-    for group_user in &self.users {
-        mut_sessions().remove_user(group_user.userid(), tenant_id);
-    }
+    for group_user in &self.users {
+        mut_sessions().remove_user(
+            group_user.userid(),
+            group_user.tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v),
+        );
+    }
src/handlers/http/oidc.rs (3)

112-116: Basic-auth tenant selection still defaults to None.
get_tenant_id_from_key() yields None for SessionKey::BasicAuth, so Users.get_user falls back to DEFAULT_TENANT even when the request specifies a tenant. Use get_tenant_id_from_request(&req) (or equivalent) in the BasicAuth branch.


141-155: Treat non-2xx cluster sync responses as failures.
reqwest::send() does not error on non-2xx status, so sync can silently fail.

🔧 Suggested fix
-                        INTRA_CLUSTER_CLIENT
+                        let resp = INTRA_CLUSTER_CLIENT
                             .post(url)
                             .header(header::AUTHORIZATION, node.token)
                             .header(header::CONTENT_TYPE, "application/json")
                             .json(&json!(
                                 {
                                     "sessionCookie": _session,
                                     "user": _user,
                                     "expiry": EXPIRY_DURATION
                                 }
                             ))
                             .send()
-                            .await?;
+                            .await?;
+                        resp.error_for_status()?;
                         Ok::<(), anyhow::Error>(())

323-327: New OIDC users are still created without tenant scope.
put_user(..., None) bypasses tenant isolation even though tenant_id is already derived.

src/query/mod.rs (1)

176-205: Don’t silently ignore schema registration errors.
register_schema failures during startup are dropped; log them so tenant query failures are diagnosable.

🔧 Suggested change
-                    let _ = catalog.register_schema(t, schema_provider);
+                    if let Err(e) = catalog.register_schema(t, schema_provider) {
+                        tracing::error!("failed to register schema for tenant {t}: {e:?}");
+                    }
src/prism/logstream/mod.rs (5)

111-117: Remove noisy debug logging in stats path.
The tracing::warn! here looks like temporary debug output.


240-249: Drop debug logging for dataset listing.
The tracing::warn! appears to be debug-only noise.


277-292: Avoid debug warnings for authorization/stream load checks.
These warn logs are likely debug artifacts.


324-335: Remove debug logging in dataset response assembly.
The tracing::warn! calls look like temporary debug output.


57-73: Restore real stats instead of defaults.
Returning QueriedStats::default() drops real stats and breaks dataset accuracy.

src/parseable/mod.rs (4)

1071-1086: TOCTOU race in add_tenant.
The read-then-write lock pattern can admit duplicate tenants under concurrency.


1130-1158: delete_tenant does not remove from self.tenants.
Leaves stale tenant entries in memory.


1160-1196: load_tenants has an empty branch and swallows lock failures.
This can hide errors and produce inconsistent tenant state.


1198-1207: list_tenants silently ignores poisoned locks.
Consider failing loudly instead of returning None.

src/metastore/metastores/object_store_metastore.rs (6)

344-360: get_alert_state_entry still ignores tenant context.
Alert state paths remain tenant-agnostic.


515-541: Dashboards are overwritten per user.
Each user insert replaces the tenant’s previous dashboards.


575-592: get_chats still ignores tenant iteration.
Multi-tenant chats from non-default tenants are skipped.


666-708: Potential panic after filter migration.
as_object().unwrap() can panic if migration yields a non-object.


997-1026: Manifest path construction still uses empty prefix helper.
Please keep put_manifest/delete_manifest/get_manifest_path aligned.


1274-1344: Tenant prefix is not filtered out of stream names.
With tenant prefixes present, list_streams can return tenant IDs as streams.

src/metastore/metastore_traits.rs (1)

320-324: get_node_metadata tenant_id parameter looks inconsistent with global scope.
Node metadata is stored globally in practice.

src/alerts/mod.rs (1)

1331-1337: Updates can still be dropped if tenant bucket is missing.
update_state and update_notification_state only insert when a bucket exists.

Also applies to: 1377-1379

🧹 Nitpick comments (10)
src/migration/mod.rs (2)

161-165: Consider idiomatic simplification.

The closure .map(|t| Some(t)) can be simplified to .map(Some).

Suggested simplification
     let tenants = if let Some(tenants) = PARSEABLE.list_tenants() {
-        tenants.into_iter().map(|t| Some(t)).collect()
+        tenants.into_iter().map(Some).collect()
     } else {
         vec![None]
     };

418-428: Consider more idiomatic Option handling.

The pattern tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v) used twice can be simplified to the more idiomatic tenant_id.as_deref().unwrap_or(DEFAULT_TENANT).

Suggested simplification
     fetch_stats_from_storage(
         stream,
         stats,
-        tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v),
+        tenant_id.as_deref().unwrap_or(DEFAULT_TENANT),
     )
     .await;
     load_daily_metrics(
         &snapshot.manifest_list,
         stream,
-        tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v),
+        tenant_id.as_deref().unwrap_or(DEFAULT_TENANT),
     );
src/rbac/map.rs (1)

137-145: Remove commented-out code before merge.

Multiple blocks of commented-out code remain (lines 137-145, 170-187). These appear to be old implementations that should be removed for clarity.

Also applies to: 170-187

src/hottier.rs (2)

100-104: Consider extracting tenant iteration pattern to a helper.

The pattern for getting tenants with a None fallback is duplicated multiple times (lines 100-104, 293-297, 757-761, 781-785):

let tenants = if let Some(tenants) = PARSEABLE.list_tenants() {
    tenants.into_iter().map(|v| Some(v)).collect()
} else {
    vec![None]
};

Consider extracting to a helper like PARSEABLE.list_tenants_or_default() that returns Vec<Option<String>>.


237-259: LGTM: Tenant-aware hot tier path construction.

hot_tier_file_path correctly creates tenant-scoped paths when tenant_id is provided, with a fallback to the non-tenant path for the default case.

Minor: Remove the commented-out code at lines 242-245.

src/storage/object_storage.rs (1)

758-760: Avoid warn-level logging of full schemas.

Logging the entire schema at warn can be noisy and may expose sensitive field names; debug/trace is safer.

src/tenants/mod.rs (1)

111-158: Consider removing or tracking commented-out code.

This large block of commented-out code appears to be scaffolding for future serialization/deserialization of tenant metadata. Consider either removing it (it's in version control if needed later) or converting it to a TODO issue to track the implementation.

src/catalog/mod.rs (1)

467-467: Lower the log level for normal manifest path logging.
This looks like a normal code path; warn! will be noisy in production.

🔧 Suggested change
-    tracing::warn!("manifest path_url= {path_url}");
+    tracing::debug!("manifest path_url= {path_url}");
src/handlers/http/logstream.rs (1)

444-446: Clarify or resolve the TODO in hot-tier update.
The // TODO tenants note suggests incomplete work, but the method already threads tenant_id. If anything is still missing, please add a tracking issue or expand the comment; otherwise consider removing it.

src/prism/logstream/mod.rs (1)

321-323: hottier is computed but unused.
Either use it in the response or prefix with _ to avoid warnings.

Comment on lines 112 to 126
pub async fn query(req: HttpRequest, query_request: Query) -> Result<HttpResponse, QueryError> {
let session_state = QUERY_SESSION.state();
let mut session_state = QUERY_SESSION.get_ctx().state();
let time_range =
TimeRange::parse_human_time(&query_request.start_time, &query_request.end_time)?;
let tables = resolve_stream_names(&query_request.query)?;
//check or load streams in memory
create_streams_for_distributed(tables.clone()).await?;
// check or load streams in memory
create_streams_for_distributed(tables.clone(), &get_tenant_id_from_request(&req)).await?;

let tenant_id = get_tenant_id_from_request(&req);
session_state
.config_mut()
.options_mut()
.catalog
.default_schema = tenant_id.as_ref().map_or("public".into(), |v| v.to_owned());

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Derive tenant from the authenticated session, not the tenant header.

Using get_tenant_id_from_request makes tenant scoping spoofable. Use get_user_and_tenant_from_request (and map its error to QueryError::Unauthorized) and thread that tenant through. Consider mirroring this in get_counts as well.

🔧 Proposed fix (query path)
-use crate::utils::{get_tenant_id_from_request, user_auth_for_datasets};
+use crate::utils::{get_user_and_tenant_from_request, get_tenant_id_from_request, user_auth_for_datasets};
@@
-    create_streams_for_distributed(tables.clone(), &get_tenant_id_from_request(&req)).await?;
-
-    let tenant_id = get_tenant_id_from_request(&req);
+    let (_user_id, tenant_id) =
+        get_user_and_tenant_from_request(&req).map_err(|_| QueryError::Unauthorized)?;
+    create_streams_for_distributed(tables.clone(), &tenant_id).await?;
🔧 Apply the same pattern in get_counts (recommended)
-    let tenant_id = get_tenant_id_from_request(&req);
+    let (_user_id, tenant_id) =
+        get_user_and_tenant_from_request(&req).map_err(|_| QueryError::Unauthorized)?;
🤖 Prompt for AI Agents
In `@src/handlers/http/query.rs` around lines 112 - 126, Replace the spoofable
tenant extraction in query by calling get_user_and_tenant_from_request(req)
instead of get_tenant_id_from_request(&req), map any error to
QueryError::Unauthorized, extract the tenant from that result and pass it to
create_streams_for_distributed(tables.clone(), &tenant) and to set
session_state.config_mut().options_mut().catalog.default_schema (use
tenant.as_ref().map_or("public".into(), |v| v.to_owned())); apply the same
change in get_counts so both paths derive tenant from the authenticated session
rather than the tenant header.

Comment on lines 341 to +353
impl S3 {
async fn _get_object(&self, path: &RelativePath) -> Result<Bytes, ObjectStorageError> {
async fn _get_object(
&self,
path: &RelativePath,
tenant_id: &Option<String>,
) -> Result<Bytes, ObjectStorageError> {
let tenant_str = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
let resp = self.client.get(&to_object_store_path(path)).await;
increment_object_store_calls_by_date("GET", &Utc::now().date_naive().to_string());
increment_object_store_calls_by_date(
"GET",
&Utc::now().date_naive().to_string(),
tenant_str,
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Only emit object-store metrics after successful calls.

increment_object_store_calls_by_date is executed before success is known, so failed operations are counted. This breaks billing/usage accuracy. Please move these increments into the Ok branches (and apply the same pattern to _put_object, _delete_prefix list/delete, _list_dates, _upload_file, and the small-file branch in _upload_multipart, plus head/delete_object/check/list helpers). Based on learnings, metrics should only be captured after successful operations.

🐛 Example fix (apply across similar methods)
-        let resp = self.client.get(&to_object_store_path(path)).await;
-        increment_object_store_calls_by_date(
-            "GET",
-            &Utc::now().date_naive().to_string(),
-            tenant_str,
-        );
+        let resp = self.client.get(&to_object_store_path(path)).await;
 
         match resp {
             Ok(resp) => {
+                increment_object_store_calls_by_date(
+                    "GET",
+                    &Utc::now().date_naive().to_string(),
+                    tenant_str,
+                );
                 let body = resp.bytes().await?;
🤖 Prompt for AI Agents
In `@src/storage/s3.rs` around lines 341 - 353, The metric increment is currently
called before knowing operation success in S3::_get_object; move the call to
increment_object_store_calls_by_date into the success (Ok) branch after the
client.get completes successfully, and apply the same change pattern to
S3::_put_object, S3::_delete_prefix (list/delete), S3::_list_dates,
S3::_upload_file, the small-file branch in S3::_upload_multipart, and the
helpers head/delete_object/check/list so that
increment_object_store_calls_by_date("GET"/"PUT"/"DELETE"/etc., ...) is only
invoked after confirming the operation returned Ok, not before.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (15)
src/handlers/http/users/dashboards.rs (1)

36-76: Tenant isolation bypass in list_dashboards.

get_tenant_id_from_request reads the tenant from an untrusted HTTP header, allowing an attacker to forge the header and list dashboards from other tenants. This is the same vulnerability pattern as get_dashboard and list_tags.

Use session-validated tenant extraction consistent with the write operations (create_dashboard, update_dashboard, etc.):

🔒 Suggested fix
 pub async fn list_dashboards(req: HttpRequest) -> Result<impl Responder, DashboardError> {
-    let tenant_id = get_tenant_id_from_request(&req);
+    let (_user_id, tenant_id) = get_user_and_tenant_from_request(&req)?;
     let query_map = web::Query::<HashMap<String, String>>::from_query(req.query_string())
src/storage/s3.rs (1)

427-438: DELETE metric recorded before confirming success.

Inside the for_each_concurrent closure, increment_object_store_calls_by_date("DELETE", ...) at lines 428-432 is called before checking whether delete_resp.is_err() at line 433. Failed deletes are still counted.

🐛 Proposed fix
                     files_deleted.fetch_add(1, Ordering::Relaxed);
                     let delete_resp = self.client.delete(&obj.location).await;
-                    increment_object_store_calls_by_date(
-                        "DELETE",
-                        &Utc::now().date_naive().to_string(),
-                        &tenant_str_clone,
-                    );
                     if delete_resp.is_err() {
                         error!(
                             "Failed to delete object during delete stream: {:?}",
                             delete_resp
                         );
+                    } else {
+                        increment_object_store_calls_by_date(
+                            "DELETE",
+                            &Utc::now().date_naive().to_string(),
+                            &tenant_str_clone,
+                        );
                     }
src/storage/gcs.rs (2)

180-205: Record object-store metrics only after successful operations.

Several ops increment increment_object_store_calls_by_date(...) before checking the result, so failed GET/PUT/HEAD/DELETE calls are counted. Please move those increments into the success branches (or after ?) for _get_object, _put_object, _upload_file, the small-file branch in _upload_multipart, get_buffered_reader, head, delete_object, check, try_delete_node_meta, and the per-object delete in _delete_prefix.

🔧 Example fix for `_get_object`
-        increment_object_store_calls_by_date("GET", &Utc::now().date_naive().to_string(), tenant);
-        match resp {
-            Ok(resp) => {
+        match resp {
+            Ok(resp) => {
+                increment_object_store_calls_by_date("GET", &Utc::now().date_naive().to_string(), tenant);
                 let body: Bytes = resp.bytes().await?;

Based on learnings, metrics should only be captured on successful operations.

Also applies to: 209-230, 232-266, 323-345, 347-389, 443-465, 481-499, 621-699


285-312: Only emit LIST metrics when list_with_delimiter succeeds.

In _list_dates, list_hours, list_minutes, list_dirs, and list_dirs_relative, the LIST metric is emitted even if the list call fails. Move the increment into the Ok(resp) branch.

🔧 Example fix for `_list_dates`
-        increment_object_store_calls_by_date("LIST", &Utc::now().date_naive().to_string(), tenant);
-        let resp = match resp {
-            Ok(resp) => resp,
+        let resp = match resp {
+            Ok(resp) => {
+                increment_object_store_calls_by_date("LIST", &Utc::now().date_naive().to_string(), tenant);
+                resp
+            }
             Err(err) => {
                 return Err(err.into());
             }
         };

Based on learnings, metrics should only be captured on successful operations.

Also applies to: 767-783, 804-821, 869-885, 902-918

src/event/mod.rs (1)

142-163: Stream/tenant must exist before commit_schema is called.

Currently, commit_schema is called on the first event (line 77) before get_or_create_stream (line 80). Since Streams starts as an empty HashMap with no pre-initialized DEFAULT_TENANT entry, commit_schema will always fail with TenantNotFound on the first event due to the nested get_mut() lookups. Either:

  1. Move commit_schema to execute after get_or_create_stream creates the tenant/stream entry, or
  2. Modify commit_schema to create the tenant/stream entry if it doesn't exist, or
  3. Add a check in Event::process to skip schema commit on first event if the stream doesn't yet exist.
src/parseable/streams.rs (1)

1216-1262: Update tests for new tenant-aware signatures and metric labels.
Several tests still call Stream::new/convert_disk_files_to_parquet without the new tenant_id parameter and assert metrics using old label counts. This will fail to compile/run.

🔧 Example adjustments (apply similarly to other tests)
+        let tenant_id: Option<String> = None;
         let staging = Stream::new(
             options.clone(),
             stream_name,
             LogStreamMetadata::default(),
-            None,
+            None,
+            &tenant_id,
         );
@@
-        .convert_disk_files_to_parquet(None, None, false, false)?;
+        .convert_disk_files_to_parquet(None, None, false, false, &tenant_id)?;
@@
-        let staging_files = metrics::STAGING_FILES.with_label_values(&[&stream]).get();
+        let staging_files =
+            metrics::STAGING_FILES.with_label_values(&[&stream, DEFAULT_TENANT]).get();
@@
-        let storage_size_arrows = metrics::STORAGE_SIZE
-            .with_label_values(&["staging", &stream, "arrows"])
+        let storage_size_arrows = metrics::STORAGE_SIZE
+            .with_label_values(&["staging", &stream, "arrows", DEFAULT_TENANT])
             .get();

Also applies to: 1376-1394

src/hottier.rs (2)

192-205: Oldest-date lookup ignores tenant context.

get_oldest_date_time_entry is still tenant-agnostic, so oldest_date_time_entry can be computed from another tenant’s data. Thread tenant_id through the helper (and its path resolution) to keep per-tenant isolation.

🔧 Suggested direction
-        stream_hot_tier.oldest_date_time_entry = self.get_oldest_date_time_entry(stream).await?;
+        stream_hot_tier.oldest_date_time_entry =
+            self.get_oldest_date_time_entry(stream, tenant_id).await?;

208-218: Delete path ignores tenant_id.

The delete path should include the tenant subdirectory; otherwise deleting a tenant’s hot tier can remove the default tenant’s directory (or leave the tenant’s data intact).

🐛 Proposed fix
-        let path = self.hot_tier_path.join(stream);
+        let path = if let Some(tenant) = tenant_id.as_ref() {
+            self.hot_tier_path.join(tenant).join(stream)
+        } else {
+            self.hot_tier_path.join(stream)
+        };
src/handlers/http/logstream.rs (1)

416-457: Resolve the TODO for tenant-specific hot tier updates.

The inline // TODO tenants suggests incomplete behavior. Either implement the missing tenant-specific handling or remove the TODO with a rationale to avoid ambiguity.

src/handlers/http/demo_data.rs (1)

47-66: Demo data ingestion ignores tenant context in local mode.

tenant_id is extracted from the request but not passed to execute_demo_script when Mode::Ingest | Mode::All, creating an inconsistency. The Mode::Query | Mode::Prism path correctly passes tenant_id to get_demo_data_from_ingestor, but the local script execution path ignores it entirely, allowing demo data to be ingested into the default tenant regardless of which tenant made the request.

Pass tenant_id to execute_demo_script and set it as an environment variable (P_TENANT) so the bash script can include it in request headers. This requires updating both the function signature and the bash script to handle the tenant context.

src/alerts/alerts_utils.rs (1)

130-154: Pass auth_token to send_query_request with proper type conversion.

The auth_token parameter is received by execute_remote_query but not used—Line 149 hardcodes None instead of passing the value. This prevents authentication when executing remote queries in Prism mode.

Additionally, there's a type mismatch: execute_remote_query receives auth_token: Option<String>, while send_query_request expects Option<HeaderMap>. The token needs to be converted before passing.

Fix
-    let (result_value, _) = send_query_request(None, &query_request, tenant_id)
+    let auth_header = auth_token.map(|token| {
+        let mut headers = reqwest::header::HeaderMap::new();
+        if let Ok(val) = reqwest::header::HeaderValue::from_str(&token) {
+            headers.insert(reqwest::header::AUTHORIZATION, val);
+        }
+        headers
+    });
+    let (result_value, _) = send_query_request(auth_header, &query_request, tenant_id)

Note: This fix converts the token to a Bearer/auth header assuming it's pre-formatted. Consider whether the token creation in alert_types.rs (which currently always returns None) should be completed first.

src/handlers/http/modal/query/querier_logstream.rs (1)

101-111: Pass tenant context in stream deletion request to ingestors.

The DELETE request to /logstream/{stream_name}/sync doesn't include the tenant header. The ingestor's delete handler extracts tenant from the tenant header (line 77 of src/handlers/http/modal/ingest/ingestor_logstream.rs), but send_stream_delete_request only sets CONTENT_TYPE and AUTHORIZATION headers. The tenant_id is available at the call site but not passed to the function. This breaks multi-tenant isolation—the ingestor won't know which tenant's stream to delete.

Add the tenant_id parameter to send_stream_delete_request and include it as a tenant header in the DELETE request.

src/storage/store_metadata.rs (1)

307-329: Add directory creation for tenant-specific staging paths.

When tenant_id is provided, the staging metadata is written to staging_dir/tenant_id/.parseable.json. The tenant subdirectory is never created before the write attempt, causing an io::Error::NotFound if it doesn't already exist. The root staging directory is created during initialization, but tenant subdirectories are not.

🐛 Proposed fix
 pub fn put_staging_metadata(meta: &StorageMetadata, tenant_id: &Option<String>) -> io::Result<()> {
     let mut staging_metadata = meta.clone();
     staging_metadata.server_mode = PARSEABLE.options.mode;
     staging_metadata.staging = PARSEABLE.options.staging_dir().to_path_buf();
     let path = if let Some(tenant_id) = tenant_id.as_ref() {
-        PARSEABLE
+        let tenant_path = PARSEABLE
             .options
             .staging_dir()
-            .join(tenant_id)
-            .join(PARSEABLE_METADATA_FILE_NAME)
+            .join(tenant_id);
+        fs::create_dir_all(&tenant_path)?;
+        tenant_path.join(PARSEABLE_METADATA_FILE_NAME)
     } else {
         PARSEABLE
             .options
             .staging_dir()
             .join(PARSEABLE_METADATA_FILE_NAME)
     };
src/storage/azure_blob.rs (2)

215-239: Count object‑store calls only on success.

increment_object_store_calls_by_date runs even when the GET fails, which inflates metrics. Move it into the success branch (and apply the same pattern across PUT/DELETE/LIST/HEAD).

📈 Suggested fix
-        let resp = self.client.get(&to_object_store_path(path)).await;
-        increment_object_store_calls_by_date("GET", &Utc::now().date_naive().to_string(), tenant);
+        let resp = self.client.get(&to_object_store_path(path)).await;
         match resp {
             Ok(resp) => {
+                increment_object_store_calls_by_date(
+                    "GET",
+                    &Utc::now().date_naive().to_string(),
+                    tenant,
+                );
                 let body: Bytes = resp.bytes().await?;
                 ...
             }
             Err(err) => Err(err.into()),
         }

Based on learnings, metrics should be recorded only for successful operations.


321-357: Tenant prefix missing from list_dates, list_hours, and list_minutes path construction.

These methods receive tenant_id as a parameter but never use it when constructing the object store prefix. They only use stream_name to build paths like stream/ and stream/date/hour/, ignoring tenant isolation. This is inconsistent with metrics recording, which properly uses tenant_id. For proper tenant isolation, include the tenant prefix in both the list prefix and the strip_prefix logic.

🧭 Suggested fix (apply similarly to list_hours/list_minutes)
-        let resp: Result<object_store::ListResult, object_store::Error> = self
-            .client
-            .list_with_delimiter(Some(&(stream.into())))
-            .await;
+        let prefix = if let Some(t) = tenant_id.as_ref() {
+            format!("{}/{}/", t, stream)
+        } else {
+            format!("{}/", stream)
+        };
+        let resp: Result<object_store::ListResult, object_store::Error> = self
+            .client
+            .list_with_delimiter(Some(&prefix.clone().into()))
+            .await;
 ...
-        let dates: Vec<_> = common_prefixes
+        let dates: Vec<_> = common_prefixes
             .iter()
-            .filter_map(|path| path.as_ref().strip_prefix(&format!("{stream}/")))
+            .filter_map(|path| path.as_ref().strip_prefix(&prefix))
             .map(String::from)
             .collect();

Also applies to: lines 785–839

🤖 Fix all issues with AI agents
In `@src/alerts/mod.rs`:
- Around line 1045-1050: When constructing the tenant context in
src/alerts/mod.rs (inside the loop over raw_objects), treat the normalized
DEFAULT_TENANT value as equivalent to None instead of Some(DEFAULT_TENANT);
update the tenant assignment used by functions like load() to check for
tenant_id.is_empty() || tenant_id == DEFAULT_TENANT and return &None in that
case, otherwise return &Some(tenant_id.clone()). Apply the same change at the
other occurrence around the 1097–1100 block so migrated alerts use the default
(None) storage path consistently.

In `@src/alerts/target.rs`:
- Around line 75-85: The Target object from the request must not be allowed to
override the authenticated tenant: before calling TARGETS.update(target) in the
HTTP handler (the handler that deserializes into Target) set the target.tenant
to the authenticated tenant (e.g., target.tenant = Some(tenant_id.to_owned()))
so the update() path and PARSEABLE.metastore.put_target use the server-provided
tenant; alternatively, remove tenant from request deserialization on the Target
struct (#[serde(skip)] / accept tenant only from path/request) so Target::tenant
cannot be supplied by clients.

In `@src/parseable/mod.rs`:
- Around line 1130-1157: The delete_tenant function currently only clears RBAC
and in-memory metadata (mut_users(), Users.delete_user, mut_roles(),
TENANT_METADATA.delete_tenant) but does not remove the tenant’s object-store
prefix; convert delete_tenant to an async fn returning Result and, before
calling TENANT_METADATA.delete_tenant, call
metastore.delete_tenant(tenant_id).await (or the appropriate metastore client
method) and propagate or log errors so storage deletion completes (or fails)
before clearing in-memory state; ensure callers are updated to await the new
async delete_tenant and handle the Result.

In `@src/rbac/map.rs`:
- Around line 437-443: The call in mutate_request_with_tenant that constructs a
HeaderValue uses from_bytes(...).unwrap(), which can panic; replace it with
HeaderValue::from_str(tenant) and handle the Result instead of unwrapping. In
the mutate_request_with_tenant method (referencing SessionKey, ServiceRequest
and self.active_sessions), attempt HeaderValue::from_str(tenant) and on Err
simply skip inserting the header (or log the error using the existing logger
pattern in this module) so the function no longer panics when tenant contains
invalid header characters.

In `@src/storage/s3.rs`:
- Around line 948-952: The HEAD metrics call uses an empty string for the tenant
label which will fragment metrics; update the call to
increment_object_store_calls_by_date in src/storage/s3.rs to pass the same
tenant_str (or DEFAULT_TENANT) used earlier instead of "", e.g., replace the
third argument with tenant_str so the "HEAD" metric uses the consistent tenant
label; ensure tenant_str is in scope where
increment_object_store_calls_by_date("HEAD", ..., tenant_str) is invoked.

In `@src/utils/mod.rs`:
- Around line 63-77: The function mutate_request_with_tenant is calling
basic.password().as_deref().unwrap(), which can panic if the BasicAuth password
is omitted; change this to safely default to an empty string (like using
as_deref().unwrap_or("")) before passing to
Users.mutate_request_with_basic_user. Update the BasicAuth handling in
mutate_request_with_tenant (the BasicAuth::extract result branch) to supply a
non-panicking password string and mirror the safe pattern used in
extract_session_key_from_req.

Comment on lines +1130 to +1157
pub fn delete_tenant(&self, tenant_id: &str) -> Result<(), anyhow::Error> {
// let mut metadata = get_metadata(&Some(tenant_id.to_owned())).await?;
// delete users and sessions
let users = mut_users().remove(tenant_id);
if let Some(users) = users {
tracing::warn!("found tenant users, deleting");
for (userid, user) in users {
// metadata
// .users
// .retain(|u| u.tenant.eq(&Some(tenant_id.to_owned())));

Users.delete_user(&userid, &user.tenant);
}
}

// delete roles
mut_roles().remove(tenant_id);
// if let Some(roles) = mut_roles().remove(tenant_id) {
// for (role, _) in roles {
// // metadata.roles.retain(|r, _| !role.eq(r));
// }
// }

// delete resources

// delete from in-mem
TENANT_METADATA.delete_tenant(&tenant_id);
Ok(())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Tenant storage isn’t deleted.

delete_tenant clears RBAC and in‑memory metadata but never deletes the tenant’s object‑store prefix. That leaves data behind after “delete,” which is a privacy/compliance risk. Consider making this async and calling metastore.delete_tenant before clearing TENANT_METADATA.

🧹 Possible direction (async cleanup)
-pub fn delete_tenant(&self, tenant_id: &str) -> Result<(), anyhow::Error> {
+pub async fn delete_tenant(&self, tenant_id: &str) -> Result<(), anyhow::Error> {
     // delete users and sessions
     let users = mut_users().remove(tenant_id);
     ...
-    TENANT_METADATA.delete_tenant(&tenant_id);
+    self.metastore.delete_tenant(tenant_id).await?;
+    TENANT_METADATA.delete_tenant(tenant_id);
     Ok(())
 }
🤖 Prompt for AI Agents
In `@src/parseable/mod.rs` around lines 1130 - 1157, The delete_tenant function
currently only clears RBAC and in-memory metadata (mut_users(),
Users.delete_user, mut_roles(), TENANT_METADATA.delete_tenant) but does not
remove the tenant’s object-store prefix; convert delete_tenant to an async fn
returning Result and, before calling TENANT_METADATA.delete_tenant, call
metastore.delete_tenant(tenant_id).await (or the appropriate metastore client
method) and propagate or log errors so storage deletion completes (or fails)
before clearing in-memory state; ensure callers are updated to await the new
async delete_tenant and handle the Result.

Comment on lines +437 to +443
pub fn mutate_request_with_tenant(&self, key: &SessionKey, req: &mut ServiceRequest) {
if let Some((_, tenant, _)) = self.active_sessions.get(key) {
req.headers_mut().insert(
HeaderName::from_static("tenant"),
HeaderValue::from_bytes(tenant.as_bytes()).unwrap(),
);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, let's check the actual code in the file
cat -n src/rbac/map.rs | sed -n '430,450p'

Repository: parseablehq/parseable

Length of output: 1065


🏁 Script executed:

# Let's also see more context around this function to understand where tenant comes from
cat -n src/rbac/map.rs | sed -n '420,460p'

Repository: parseablehq/parseable

Length of output: 1885


🏁 Script executed:

# Check what type tenant is and how it's used
rg -A 20 "pub fn mutate_request_with_tenant" src/rbac/map.rs

Repository: parseablehq/parseable

Length of output: 935


🏁 Script executed:

# Check the imports to see which crate HeaderValue comes from
head -50 src/rbac/map.rs | grep -E "use|import"

Repository: parseablehq/parseable

Length of output: 800


🏁 Script executed:

# Search for where active_sessions is defined and populated to understand tenant's origin
rg "active_sessions" src/rbac/map.rs -B 3 -A 3

Repository: parseablehq/parseable

Length of output: 3899


🏁 Script executed:

# Find DEFAULT_TENANT definition
rg "DEFAULT_TENANT" src/ -B 2 -A 2 | head -30

Repository: parseablehq/parseable

Length of output: 1300


🏁 Script executed:

# Search for where User.tenant is set and what constraints exist
rg "struct User" src/rbac/ -A 10

Repository: parseablehq/parseable

Length of output: 3864


🏁 Script executed:

# Check what types can be stored as tenant
rg "tenant:" src/rbac/ -B 3 -A 1 | head -40

Repository: parseablehq/parseable

Length of output: 1050


🏁 Script executed:

# Let's understand HeaderValue::from_bytes behavior - check if it validates anything beyond UTF-8
web_search
HeaderValue::from_bytes HTTP header validation rules actix-web hyper

Repository: parseablehq/parseable

Length of output: 177


🏁 Script executed:

# Find DEFAULT_TENANT constant definition
rg "const DEFAULT_TENANT\|let DEFAULT_TENANT" src/ -B 2 -A 2

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Also check in parseable.rs which was mentioned in imports
fd "parseable.rs" src/ -type f

Repository: parseablehq/parseable

Length of output: 235


🏁 Script executed:

# Check what constraints might exist on tenant values
rg "tenant.*=" src/rbac/user.rs | head -20

Repository: parseablehq/parseable

Length of output: 47


🌐 Web query:

HTTP HeaderValue::from_bytes validation rules allowed bytes RFC 7230

💡 Result:

Short answer

  • RFC 7230 allows header field octets except CTLs (0x00–0x1F and 0x7F). In practice a field-value may contain SP (0x20), HTAB (0x09), VCHAR (0x21–0x7E) and obs-text (0x80–0xFF); obs-fold (CRLF + SP/HTAB) is deprecated. [1]
  • hyper's HeaderValue::from_bytes currently accepts only bytes 0x20–0xFF (inclusive) but rejects 0x7F (DEL) — i.e. it does not allow CTLs and also does not accept HTAB (0x09). [2]

Sources

  • RFC 7230 (field-value / field-vchar / obs-text / obs-fold). [1]
  • hyper HeaderValue::from_bytes documentation (validation: only 32–255 allowed, excluding 127). [2]

🏁 Script executed:

# Find DEFAULT_TENANT definition
rg "DEFAULT_TENANT\s*=" src/ -A 1

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Check if there's validation on tenant field during User creation/deserialization
rg "is_valid.*tenant\|validate.*tenant" src/rbac/

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Look at User struct serialization/deserialization to see if tenant is validated
cat -n src/rbac/user.rs | sed -n '1,100p'

Repository: parseablehq/parseable

Length of output: 3737


🏁 Script executed:

# Check if DEFAULT_TENANT is imported from parseable module
cat -n src/parseable.rs | grep -A 5 -B 5 "DEFAULT_TENANT"

Repository: parseablehq/parseable

Length of output: 113


🏁 Script executed:

# Find where DEFAULT_TENANT is defined
find src -name "*.rs" -type f | xargs grep -l "const DEFAULT_TENANT\|static DEFAULT_TENANT" | head -5

Repository: parseablehq/parseable

Length of output: 85


🏁 Script executed:

# Try broader search for DEFAULT_TENANT assignment
rg "DEFAULT_TENANT.*=" src/ --max-count 5

Repository: parseablehq/parseable

Length of output: 136


🏁 Script executed:

# Check the file structure
fd -type f -name "*.rs" src/parseable/ | head -10

Repository: parseablehq/parseable

Length of output: 235


🏁 Script executed:

# Search for where parseable module is defined
rg "pub.*const DEFAULT_TENANT\|pub.*static DEFAULT_TENANT" src/

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Check if there are any validation constraints on tenant values during User creation or parsing
rg "tenant" src/rbac/user.rs -B 2 -A 2 | grep -E "validate|check|constraint|regex|allow"

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Look at how tenant is used elsewhere to see if there's implicit validation
rg "HeaderValue::from" src/ -B 2 -A 2

Repository: parseablehq/parseable

Length of output: 4863


🏁 Script executed:

# Check if there are any tests or examples showing what tenant values are expected
rg "tenant.*=" src/ | grep -i "test\|example\|default" | head -10

Repository: parseablehq/parseable

Length of output: 942


Replace unwrap with error handling when creating HeaderValue from tenant.

Line 441 uses HeaderValue::from_bytes(tenant.as_bytes()).unwrap(), which will panic if tenant contains bytes outside the valid HTTP header range (e.g., control characters). While unlikely in typical usage, this should be handled gracefully. Use HeaderValue::from_str() with proper error handling to follow the pattern used elsewhere in the codebase.

🛠️ Proposed fix
-        if let Some((_, tenant, _)) = self.active_sessions.get(key) {
-            req.headers_mut().insert(
-                HeaderName::from_static("tenant"),
-                HeaderValue::from_bytes(tenant.as_bytes()).unwrap(),
-            );
-        }
+        if let Some((_, tenant, _)) = self.active_sessions.get(key) {
+            if let Ok(value) = HeaderValue::from_str(tenant) {
+                req.headers_mut()
+                    .insert(HeaderName::from_static("tenant"), value);
+            }
+        }
🤖 Prompt for AI Agents
In `@src/rbac/map.rs` around lines 437 - 443, The call in
mutate_request_with_tenant that constructs a HeaderValue uses
from_bytes(...).unwrap(), which can panic; replace it with
HeaderValue::from_str(tenant) and handle the Result instead of unwrapping. In
the mutate_request_with_tenant method (referencing SessionKey, ServiceRequest
and self.active_sessions), attempt HeaderValue::from_str(tenant) and on Err
simply skip inserting the header (or log the error using the existing logger
pattern in this module) so the function no longer panics when tenant contains
invalid header characters.

Comment on lines +63 to +77
pub fn mutate_request_with_tenant(req: &mut ServiceRequest) {
let creds = BasicAuth::extract(req.request()).into_inner();

if let Ok(basic) = &creds {
Users.mutate_request_with_basic_user(
basic.user_id(),
basic.password().as_deref().unwrap(),
req,
);
} else if let Some(cookie) = req.cookie("session") {
if let Ok(ulid) = ulid::Ulid::from_string(cookie.value()) {
let key = SessionKey::SessionId(ulid);
sessions().mutate_request_with_tenant(&key, req);
}
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's locate and examine the file
fd -t f "mod.rs" src/utils/ --exec cat -n {} \;

Repository: parseablehq/parseable

Length of output: 36039


🏁 Script executed:

# Find the src/utils/mod.rs file and examine it
find . -path "*src/utils/mod.rs" -type f

Repository: parseablehq/parseable

Length of output: 83


🏁 Script executed:

# Check directory structure
ls -la src/ 2>/dev/null || echo "src not found"

Repository: parseablehq/parseable

Length of output: 2401


🏁 Script executed:

# Search for BasicAuth usage and understand its type
rg "BasicAuth" -A 5 -B 5 --max-count=20

Repository: parseablehq/parseable

Length of output: 11080


🏁 Script executed:

rg "fn mutate_request_with_basic_user" -A 5

Repository: parseablehq/parseable

Length of output: 308


Avoid panic when BasicAuth omits a password.

Line 69 uses unwrap() on an optional password; the HTTP Basic Authentication specification allows passwords to be omitted, so this will panic on such requests. Other authentication paths in the codebase (e.g., extract_session_key_from_req in src/utils/actix.rs) correctly handle this with unwrap_or("").

🛠️ Proposed fix
-        Users.mutate_request_with_basic_user(
-            basic.user_id(),
-            basic.password().as_deref().unwrap(),
-            req,
-        );
+        Users.mutate_request_with_basic_user(
+            basic.user_id(),
+            basic.password().as_deref().unwrap_or(""),
+            req,
+        );
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pub fn mutate_request_with_tenant(req: &mut ServiceRequest) {
let creds = BasicAuth::extract(req.request()).into_inner();
if let Ok(basic) = &creds {
Users.mutate_request_with_basic_user(
basic.user_id(),
basic.password().as_deref().unwrap(),
req,
);
} else if let Some(cookie) = req.cookie("session") {
if let Ok(ulid) = ulid::Ulid::from_string(cookie.value()) {
let key = SessionKey::SessionId(ulid);
sessions().mutate_request_with_tenant(&key, req);
}
};
pub fn mutate_request_with_tenant(req: &mut ServiceRequest) {
let creds = BasicAuth::extract(req.request()).into_inner();
if let Ok(basic) = &creds {
Users.mutate_request_with_basic_user(
basic.user_id(),
basic.password().as_deref().unwrap_or(""),
req,
);
} else if let Some(cookie) = req.cookie("session") {
if let Ok(ulid) = ulid::Ulid::from_string(cookie.value()) {
let key = SessionKey::SessionId(ulid);
sessions().mutate_request_with_tenant(&key, req);
}
};
🤖 Prompt for AI Agents
In `@src/utils/mod.rs` around lines 63 - 77, The function
mutate_request_with_tenant is calling basic.password().as_deref().unwrap(),
which can panic if the BasicAuth password is omitted; change this to safely
default to an empty string (like using as_deref().unwrap_or("")) before passing
to Users.mutate_request_with_basic_user. Update the BasicAuth handling in
mutate_request_with_tenant (the BasicAuth::extract result branch) to supply a
non-panicking password string and mirror the safe pattern used in
extract_session_key_from_req.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (17)
src/handlers/http/modal/query/querier_rbac.rs (1)

197-201: Role existence checks must be tenant-scoped.

These checks query roles() without scoping to the request tenant, which can falsely reject valid roles (or accept roles from another tenant) in multi-tenant setups.

🔧 Proposed fix
-    roles_to_add.iter().for_each(|r| {
-        if roles().get(r).is_none() {
-            non_existent_roles.push(r.clone());
-        }
-    });
+    let tenant_roles = roles().get(tenant);
+    roles_to_add.iter().for_each(|r| {
+        if tenant_roles.and_then(|tr| tr.get(r)).is_none() {
+            non_existent_roles.push(r.clone());
+        }
+    });
-    roles_to_remove.iter().for_each(|r| {
-        if roles().get(r).is_none() {
-            non_existent_roles.push(r.clone());
-        }
-    });
+    let tenant_roles = roles().get(tenant);
+    roles_to_remove.iter().for_each(|r| {
+        if tenant_roles.and_then(|tr| tr.get(r)).is_none() {
+            non_existent_roles.push(r.clone());
+        }
+    });

Also applies to: 258-260

src/handlers/http/correlation.rs (2)

67-81: Tenant context discarded during correlation creation.

The post handler extracts both user and tenant from the request but discards tenant_id. If correlations should be tenant-scoped (as suggested by the changes to get and delete), the correlation creation should also include tenant context.

🔧 Suggested approach
 pub async fn post(
     req: HttpRequest,
     Json(mut correlation): Json<CorrelationConfig>,
 ) -> Result<impl Responder, CorrelationError> {
     let session_key = extract_session_key_from_req(&req)
         .map_err(|err| CorrelationError::AnyhowError(anyhow::Error::msg(err.to_string())))?;
-    let user_id = get_user_and_tenant_from_request(&req)
-        .map(|(s, _)| get_hash(&s.to_string()))
+    let (user_id, tenant_id) = get_user_and_tenant_from_request(&req)
+        .map(|(s, t)| (get_hash(&s.to_string()), t))
         .map_err(|err| CorrelationError::AnyhowError(Error::msg(err.to_string())))?;
     correlation.user_id = user_id;
+    // TODO: Pass tenant_id to CORRELATIONS.create if tenant scoping is required

     let correlation = CORRELATIONS.create(correlation, &session_key).await?;

83-99: Same issue: tenant context discarded during correlation modification.

Similar to post, the modify handler discards tenant_id. Ensure tenant context is propagated to CORRELATIONS.update if correlations should be tenant-scoped.

src/handlers/http/modal/ingest/ingestor_logstream.rs (3)

37-70: Security concern: header-based tenant extraction for destructive operation.

retention_cleanup extracts tenant_id from an HTTP header via get_tenant_id_from_request. This header can be spoofed by clients. For operations that modify or delete data, consider using session-based extraction (get_user_and_tenant_from_request) to ensure the tenant is derived from the authenticated session rather than a client-controlled header.

🔒 Suggested approach
+use crate::utils::get_user_and_tenant_from_request;

 pub async fn retention_cleanup(
     req: HttpRequest,
     stream_name: Path<String>,
     Json(date_list): Json<Vec<String>>,
 ) -> Result<impl Responder, StreamError> {
     let stream_name = stream_name.into_inner();
     let storage = PARSEABLE.storage().get_object_store();
-    let tenant_id = get_tenant_id_from_request(&req);
+    let (_, tenant_id) = get_user_and_tenant_from_request(&req)
+        .map_err(|e| StreamError::Custom {
+            msg: e.to_string(),
+            status: StatusCode::UNAUTHORIZED,
+        })?;

72-94: Security concern: header-based tenant extraction for stream deletion.

Similar to retention_cleanup, the delete handler uses header-based tenant extraction. Stream deletion is a destructive operation that should derive tenant context from the authenticated session to prevent unauthorized cross-tenant deletions.

🔒 Suggested approach
 pub async fn delete(
     req: HttpRequest,
     stream_name: Path<String>,
 ) -> Result<impl Responder, StreamError> {
     let stream_name = stream_name.into_inner();
-    let tenant_id = get_tenant_id_from_request(&req);
+    let (_, tenant_id) = get_user_and_tenant_from_request(&req)
+        .map_err(|e| StreamError::Custom {
+            msg: e.to_string(),
+            status: StatusCode::UNAUTHORIZED,
+        })?;

96-108: Consider session-based tenant extraction for stream creation.

While put_stream is less destructive than delete operations, using header-based tenant extraction could allow users to create streams under arbitrary tenants. Consider whether session-based extraction would be more appropriate for your security model.

src/metadata.rs (1)

37-64: Rename origin parameter to format for semantic clarity.

The origin parameter receives origin_format values (e.g., "json") and maps to the metric label "format", not "origin". The parameter name should match the label name and semantic meaning to avoid confusion.

src/hottier.rs (1)

208-218: Tenant ID is ignored when deleting hot-tier data.
Line 216 removes hot_tier_path/<stream> regardless of tenant, which can delete or mix hot-tier data when stream names overlap. Now that metadata is tenant-scoped, the hot-tier data paths (delete, date discovery, manifest paths, cleanup, and oldest entry) should be tenant-scoped too.

🔧 Suggested fix (apply pattern across hot-tier data paths)
-        let path = self.hot_tier_path.join(stream);
+        let path = match tenant_id.as_ref() {
+            Some(t) => self.hot_tier_path.join(t).join(stream),
+            None => self.hot_tier_path.join(stream),
+        };
src/storage/gcs.rs (1)

179-205: Only emit object-store metrics after successful operations.
Line 186 increments metrics before success is known; failed calls are counted. Move increment_object_store_calls_by_date (and related metrics) into Ok branches for _get_object, _put_object, _delete_prefix, _list_dates, _upload_file, _upload_multipart, head, delete_object, check, list_dirs, and list_dirs_relative. Based on learnings, metrics should be recorded only on success.

🔧 Example fix
-        let resp = self.client.get(&to_object_store_path(path)).await;
-        increment_object_store_calls_by_date("GET", &Utc::now().date_naive().to_string(), tenant);
+        let resp = self.client.get(&to_object_store_path(path)).await;
 
         match resp {
             Ok(resp) => {
+                increment_object_store_calls_by_date("GET", &Utc::now().date_naive().to_string(), tenant);
                 let body: Bytes = resp.bytes().await?;
src/metastore/metastores/object_store_metastore.rs (1)

576-602: Inconsistency: get_chats does not iterate over tenants unlike similar methods.

Other collection methods (get_dashboards, get_filters, get_correlations) iterate over all tenants using PARSEABLE.list_tenants(), but get_chats only looks at the root USERS_ROOT_DIR without tenant prefix and passes &None for tenant_id. This will miss chats stored under tenant-specific paths.

🔧 Suggested fix to make tenant-aware
     async fn get_chats(&self) -> Result<DashMap<String, Vec<Bytes>>, MetastoreError> {
         let all_user_chats = DashMap::new();
+        let base_paths = PARSEABLE.list_tenants().map_or(vec!["".into()], |v| v);
+        for tenant in base_paths {
+            let tenant_id = &Some(tenant.clone());
+            let users_dir = RelativePathBuf::from_iter([&tenant, USERS_ROOT_DIR]);
-        let users_dir = RelativePathBuf::from(USERS_ROOT_DIR);
-        for user in self.storage.list_dirs_relative(&users_dir, &None).await? {
+            for user in self.storage.list_dirs_relative(&users_dir, tenant_id).await? {
                 if user.starts_with(".") {
                     continue;
                 }
                 // ... rest of logic with tenant_id instead of &None
             }
+        }
         Ok(all_user_chats)
     }
src/handlers/http/modal/query/querier_logstream.rs (1)

101-111: Pass tenant context to ingestors during stream deletion.

The ingestor delete endpoint extracts tenant_id from the request header via get_tenant_id_from_request(), but send_stream_delete_request does not accept or forward the tenant_id parameter. This causes the ingestor to receive a DELETE request without the tenant header, breaking multi-tenant stream deletion.

Update send_stream_delete_request to accept tenant_id as a parameter and include it as a request header:

pub async fn send_stream_delete_request(
    url: &str,
    ingestor: IngestorMetadata,
    tenant_id: &Option<String>,  // Add this
) -> Result<(), StreamError> {
    // ...
    let mut req = INTRA_CLUSTER_CLIENT.delete(url);
    if let Some(tenant) = tenant_id {
        req = req.header("tenant", tenant);
    }
    req.header(header::AUTHORIZATION, ingestor.token)
        .send()
        .await

Then update the caller in querier_logstream.rs:110 to pass &tenant_id.

src/alerts/alerts_utils.rs (1)

129-154: auth_token parameter is not forwarded to send_query_request and requires type conversion from Option<String> to Option<HeaderMap>.

The function receives auth_token: Option<String> but passes None as the first argument to send_query_request, which expects Option<HeaderMap>. Additionally, there is a related issue in alert_types.rs (lines 108-112) where the admin user's basic auth credentials are extracted but not converted to an auth token—the code currently returns None with a comment "use that to send request" but does not implement it.

To fix execute_remote_query, convert the token string to a HeaderMap before passing it to send_query_request. Based on the codebase patterns (e.g., target.rs, ingest_server.rs), the token format is "Basic {base64_encoded(username:password)}", so ensure the conversion accounts for this format.

Also address the incomplete implementation in alert_types.rs to actually build and pass the token from the extracted Basic credentials.

src/storage/store_metadata.rs (1)

124-144: Tenant-specific staging reads are still global

resolve_parseable_metadata now accepts tenant_id, but get_staging_metadata() still reads the default file. For non-default tenants this can ignore existing tenant metadata and overwrite it with default state.

🐛 Suggested fix: make staging reads tenant-aware
 pub async fn resolve_parseable_metadata(
     parseable_metadata: &Option<Bytes>,
     tenant_id: &Option<String>,
 ) -> Result<StorageMetadata, ObjectStorageError> {
-    let staging_metadata = get_staging_metadata()?;
+    let staging_metadata = get_staging_metadata(tenant_id)?;
     let remote_metadata = parseable_metadata
         .as_ref()
         .map(|meta| serde_json::from_slice(meta).expect("parseable config is valid json"));
@@
-pub fn get_staging_metadata() -> io::Result<Option<StorageMetadata>> {
-    let path = RelativePathBuf::from(PARSEABLE_METADATA_FILE_NAME)
-        .to_path(PARSEABLE.options.staging_dir());
+pub fn get_staging_metadata(tenant_id: &Option<String>) -> io::Result<Option<StorageMetadata>> {
+    let path = if let Some(tenant_id) = tenant_id.as_ref() {
+        RelativePathBuf::from_iter([tenant_id.as_str(), PARSEABLE_METADATA_FILE_NAME])
+            .to_path(PARSEABLE.options.staging_dir())
+    } else {
+        RelativePathBuf::from(PARSEABLE_METADATA_FILE_NAME)
+            .to_path(PARSEABLE.options.staging_dir())
+    };
src/parseable/streams.rs (1)

1211-1227: Tests won't compile: missing tenant_id parameter.

All test functions calling Stream::new are missing the required tenant_id parameter added at line 123. This applies to tests at lines 1216, 1234, 1252, 1270, 1291, 1314, 1348, 1377, 1434, 1460, 1483, 1510, 1533, 1564, and the Streams::get_or_create calls at lines 1638, 1646, 1676, 1714, 1722.

🐛 Example fix for test_staging_new_with_valid_stream
     fn test_staging_new_with_valid_stream() {
         let stream_name = "test_stream";
 
         let options = Arc::new(Options::default());
         let staging = Stream::new(
             options.clone(),
             stream_name,
             LogStreamMetadata::default(),
             None,
+            &None,
         );
 
         assert_eq!(
             staging.data_path,
-            options.local_stream_data_path(stream_name)
+            options.local_stream_data_path(stream_name, &None)
         );
     }

Similarly, Streams::get_or_create calls need the tenant_id parameter:

         let stream1 = streams.get_or_create(
             options.clone(),
             stream_name.to_owned(),
             metadata.clone(),
             ingestor_id.clone(),
+            &None,
         );
src/metrics/mod.rs (1)

38-168: Reconsider the high-cardinality labels approach.

While the tenant_id additions are consistently implemented across metrics with proper label ordering, this approach violates Prometheus best practices. Adding unbounded tenant_id values as metric labels creates cardinality explosion—each unique tenant generates new time series, causing memory pressure and query performance degradation. Prometheus documentation explicitly warns against high-cardinality labels.

Consider alternatives:

  • Use bounded label categories (e.g., tenant_tier: free/premium) instead of raw tenant IDs
  • Move tenant-specific debugging data to logs (Loki) or traces (Jaeger)
  • Use metric relabeling at scrape time to drop or remap tenant_id
  • Implement per-tenant Prometheus instances or remote-write destinations for isolation
  • Aggregate per-tenant metrics via recording rules rather than storing per-tenant time series

Also note: This is a breaking change for existing Prometheus queries and Grafana dashboards.

src/migration/mod.rs (1)

50-61: Staging metadata should be read with tenant scope to match write pattern.

get_staging_metadata does not accept tenant_id, but put_staging_metadata explicitly scopes metadata files by tenant (at staging_dir/tenant_id/.parseable.json). The read at line 60 must also be tenant-scoped, or the migration will read from the wrong location for multi-tenant setups.

Update get_staging_metadata to accept tenant_id and apply the same tenant-scoped path logic as put_staging_metadata:

Suggested fix
-pub fn get_staging_metadata(config: &Parseable) -> anyhow::Result<Option<serde_json::Value>> {
-    let path =
-        RelativePathBuf::from(PARSEABLE_METADATA_FILE_NAME).to_path(config.options.staging_dir());
+pub fn get_staging_metadata(
+    config: &Parseable,
+    tenant_id: &Option<String>,
+) -> anyhow::Result<Option<serde_json::Value>> {
+    let path = if let Some(tenant) = tenant_id.as_ref() {
+        config
+            .options
+            .staging_dir()
+            .join(tenant)
+            .join(".parseable.json")
+    } else {
+        config.options.staging_dir().join(".parseable.json")
+    };

Then update the call at line 60:

-    let staging_metadata = get_staging_metadata(config)?;
+    let staging_metadata = get_staging_metadata(config, tenant_id)?;
src/storage/azure_blob.rs (1)

214-318: Count object-store calls only on success.

increment_object_store_calls_by_date is invoked before verifying results in several operations, so failed calls are counted. Move the increment into the Ok path (and only count DELETEs when they succeed in _delete_prefix). Based on learnings, metrics should only be emitted on successful operations.

🛠️ Example adjustment (apply similarly to other ops)
-        let resp = self.client.put(&to_object_store_path(path), resource).await;
-        increment_object_store_calls_by_date("PUT", &Utc::now().date_naive().to_string(), tenant);
-        match resp {
+        match self.client.put(&to_object_store_path(path), resource).await {
             Ok(_) => {
+                increment_object_store_calls_by_date(
+                    "PUT",
+                    &Utc::now().date_naive().to_string(),
+                    tenant,
+                );
                 increment_files_scanned_in_object_store_calls_by_date(
                     "PUT",
                     1,
                     &Utc::now().date_naive().to_string(),
                     tenant,
                 );
                 Ok(())
             }
             Err(err) => Err(err.into()),
         }

Also applies to: 321-357, 359-381, 499-517, 639-660, 663-679, 692-717

🤖 Fix all issues with AI agents
In `@src/handlers/http/cluster/mod.rs`:
- Around line 1900-1915: The outgoing request header map (variable map)
currently only inserts AUTHORIZATION when auth_token is None and iterates auth
when Some, but never propagates tenant_id; modify the logic around
auth_token/querier to always insert the tenant header when tenant_id is present:
check the tenant_id (querier.tenant_id or local tenant_id variable) and, if
Some, create a HeaderName (matching your project's tenant header constant or
e.g., "x-tenant-id") and a HeaderValue from the tenant_id string and insert it
into map (similar to how AUTHORIZATION is inserted); ensure this runs both in
the auth_token Some branch and the else branch so tenant context is always added
to map.

In `@src/metastore/metastores/object_store_metastore.rs`:
- Around line 317-332: The tenant fallback in get_alert_states is incorrect: it
uses DEFAULT_TENANT which makes the base path
RelativePathBuf::from_iter([&tenant, ALERTS_ROOT_DIRECTORY]) point at
DEFAULT_TENANT/.alerts/ instead of the tenant-less path used elsewhere; change
the fallback to the empty string (use tenant_id.as_ref().map_or("", |v| v) or
as_deref().unwrap_or("")) so the RelativePathBuf matches other methods that use
an empty string for non-tenant paths, ensuring get_objects searches the same
/.alerts/ location when tenant_id is None.

In `@src/query/stream_schema_provider.rs`:
- Around line 84-96: The table() method currently checks table_exist(name) then
calls PARSEABLE.get_stream(name, &self.tenant_id).expect(STREAM_EXISTS), which
can panic if the stream is deleted between the two calls; remove the TOCTOU by
calling PARSEABLE.get_stream(...) directly inside table(), match on its result
(None => return Ok(None), Some(stream) => construct and return the
StandardTableProvider wrapped in Arc), and avoid using expect(STREAM_EXISTS);
ensure any errors from get_stream are converted to a DataFusionResult error or
mapped to Ok(None) as appropriate so table() no longer panics.

In `@src/storage/store_metadata.rs`:
- Around line 296-322: put_staging_metadata currently constructs a
tenant-specific path but never creates the tenant subdirectory, causing open to
fail for new tenants; before opening/writing the metadata file in
put_staging_metadata, call std::fs::create_dir_all on the directory portion
(e.g., PARSEABLE.options.staging_dir() or the tenant-specific parent via
path.parent()) to ensure the staging directory exists, propagate any io::Error
(matching the function's io::Result<()>), and then proceed to write the file
(retaining use of PARSEABLE_METADATA_FILE_NAME and keeping adjustments to
staging_metadata.server_mode and .staging).

Comment on lines 84 to 96
async fn table(&self, name: &str) -> DataFusionResult<Option<Arc<dyn TableProvider>>> {
if self.table_exist(name) {
Ok(Some(Arc::new(StandardTableProvider {
schema: PARSEABLE
.get_stream(name)
.get_stream(name, &self.tenant_id)
.expect(STREAM_EXISTS)
.get_schema(),
stream: name.to_owned(),
tenant_id: self.tenant_id.clone(),
})))
} else {
Ok(None)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Potential TOCTOU race between table_exist and get_stream.

There's a small race window between the table_exist check (line 85) and the get_stream call (line 88). If the stream is deleted between these calls, the .expect(STREAM_EXISTS) will panic.

Consider handling the get_stream result directly instead of relying on the prior existence check:

Suggested fix
     async fn table(&self, name: &str) -> DataFusionResult<Option<Arc<dyn TableProvider>>> {
-        if self.table_exist(name) {
-            Ok(Some(Arc::new(StandardTableProvider {
-                schema: PARSEABLE
-                    .get_stream(name, &self.tenant_id)
-                    .expect(STREAM_EXISTS)
-                    .get_schema(),
-                stream: name.to_owned(),
-                tenant_id: self.tenant_id.clone(),
-            })))
-        } else {
-            Ok(None)
+        match PARSEABLE.get_stream(name, &self.tenant_id) {
+            Ok(stream) => Ok(Some(Arc::new(StandardTableProvider {
+                schema: stream.get_schema(),
+                stream: name.to_owned(),
+                tenant_id: self.tenant_id.clone(),
+            }))),
+            Err(_) => Ok(None),
         }
     }
🤖 Prompt for AI Agents
In `@src/query/stream_schema_provider.rs` around lines 84 - 96, The table() method
currently checks table_exist(name) then calls PARSEABLE.get_stream(name,
&self.tenant_id).expect(STREAM_EXISTS), which can panic if the stream is deleted
between the two calls; remove the TOCTOU by calling PARSEABLE.get_stream(...)
directly inside table(), match on its result (None => return Ok(None),
Some(stream) => construct and return the StandardTableProvider wrapped in Arc),
and avoid using expect(STREAM_EXISTS); ensure any errors from get_stream are
converted to a DataFusionResult error or mapped to Ok(None) as appropriate so
table() no longer panics.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (17)
src/storage/retention.rs (1)

215-224: Include tenant context in path construction.

The path on line 216 should include tenant_id as the first element when present, consistent with the pattern used elsewhere for tenant-aware storage operations. Construct the path as [tenant_id, &stream_name, &date] when a tenant exists, or [&stream_name, &date] otherwise.

Example pattern from codebase
// Current (retention.rs:216)
let path = RelativePathBuf::from_iter([&stream_name, &date]);

// Should be (aligns with object_store_metastore.rs patterns)
let path = if let Some(tenant_id) = tenant_id.as_ref() {
    RelativePathBuf::from_iter([tenant_id, &stream_name, &date])
} else {
    RelativePathBuf::from_iter([&stream_name, &date])
};
src/prism/logstream/mod.rs (2)

200-214: Missing hottier field in response struct.

The build_dataset_response method fetches hot tier info (line 322) but this struct has no hottier field to include it. Either add the field or remove the unused fetch.

🛠️ If hot tier info should be included:
 #[derive(Serialize, Debug)]
 pub struct PrismDatasetResponse {
     /// Name of the stream
     stream: String,
     /// Basic information about the stream
     info: StreamInfo,
     /// Schema of the stream
     schema: Arc<Schema>,
     /// Statistics for the queried timeframe
     stats: QueriedStats,
     /// Retention policy details
     retention: Retention,
     /// Count of records in the specified time range
     counts: CountsResponse,
+    /// Hot tier configuration (if enabled)
+    #[serde(skip_serializing_if = "Option::is_none")]
+    hottier: Option<StreamHotTier>,
 }

408-408: Incorrect HTTP status code for TimeParse error.

TimeParseError indicates malformed input (a client error), not a missing resource. BAD_REQUEST (400) is more appropriate than NOT_FOUND (404).

-            PrismLogstreamError::TimeParse(_) => StatusCode::NOT_FOUND,
+            PrismLogstreamError::TimeParse(_) => StatusCode::BAD_REQUEST,
src/otel/traces.rs (1)

922-922: Test will not compile: missing tenant_id parameter.

The flatten_otel_traces function now requires a tenant_id: &str parameter, but the test at line 922 still calls flatten_otel_traces(&traces_data) without it.

🔧 Suggested fix
-        let result = flatten_otel_traces(&traces_data);
+        let result = flatten_otel_traces(&traces_data, "test-tenant");
src/hottier.rs (1)

208-218: Delete path ignores tenant_id.
When a tenant is provided, deleting hot_tier_path/stream can wipe the wrong tenant or leave the correct one behind. Use a tenant-aware base path.

🐛 Suggested fix
-        let path = self.hot_tier_path.join(stream);
+        let path = if let Some(tenant) = tenant_id.as_ref() {
+            self.hot_tier_path.join(tenant).join(stream)
+        } else {
+            self.hot_tier_path.join(stream)
+        };
         fs::remove_dir_all(path).await?;
src/parseable/streams.rs (1)

118-127: Update all Stream::new call sites to pass tenant_id.
The signature changed (Line 118), but several callers/tests in this file still pass only 4 args, which will fail to compile.

🔧 Example fix (apply across call sites)
-        let staging = Stream::new(
+        let staging = Stream::new(
             options.clone(),
             stream_name,
             LogStreamMetadata::default(),
             None,
+            &None,
         );
src/alerts/alerts_utils.rs (1)

130-154: auth_token parameter is unused in remote query execution.

The auth_token parameter is accepted at line 131 but send_query_request is called with None at line 149 instead. This appears to be an oversight - if authentication is needed for remote queries in Prism mode, the token should be forwarded.

🔧 Proposed fix
-    let (result_value, _) = send_query_request(None, &query_request, tenant_id)
+    let (result_value, _) = send_query_request(auth_token, &query_request, tenant_id)
src/alerts/alert_structs.rs (1)

873-881: AlertStateEntry should follow the same tenant-scoping pattern as AlertConfig and MTTRHistory.

AlertStateEntry lacks a tenant_id field and does not include tenant information in its get_object_path() implementation, unlike AlertConfig (line 889) and MTTRHistory (line 899) which both incorporate tenant IDs into their storage paths. While tenant isolation currently works through the storage layer's parameter handling, this creates an inconsistent pattern. Consider either adding tenant_id to the AlertStateEntry struct and using it in the path (matching the pattern of AlertConfig), or updating alert_state_json_path() to accept a tenant parameter (like alert_json_path() does).

src/rbac/user.rs (1)

346-354: User existence check in validate is not tenant-scoped.

Line 350 checks users().contains_key(group_user.userid()) but users() returns a tenant-keyed map based on the PR's multi-tenant design. This check will likely fail or check the wrong scope since it doesn't use tenant_id to look up the correct tenant's user map.

🔧 Suggested fix
         let mut non_existent_users = Vec::new();
         if !self.users.is_empty() {
             // validate that the users exist
             for group_user in &self.users {
-                if !users().contains_key(group_user.userid()) {
+                let tenant = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v.as_str());
+                let user_exists = users()
+                    .get(tenant)
+                    .map(|tenant_users| tenant_users.contains_key(group_user.userid()))
+                    .unwrap_or(false);
+                if !user_exists {
                     non_existent_users.push(group_user.userid().to_string());
                 }
             }
         }
src/storage/gcs.rs (3)

180-207: Metrics increment should occur after successful response, not before.

Line 187 increments increment_object_store_calls_by_date before checking if the response succeeded. Per the learnings, metrics should only be captured when operations succeed. The Azure Blob implementation shows the correct pattern where metrics are recorded after the await?.

Based on learnings, metrics should only increment on successful operations.

🔧 Suggested fix
     async fn _get_object(
         &self,
         path: &RelativePath,
         tenant_id: &Option<String>,
     ) -> Result<Bytes, ObjectStorageError> {
         let resp = self.client.get(&to_object_store_path(path)).await;
         let tenant = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v);
-        increment_object_store_calls_by_date("GET", &Utc::now().date_naive().to_string(), tenant);
         match resp {
             Ok(resp) => {
                 let body: Bytes = resp.bytes().await?;
+                increment_object_store_calls_by_date("GET", &Utc::now().date_naive().to_string(), tenant);
                 increment_files_scanned_in_object_store_calls_by_date(

209-230: _put_object and _delete_prefix have same metrics timing issue.

Lines 217 and 242 increment metrics before confirming success. Apply the same fix pattern — move the call count increment inside the success branch.

Also applies to: 232-283


702-707: list_streams unimplemented across all cloud storage backends.

This method returns Err(ObjectStorageError::Custom(...)) for GCS, S3, and Azure Blob Storage. This breaks the stream listing API endpoint (/api/v1/logstream) for all cloud-based deployments. LocalFS has a complete implementation, indicating this functionality is needed. Either implement list_streams for cloud backends or provide an alternative mechanism for querying available streams.

src/storage/azure_blob.rs (1)

215-472: Record object-store metrics only after successful operations.
increment_object_store_calls_by_date (and related file/bytes counters) are executed even when GET/PUT/LIST/DELETE fail, which will skew billing metrics. Move increments into the Ok(...) paths (and after deletes succeed).

🔧 Example fix pattern
-        let resp = self.client.get(&to_object_store_path(path)).await;
-        increment_object_store_calls_by_date("GET", &Utc::now().date_naive().to_string(), tenant);
+        let resp = self.client.get(&to_object_store_path(path)).await;
@@
-            Ok(resp) => {
+            Ok(resp) => {
+                increment_object_store_calls_by_date(
+                    "GET",
+                    &Utc::now().date_naive().to_string(),
+                    tenant,
+                );
                 let body: Bytes = resp.bytes().await?;
@@
-        let resp = self.client.put(&to_object_store_path(path), resource).await;
-        increment_object_store_calls_by_date("PUT", &Utc::now().date_naive().to_string(), tenant);
+        let resp = self.client.put(&to_object_store_path(path), resource).await;
         match resp {
             Ok(_) => {
+                increment_object_store_calls_by_date(
+                    "PUT",
+                    &Utc::now().date_naive().to_string(),
+                    tenant,
+                );
                 increment_files_scanned_in_object_store_calls_by_date(

Based on learnings, metrics should only be recorded for successful object-store operations.

src/migration/mod.rs (1)

50-118: Staging metadata lookup should be tenant-aware.
run_metadata_migration now accepts tenant_id but still calls get_staging_metadata(config) without tenant context. In multi-tenant mode, this will read only the root staging metadata and skip tenant subdirs.

Consider changing get_staging_metadata to accept tenant_id and read from staging_dir()/tenant_id/.parseable.json (matching put_staging_metadata).

src/handlers/http/cluster/mod.rs (2)

501-707: Add tenant header to user/role sync requests to prevent multi-tenant collisions.

These sync requests omit the tenant context despite having access to it. The receiving handlers extract tenant from the "tenant" request header via get_tenant_id_from_request(), and without it they default to DEFAULT_TENANT. In multi-tenant deployments, this causes operations on users/roles to potentially affect the wrong tenant when usernames collide.

Include the tenant header in all four sync functions:

  • sync_users_with_roles_with_ingestors
  • sync_user_deletion_with_ingestors
  • sync_user_creation
  • sync_password_reset_with_ingestors
🔧 Example fix (apply similarly to deletion/password-reset)
-            let res = INTRA_CLUSTER_CLIENT
-                .patch(url)
-                .header(header::AUTHORIZATION, &ingestor.token)
-                .header(header::CONTENT_TYPE, "application/json")
-                .body(role_data)
-                .send()
-                .await
+            let mut req = INTRA_CLUSTER_CLIENT
+                .patch(url)
+                .header(header::AUTHORIZATION, &ingestor.token)
+                .header(header::CONTENT_TYPE, "application/json");
+            if let Some(tenant) = tenant_id.as_ref() {
+                req = req.header("tenant", tenant);
+            }
+            let res = req.body(role_data).send().await
                 .map_err(|err| {

379-417: Add tenant header to sync requests.

The function uses tenant_id to select target ingestors but doesn't forward it in the HTTP request headers. In multi-tenant mode, syncs can default to the default tenant unless the tenant context is propagated. Add the tenant header when tenant_id is present.

🔧 Suggested fix
     let mut reqwest_headers = reqwest::header::HeaderMap::new();
@@
     for (key, value) in headers.iter() {
         // Convert actix header name/value to reqwest header name/value
         if let Ok(name) = reqwest::header::HeaderName::from_bytes(key.as_str().as_bytes())
             && let Ok(val) = reqwest::header::HeaderValue::from_bytes(value.as_bytes())
         {
             reqwest_headers.insert(name, val);
         }
     }
+    if let Some(tenant) = tenant_id.as_ref() {
+        if let Ok(val) = reqwest::header::HeaderValue::from_str(tenant) {
+            reqwest_headers.insert(reqwest::header::HeaderName::from_static("tenant"), val);
+        }
+    }
src/parseable/mod.rs (1)

467-547: Add tenant header when syncing internal streams.

When sync_streams_with_ingestors forwards requests to ingestors, the ingestor endpoint extracts tenant context via get_tenant_id_from_request(&req), which reads the "tenant" header. The header_map created in this function lacks this header, causing internal streams to sync into the default tenant regardless of the actual tenant_id being processed in the loop.

🔧 Suggested fix
             let mut header_map = HeaderMap::new();
             header_map.insert(
                 HeaderName::from_str(STREAM_TYPE_KEY).unwrap(),
                 HeaderValue::from_str(&StreamType::Internal.to_string()).unwrap(),
             );
             header_map.insert(CONTENT_TYPE, HeaderValue::from_static("application/json"));
+            if let Some(tenant) = tenant_id.as_ref() {
+                if let Ok(val) = HeaderValue::from_str(tenant) {
+                    header_map.insert(HeaderName::from_static("tenant"), val);
+                }
+            }
🤖 Fix all issues with AI agents
In `@src/handlers/http/correlation.rs`:
- Around line 89-91: The current mapping of
get_user_and_tenant_from_request(&req) into only a hashed user discards tenant
context; modify the flow so the closure returns both hashed user and tenant
(e.g., map(|(user, tenant)| (get_hash(&user.to_string()), tenant.clone())) or
otherwise extract both values, then set correlation.user_id and
correlation.tenant_id accordingly, preserving the same error mapping to
CorrelationError::AnyhowError(Error::msg(err.to_string())). Use the same symbols
get_user_and_tenant_from_request, get_hash, correlation.user_id,
correlation.tenant_id, and CorrelationError::AnyhowError in the fix.

In `@src/handlers/http/modal/ingest/ingestor_rbac.rs`:
- Line 60: The call to storage::put_staging_metadata is currently ignored (let _
= ...), causing silent failures and state divergence; replace these with proper
error propagation/handling by unwrapping the Result (e.g., use the ? operator)
or mapping the error into the handler's error type and returning it so
persistence failures bubble up; apply the same change for every invocation of
storage::put_staging_metadata in this file (the occurrences around the other
call sites) and ensure you log the error context before returning if your
handler pattern requires logging.

In `@src/handlers/http/modal/ingest/ingestor_role.rs`:
- Line 58: The call to storage::put_staging_metadata(&metadata, req_tenant_id)
is being ignored; change it to propagate and handle failures instead of
discarding them. Replace the `let _ = ...` with a proper Result check (e.g., use
the `?` operator or map_err to convert into the handler's error type) and ensure
the enclosing function (in ingestor_role.rs) returns a compatible Result so
storage errors bubble up; alternatively perform the storage write before
mutating any in-memory state or roll back the in-memory update on error so state
stays consistent.

In `@src/handlers/http/rbac.rs`:
- Around line 72-75: The handler list_users (and the other handlers noted)
currently calls get_tenant_id_from_request(&req) which reads a spoofable header;
change them to derive tenant_id from the authenticated session/claims instead
(e.g., fetch the authenticated user/claims injected by the auth middleware such
as get_authenticated_user / get_user_from_session or
req.extensions().get::<AuthClaims>()), then use that user's tenant field when
calling Users.collect_user::<User>(&tenant_id); remove reliance on request
headers and return an unauthorized/error if no authenticated session is present.

In `@src/handlers/http/users/dashboards.rs`:
- Around line 36-38: The code currently derives tenant_id from request headers
using get_tenant_id_from_request; change it to derive tenant from the
authenticated session/token instead (e.g., call the session-based helper such as
get_tenant_id_from_session or extract tenant from the request's auth principal)
inside list_dashboards so the tenant is not taken from forgeable headers; update
the same pattern in the other handlers in this file (the functions around lines
59-69) to replace any use of get_tenant_id_from_request with the session-derived
tenant extraction and ensure downstream logic uses that value.

In `@src/prism/logstream/mod.rs`:
- Around line 321-334: The local variable `hottier` returned by
get_hot_tier_info is fetched but never used; update the PrismDatasetResponse
construction to include the hot-tier info by adding a hottier (or hot_tier)
field populated with the hottier value, i.e. set the response's hot tier field
to hottier in the PrismDatasetResponse initializer; if PrismDatasetResponse
currently lacks a matching field, add a suitably typed field (e.g., hottier:
HotTierInfo) to the PrismDatasetResponse definition so the value can be stored
and propagated.
♻️ Duplicate comments (64)
src/handlers/airplane.rs (1)

113-115: Propagate tenant_id instead of defaulting to the global tenant.

These paths still pass &None into tenant-aware APIs, so Flight requests for non-default tenants will read/execute/clean up against the default tenant. That’s a cross-tenant data and auth boundary violation. Please derive tenant_id from the session key (or ticket metadata for get_schema) and thread it through all these calls.

🔧 Example fix pattern (derive tenant_id and pass it through)
- let key = extract_session_key(req.metadata())
+ let key = extract_session_key(req.metadata())
     .map_err(|e| Status::unauthenticated(e.to_string()))?;

+ let (_, tenant_raw) = Users
+     .get_userid_from_session(&key)
+     .ok_or_else(|| Status::unauthenticated("invalid session"))?;
+ let tenant_id = if tenant_raw == DEFAULT_TENANT { None } else { Some(tenant_raw) };

- let schema = PARSEABLE.get_stream(&table_name, &None) ...
+ let schema = PARSEABLE.get_stream(&table_name, &tenant_id) ...

- let ingester_metadatas = get_node_info(NodeType::Ingestor, &None).await?;
+ let ingester_metadatas = get_node_info(NodeType::Ingestor, &tenant_id).await?;

- user_auth_for_datasets(&permissions, &streams, &None).await?;
+ user_auth_for_datasets(&permissions, &streams, &tenant_id).await?;

- let (records, _) = execute(query, false, &None).await?;
+ let (records, _) = execute(query, false, &tenant_id).await?;

- PARSEABLE.get_or_create_stream(&stream_name, &None).clear();
+ PARSEABLE.get_or_create_stream(&stream_name, &tenant_id).clear();

Also applies to: 169-171, 202-206, 209-211, 238-239

src/prism/logstream/mod.rs (2)

66-73: Debug artifacts and fake stats already flagged.

The debug tracing::warn! statements and the hardcoded QueriedStats::default() bypassing real stats have been raised in previous reviews. Please address those comments.


111-158: Stats retrieval logic is correct; debug log previously flagged.

The tenant-aware stats retrieval and merging logic is properly implemented. The tracing::warn!("starting stats") on line 115 was flagged in a previous review.

src/handlers/http/users/dashboards.rs (2)

78-85: Same tenant isolation issue here—use session-derived tenant.
The header-based tenant can be spoofed to read other tenants’ dashboards.

🔒 Proposed fix
-    let tenant_id = get_tenant_id_from_request(&req);
+    let (_user_id, tenant_id) = get_user_and_tenant_from_request(&req)?;

248-251: Derive tenant from session to avoid cross-tenant tag disclosure.
Header-based tenant selection bypasses isolation.

🔒 Proposed fix
-    let tags = DASHBOARDS
-        .list_tags(&get_tenant_id_from_request(&req))
-        .await;
+    let (_user_id, tenant_id) = get_user_and_tenant_from_request(&req)?;
+    let tags = DASHBOARDS.list_tags(&tenant_id).await;
src/query/listing_table_builder.rs (1)

101-102: Clarify the comment to reflect correct multi-tenancy behavior.

The comment "None because no new data will be created using this method" is misleading. This is a read/listing operation, not a data creation operation. Passing None for tenant_id defaults to DEFAULT_TENANT (as seen in the storage implementations across s3.rs, gcs.rs, and azure_blob.rs).

The comment should explain why the default tenant is appropriate for this historical data listing context.

Suggested comment improvement
-            // None because no new data will be created using this method
+            // tenant_id is None to use the default tenant for this historical data listing
             match storage.list_dirs_relative(&prefix, &None).await {
src/storage/store_metadata.rs (1)

307-329: Ensure tenant staging directory exists before write.

The function constructs a tenant-specific path but never creates the tenant subdirectory. For new tenants, this will fail with NotFound when attempting to open the file.

🐛 Suggested fix
 pub fn put_staging_metadata(meta: &StorageMetadata, tenant_id: &Option<String>) -> io::Result<()> {
     let mut staging_metadata = meta.clone();
     staging_metadata.server_mode = PARSEABLE.options.mode;
     staging_metadata.staging = PARSEABLE.options.staging_dir().to_path_buf();
     let path = if let Some(tenant_id) = tenant_id.as_ref() {
-        PARSEABLE
-            .options
-            .staging_dir()
-            .join(tenant_id)
-            .join(PARSEABLE_METADATA_FILE_NAME)
+        let tenant_dir = PARSEABLE.options.staging_dir().join(tenant_id);
+        create_dir_all(&tenant_dir)?;
+        tenant_dir.join(PARSEABLE_METADATA_FILE_NAME)
     } else {
         PARSEABLE
             .options
             .staging_dir()
             .join(PARSEABLE_METADATA_FILE_NAME)
     };
src/handlers/http/modal/query/querier_rbac.rs (1)

79-81: User created without tenant association.

User::new_basic is called with None for the tenant parameter, but tenant_id is available from the request context. This means new users won't be properly associated with their tenant in a multi-tenant deployment.

-    let (user, password) = user::User::new_basic(username.clone(), None);
+    let (user, password) = user::User::new_basic(username.clone(), tenant_id.clone());
src/storage/s3.rs (2)

341-401: Record object-store metrics only after success.
Metrics are still incremented before result is known across GET/PUT/DELETE/LIST paths, which inflates failed calls.

🐛 Example fix (apply across similar methods)
-        let resp = self.client.get(&to_object_store_path(path)).await;
-        increment_object_store_calls_by_date(
-            "GET",
-            &Utc::now().date_naive().to_string(),
-            tenant_str,
-        );
+        let resp = self.client.get(&to_object_store_path(path)).await;

         match resp {
             Ok(resp) => {
+                increment_object_store_calls_by_date(
+                    "GET",
+                    &Utc::now().date_naive().to_string(),
+                    tenant_str,
+                );
                 let body = resp.bytes().await?;

Based on learnings, metrics should be recorded only after successful operations.

Also applies to: 403-460, 462-531, 533-626, 629-695, 826-848, 850-872, 884-909, 1088-1122, 1124-1160


948-952: Use tenant_str in HEAD metrics.
Passing "" fragments tenant-scoped metrics.

src/query/stream_schema_provider.rs (3)

84-96: Avoid TOCTOU between table_exist and get_stream.
This still risks a panic if the stream disappears between checks (Line 85–89).


533-538: Downgrade per-scan logging from warn.
This is noisy on every scan and should be debug/trace.


648-650: Remove or downgrade warn! on object_store_url.
Per-scan warn logging will flood production logs.

src/hottier.rs (3)

92-110: Fix stream/tenant exclusion logic in hot-tier size aggregation.
The current condition still excludes the wrong entries when tenant differs.


595-603: Avoid panics in check_stream_hot_tier_exists.
hot_tier_file_path(...).unwrap() can panic on invalid paths.


779-803: pstats hot-tier sizing should use internal stream defaults.
This still uses MIN_STREAM_HOT_TIER_SIZE_BYTES.

src/parseable/streams.rs (4)

1068-1091: Reduce noisy warn! logs in get_or_create.
These appear to be debug-only and will flood production logs.


1104-1112: Missing-tenant warnings are too noisy.
Use debug/trace to avoid log spam on normal lookups.


657-659: Remove or downgrade warn!(part_path=…).
This is a hot path log and should not be at warn level.


1175-1181: Drop debug warn! logs in flush_and_convert.
These will spam logs during normal operation.

src/handlers/http/health_check.rs (1)

120-128: Guard against invalid tenant headers.
get_tenant_id_from_request still unwraps to_str(), so a malformed header can panic readiness.

src/handlers/http/correlation.rs (1)

45-52: Inconsistent tenant_id extraction creates potential security risk.

The get function extracts tenant_id from an HTTP header via get_tenant_id_from_request, while delete (lines 106-108) extracts it from the authenticated session via get_user_and_tenant_from_request. This inconsistency could allow a user to access correlations from other tenants by manipulating the tenant header.

src/tenants/mod.rs (1)

51-72: Data duplication persists - suspend/resume don't sync with meta.

As noted in the past review, suspended_services is stored both in TenantOverview.suspended_services (the HashSet) and within TenantOverview.meta.suspended_services. When suspend_service or resume_service is called, only the HashSet is updated, leaving meta.suspended_services stale.

This causes get_tenants() (line 99-104) to return potentially outdated suspension state since it returns meta.clone().

🐛 Option: Sync meta when suspending/resuming
 pub fn suspend_service(&self, tenant_id: &str, service: Service) {
     if let Some(mut tenant) = self.tenants.get_mut(tenant_id) {
         tenant.suspended_services.insert(service.clone());
+        // Keep meta in sync
+        let services = tenant.meta.suspended_services.get_or_insert_with(HashSet::new);
+        services.insert(service);
     }
 }

 pub fn resume_service(&self, tenant_id: &str, service: Service) {
     if let Some(mut tenant) = self.tenants.get_mut(tenant_id) {
         tenant.suspended_services.remove(&service);
+        // Keep meta in sync
+        if let Some(services) = tenant.meta.suspended_services.as_mut() {
+            services.remove(&service);
+        }
     }
 }
src/alerts/alert_types.rs (1)

91-123: Auth credential extraction is incomplete - always returns None.

As noted in the past review, the admin user lookup logic (lines 91-105) correctly identifies an admin user for the tenant, but the credential extraction (lines 107-118) always returns None:

  • Native user: returns None with TODO comments
  • OAuth user: logs warning and returns None

This means execute_alert_query will receive None for the auth token, potentially causing remote alert queries in Prism mode to fail authentication.

src/utils/mod.rs (2)

63-78: Potential panic when BasicAuth password is omitted.

As flagged in a past review, line 69 uses basic.password().as_deref().unwrap() which can panic if the password is absent. HTTP Basic Auth spec allows omitted passwords. Other paths in the codebase (e.g., extract_session_key_from_req in src/utils/actix.rs) handle this with unwrap_or("").

🛠️ Proposed fix
         Users.mutate_request_with_basic_user(
             basic.user_id(),
-            basic.password().as_deref().unwrap(),
+            basic.password().as_deref().unwrap_or(""),
             req,
         );

126-132: Potential panic on malformed header value.

As flagged in a past review, line 128 uses .to_str().unwrap() which can panic if the tenant header contains non-UTF8 bytes. Consider handling this gracefully.

💡 Suggested fix
 pub fn get_tenant_id_from_request(req: &HttpRequest) -> Option<String> {
     if let Some(tenant_value) = req.headers().get("tenant") {
-        Some(tenant_value.to_str().unwrap().to_owned())
+        tenant_value.to_str().ok().map(|s| s.to_owned())
     } else {
         None
     }
 }
src/handlers/http/modal/ingest/ingestor_role.rs (1)

47-51: Inverted tenant validation logic.

The condition req_tenant.eq(&sync_req.tenant_id) checks for equality when it should check for inequality. Based on the error message, this should reject when a non-super-admin tries to create a role for a different tenant, not the same tenant.

🐛 Proposed fix
-    if req_tenant.ne(DEFAULT_TENANT) && (req_tenant.eq(&sync_req.tenant_id)) {
+    if req_tenant.ne(DEFAULT_TENANT) && (req_tenant.ne(&sync_req.tenant_id)) {
         return Err(RoleError::Anyhow(anyhow::Error::msg(
             "non super-admin user trying to create role for another tenant",
         )));
     }
src/handlers/http/modal/ingest/ingestor_rbac.rs (1)

51-57: Inverted tenant validation logic.

Same issue as in ingestor_role.rs: the condition checks for a match when it should check for a mismatch. This allows cross-tenant user creation while blocking valid same-tenant operations.

🐛 Proposed fix
-        if req_tenant.ne(DEFAULT_TENANT)
-            && (req_tenant.eq(user.tenant.as_ref().map_or(DEFAULT_TENANT, |v| v)))
+        if req_tenant.ne(DEFAULT_TENANT)
+            && (req_tenant.ne(user.tenant.as_ref().map_or(DEFAULT_TENANT, |v| v)))
src/alerts/target.rs (2)

61-69: Convert DEFAULT_TENANT string to None for consistency.

When tenant_id equals DEFAULT_TENANT, storing Some("DEFAULT_TENANT") in t.tenant violates the codebase convention where None represents the default tenant.

♻️ Suggested fix
         for (tenant_id, targets) in targets {
             let inner = targets
                 .into_iter()
                 .map(|mut t| {
-                    t.tenant = Some(tenant_id.clone());
+                    t.tenant = if tenant_id == DEFAULT_TENANT {
+                        None
+                    } else {
+                        Some(tenant_id.clone())
+                    };
                     (t.id, t)
                 })
                 .collect();
             map.insert(tenant_id, inner);
         }

75-87: Security: Override target.tenant with authenticated tenant before persisting.

Looking at src/handlers/http/targets.rs (lines 71-97), the update handler deserializes the JSON payload into a Target without overwriting the tenant field. Since Target.tenant is publicly deserialized, a client can inject an arbitrary tenant value and bypass tenant validation.

Fix in src/handlers/http/targets.rs: Set target.tenant = tenant_id.clone() before calling TARGETS.update(target).

src/handlers/http/middleware.rs (1)

199-202: Potential panic from HeaderValue::from_str().unwrap().

If tid contains characters not valid in HTTP headers (e.g., control characters), this will panic. Consider handling gracefully.

♻️ Suggested fix
-                        req.headers_mut().insert(
-                            HeaderName::from_static("tenant"),
-                            HeaderValue::from_str(&tid).unwrap(),
-                        );
-                        tenant
+                        if let Ok(header_val) = HeaderValue::from_str(&tid) {
+                            req.headers_mut().insert(
+                                HeaderName::from_static("tenant"),
+                                header_val,
+                            );
+                        }
+                        tenant
src/catalog/mod.rs (1)

563-576: Avoid empty tenant path component in partition_path.

map_or("") produces a leading empty path segment when tenant_id is None, which is inconsistent with other DEFAULT_TENANT fallbacks and can create mismatched paths.

♻️ Suggested fix
-    let root = tenant_id.as_ref().map_or("", |v| v);
+    let root = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v);

Please confirm expected behavior with the relative_path crate’s handling of empty components.

src/handlers/http/oidc.rs (3)

112-116: Basic-auth tenant selection still ignores request tenant context.

get_tenant_id_from_key is None for BasicAuth unless a session exists, so this can still default to the wrong tenant. Use get_tenant_id_from_request(&req) for BasicAuth.


132-155: Intra-cluster login sync should treat non‑2xx as failure.

send() succeeds on non‑2xx and silently swallows failed syncs.

🛠️ Suggested fix
-                        INTRA_CLUSTER_CLIENT
+                        let resp = INTRA_CLUSTER_CLIENT
                             .post(url)
                             .header(header::AUTHORIZATION, node.token)
                             .header(header::CONTENT_TYPE, "application/json")
                             .json(&json!(
                                 {
                                     "sessionCookie": _session,
                                     "user": _user,
                                     "expiry": EXPIRY_DURATION
                                 }
                             ))
                             .send()
-                            .await?;
+                            .await?;
+                        resp.error_for_status()?;

323-327: New OIDC users are still created without tenant association.

This leaves a multi-tenant isolation gap for new users.

src/rbac/map.rs (2)

311-322: Session expiry logic is inverted.

retain(|(_, expiry)| expiry < &now) keeps expired sessions.

🐛 Suggested fix
-        sessions.retain(|(_, expiry)| expiry < &now);
+        sessions.retain(|(_, expiry)| expiry > &now);

437-443: Avoid panics on invalid tenant header values.

HeaderValue::from_bytes(...).unwrap() can panic on invalid bytes.

♻️ Suggested fix
-            req.headers_mut().insert(
-                HeaderName::from_static("tenant"),
-                HeaderValue::from_bytes(tenant.as_bytes()).unwrap(),
-            );
+            if let Ok(value) = HeaderValue::from_str(tenant) {
+                req.headers_mut()
+                    .insert(HeaderName::from_static("tenant"), value);
+            }
src/rbac/mod.rs (2)

354-366: Basic-auth tenant resolution still picks the first match across tenants.

If credentials exist in multiple tenants, the chosen tenant is nondeterministic.


328-331: Avoid panics when constructing tenant header.

HeaderValue::from_bytes(...).unwrap() can panic on invalid tenant values.

♻️ Suggested fix
-            req.headers_mut().insert(
-                HeaderName::from_static("tenant"),
-                HeaderValue::from_bytes(tenant.as_bytes()).unwrap(),
-            );
+            if let Ok(value) = HeaderValue::from_str(tenant) {
+                req.headers_mut()
+                    .insert(HeaderName::from_static("tenant"), value);
+            }
src/rbac/user.rs (1)

380-390: add_roles uses caller-provided tenant_id for all users instead of each user's tenant_id.

Line 387 invalidates sessions using the function parameter tenant_id for every user in the group. However, add_users (line 399-402), remove_roles (line 421-424), and remove_users (line 440-443) all correctly use group_user.tenant_id. This inconsistency could invalidate sessions for the wrong tenant if a group contains users from different tenants.

🔧 Suggested fix
     pub fn add_roles(&mut self, roles: HashSet<String>, tenant_id: &str) -> Result<(), RBACError> {
         if roles.is_empty() {
             return Ok(());
         }
         self.roles.extend(roles);
         // also refresh all user sessions
         for group_user in &self.users {
-            mut_sessions().remove_user(group_user.userid(), tenant_id);
+            mut_sessions().remove_user(
+                group_user.userid(),
+                group_user.tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v),
+            );
         }
         Ok(())
     }
src/query/mod.rs (2)

191-205: Schema registration errors are silently ignored during initialization.

Lines 191 and 201 use let _ = catalog.register_schema(...) which discards any errors. Failed schema registration at startup will cause query failures for affected tenants without any diagnostic information.

🔧 Suggested fix
-                    let _ = catalog.register_schema(t, schema_provider);
+                    if let Err(e) = catalog.register_schema(t, schema_provider) {
+                        tracing::error!("Failed to register schema for tenant {}: {:?}", t, e);
+                    }

97-121: InMemorySessionContext::add_schema panics on registration failure.

Line 119 uses .expect("Should be able to register new schema") which will panic if schema registration fails. This could crash the server when adding a new tenant at runtime. Consider returning a Result or logging the error instead.

🔧 Suggested fix
-    pub fn add_schema(&self, tenant_id: &str) {
+    pub fn add_schema(&self, tenant_id: &str) -> Result<(), datafusion::error::DataFusionError> {
         self.session_context
             .write()
             .expect("SessionContext should be writeable")
             .catalog("datafusion")
             .expect("Default catalog should be available")
             .register_schema(
                 tenant_id,
                 Arc::new(GlobalSchemaProvider {
                     storage: PARSEABLE.storage().get_object_store(),
                     tenant_id: Some(tenant_id.to_owned()),
                 }),
-            )
-            .expect("Should be able to register new schema");
+            )?;
+        Ok(())
     }
src/handlers/http/query.rs (1)

112-131: Tenant context derived from header is spoofable.

Line 120 extracts tenant_id using get_tenant_id_from_request(&req) which reads from a header. Combined with the schema routing fix at lines 121-125 that sets default_schema based on this tenant, a malicious user could query another tenant's data by spoofing the header. The tenant should be derived from authenticated credentials.

src/storage/object_storage.rs (2)

986-1127: Lower per-file warn logs during sync.
Per-file warn! logging in process_parquet_files and process_schema_files is very noisy. Consider downgrading to debug/trace.


1214-1246: Skip tenant prefix for DEFAULT_TENANT.
schema_path / stream_json_path still include the tenant segment for Some(DEFAULT_TENANT), which can diverge from existing default-tenant layout and from stream_relative_path.

src/migration/mod.rs (2)

157-215: Per-tenant migration still aborts on first list_streams error.
Using ? inside the tenant loop exits early and skips remaining tenants. Handle errors per-tenant and continue.


494-507: Use PARSEABLE_METADATA_FILE_NAME instead of hardcoding.
put_staging_metadata still uses the hardcoded string ".parseable.json", which can drift from the constant used elsewhere.

src/handlers/http/cluster/mod.rs (4)

709-715: Drop the unused req: HttpRequest parameter.
It’s still unused in sync_role_update and will keep triggering warnings.


918-920: Avoid panic on malformed tenant header.
get_tenant_id_from_request still relies on to_str().unwrap() in the utils layer; malformed headers can panic. Please make tenant extraction fallible and handle errors here.


1900-1910: Missing tenant propagation in query forwarding.
The outgoing request map never adds the tenant header, so queries can default to the wrong tenant. Please add the tenant header when tenant_id is present.


1911-1914: Avoid unwrap() on auth header conversion.
HeaderValue::from_str(...).unwrap() can panic on invalid header values. Use a fallible conversion and map to QueryError instead.

src/parseable/mod.rs (4)

1071-1090: Fix TOCTOU race in add_tenant.
The read-then-write lock pattern is still present and allows a race between contains() and push(). Use a single write lock for the check/insert.


1130-1158: Tenant deletion is still incomplete.
delete_tenant doesn’t remove the tenant from self.tenants, and storage prefixes are not deleted. This leaves state and data behind.


1160-1196: load_tenants still has incomplete logic and silent lock failure.
The empty else if !is_multi_tenant {} branch and Ok(None) on lock failure remain. Please handle these explicitly.


1198-1207: Avoid silent lock failure in list_tenants.
Returning None on lock failure masks poisoned lock errors.

src/metastore/metastores/object_store_metastore.rs (7)

344-374: Alert state paths ignore tenant_id (breaks isolation).
alert_state_json_path must incorporate tenant scope for reads/writes.

🛠️ Proposed fix (also update alert_state_json_path signature and all callers)
-        let path = alert_state_json_path(*alert_id);
+        let path = alert_state_json_path(*alert_id, tenant_id);
-        let path = alert_state_json_path(id);
+        let path = alert_state_json_path(id, tenant_id);

Also applies to: 363-412


575-599: get_chats isn’t tenant-aware, so non-default tenants are skipped.

🛠️ Proposed fix
-        let users_dir = RelativePathBuf::from(USERS_ROOT_DIR);
-        for user in self.storage.list_dirs_relative(&users_dir, &None).await? {
-            if user.starts_with(".") {
-                continue;
-            }
-            let mut chats = Vec::new();
-            let chats_path = users_dir.join(&user).join("chats");
-            let user_chats = self
-                .storage
-                .get_objects(
-                    Some(&chats_path),
-                    Box::new(|file_name| file_name.ends_with(".json")),
-                    &None,
-                )
-                .await?;
-            for chat in user_chats {
-                chats.push(chat);
-            }
-
-            all_user_chats.insert(user, chats);
-        }
+        let base_paths = PARSEABLE.list_tenants().map_or(vec!["".into()], |v| v);
+        for mut tenant in base_paths {
+            let tenant_id = &Some(tenant.clone());
+            let users_dir = RelativePathBuf::from_iter([&tenant, USERS_ROOT_DIR]);
+            for user in self.storage.list_dirs_relative(&users_dir, tenant_id).await? {
+                if user.starts_with(".") {
+                    continue;
+                }
+                let mut chats = Vec::new();
+                let chats_path = users_dir.join(&user).join("chats");
+                let user_chats = self
+                    .storage
+                    .get_objects(
+                        Some(&chats_path),
+                        Box::new(|file_name| file_name.ends_with(".json")),
+                        tenant_id,
+                    )
+                    .await?;
+                chats.extend(user_chats);
+                all_user_chats.insert(user, chats);
+            }
+        }

1203-1217: Node metadata scope is inconsistent with put/delete (global vs tenant).
Currently read is tenant-scoped while write/delete are global, so reads may miss data.

🛠️ Proposed fix (align reads to global scope)
-        let metadata = self
+        let metadata = self
             .storage
             .get_objects(
                 Some(&root_path),
                 Box::new(move |file_name| file_name.starts_with(&prefix_owned)), // Use the owned copy
-                tenant_id,
+                &None,
             )
             .await?

317-322: Default-tenant fallback should not use DEFAULT_TENANT here.
This makes tenantless alert states look under DEFAULT_TENANT/.alerts/ instead of /.alerts/.

🛠️ Proposed fix
-        let tenant = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v);
-        let base_path = RelativePathBuf::from_iter([&tenant, ALERTS_ROOT_DIRECTORY]);
+        let base_path = if let Some(tenant) = tenant_id.as_ref() {
+            RelativePathBuf::from_iter([tenant, ALERTS_ROOT_DIRECTORY])
+        } else {
+            RelativePathBuf::from(ALERTS_ROOT_DIRECTORY)
+        };

515-540: Bug: dashboards overwrite per-user entries within a tenant.
Only the last user’s dashboards remain for each tenant.

🛠️ Proposed fix
-        for mut tenant in base_paths {
+        for mut tenant in base_paths {
             let tenant_id = &Some(tenant.clone());
             let users_dir = RelativePathBuf::from_iter([&tenant, USERS_ROOT_DIR]);
+            let mut tenant_dashboards = Vec::new();
             for user in self
                 .storage
                 .list_dirs_relative(&users_dir, tenant_id)
                 .await?
             {
                 let dashboards_path = users_dir.join(&user).join("dashboards");
                 let dashboard_bytes = self
                     .storage
                     .get_objects(
                         Some(&dashboards_path),
                         Box::new(|file_name| file_name.ends_with(".json")),
                         tenant_id,
                     )
                     .await?;
-                if tenant.eq(&mut "") {
-                    tenant.clone_from(&DEFAULT_TENANT.to_string());
-                }
-                dashboards.insert(tenant.to_owned(), dashboard_bytes);
+                tenant_dashboards.extend(dashboard_bytes);
             }
+            if tenant.eq(&mut "") {
+                tenant.clone_from(&DEFAULT_TENANT.to_string());
+            }
+            dashboards.insert(tenant, tenant_dashboards);
         }

678-693: Avoid unwraps after migration to prevent panics on malformed data.

🛠️ Proposed fix
-                                let user_id = filter_value
-                                    .as_object()
-                                    .unwrap()
-                                    .get("user_id")
-                                    .and_then(|user_id| user_id.as_str());
-                                let filter_id = filter_value
-                                    .as_object()
-                                    .unwrap()
-                                    .get("filter_id")
-                                    .and_then(|filter_id| filter_id.as_str());
-                                let stream_name = filter_value
-                                    .as_object()
-                                    .unwrap()
-                                    .get("stream_name")
-                                    .and_then(|stream_name| stream_name.as_str());
+                                let migrated_obj = filter_value.as_object();
+                                let user_id = migrated_obj
+                                    .and_then(|o| o.get("user_id"))
+                                    .and_then(|v| v.as_str());
+                                let filter_id = migrated_obj
+                                    .and_then(|o| o.get("filter_id"))
+                                    .and_then(|v| v.as_str());
+                                let stream_name = migrated_obj
+                                    .and_then(|o| o.get("stream_name"))
+                                    .and_then(|v| v.as_str());

1303-1318: Tenant prefix can leak into stream names.
When tenant_id is set, the tenant segment is still included in path.parts().

🛠️ Proposed fix
-            let streams = resp
-                .common_prefixes
-                .iter()
-                .flat_map(|path| {
-                    path.parts()
-                })
-                .map(|name| name.as_ref().to_string())
+            let streams = resp
+                .common_prefixes
+                .iter()
+                .flat_map(|path| {
+                    let mut parts = path
+                        .parts()
+                        .map(|p| p.as_ref().to_string())
+                        .collect::<Vec<_>>();
+                    if tenant_id.is_some() && !parts.is_empty() {
+                        parts.remove(0);
+                    }
+                    parts
+                })
+                .filter(|name| !name.is_empty())
                 .filter(|name| {
                     name != PARSEABLE_ROOT_DIRECTORY
                         && name != USERS_ROOT_DIR
                         && name != SETTINGS_ROOT_DIRECTORY
                         && name != ALERTS_ROOT_DIRECTORY
                 })
                 .collect::<Vec<_>>();
src/alerts/mod.rs (2)

1045-1050: Normalize DEFAULT_TENANT to None during load to avoid default-tenant path mismatches.
Otherwise default-tenant alerts can be stored under tenant-prefixed paths and re-migrated repeatedly.

🛠️ Proposed fix
-            let tenant = if tenant_id.is_empty() {
-                &None
-            } else {
-                &Some(tenant_id.clone())
-            };
+            let tenant = if tenant_id.is_empty() || tenant_id == DEFAULT_TENANT {
+                &None
+            } else {
+                &Some(tenant_id.clone())
+            };

Also applies to: 1097-1099


1331-1337: Avoid dropping updates when tenant bucket is missing.
Both writebacks no-op if the tenant bucket doesn’t exist (first-time tenant/race), losing updates.

🛠️ Proposed fix
-            if let Some(alerts) = write_access.get_mut(tenant) {
-                alerts.insert(*alert.get_id(), alert.clone_box());
-            }
+            write_access
+                .entry(tenant.to_owned())
+                .or_default()
+                .insert(*alert.get_id(), alert.clone_box());
-        if let Some(alerts) = write_access.get_mut(tenant) {
-            alerts.insert(*alert.get_id(), alert.clone_box());
-        }
+        write_access
+            .entry(tenant.to_owned())
+            .or_default()
+            .insert(*alert.get_id(), alert.clone_box());

Also applies to: 1377-1379

Comment on lines +89 to 91
correlation.user_id = get_user_and_tenant_from_request(&req)
.map(|(s, _)| get_hash(&s.to_string()))
.map_err(|err| CorrelationError::AnyhowError(Error::msg(err.to_string())))?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Same issue: tenant context discarded during modification.

Similar to post, the modify handler discards tenant_id. Ensure consistency with how create and update handle tenant scoping.

🤖 Prompt for AI Agents
In `@src/handlers/http/correlation.rs` around lines 89 - 91, The current mapping
of get_user_and_tenant_from_request(&req) into only a hashed user discards
tenant context; modify the flow so the closure returns both hashed user and
tenant (e.g., map(|(user, tenant)| (get_hash(&user.to_string()),
tenant.clone())) or otherwise extract both values, then set correlation.user_id
and correlation.tenant_id accordingly, preserving the same error mapping to
CorrelationError::AnyhowError(Error::msg(err.to_string())). Use the same symbols
get_user_and_tenant_from_request, get_hash, correlation.user_id,
correlation.tenant_id, and CorrelationError::AnyhowError in the fix.

}
let req_tenant_id = &user.tenant;
let metadata = get_metadata(req_tenant_id).await?;
let _ = storage::put_staging_metadata(&metadata, req_tenant_id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Silently ignoring storage error.

The result of put_staging_metadata is discarded. If persistence fails, in-memory and storage states diverge. This pattern repeats at lines 86, 133, 192, and 210.

🐛 Proposed fix for all occurrences
-        let _ = storage::put_staging_metadata(&metadata, req_tenant_id);
+        storage::put_staging_metadata(&metadata, req_tenant_id)?;

Apply similar fix at lines 86, 133, 192, and 210.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let _ = storage::put_staging_metadata(&metadata, req_tenant_id);
storage::put_staging_metadata(&metadata, req_tenant_id)?;
🤖 Prompt for AI Agents
In `@src/handlers/http/modal/ingest/ingestor_rbac.rs` at line 60, The call to
storage::put_staging_metadata is currently ignored (let _ = ...), causing silent
failures and state divergence; replace these with proper error
propagation/handling by unwrapping the Result (e.g., use the ? operator) or
mapping the error into the handler's error type and returning it so persistence
failures bubble up; apply the same change for every invocation of
storage::put_staging_metadata in this file (the occurrences around the other
call sites) and ensure you log the error context before returning if your
handler pattern requires logging.


let _ = storage::put_staging_metadata(&metadata);
mut_roles().insert(name.clone(), privileges);
let _ = storage::put_staging_metadata(&metadata, req_tenant_id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Silently ignoring storage error.

The result of put_staging_metadata is being discarded with let _ = .... If metadata persistence fails, the in-memory state will be updated but storage will be inconsistent. Consider propagating this error.

🐛 Proposed fix
-    let _ = storage::put_staging_metadata(&metadata, req_tenant_id);
+    storage::put_staging_metadata(&metadata, req_tenant_id)?;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let _ = storage::put_staging_metadata(&metadata, req_tenant_id);
storage::put_staging_metadata(&metadata, req_tenant_id)?;
🤖 Prompt for AI Agents
In `@src/handlers/http/modal/ingest/ingestor_role.rs` at line 58, The call to
storage::put_staging_metadata(&metadata, req_tenant_id) is being ignored; change
it to propagate and handle failures instead of discarding them. Replace the `let
_ = ...` with a proper Result check (e.g., use the `?` operator or map_err to
convert into the handler's error type) and ensure the enclosing function (in
ingestor_role.rs) returns a compatible Result so storage errors bubble up;
alternatively perform the storage write before mutating any in-memory state or
roll back the in-memory update on error so state stays consistent.

Comment on lines +72 to 75
pub async fn list_users(req: HttpRequest) -> impl Responder {
let tenant_id = get_tenant_id_from_request(&req);
web::Json(Users.collect_user::<User>(&tenant_id))
}
Copy link
Contributor

@coderabbitai coderabbitai bot Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Tenant derived from request header is spoofable — use authenticated session instead.

All handlers extract tenant_id using get_tenant_id_from_request(&req) which reads from a header. This allows any authenticated user to spoof their tenant context by setting the header, bypassing tenant isolation. The tenant should be derived from the authenticated user's session/credentials.

#!/bin/bash
# Check if there's a secure tenant extraction method available
rg -n "get_user.*tenant|tenant.*from.*session|extract.*tenant.*auth" src/ --type rust -A 3

Also applies to: 79-87, 91-108

🤖 Prompt for AI Agents
In `@src/handlers/http/rbac.rs` around lines 72 - 75, The handler list_users (and
the other handlers noted) currently calls get_tenant_id_from_request(&req) which
reads a spoofable header; change them to derive tenant_id from the authenticated
session/claims instead (e.g., fetch the authenticated user/claims injected by
the auth middleware such as get_authenticated_user / get_user_from_session or
req.extensions().get::<AuthClaims>()), then use that user's tenant field when
calling Users.collect_user::<User>(&tenant_id); remove reliance on request
headers and return an unauthorized/error if no authenticated session is present.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the header is added in middleware so how is it spoofable?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@parmesant Great question. If the middleware always canonicalizes the tenant and overwrites any client-provided header before the handler runs, then the header value seen by handlers is effectively trusted and not spoofable by clients. The risk exists only if either:

  • the middleware “adds when missing” or forwards an existing tenant header without verifying it against the authenticated user, or
  • some routes bypass the middleware.

Two concrete hardening options (either is fine):

  1. Prefer session-derived tenant in handlers
  • Use the authenticated context instead of the header on the RBAC endpoints (Lines 72–75, 79–87, 91–108). Example for list_users:
    • Before:
      • let tenant_id = get_tenant_id_from_request(&req);
    • After:
      • let (_, tenant_id) = utils::get_user_and_tenant_from_request(&req)?; // derives from session/basic-auth
      • return Ok(web::Json(Users.collect_user::(&tenant_id)));
    • This makes the trust chain explicit and avoids any dependency on inbound headers. You’ll need to change the signature to Result<impl Responder, RBACError> like other RBAC handlers already do.
  1. Make the middleware the single source of truth
  • In middleware:
    • Remove any incoming tenant header and set the canonical tenant derived from the authenticated user/basic auth:
      • req.headers_mut().remove("X-P-Tenant");
      • req.headers_mut().insert("X-P-Tenant", canonical_tenant.parse()?);
    • Alternatively (better), store it in req.extensions() and have get_tenant_id_from_request read from extensions first, falling back to header only on ingest/login paths that are intentionally unauthenticated.
  • This guarantees that even if a client or proxy sets the header, the server’s value wins.

If your middleware already strips/overwrites the header unconditionally for all authenticated routes (including these RBAC endpoints), then we’re aligned and this is secure. If it only injects when missing, I recommend switching to one of the two approaches above to eliminate any header-origin ambiguity.

Comment on lines 36 to 38
pub async fn list_dashboards(req: HttpRequest) -> Result<impl Responder, DashboardError> {
let tenant_id = get_tenant_id_from_request(&req);
let query_map = web::Query::<HashMap<String, String>>::from_query(req.query_string())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Derive tenant from authenticated session, not the request header.
Header-based tenant IDs are forgeable and allow cross-tenant listing. Use the session-derived tenant instead.

🔒 Proposed fix
-    let tenant_id = get_tenant_id_from_request(&req);
+    let (_user_id, tenant_id) = get_user_and_tenant_from_request(&req)?;

Also applies to: 59-69

🤖 Prompt for AI Agents
In `@src/handlers/http/users/dashboards.rs` around lines 36 - 38, The code
currently derives tenant_id from request headers using
get_tenant_id_from_request; change it to derive tenant from the authenticated
session/token instead (e.g., call the session-based helper such as
get_tenant_id_from_session or extract tenant from the request's auth principal)
inside list_dashboards so the tenant is not taken from forgeable headers; update
the same pattern in the other handlers in this file (the functions around lines
59-69) to replace any use of get_tenant_id_from_request with the session-derived
tenant extraction and ensure downstream logic uses that value.

Comment on lines +321 to +334
// Get hot tier info
let hottier = self.get_hot_tier_info(&stream, tenant_id).await?;

Ok(PrismDatasetResponse {
// Get counts
let counts = self.get_counts(&stream, tenant_id).await?;
tracing::warn!("got counts");
let res = PrismDatasetResponse {
stream,
info: info.info,
schema: info.schema,
stats: info.stats,
retention: info.retention,
counts,
})
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

hottier variable is fetched but never used.

The hot tier info is retrieved on line 322 but is not included in the PrismDatasetResponse construction. This appears to be incomplete implementation.

         let counts = self.get_counts(&stream, tenant_id).await?;
-        tracing::warn!("got counts");
         let res = PrismDatasetResponse {
             stream,
             info: info.info,
             schema: info.schema,
             stats: info.stats,
             retention: info.retention,
             counts,
+            hottier,  // Add after updating the struct
         };
-        tracing::warn!(prism_logstream_res=?res);
         Ok(res)
🤖 Prompt for AI Agents
In `@src/prism/logstream/mod.rs` around lines 321 - 334, The local variable
`hottier` returned by get_hot_tier_info is fetched but never used; update the
PrismDatasetResponse construction to include the hot-tier info by adding a
hottier (or hot_tier) field populated with the hottier value, i.e. set the
response's hot tier field to hottier in the PrismDatasetResponse initializer; if
PrismDatasetResponse currently lacks a matching field, add a suitably typed
field (e.g., hottier: HotTierInfo) to the PrismDatasetResponse definition so the
value can be stored and propagated.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (15)
src/storage/store_metadata.rs (1)

124-144: Tenant-specific staging metadata read is missing.
put_staging_metadata writes to a tenant subdirectory, but resolve_parseable_metadata still reads the global staging file. That can cause tenant-scoped staging state to be ignored and trigger incorrect env-change decisions. Consider reading from the same tenant-scoped path.

🛠️ Suggested fix
 pub async fn resolve_parseable_metadata(
     parseable_metadata: &Option<Bytes>,
     tenant_id: &Option<String>,
 ) -> Result<StorageMetadata, ObjectStorageError> {
-    let staging_metadata = get_staging_metadata()?;
+    let staging_metadata = get_staging_metadata(tenant_id)?;
     let remote_metadata = parseable_metadata
         .as_ref()
         .map(|meta| serde_json::from_slice(meta).expect("parseable config is valid json"));
     ...
 }
 
-pub fn get_staging_metadata() -> io::Result<Option<StorageMetadata>> {
-    let path = RelativePathBuf::from(PARSEABLE_METADATA_FILE_NAME)
-        .to_path(PARSEABLE.options.staging_dir());
+pub fn get_staging_metadata(tenant_id: &Option<String>) -> io::Result<Option<StorageMetadata>> {
+    let path = if let Some(tenant_id) = tenant_id.as_ref() {
+        PARSEABLE
+            .options
+            .staging_dir()
+            .join(tenant_id)
+            .join(PARSEABLE_METADATA_FILE_NAME)
+    } else {
+        PARSEABLE.options.staging_dir().join(PARSEABLE_METADATA_FILE_NAME)
+    };
     let bytes = match fs::read(path) {
         Ok(bytes) => bytes,
         Err(err) => match err.kind() {
             io::ErrorKind::NotFound => return Ok(None),
             _ => return Err(err),
         },
     };
src/rbac/user.rs (1)

328-344: Role validation may incorrectly skip valid roles when tenant-specific roles don't exist.

The validation logic at lines 338-342 checks if the role exists in the tenant's role map. However, the condition !tenant_roles.contains_key(role) only adds to non_existent_roles when the tenant map exists but doesn't contain the role. If roles().get(...) returns None (tenant has no roles at all), valid default roles would not be flagged as missing, potentially causing silent failures later.

Consider also checking if the tenant's role map exists:

 for role in &self.roles {
-    if let Some(tenant_roles) =
-        roles().get(tenant_id.as_deref().unwrap_or(DEFAULT_TENANT))
-        && !tenant_roles.contains_key(role)
-    {
+    let tenant_key = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
+    let role_exists = roles()
+        .get(tenant_key)
+        .map_or(false, |tenant_roles| tenant_roles.contains_key(role));
+    if !role_exists {
         non_existent_roles.push(role.clone());
     }
 }
src/handlers/http/correlation.rs (1)

67-81: Tenant context discarded during correlation creation.

The post handler extracts (user_id, tenant_id) but discards the tenant with (s, _). The correlation should be associated with the tenant:

-    let user_id = get_user_and_tenant_from_request(&req)
-        .map(|(s, _)| get_hash(&s.to_string()))
+    let (user_id, tenant_id) = get_user_and_tenant_from_request(&req)
+        .map(|(s, t)| (get_hash(&s.to_string()), t))
         .map_err(|err| CorrelationError::AnyhowError(Error::msg(err.to_string())))?;
     correlation.user_id = user_id;
+    correlation.tenant_id = tenant_id;

Without this, correlations created by tenanted users won't be properly scoped.

src/alerts/alerts_utils.rs (1)

129-154: auth_token parameter is accepted but not forwarded to send_query_request.

The function accepts auth_token: Option<String> but passes None to send_query_request on line 149, effectively ignoring the provided token:

-    let (result_value, _) = send_query_request(None, &query_request, tenant_id)
+    let auth_headers = auth_token.map(|token| {
+        let mut map = reqwest::header::HeaderMap::new();
+        map.insert(
+            reqwest::header::AUTHORIZATION,
+            reqwest::header::HeaderValue::from_str(&token).unwrap(),
+        );
+        map
+    });
+    let (result_value, _) = send_query_request(auth_headers, &query_request, tenant_id)

Based on the relevant code snippet at src/handlers/http/cluster/mod.rs:1876-1960, send_query_request expects auth_token: Option<HeaderMap>, so the Option<String> needs conversion.

src/otel/traces.rs (1)

922-922: Add missing tenant_id parameter to test call.

Line 922 calls flatten_otel_traces(&traces_data) but the function signature requires two parameters: flatten_otel_traces(message: &TracesData, tenant_id: &str). The test is missing the tenant_id argument.

Required fix
-        let result = flatten_otel_traces(&traces_data);
+        let result = flatten_otel_traces(&traces_data, "test-tenant");
src/handlers/http/users/filters.rs (1)

38-43: Inconsistent tenant handling in list handler.

The list function still uses extract_session_key_from_req and does not extract or pass tenant_id, while all other handlers (get, post, update, delete) have been updated to use get_user_and_tenant_from_request and pass tenant_id to the underlying methods.

This inconsistency means list_filters may return filters from all tenants rather than only the current tenant's filters.

🐛 Proposed fix
 pub async fn list(req: HttpRequest) -> Result<impl Responder, FiltersError> {
-    let key =
-        extract_session_key_from_req(&req).map_err(|e| FiltersError::Custom(e.to_string()))?;
-    let filters = FILTERS.list_filters(&key).await;
+    let (user_id, tenant_id) = get_user_and_tenant_from_request(&req)?;
+    let key =
+        extract_session_key_from_req(&req).map_err(|e| FiltersError::Custom(e.to_string()))?;
+    let filters = FILTERS.list_filters(&key, &tenant_id).await;
     Ok((web::Json(filters), StatusCode::OK))
 }
src/hottier.rs (1)

208-220: delete_hot_tier path construction ignores tenant_id.

At line 216, the path is constructed as self.hot_tier_path.join(stream) without incorporating tenant_id. This will delete the wrong directory for tenant-scoped streams where the hot tier path should include the tenant prefix.

🐛 Proposed fix
 pub async fn delete_hot_tier(
     &self,
     stream: &str,
     tenant_id: &Option<String>,
 ) -> Result<(), HotTierError> {
     if !self.check_stream_hot_tier_exists(stream, tenant_id) {
         return Err(HotTierValidationError::NotFound(stream.to_owned()).into());
     }
-    let path = self.hot_tier_path.join(stream);
+    let path = if let Some(tenant) = tenant_id.as_ref() {
+        self.hot_tier_path.join(tenant).join(stream)
+    } else {
+        self.hot_tier_path.join(stream)
+    };
     fs::remove_dir_all(path).await?;

     Ok(())
 }
src/parseable/streams.rs (2)

118-126: Update Stream::new call sites for the new tenant_id parameter.

Line 118-126 adds tenant_id, but tests in this file (e.g., Line 1215, 1234, 1251, etc.) still call Stream::new with 4 args, which will not compile. Update all call sites to pass &None (or &Some(...)) as appropriate.

✅ Example fix (apply across all call sites)
-        let staging = Stream::new(
-            options.clone(),
-            stream_name,
-            LogStreamMetadata::default(),
-            None,
-        );
+        let staging = Stream::new(
+            options.clone(),
+            stream_name,
+            LogStreamMetadata::default(),
+            None,
+            &None,
+        );

587-621: Update test code at lines 1386-1393 to include tenant labels matching the metric definitions.

The test test_convert_to_parquet_with_empty_staging calls with_label_values() with mismatched label counts:

  • STAGING_FILES requires ["stream", "tenant_id"] (2 labels) but receives only [&stream] (1 label)
  • STORAGE_SIZE requires ["type", "stream", "format", "tenant_id"] (4 labels) but receives only 3

Replace these calls to include DEFAULT_TENANT:

let staging_files = metrics::STAGING_FILES.with_label_values(&[&stream, DEFAULT_TENANT]).get();
let storage_size_arrows = metrics::STORAGE_SIZE.with_label_values(&["staging", &stream, "arrows", DEFAULT_TENANT]).get();
let storage_size_parquet = metrics::STORAGE_SIZE.with_label_values(&["staging", &stream, "parquet", DEFAULT_TENANT]).get();
src/storage/gcs.rs (1)

180-205: Don’t increment object-store metrics on failed operations.

Line 185-187 increments GET metrics before checking the result, so failed GETs are counted. Move the increments into the success branch and apply the same pattern to _put_object, _upload_file, _delete_prefix, etc., to keep billing metrics accurate.

🐛 Proposed fix (pattern for `_get_object`)
-        let resp = self.client.get(&to_object_store_path(path)).await;
-        let tenant = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v);
-        increment_object_store_calls_by_date("GET", &Utc::now().date_naive().to_string(), tenant);
-        match resp {
+        let resp = self.client.get(&to_object_store_path(path)).await;
+        let tenant = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v);
+        match resp {
             Ok(resp) => {
+                increment_object_store_calls_by_date(
+                    "GET",
+                    &Utc::now().date_naive().to_string(),
+                    tenant,
+                );
                 let body: Bytes = resp.bytes().await?;
                 increment_files_scanned_in_object_store_calls_by_date(
                     "GET",
                     1,
                     &Utc::now().date_naive().to_string(),
                     tenant,
                 );
Based on learnings, metrics should only be recorded after successful operations.
src/storage/azure_blob.rs (1)

215-266: Record object-store metrics only after successful responses.
Several methods increment metrics before the call result is known, so failed operations are counted. Move increments into the success paths (or after await?) for _get_object, _put_object, _delete_prefix (LIST/DELETE), _list_dates, _upload_file, the small-file branch in _upload_multipart, head, get_ingestor_meta_file_paths, delete_object, check, try_delete_node_meta, list_old_streams HEAD tasks, list_dirs, and list_dirs_relative.

🐛 Example fix (apply same pattern to other methods)
-        let resp = self.client.get(&to_object_store_path(path)).await;
-        increment_object_store_calls_by_date("GET", &Utc::now().date_naive().to_string(), tenant);
+        let resp = self.client.get(&to_object_store_path(path)).await;
 
         match resp {
             Ok(resp) => {
+                increment_object_store_calls_by_date(
+                    "GET",
+                    &Utc::now().date_naive().to_string(),
+                    tenant,
+                );
                 let body: Bytes = resp.bytes().await?;

Based on learnings, metrics should only be captured after successful operations.

Also applies to: 268-319, 321-381, 383-427, 499-517, 581-613, 639-717, 726-770, 891-954

src/migration/mod.rs (1)

60-61: Read staging metadata from the tenant directory when tenant_id is set.
run_metadata_migration receives tenant_id, but get_staging_metadata always reads the global staging file. This skips per-tenant staging metadata during migration.

🔧 Proposed fix
-    let staging_metadata = get_staging_metadata(config)?;
+    let staging_metadata = get_staging_metadata(config, tenant_id)?;
-pub fn get_staging_metadata(config: &Parseable) -> anyhow::Result<Option<serde_json::Value>> {
-    let path =
-        RelativePathBuf::from(PARSEABLE_METADATA_FILE_NAME).to_path(config.options.staging_dir());
+pub fn get_staging_metadata(
+    config: &Parseable,
+    tenant_id: &Option<String>,
+) -> anyhow::Result<Option<serde_json::Value>> {
+    let path = if let Some(tenant) = tenant_id.as_ref() {
+        config
+            .options
+            .staging_dir()
+            .join(tenant)
+            .join(PARSEABLE_METADATA_FILE_NAME)
+    } else {
+        config
+            .options
+            .staging_dir()
+            .join(PARSEABLE_METADATA_FILE_NAME)
+    };

Also applies to: 467-477

src/metastore/metastores/object_store_metastore.rs (3)

344-361: Inconsistent tenant handling: alert_state_json_path ignores tenant_id.

The alert_state_json_path(*alert_id) call at line 349 constructs the path without considering tenant_id, but the get_object call at line 350 passes tenant_id to storage. This creates a mismatch—the path lacks the tenant prefix while storage operations expect tenant-scoped access. The same issue exists in put_alert_state at line 373.

Compare with alert_json_path(id, tenant_id) at line 295 which correctly accepts tenant_id.

#!/bin/bash
# Verify alert_state_json_path signature and check if it should accept tenant_id
rg -n "fn alert_state_json_path" --type rust -A 3

918-940: Missing tenant prefix in date_path construction.

Line 919 constructs date_path using only stream_name without the tenant prefix:

let date_path = object_store::path::Path::from(format!("{}/{}", stream_name, &date));

When tenant_id is Some, this path should include the tenant prefix to match the structure established in lines 903-908. Currently, manifest retrieval will fail for tenant-scoped streams because it looks in the wrong location.

🐛 Suggested fix
         for date in dates {
-            let date_path = object_store::path::Path::from(format!("{}/{}", stream_name, &date));
+            let date_path = object_store::path::Path::from(format!("{}/{}", &root, &date));
             let resp = self.storage.list_with_delimiter(Some(date_path)).await?;

576-602: get_chats is inconsistent with multi-tenant patterns used by other retrieval methods.

This method always uses hardcoded &None for storage operations (lines 580, 591), returning chats only from the non-tenant root path. However, put_chat and delete_chat both accept tenant_id parameters and pass them to storage. More critically, other similar retrieval methods like get_keystones, get_conversations, and get_alerts iterate over all tenants via PARSEABLE.list_tenants() and pass the tenant ID explicitly. This means chats stored by put_chat under a tenant prefix will be invisible to get_chats, causing data loss in multi-tenant scenarios.

Update get_chats to follow the same pattern: iterate over tenants and pass each tenant ID to storage operations.

🤖 Fix all issues with AI agents
In `@src/handlers/http/middleware.rs`:
- Around line 424-435: check_suspension currently treats any error from
TENANT_METADATA.is_action_suspended the same as Ok(None) and returns
rbac::Response::Authorized for missing tenants; update the call to
TENANT_METADATA.is_action_suspended(tenant, &action) to match on its Result and
handle all cases explicitly: return rbac::Response::Suspended(suspension) for
Ok(Some(suspension)), do nothing for Ok(None), and return the appropriate error
response (e.g., rbac::Response::UnAuthorized or the project-standard
tenant-not-found response) for Err(...) so non-existent tenants are not treated
as Authorized.

In `@src/handlers/http/modal/ingest/ingestor_role.rs`:
- Around line 59-66: The tenant used to update mut_roles() is derived from
req_tenant_id, while metadata operations use sync_req.tenant_id; make them
consistent by using sync_req.tenant_id as the tenant source for the mut_roles()
update (i.e., derive tenant_id from sync_req.tenant_id with the same fallback to
DEFAULT_TENANT) and then call
mut_roles().entry(tenant_id).or_default().insert(name.clone(),
sync_req.privileges) so session refresh and metadata operations target the same
tenant.

In `@src/handlers/http/modal/query/querier_rbac.rs`:
- Around line 260-266: In remove_roles_from_user, replace the combined `if let
Some(tenant_roles) = roles().get(tenant_id.as_deref().unwrap_or(DEFAULT_TENANT))
&& tenant_roles.get(r).is_none()` pattern with an explicit nested check: first
`if let Some(tenant_roles) =
roles().get(tenant_id.as_deref().unwrap_or(DEFAULT_TENANT)) { if
tenant_roles.get(r).is_none() { non_existent_roles.push(r.clone()); } }` (or
equivalently check presence with `match`/`map_or`) so `roles()` lookup is
performed correctly and non_existent_roles is populated as intended for each `r`
in `roles_to_remove`.
- Around line 198-204: The current check around roles_to_add uses if let
Some(tenant_roles) = roles().get(tenant_id.as_deref().unwrap_or(DEFAULT_TENANT))
&& tenant_roles.get(r).is_none() which lets roles pass when the tenant is
missing; change the logic so missing tenant counts as non-existent role:
explicitly attempt to lookup tenant_roles from roles() for the tenant (using
tenant_id.as_deref().unwrap_or(DEFAULT_TENANT)) and if the lookup yields None or
tenant_roles.get(r).is_none() then push r.clone() into non_existent_roles;
update the block around roles_to_add iteration (same area as in post_user) to
reflect this explicit both-missing-or-missing-role condition.
- Around line 62-70: The role existence check currently skips roles when
roles().get(tenant_id.as_deref().unwrap_or(DEFAULT_TENANT)) returns None, so
roles for a non-existent tenant are treated as present; update the loop that
builds non_existent_roles to treat a missing tenant as "no roles" and therefore
mark the role missing: use the result of roles().get(...) and if it is None OR
the returned map does not contain the role, push role.clone() into
non_existent_roles (i.e., replace the current if let Some(...) &&
!contains_key(...) guard with a check that handles None explicitly or use map_or
to evaluate to true when tenant is missing); keep
RBACError::RolesDoNotExist(non_existent_roles) behavior unchanged.

In `@src/storage/localfs.rs`:
- Around line 493-498: The metrics call is using a hardcoded tenant string
"default" in list_streams and list_old_streams; update those calls to use the
configured default tenant constant or derive tenant_id from context (e.g., use
DEFAULT_TENANT or the function's tenant parameter) instead of the literal
"default". Locate the increment_object_store_calls_by_date invocations in
list_streams and list_old_streams and replace the third argument with the
appropriate tenant variable/constant so metrics remain tenant-aware and
consistent with other code paths.

In `@src/tenants/mod.rs`:
- Around line 78-97: The is_action_suspended function should treat a missing
tenant as "action not suspended" instead of returning Err; in the tenant lookup
else branch replace the Err(TenantNotFound(...)) return with Ok(None) so the
function returns Ok(None) for non-existent tenants. Update the else branch in
is_action_suspended (in the tenants module) to return Ok(None) while leaving the
signature Result<Option<String>, TenantNotFound> unchanged so callers (e.g., the
middleware) no longer need to handle TenantNotFound.

Comment on lines +424 to +435
pub fn check_suspension(req: &HttpRequest, action: Action) -> rbac::Response {
if let Some(tenant) = req.headers().get("tenant")
&& let Ok(tenant) = tenant.to_str()
{
if let Ok(Some(suspension)) = TENANT_METADATA.is_action_suspended(tenant, &action) {
return rbac::Response::Suspended(suspension);
} else {
// tenant does not exist
}
}
rbac::Response::Authorized
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check how TENANT_METADATA.is_action_suspended handles unknown tenants
rg -n "is_action_suspended" --type rust -A 10 -B 2

Repository: parseablehq/parseable

Length of output: 1857


🏁 Script executed:

#!/bin/bash
# Get the complete is_action_suspended implementation
rg -n "is_action_suspended" --type rust -A 20 src/tenants/mod.rs

Repository: parseablehq/parseable

Length of output: 995


🏁 Script executed:

#!/bin/bash
# Check tests or usage patterns for check_suspension
rg -n "check_suspension" --type rust -B 3 -A 3

Repository: parseablehq/parseable

Length of output: 2073


🏁 Script executed:

#!/bin/bash
# Check what happens in the calling functions when check_suspension returns Authorized
rg -n "auth_no_context|auth_with_context" --type rust -A 15 | head -60

Repository: parseablehq/parseable

Length of output: 2186


🏁 Script executed:

#!/bin/bash
# Look for tests related to check_suspension or TenantNotFound
rg -n "TenantNotFound|check_suspension" --type rust -B 2 -A 5 | grep -A 10 "test\|#\[test"

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

#!/bin/bash
# Look for documentation or comments about tenant handling
rg -n "tenant" --type rust src/handlers/http/middleware.rs | head -30

Repository: parseablehq/parseable

Length of output: 2062


🏁 Script executed:

#!/bin/bash
# Check how non-existent tenants are handled in authorization checks
rg -n "TenantNotFound" --type rust -B 5 -A 5

Repository: parseablehq/parseable

Length of output: 9314


🏁 Script executed:

#!/bin/bash
# Check if there are any error handling mechanisms in place after check_suspension
rg -n "Users.authorize" --type rust -B 5 -A 10 | head -80

Repository: parseablehq/parseable

Length of output: 5172


🏁 Script executed:

#!/bin/bash
# Check if there's multi-tenant vs single-tenant mode handling relevant to this
rg -n "is_multi_tenant" --type rust -B 2 -A 2 | head -40

Repository: parseablehq/parseable

Length of output: 2509


Handle missing tenants explicitly in check_suspension.

The is_action_suspended method returns Err(TenantNotFound) when a tenant doesn't exist (line 95 in src/tenants/mod.rs), but check_suspension silently treats this error the same as Ok(None) via the else branch at lines 430-432, allowing the function to return Authorized for non-existent tenants. This bypasses tenant-level suspension checks.

Either return an error response for non-existent tenants (consistent with how TenantNotFound is handled elsewhere in the codebase), or explicitly check for the error case:

match TENANT_METADATA.is_action_suspended(tenant, &action) {
    Ok(Some(suspension)) => return rbac::Response::Suspended(suspension),
    Ok(None) => {},
    Err(_) => return rbac::Response::UnAuthorized, // or appropriate response
}
🤖 Prompt for AI Agents
In `@src/handlers/http/middleware.rs` around lines 424 - 435, check_suspension
currently treats any error from TENANT_METADATA.is_action_suspended the same as
Ok(None) and returns rbac::Response::Authorized for missing tenants; update the
call to TENANT_METADATA.is_action_suspended(tenant, &action) to match on its
Result and handle all cases explicitly: return
rbac::Response::Suspended(suspension) for Ok(Some(suspension)), do nothing for
Ok(None), and return the appropriate error response (e.g.,
rbac::Response::UnAuthorized or the project-standard tenant-not-found response)
for Err(...) so non-existent tenants are not treated as Authorized.

Comment on lines +59 to +66
let tenant_id = req_tenant_id
.as_ref()
.map_or(DEFAULT_TENANT, |v| v)
.to_owned();
mut_roles()
.entry(tenant_id.clone())
.or_default()
.insert(name.clone(), sync_req.privileges);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Inconsistent tenant_id usage.

tenant_id is derived from req_tenant_id (line 59-62) but sync_req.tenant_id is used for metadata operations (lines 53, 58). This inconsistency could cause role updates to target the wrong tenant in session refresh logic.

🐛 Proposed fix - use consistent tenant source
-    let tenant_id = req_tenant_id
-        .as_ref()
-        .map_or(DEFAULT_TENANT, |v| v)
-        .to_owned();
+    let tenant_id = sync_req.tenant_id
+        .as_ref()
+        .map_or(DEFAULT_TENANT, |v| v.as_str())
+        .to_owned();
     mut_roles()
         .entry(tenant_id.clone())
🤖 Prompt for AI Agents
In `@src/handlers/http/modal/ingest/ingestor_role.rs` around lines 59 - 66, The
tenant used to update mut_roles() is derived from req_tenant_id, while metadata
operations use sync_req.tenant_id; make them consistent by using
sync_req.tenant_id as the tenant source for the mut_roles() update (i.e., derive
tenant_id from sync_req.tenant_id with the same fallback to DEFAULT_TENANT) and
then call mut_roles().entry(tenant_id).or_default().insert(name.clone(),
sync_req.privileges) so session refresh and metadata operations target the same
tenant.

Comment on lines +62 to 70
if let Some(tenant_roles) = roles().get(tenant_id.as_deref().unwrap_or(DEFAULT_TENANT))
&& !tenant_roles.contains_key(role)
{
non_existent_roles.push(role.clone());
}
}
if !non_existent_roles.is_empty() {
return Err(RBACError::RolesDoNotExist(non_existent_roles));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Role existence check may silently pass for non-existent tenants.

When roles().get(...) returns None (tenant not in roles map), the if let Some(...) guard fails and the loop continues without adding the role to non_existent_roles. This means roles specified for a non-existent tenant silently pass validation.

🐛 Proposed fix
     let mut non_existent_roles = Vec::new();
     for role in &user_roles {
-        if let Some(tenant_roles) = roles().get(tenant_id.as_deref().unwrap_or(DEFAULT_TENANT))
-            && !tenant_roles.contains_key(role)
-        {
+        let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
+        let role_exists = roles()
+            .get(tenant)
+            .is_some_and(|tenant_roles| tenant_roles.contains_key(role));
+        if !role_exists {
             non_existent_roles.push(role.clone());
         }
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if let Some(tenant_roles) = roles().get(tenant_id.as_deref().unwrap_or(DEFAULT_TENANT))
&& !tenant_roles.contains_key(role)
{
non_existent_roles.push(role.clone());
}
}
if !non_existent_roles.is_empty() {
return Err(RBACError::RolesDoNotExist(non_existent_roles));
}
let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
let role_exists = roles()
.get(tenant)
.is_some_and(|tenant_roles| tenant_roles.contains_key(role));
if !role_exists {
non_existent_roles.push(role.clone());
}
}
if !non_existent_roles.is_empty() {
return Err(RBACError::RolesDoNotExist(non_existent_roles));
}
🤖 Prompt for AI Agents
In `@src/handlers/http/modal/query/querier_rbac.rs` around lines 62 - 70, The role
existence check currently skips roles when
roles().get(tenant_id.as_deref().unwrap_or(DEFAULT_TENANT)) returns None, so
roles for a non-existent tenant are treated as present; update the loop that
builds non_existent_roles to treat a missing tenant as "no roles" and therefore
mark the role missing: use the result of roles().get(...) and if it is None OR
the returned map does not contain the role, push role.clone() into
non_existent_roles (i.e., replace the current if let Some(...) &&
!contains_key(...) guard with a check that handles None explicitly or use map_or
to evaluate to true when tenant is missing); keep
RBACError::RolesDoNotExist(non_existent_roles) behavior unchanged.

@parmesant parmesant force-pushed the multi-tenancy branch 2 times, most recently from 256c50f to 9914d5c Compare January 20, 2026 08:21
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (12)
src/storage/field_stats.rs (1)

93-109: Guard tenant_id against empty strings before using it as the DataFusion default schema.

The current pattern tenant_id.as_ref().map_or("public".into(), |v| v.to_owned()) handles None but allows empty strings through (e.g., Some("") would set default_schema to an empty string). This could cause schema validation failures at runtime. Normalize the value to fall back to "public" when the tenant_id is None or empty.

🔧 Suggested fix
-        session_state
-            .config_mut()
-            .options_mut()
-            .catalog
-            .default_schema = tenant_id.as_ref().map_or("public".into(), |v| v.to_owned());
+        let default_schema = tenant_id
+            .as_deref()
+            .map(|s| s.trim())
+            .filter(|s| !s.is_empty())
+            .unwrap_or("public");
+        session_state
+            .config_mut()
+            .options_mut()
+            .catalog
+            .default_schema = default_schema.to_string();
src/handlers/http/correlation.rs (1)

67-81: Tenant context discarded during correlation creation.

The post handler extracts both user and tenant from get_user_and_tenant_from_request but only uses the user component (line 74: map(|(s, _)| ...)). The tenant context is discarded and not assigned to correlation.tenant_id, which breaks tenant isolation for newly created correlations.

🔒 Proposed fix
-    let user_id = get_user_and_tenant_from_request(&req)
-        .map(|(s, _)| get_hash(&s.to_string()))
+    let (user_id, tenant_id) = get_user_and_tenant_from_request(&req)
+        .map(|(s, t)| (get_hash(&s.to_string()), t))
         .map_err(|err| CorrelationError::AnyhowError(Error::msg(err.to_string())))?;
     correlation.user_id = user_id;
+    correlation.tenant_id = tenant_id;
src/parseable/streams.rs (2)

118-127: Update all Stream::new call sites to pass the tenant_id argument.

Stream::new requires five parameters: options, stream_name, metadata, ingestor_id, and tenant_id. However, multiple call sites throughout the file pass only four arguments, omitting tenant_id. This will cause compilation errors. Update all call sites to include &None (or a real tenant value) as the fifth argument. Affected locations include test functions and staging stream creation calls (approximately 11+ call sites in this file).


585-620: Fix missing tenant_id labels in test code at line 1384.

Test assertions at lines 1384–1394 are missing the tenant_id label parameter in all metric calls:

  • Line 1384: STAGING_FILES.with_label_values(&[&stream]) should be &[&stream, &DEFAULT_TENANT] (or actual tenant_id)
  • Lines 1388–1389: STORAGE_SIZE.with_label_values(&["staging", &stream, "arrows"]) should be &["staging", &stream, "arrows", &DEFAULT_TENANT]
  • Lines 1391–1392: STORAGE_SIZE.with_label_values(&["staging", &stream, "parquet"]) should be &["staging", &stream, "parquet", &DEFAULT_TENANT]

This will cause a Prometheus label cardinality mismatch panic when tests run, since the metric definitions expect all labels.

src/prism/logstream/mod.rs (1)

293-299: Use tenant_id when resolving stream type.
Using get_stream(&stream) bypasses tenant scoping and can leak data across tenants.

🔐 Proposed fix
-        let is_internal = PARSEABLE.get_stream(&stream).is_ok_and(|stream| {
+        let is_internal = PARSEABLE
+            .get_stream(&stream, tenant_id)
+            .is_ok_and(|stream| {
             stream
                 .get_stream_type()
                 .eq(&crate::storage::StreamType::Internal)
         });
src/alerts/alert_structs.rs (1)

292-375: Block tenant_id injection via other_fields.
AlertConfig now has a tenant_id field, but RESERVED_FIELDS doesn’t include tenant_id/tenantId. A client can place these keys in other_fields, which may override or conflict when flattened/serialized. Add both variants to the reserved list and sanitize them.

🔒 Proposed fix
 const RESERVED_FIELDS: &[&str] = &[
     "version",
     "id",
+    "tenantId",
+    "tenant_id",
     "severity",
     "title",
     "query",
     "datasets",
src/handlers/http/mod.rs (1)

104-137: Forward tenant header on intra-cluster queries.
send_query_request_to_ingestor now receives tenant_id but doesn’t propagate it in the HTTP request. If this is used for multi-tenant queries, the ingestor will default to the wrong tenant. Add the tenant header when present.

✅ Proposed fix
-        let reqw = INTRA_CLUSTER_CLIENT
-            .post(uri)
-            .json(query)
-            .header(http::header::AUTHORIZATION, im.token.clone())
-            .header(http::header::CONTENT_TYPE, "application/json")
-            .send()
-            .await;
+        let mut req = INTRA_CLUSTER_CLIENT
+            .post(uri)
+            .json(query)
+            .header(http::header::AUTHORIZATION, im.token.clone())
+            .header(http::header::CONTENT_TYPE, "application/json");
+
+        if let Some(tenant) = tenant_id.as_ref() {
+            req = req.header("tenant", tenant);
+        }
+
+        let reqw = req.send().await;
src/handlers/http/targets.rs (1)

26-105: Tenant should come from authenticated context, not a raw header.

These handlers trust the tenant header directly, which is user-controllable and can lead to cross-tenant access unless middleware overwrites it. Prefer a session-derived tenant (or ensure middleware always overrides the header before handlers run).

src/storage/azure_blob.rs (1)

214-266: Record object-store metrics only after successful operations.

Several methods increment metrics before confirming success (e.g., _get_object, _put_object, _delete_prefix list/delete, _list_dates, _upload_file, head, delete_object, check, list_dirs*). This counts failed attempts and skews billing/usage metrics. Consider moving increments into the Ok paths (or after ?) so only successful operations are recorded.

🛠️ Example fix (apply similarly across methods)
-        let resp = self.client.get(&to_object_store_path(path)).await;
-        increment_object_store_calls_by_date("GET", &Utc::now().date_naive().to_string(), tenant);
+        let resp = self.client.get(&to_object_store_path(path)).await;

         match resp {
             Ok(resp) => {
+                increment_object_store_calls_by_date(
+                    "GET",
+                    &Utc::now().date_naive().to_string(),
+                    tenant,
+                );
                 let body: Bytes = resp.bytes().await?;
                 increment_files_scanned_in_object_store_calls_by_date(
                     "GET",
                     1,
                     &Utc::now().date_naive().to_string(),
                     tenant,
                 );

Based on learnings, metrics should only be recorded after successful operations.

Also applies to: 268-318, 321-381, 499-517, 639-680, 891-913

src/storage/gcs.rs (1)

179-345: Record object-store metrics only after successful operations.

Multiple methods increment metrics before success is known (e.g., _get_object, _put_object, _delete_prefix list/delete, _list_dates, _upload_file, head, delete_object, check, list_dirs*). This will count failed attempts. Move increments into Ok branches (or after ?) so only successful calls are counted.

🛠️ Example fix (apply similarly across methods)
-        let resp = self.client.get(&to_object_store_path(path)).await;
-        let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
-        increment_object_store_calls_by_date("GET", &Utc::now().date_naive().to_string(), tenant);
+        let resp = self.client.get(&to_object_store_path(path)).await;
+        let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);

         match resp {
             Ok(resp) => {
+                increment_object_store_calls_by_date(
+                    "GET",
+                    &Utc::now().date_naive().to_string(),
+                    tenant,
+                );
                 let body: Bytes = resp.bytes().await?;

Based on learnings, metrics should only be recorded after successful operations.

Also applies to: 232-283, 285-345, 481-499, 621-663, 869-901

src/handlers/http/cluster/mod.rs (1)

503-709: Propagate tenant header in RBAC sync requests.

sync_users_with_roles_with_ingestors, sync_user_deletion_with_ingestors, sync_user_creation, and sync_password_reset_with_ingestors send requests without the tenant header, but downstream handlers use get_tenant_id_from_request which extracts tenant context from the "tenant" header. In multi-tenant deployments, when the header is missing, handlers default to DEFAULT_TENANT, potentially applying changes to the wrong tenant. Attach the tenant header whenever tenant_id is present.

🛠️ Example fix (apply across all sync_* calls)
-        async move {
-            let res = INTRA_CLUSTER_CLIENT
-                .post(url)
-                .header(header::AUTHORIZATION, &ingestor.token)
-                .header(header::CONTENT_TYPE, "application/json")
-                .send()
-                .await
+        async move {
+            let mut req = INTRA_CLUSTER_CLIENT
+                .post(url)
+                .header(header::AUTHORIZATION, &ingestor.token)
+                .header(header::CONTENT_TYPE, "application/json");
+            if let Some(tenant) = tenant_id.as_deref() {
+                req = req.header("tenant", tenant);
+            }
+            let res = req.send().await
                 .map_err(|err| {
                     error!(
                         "Fatal: failed to forward request to ingestor: {}\n Error: {:?}",
                         ingestor.domain_name, err
                     );
                     RBACError::Network(err)
                 })?;
src/metastore/metastores/object_store_metastore.rs (1)

575-602: get_chats is not tenant-aware, inconsistent with other similar methods.

Unlike get_dashboards, get_filters, and get_correlations which iterate over all tenants via PARSEABLE.list_tenants(), this method only looks at the root USERS_ROOT_DIR with tenant_id = None. This will miss chats stored under tenant-specific paths.

🔧 Suggested fix to align with other tenant-aware methods
     async fn get_chats(&self) -> Result<DashMap<String, Vec<Bytes>>, MetastoreError> {
         let all_user_chats = DashMap::new();
-
-        let users_dir = RelativePathBuf::from(USERS_ROOT_DIR);
-        for user in self.storage.list_dirs_relative(&users_dir, &None).await? {
-            if user.starts_with(".") {
-                continue;
-            }
-            let mut chats = Vec::new();
-            let chats_path = users_dir.join(&user).join("chats");
-            let user_chats = self
-                .storage
-                .get_objects(
-                    Some(&chats_path),
-                    Box::new(|file_name| file_name.ends_with(".json")),
-                    &None,
-                )
-                .await?;
-            for chat in user_chats {
-                chats.push(chat);
+        let base_paths = PARSEABLE.list_tenants().map_or(vec!["".into()], |v| v);
+        for tenant in base_paths {
+            let tenant_id = &Some(tenant.clone());
+            let users_dir = RelativePathBuf::from_iter([&tenant, USERS_ROOT_DIR]);
+            for user in self.storage.list_dirs_relative(&users_dir, tenant_id).await? {
+                if user.starts_with(".") {
+                    continue;
+                }
+                let mut chats = Vec::new();
+                let chats_path = users_dir.join(&user).join("chats");
+                let user_chats = self
+                    .storage
+                    .get_objects(
+                        Some(&chats_path),
+                        Box::new(|file_name| file_name.ends_with(".json")),
+                        tenant_id,
+                    )
+                    .await?;
+                for chat in user_chats {
+                    chats.push(chat);
+                }
+                all_user_chats.insert(user, chats);
             }
-
-            all_user_chats.insert(user, chats);
         }
 
         Ok(all_user_chats)
     }
🤖 Fix all issues with AI agents
In `@src/alerts/alert_types.rs`:
- Around line 90-101: The code currently uses
HeaderValue::from_str(&header).unwrap() which can panic; replace the unwrap with
proper error handling by matching or using .map_err() on
HeaderValue::from_str(&header) inside the auth-building block (the code
referencing self.tenant_id, TENANT_METADATA.get_global_query_auth, HeaderMap and
HeaderValue::from_str). Either propagate the error from the surrounding function
(turning the caller into a Result and returning an Err with a descriptive
message) or handle the parse failure gracefully (log the malformed header and
skip adding the Authorization header by returning None for auth); implement the
match or map_err flow so no panic can occur.

In `@src/handlers/http/ingest.rs`:
- Around line 62-67: The handler is currently vulnerable to a panic because
get_tenant_id_from_request (which uses to_str().unwrap()) can fail on non‑UTF8
tenant headers; change get_tenant_id_from_request to return a Result<String,
HeaderParseError> (or Option) instead of unwrapping, then update the ingest
handler to handle that error—when extracting the tenant id after
STREAM_NAME_HEADER_KEY lookup, map a malformed/non‑UTF8 tenant header into a
PostError::Header(ParseHeaderError::InvalidTenantHeader) (or other appropriate
ParseHeaderError) and return Err(...) rather than allowing a panic; refer to
get_tenant_id_from_request and PostError/ParseHeaderError when making these
changes.

In `@src/handlers/http/llm.rs`:
- Around line 92-103: get_tenant_id_from_request currently unwraps
header.to_str() which can panic on non‑UTF8 tenant headers; change
get_tenant_id_from_request (in utils) to return a Result<String, _> or
Option<String> instead of panicking, then update make_llm_request to handle the
error/None case by returning an appropriate LLMError (e.g.,
LLMError::InvalidTenantHeader or map to a 400) before calling
PARSEABLE.get_stream; reference get_tenant_id_from_request and make_llm_request
and ensure no direct to_str().unwrap() remains.

In `@src/handlers/livetail.rs`:
- Around line 119-121: The code calls PARSEABLE.get_stream(stream, &None) which
forces the default tenant; change it to resolve the tenant from the current
session and pass that instead. Locate the call to PARSEABLE.get_stream and
replace the &None tenant argument with the session's tenant (e.g.
&Some(session.tenant()) or &Some(session.tenant_id()) depending on the session
API) so get_stream uses the session's tenant context rather than the default.

In `@src/metastore/metastores/object_store_metastore.rs`:
- Around line 1167-1172: The delete_tenant implementation is passing &None for
tenant attribution to storage.delete_prefix which causes metrics to be
attributed to DEFAULT_TENANT; update the delete_tenant function to call
self.storage.delete_prefix(&RelativePathBuf::from(tenant_id),
&Some(tenant_id.to_string())).await.map_err(MetastoreError::ObjectStorageError)
so the actual tenant_id is supplied (consistent with how delete_object uses
tenant attribution).

In `@src/storage/store_metadata.rs`:
- Around line 126-131: The resolve_parseable_metadata function reads staging
metadata using get_staging_metadata() but now receives tenant_id; update the
call to pass tenant_id through so tenant-specific staging metadata is read
(thread tenant_id into get_staging_metadata). Modify get_staging_metadata (or
its wrapper) to accept the same tenant_id type used by
resolve_parseable_metadata (e.g., &Option<String> or &str) and use that value
when loading staging metadata; update the call site in
resolve_parseable_metadata to call get_staging_metadata(tenant_id) instead of
the parameterless version.

Comment on lines +90 to +101
let auth = if let Some(tenant) = self.tenant_id.as_ref()
&& let Some(header) = TENANT_METADATA.get_global_query_auth(tenant)
{
let mut map = HeaderMap::new();
map.insert(
HeaderName::from_static("authorization"),
HeaderValue::from_str(&header).unwrap(),
);
Some(map)
} else {
None
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid panics when building the Authorization header.

HeaderValue::from_str(&header).unwrap() can panic on malformed values. Convert this into a handled error instead.

🛠️ Proposed fix
-        let auth = if let Some(tenant) = self.tenant_id.as_ref()
-            && let Some(header) = TENANT_METADATA.get_global_query_auth(tenant)
-        {
-            let mut map = HeaderMap::new();
-            map.insert(
-                HeaderName::from_static("authorization"),
-                HeaderValue::from_str(&header).unwrap(),
-            );
-            Some(map)
-        } else {
-            None
-        };
+        let auth = if let Some(tenant) = self.tenant_id.as_ref()
+            && let Some(header) = TENANT_METADATA.get_global_query_auth(tenant)
+        {
+            let mut map = HeaderMap::new();
+            let value = HeaderValue::from_str(&header)
+                .map_err(|e| AlertError::CustomError(format!("Invalid auth header: {e}")))?;
+            map.insert(HeaderName::from_static("authorization"), value);
+            Some(map)
+        } else {
+            None
+        };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let auth = if let Some(tenant) = self.tenant_id.as_ref()
&& let Some(header) = TENANT_METADATA.get_global_query_auth(tenant)
{
let mut map = HeaderMap::new();
map.insert(
HeaderName::from_static("authorization"),
HeaderValue::from_str(&header).unwrap(),
);
Some(map)
} else {
None
};
let auth = if let Some(tenant) = self.tenant_id.as_ref()
&& let Some(header) = TENANT_METADATA.get_global_query_auth(tenant)
{
let mut map = HeaderMap::new();
let value = HeaderValue::from_str(&header)
.map_err(|e| AlertError::CustomError(format!("Invalid auth header: {e}")))?;
map.insert(HeaderName::from_static("authorization"), value);
Some(map)
} else {
None
};
🤖 Prompt for AI Agents
In `@src/alerts/alert_types.rs` around lines 90 - 101, The code currently uses
HeaderValue::from_str(&header).unwrap() which can panic; replace the unwrap with
proper error handling by matching or using .map_err() on
HeaderValue::from_str(&header) inside the auth-building block (the code
referencing self.tenant_id, TENANT_METADATA.get_global_query_auth, HeaderMap and
HeaderValue::from_str). Either propagate the error from the surrounding function
(turning the caller into a Result and returning an Err with a descriptive
message) or handle the parse failure gracefully (log the malformed header and
skip adding the Authorization header by returning None for auth); implement the
match or map_err flow so no panic can occur.

Comment on lines 62 to +67
let Some(stream_name) = req.headers().get(STREAM_NAME_HEADER_KEY) else {
return Err(PostError::Header(ParseHeaderError::MissingStreamName));
};

let tenant_id = get_tenant_id_from_request(&req);
let stream_name = stream_name.to_str().unwrap().to_owned();
let internal_stream_names = PARSEABLE.streams.list_internal_streams();
let internal_stream_names = PARSEABLE.streams.list_internal_streams(&tenant_id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid panic on malformed tenant header.

Line 65 calls get_tenant_id_from_request, which currently uses to_str().unwrap() (see src/utils/mod.rs). A non‑UTF8 tenant header will panic and crash the handler. Please harden the helper to be tolerant (or return a typed error) and handle it here as a bad request.

🛠️ Proposed fix (src/utils/mod.rs)
 pub fn get_tenant_id_from_request(req: &HttpRequest) -> Option<String> {
     req.headers()
-        .get("tenant")
-        .map(|tenant_value| tenant_value.to_str().unwrap().to_owned())
+        .get("tenant")
+        .and_then(|tenant_value| tenant_value.to_str().ok())
+        .map(|tenant_value| tenant_value.to_owned())
 }
🤖 Prompt for AI Agents
In `@src/handlers/http/ingest.rs` around lines 62 - 67, The handler is currently
vulnerable to a panic because get_tenant_id_from_request (which uses
to_str().unwrap()) can fail on non‑UTF8 tenant headers; change
get_tenant_id_from_request to return a Result<String, HeaderParseError> (or
Option) instead of unwrapping, then update the ingest handler to handle that
error—when extracting the tenant id after STREAM_NAME_HEADER_KEY lookup, map a
malformed/non‑UTF8 tenant header into a
PostError::Header(ParseHeaderError::InvalidTenantHeader) (or other appropriate
ParseHeaderError) and return Err(...) rather than allowing a panic; refer to
get_tenant_id_from_request and PostError/ParseHeaderError when making these
changes.

Comment on lines +92 to +103
pub async fn make_llm_request(
req: HttpRequest,
body: web::Json<AiPrompt>,
) -> Result<HttpResponse, LLMError> {
let api_key = match &PARSEABLE.options.open_ai_key {
Some(api_key) if api_key.len() > 3 => api_key,
_ => return Err(LLMError::InvalidAPIKey),
};

let stream_name = &body.stream;
let schema = PARSEABLE.get_stream(stream_name)?.get_schema();
let tenant_id = get_tenant_id_from_request(&req);
let schema = PARSEABLE.get_stream(stream_name, &tenant_id)?.get_schema();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Guard against panic on malformed tenant header.
get_tenant_id_from_request unwraps to_str(), so a non‑UTF8 tenant header can panic and crash the handler. Prefer returning None (or a 400) instead of panicking, then handle it here.

Proposed fix (src/utils/mod.rs)
 pub fn get_tenant_id_from_request(req: &HttpRequest) -> Option<String> {
-    req.headers()
-        .get("tenant")
-        .map(|tenant_value| tenant_value.to_str().unwrap().to_owned())
+    req.headers()
+        .get("tenant")
+        .and_then(|tenant_value| tenant_value.to_str().ok())
+        .map(|tenant_value| tenant_value.to_owned())
 }
🤖 Prompt for AI Agents
In `@src/handlers/http/llm.rs` around lines 92 - 103, get_tenant_id_from_request
currently unwraps header.to_str() which can panic on non‑UTF8 tenant headers;
change get_tenant_id_from_request (in utils) to return a Result<String, _> or
Option<String> instead of panicking, then update make_llm_request to handle the
error/None case by returning an appropriate LLMError (e.g.,
LLMError::InvalidTenantHeader or map to a 400) before calling
PARSEABLE.get_stream; reference get_tenant_id_from_request and make_llm_request
and ensure no direct to_str().unwrap() remains.

Comment on lines 119 to 121
let schema = PARSEABLE
.get_stream(stream)
.get_stream(stream, &None)
.map_err(|err| Status::failed_precondition(err.to_string()))?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Resolve streams using the session’s tenant.
Passing &None forces the default tenant and can expose the wrong tenant’s data.

🔐 Proposed fix
-        let schema = PARSEABLE
-            .get_stream(stream, &None)
+        let tenant_id = utils::get_tenant_id_from_key(&key);
+        let schema = PARSEABLE
+            .get_stream(stream, &tenant_id)
             .map_err(|err| Status::failed_precondition(err.to_string()))?
             .get_schema();
🤖 Prompt for AI Agents
In `@src/handlers/livetail.rs` around lines 119 - 121, The code calls
PARSEABLE.get_stream(stream, &None) which forces the default tenant; change it
to resolve the tenant from the current session and pass that instead. Locate the
call to PARSEABLE.get_stream and replace the &None tenant argument with the
session's tenant (e.g. &Some(session.tenant()) or &Some(session.tenant_id())
depending on the session API) so get_stream uses the session's tenant context
rather than the default.

Comment on lines +1167 to +1172
async fn delete_tenant(&self, tenant_id: &str) -> Result<(), MetastoreError> {
self.storage
.delete_prefix(&RelativePathBuf::from(tenant_id), &None)
.await
.map_err(MetastoreError::ObjectStorageError)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

rg -n "async fn delete_prefix" --type rust -A 10

Repository: parseablehq/parseable

Length of output: 3152


🏁 Script executed:

rg -n "_delete_prefix" --type rust -B 2 -A 15 | head -80

Repository: parseablehq/parseable

Length of output: 4514


🏁 Script executed:

rg -n "delete_prefix" src/metastore/metastores/object_store_metastore.rs -B 3 -A 3

Repository: parseablehq/parseable

Length of output: 355


🏁 Script executed:

rg -n "async fn delete_stream" src/metastore/metastores/object_store_metastore.rs -A 8

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

rg -n "delete_stream" src/metastore/metastores/object_store_metastore.rs -B 2 -A 5

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

rg -n "self.storage" src/metastore/metastores/object_store_metastore.rs | head -20

Repository: parseablehq/parseable

Length of output: 1381


🏁 Script executed:

sed -n '1190,1210p' src/metastore/metastores/object_store_metastore.rs

Repository: parseablehq/parseable

Length of output: 744


🏁 Script executed:

sed -n '1160,1180p' src/metastore/metastores/object_store_metastore.rs

Repository: parseablehq/parseable

Length of output: 739


Fix delete_tenant to pass the actual tenant_id to delete_prefix.

The delete_tenant function receives tenant_id: &str but passes &None to delete_prefix. This causes metrics attribution to DEFAULT_TENANT instead of the tenant being deleted. The function should pass &Some(tenant_id.to_string()) to align with how other storage operations (e.g., delete_object at line 141) handle tenant_id.

🤖 Prompt for AI Agents
In `@src/metastore/metastores/object_store_metastore.rs` around lines 1167 - 1172,
The delete_tenant implementation is passing &None for tenant attribution to
storage.delete_prefix which causes metrics to be attributed to DEFAULT_TENANT;
update the delete_tenant function to call
self.storage.delete_prefix(&RelativePathBuf::from(tenant_id),
&Some(tenant_id.to_string())).await.map_err(MetastoreError::ObjectStorageError)
so the actual tenant_id is supplied (consistent with how delete_object uses
tenant attribution).

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (10)
src/catalog/mod.rs (1)

563-576: Fix partition_path to use DEFAULT_TENANT when tenant_id is None.

The function currently uses an empty string for the root path component when tenant_id is None. This is inconsistent with the established pattern throughout the codebase, which uses DEFAULT_TENANT for all similar cases (stats.rs, correlation.rs, parseable/streams.rs, storage/localfs.rs, etc.). Update line 569 to:

let root = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);

This ensures manifest paths follow the same convention as the rest of the application.

src/migration/mod.rs (1)

50-61: Staging metadata read still ignores tenant_id.

put_staging_metadata writes to a tenant-scoped path when tenant_id is Some, but run_metadata_migration still reads only the global staging file. This skips staging migration for non-default tenants.

🔧 Proposed fix
-    let staging_metadata = get_staging_metadata(config)?;
+    let staging_metadata = get_staging_metadata(config, tenant_id)?;

-pub fn get_staging_metadata(config: &Parseable) -> anyhow::Result<Option<serde_json::Value>> {
-    let path =
-        RelativePathBuf::from(PARSEABLE_METADATA_FILE_NAME).to_path(config.options.staging_dir());
+pub fn get_staging_metadata(
+    config: &Parseable,
+    tenant_id: &Option<String>,
+) -> anyhow::Result<Option<serde_json::Value>> {
+    let path = if let Some(tenant) = tenant_id.as_ref() {
+        RelativePathBuf::from_iter([tenant, PARSEABLE_METADATA_FILE_NAME])
+            .to_path(config.options.staging_dir())
+    } else {
+        RelativePathBuf::from(PARSEABLE_METADATA_FILE_NAME)
+            .to_path(config.options.staging_dir())
+    };
src/metastore/metastores/object_store_metastore.rs (2)

344-413: Alert state paths ignore tenant_id, causing cross-tenant leakage.

get_alert_states reads tenant-scoped .alerts/, but get_alert_state_entry/put_alert_state write to a tenant-less alert_state_json_path, so tenant states won’t be found (and can collide). Align these paths with the tenant-aware base.

🔧 Proposed fix (inline path construction)
-        let path = alert_state_json_path(*alert_id);
+        let path = if let Some(tenant) = tenant_id.as_ref() {
+            RelativePathBuf::from_iter([
+                tenant,
+                ALERTS_ROOT_DIRECTORY,
+                &format!("alert_state_{alert_id}.json"),
+            ])
+        } else {
+            alert_state_json_path(*alert_id)
+        };
@@
-        let path = alert_state_json_path(id);
+        let path = if let Some(tenant) = tenant_id.as_ref() {
+            RelativePathBuf::from_iter([
+                tenant,
+                ALERTS_ROOT_DIRECTORY,
+                &format!("alert_state_{id}.json"),
+            ])
+        } else {
+            alert_state_json_path(id)
+        };

903-916: Tenant prefix leaks into dates in get_all_manifest_files.

When tenant_id is Some, common_prefixes parts include the tenant segment and can be treated as a date. Use the last path component or strip the tenant prefix.

🔧 Suggested fix
-        let dates = resp
-            .common_prefixes
-            .iter()
-            .flat_map(|path| path.parts())
-            .filter(|name| name.as_ref() != stream_name && name.as_ref() != STREAM_ROOT_DIRECTORY)
-            .map(|name| name.as_ref().to_string())
-            .collect::<Vec<_>>();
+        let dates = resp
+            .common_prefixes
+            .iter()
+            .filter_map(|path| path.parts().last().map(|p| p.as_ref().to_string()))
+            .filter(|name| name != stream_name && name != STREAM_ROOT_DIRECTORY)
+            .collect::<Vec<_>>();
src/parseable/streams.rs (1)

118-126: Tests lack tenant_id parameter for Stream::new and Options::local_stream_data_path calls—update all test instantiations.

Multiple test functions fail to pass the new tenant_id parameter introduced in both Stream::new() and Options::local_stream_data_path(). Examples:

  • Line 1214–1222: Stream::new(options.clone(), stream_name, LogStreamMetadata::default(), None) missing final tenant_id arg
  • Line 1222: options.local_stream_data_path(stream_name) missing tenant_id arg

Production call sites in src/parseable/mod.rs (lines 240, 448, 864) correctly pass tenant_id to get_or_create, which then properly forwards it to Stream::new. All test calls throughout lines 1209–1562 must be updated to include &None as the final argument.

src/rbac/user.rs (1)

332-379: UserGroup::validate() has incomplete tenant awareness.

Line 336 checks read_user_groups().contains_key(&self.name) without tenant context - this checks if the group name exists in ANY tenant's groups, not the specific tenant. Line 355 similarly checks users().contains_key(group_user.userid()) at the top level instead of within the tenant's user map.

Suggested fix for tenant-aware validation
     pub fn validate(&self, tenant_id: &Option<String>) -> Result<(), RBACError> {
         let valid_name = is_valid_group_name(&self.name);
+        let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
 
-        if read_user_groups().contains_key(&self.name) {
+        if let Some(groups) = read_user_groups().get(tenant)
+            && groups.contains_key(&self.name)
+        {
             return Err(RBACError::UserGroupExists(self.name.clone()));
         }
         // ... role validation (already tenant-aware) ...
         
         if !self.users.is_empty() {
             // validate that the users exist
             for group_user in &self.users {
-                if !users().contains_key(group_user.userid()) {
+                let user_exists = users()
+                    .get(tenant)
+                    .map(|u| u.contains_key(group_user.userid()))
+                    .unwrap_or(false);
+                if !user_exists {
                     non_existent_users.push(group_user.userid().to_string());
                 }
             }
         }
src/handlers/http/logstream.rs (1)

272-291: Avoid panics when date metrics are missing or label sets change.

get_metric_with_label_values(...).unwrap() will panic if a tenant/date label set isn’t registered (or if global metrics remain without tenant labels). Prefer a fallible path and return a StreamError instead of crashing.

🐛 Proposed fix (apply to all three metrics)
-    let events_ingested = EVENTS_INGESTED_DATE
-        .get_metric_with_label_values(&event_labels)
-        .unwrap()
-        .get() as u64;
+    let events_ingested = EVENTS_INGESTED_DATE
+        .get_metric_with_label_values(&event_labels)
+        .map_err(|_| StreamError::Custom {
+            msg: "stats not available for requested date/tenant".to_string(),
+            status: StatusCode::NOT_FOUND,
+        })?
+        .get() as u64;

Based on learnings, TOTAL_EVENTS_*_DATE metrics are intended to be global and label sets must remain consistent.

src/storage/azure_blob.rs (1)

215-233: Count object-store metrics only on successful calls.

increment_object_store_calls_by_date currently runs even when get fails, which can skew billing/usage metrics. Move it inside the Ok branch (apply similarly in _put_object, _delete_prefix, etc.).

🛠️ Proposed fix (example for GET)
-        let resp = self.client.get(&to_object_store_path(path)).await;
-        increment_object_store_calls_by_date("GET", &Utc::now().date_naive().to_string(), tenant);
+        let resp = self.client.get(&to_object_store_path(path)).await;
         match resp {
             Ok(resp) => {
+                increment_object_store_calls_by_date(
+                    "GET",
+                    &Utc::now().date_naive().to_string(),
+                    tenant,
+                );
                 let body: Bytes = resp.bytes().await?;

Based on learnings, object-store metrics should be recorded only after successful operations.

src/alerts/mod.rs (1)

626-635: Filter SSE recipients by alert tenant to prevent cross-tenant leaks.
user_auth_for_query authorizes by session permissions, but the broadcast list isn’t constrained to the alert’s tenant. If dataset names overlap across tenants, this can leak alert events. Gate by tenant match before adding to broadcast_to.

🔒 Proposed fix
-        for (session, _, _) in active_session {
-            if user_auth_for_query(&session, &self.query).await.is_ok()
-                && let SessionKey::SessionId(id) = &session
-            {
-                broadcast_to.push(*id);
-            }
-        }
+        let alert_tenant = self.tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
+        for (session, _, _) in active_session {
+            let session_tenant =
+                get_tenant_id_from_key(&session).as_deref().unwrap_or(DEFAULT_TENANT);
+            if session_tenant != alert_tenant {
+                continue;
+            }
+            if user_auth_for_query(&session, &self.query).await.is_ok()
+                && let SessionKey::SessionId(id) = &session
+            {
+                broadcast_to.push(*id);
+            }
+        }
src/storage/gcs.rs (1)

179-229: Record object-store metrics only on success.
Several methods increment increment_object_store_calls_by_date (and some file counters) before confirming the operation succeeded. This inflates metrics on failures.

✅ Example fix (apply similarly in other methods)
-        let resp = self.client.get(&to_object_store_path(path)).await;
-        let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
-        increment_object_store_calls_by_date("GET", &Utc::now().date_naive().to_string(), tenant);
-        match resp {
+        let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
+        let resp = self.client.get(&to_object_store_path(path)).await;
+        match resp {
             Ok(resp) => {
+                increment_object_store_calls_by_date(
+                    "GET",
+                    &Utc::now().date_naive().to_string(),
+                    tenant,
+                );
                 let body: Bytes = resp.bytes().await?;
                 increment_files_scanned_in_object_store_calls_by_date(
                     "GET",
                     1,
                     &Utc::now().date_naive().to_string(),
                     tenant,
                 );
                 increment_bytes_scanned_in_object_store_calls_by_date(
                     "GET",
                     body.len() as u64,
                     &Utc::now().date_naive().to_string(),
                     tenant,
                 );
                 Ok(body)
             }
             Err(err) => Err(err.into()),
         }
-        let resp = self.client.put(&to_object_store_path(path), resource).await;
-        increment_object_store_calls_by_date("PUT", &Utc::now().date_naive().to_string(), tenant);
-        match resp {
+        let resp = self.client.put(&to_object_store_path(path), resource).await;
+        match resp {
             Ok(_) => {
+                increment_object_store_calls_by_date(
+                    "PUT",
+                    &Utc::now().date_naive().to_string(),
+                    tenant,
+                );
                 increment_files_scanned_in_object_store_calls_by_date(
                     "PUT",
                     1,
                     &Utc::now().date_naive().to_string(),
                     tenant,
                 );
                 Ok(())
             }
             Err(err) => Err(err.into()),
         }

Based on learnings, metrics should only be emitted after successful operations.

Also applies to: 232-283, 441-499, 621-699, 869-933

🤖 Fix all issues with AI agents
In `@src/handlers/http/logstream.rs`:
- Around line 442-446: The "TODO tenants" marker is ambiguous—either remove it
if hot-tier handling is already tenant-aware or implement tenant scoping; choose
one: if hot tiers are tenant-aware, delete the line "// TODO tenants" and add a
brief comment confirming tenant scoping is handled around validator::hot_tier,
hottier, stream.set_hot_tier, and HotTierManager::global(); otherwise, implement
tenant-aware behavior by obtaining the current tenant id/context,
validating/declaring hottier per-tenant (e.g., via validator::hot_tier with
tenant info), and pass tenant-scoped hottier to stream.set_hot_tier and
HotTierManager::global() so the hot-tier is correctly associated with the tenant
before removing the TODO.

In `@src/handlers/http/modal/ingest/ingestor_rbac.rs`:
- Around line 154-160: In remove_roles_from_user, the closure that checks and
collects non_existent_roles uses a combined pattern with if let ... && ... which
mis-validates roles; replace it with a two-step check: first use if let
Some(tenant_roles) = roles().get(tenant_id.as_deref().unwrap_or(DEFAULT_TENANT))
{ if tenant_roles.get(r).is_none() { non_existent_roles.push(r.clone()); } } so
you explicitly resolve tenant_roles from roles() and then test
tenant_roles.get(r).is_none() (mirroring the same fix applied elsewhere).
- Around line 108-114: The current loop over roles_to_add uses
roles().get(...).and_then logic that lets roles for a non-existent tenant pass
silently; change the check in ingestor_rbac.rs so you first lookup the tenant
entry from roles() using tenant_id.as_deref().unwrap_or(DEFAULT_TENANT) (e.g.,
let tenant_roles_opt = roles().get(...)) and then: if tenant_roles_opt is None,
treat the tenant as missing and push all roles_to_add (or otherwise mark them as
non-existent) into non_existent_roles; else inspect tenant_roles
(tenant_roles_opt.unwrap()) and push only those r where
tenant_roles.get(r).is_none(). This ensures non-existent tenants cause
validation failures rather than silent passes.

In `@src/rbac/mod.rs`:
- Around line 299-345: The current search in mutate_request_with_basic_user (and
get_user_tenant_from_basic) uses parallel find_first across tenants, so if
identical basic credentials exist in multiple tenants the chosen tenant is
non-deterministic; fix by either enforcing username uniqueness across tenants at
user-creation time (validate in the user creation function that no other tenant
has the same basic.username/password and reject/merge) or make the lookup
deterministic by replacing the parallel find_first with a serial, deterministic
scan (e.g., users().iter() or a sorted tenant order) and returning the first
match consistently; apply the chosen approach to both
mutate_request_with_basic_user and get_user_tenant_from_basic and add a short
doc comment describing the enforced behavior.

Comment on lines 442 to 446
validator::hot_tier(&hottier.size.to_string())?;

// TODO tenants
stream.set_hot_tier(Some(hottier.clone()));
let Some(hot_tier_manager) = HotTierManager::global() else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Resolve or remove the “TODO tenants” marker.

This reads like an incomplete requirement. If hot-tier is already tenant-aware, remove it to avoid confusion; otherwise, please implement or track the missing work.

🤖 Prompt for AI Agents
In `@src/handlers/http/logstream.rs` around lines 442 - 446, The "TODO tenants"
marker is ambiguous—either remove it if hot-tier handling is already
tenant-aware or implement tenant scoping; choose one: if hot tiers are
tenant-aware, delete the line "// TODO tenants" and add a brief comment
confirming tenant scoping is handled around validator::hot_tier, hottier,
stream.set_hot_tier, and HotTierManager::global(); otherwise, implement
tenant-aware behavior by obtaining the current tenant id/context,
validating/declaring hottier per-tenant (e.g., via validator::hot_tier with
tenant info), and pass tenant-scoped hottier to stream.set_hot_tier and
HotTierManager::global() so the hot-tier is correctly associated with the tenant
before removing the TODO.

Comment on lines 154 to 160
roles_to_remove.iter().for_each(|r| {
if roles().get(r).is_none() {
if let Some(tenant_roles) = roles().get(tenant_id.as_deref().unwrap_or(DEFAULT_TENANT))
&& tenant_roles.get(r).is_none()
{
non_existent_roles.push(r.clone());
}
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Same role validation issue in remove_roles_from_user.

Apply the same fix pattern here.

🤖 Prompt for AI Agents
In `@src/handlers/http/modal/ingest/ingestor_rbac.rs` around lines 154 - 160, In
remove_roles_from_user, the closure that checks and collects non_existent_roles
uses a combined pattern with if let ... && ... which mis-validates roles;
replace it with a two-step check: first use if let Some(tenant_roles) =
roles().get(tenant_id.as_deref().unwrap_or(DEFAULT_TENANT)) { if
tenant_roles.get(r).is_none() { non_existent_roles.push(r.clone()); } } so you
explicitly resolve tenant_roles from roles() and then test
tenant_roles.get(r).is_none() (mirroring the same fix applied elsewhere).

Comment on lines +299 to +345
pub fn mutate_request_with_basic_user(
&self,
username: &str,
password: &str,
req: &mut ServiceRequest,
) {
if let Some((tenant, _)) = users().par_iter().find_first(|(_, usermap)| {
usermap
.par_iter()
.find_first(|(_, user)| {
if let UserType::Native(basic) = &user.ty
&& basic.username.eq(username)
&& basic.verify_password(password)
&& let Some(_) = &user.tenant
{
// req.headers_mut().insert(
// HeaderName::from_static("tenant"),
// HeaderValue::from_bytes(tenant.as_bytes()).unwrap(),
// );
true
} else {
false
}
})
.is_some()
}) {
req.headers_mut().insert(
HeaderName::from_static("tenant"),
HeaderValue::from_bytes(tenant.as_bytes()).unwrap(),
);
};
// for (_, usermap) in users().iter() {
// for (_, user) in usermap.iter() {
// if let UserType::Native(basic) = &user.ty
// && basic.username.eq(username)
// && basic.verify_password(password)
// && let Some(tenant) = &user.tenant
// {
// req.headers_mut().insert(
// HeaderName::from_static("tenant"),
// HeaderValue::from_bytes(tenant.as_bytes()).unwrap(),
// );
// return;
// }
// }
// }
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's find the user creation and modification functions
rg -n "fn new_basic|fn put_user|fn create.*user" --type rust -A 10 | head -100

Repository: parseablehq/parseable

Length of output: 2044


🏁 Script executed:

# Let's also check the users() function and data structure
rg -n "fn users|struct.*User|HashMap.*User" --type rust -B 2 -A 5 | head -100

Repository: parseablehq/parseable

Length of output: 5066


🏁 Script executed:

# Check for any existing uniqueness validation
rg -n "unique|exists|already.*user|duplicate" --type rust -i -B 2 -A 2 | head -80

Repository: parseablehq/parseable

Length of output: 4374


🏁 Script executed:

# Check the Users::insert method implementation
sed -n '460,480p' src/rbac/map.rs

Repository: parseablehq/parseable

Length of output: 820


🏁 Script executed:

# Look at the Basic struct definition
rg -n "struct Basic" --type rust -A 10

Repository: parseablehq/parseable

Length of output: 1209


🏁 Script executed:

# Check how userid is generated/used in authentication
rg -n "userid|username" src/rbac/user.rs --type rust -B 2 -A 2 | head -80

Repository: parseablehq/parseable

Length of output: 2644


🏁 Script executed:

# Check if there's any validation when creating users via API
rg -n "create_user|add_user" src/handlers/http --type rust -B 5 -A 15

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Check if there's any endpoint that creates basic users and whether it validates uniqueness
rg -n "UserCreate|create.*user" src/handlers/http --type rust -B 5 -A 20 | head -150

Repository: parseablehq/parseable

Length of output: 11502


🏁 Script executed:

# Look for any comments or documentation about tenant scoping
rg -n "tenant.*user|username.*unique|basic.*auth" src/rbac --type rust -i -B 2 -A 2

Repository: parseablehq/parseable

Length of output: 9734


🏁 Script executed:

# Check the entire flow of basic auth authentication
rg -n "verify_password|mutate_request_with_basic" src --type rust -B 3 -A 10 | head -100

Repository: parseablehq/parseable

Length of output: 5640


Document or enforce tenant-scoped username uniqueness in basic auth.

mutate_request_with_basic_user and get_user_tenant_from_basic both iterate across all tenants searching for matching credentials. If identical credentials exist across tenants (currently not prevented), the resulting tenant assignment is non-deterministic due to parallel iteration with find_first(). Either enforce username uniqueness across tenants at user creation time, or explicitly document that matching the first tenant-user pair is the expected behavior.

🤖 Prompt for AI Agents
In `@src/rbac/mod.rs` around lines 299 - 345, The current search in
mutate_request_with_basic_user (and get_user_tenant_from_basic) uses parallel
find_first across tenants, so if identical basic credentials exist in multiple
tenants the chosen tenant is non-deterministic; fix by either enforcing username
uniqueness across tenants at user-creation time (validate in the user creation
function that no other tenant has the same basic.username/password and
reject/merge) or make the lookup deterministic by replacing the parallel
find_first with a serial, deterministic scan (e.g., users().iter() or a sorted
tenant order) and returning the first match consistently; apply the chosen
approach to both mutate_request_with_basic_user and get_user_tenant_from_basic
and add a short doc comment describing the enforced behavior.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (8)
src/correlation.rs (1)

167-168: Logic error: validates wrong correlation and overwrites new values with old.

Two issues here:

  1. Line 167 validates correlation (the old stored version) instead of validating the incoming updated_correlation with new values.
  2. Line 168 calls updated_correlation.update(correlation) which, per the update() method implementation (lines 264-271), copies fields FROM correlation (old) INTO updated_correlation (new), overwriting the user's intended changes.

The result is that updates are effectively no-ops—user-provided values are replaced by stored values before persistence.

🐛 Proposed fix
-        correlation.validate(session_key).await?;
-        updated_correlation.update(correlation);
+        // Preserve id and user_id from stored correlation
+        updated_correlation.id = correlation.id.clone();
+        updated_correlation.user_id = correlation.user_id.clone();
+        
+        // Validate the new configuration before persisting
+        updated_correlation.validate(session_key).await?;
src/storage/localfs.rs (1)

595-628: Add a comment explaining why tenant_id is unused in filesystem storage.

The _tenant_id parameter is intentionally unused in list_dirs_relative and similar methods (list_hours, list_minutes) because local filesystem storage does not implement per-tenant metrics tracking, unlike cloud backends (S3, Azure, GCS). Consider adding a brief comment above the method signature or as an inline comment to clarify this is by design and part of the trait interface.

src/storage/azure_blob.rs (1)

214-266: Record object-store metrics only on successful operations.
increment_object_store_calls_by_date is executed regardless of success in several methods, which will overcount failures. Please move these increments into the success path (and only after the call succeeds).

✅ Example fix (apply similarly to other ops)
-        let resp = self.client.get(&to_object_store_path(path)).await;
-        increment_object_store_calls_by_date("GET", &Utc::now().date_naive().to_string(), tenant);
-
-        match resp {
-            Ok(resp) => {
+        let resp = self.client.get(&to_object_store_path(path)).await;
+        match resp {
+            Ok(resp) => {
+                increment_object_store_calls_by_date(
+                    "GET",
+                    &Utc::now().date_naive().to_string(),
+                    tenant,
+                );
                 let body: Bytes = resp.bytes().await?;
                 increment_files_scanned_in_object_store_calls_by_date(
                     "GET",
                     1,
                     &Utc::now().date_naive().to_string(),
                     tenant,
                 );
                 increment_bytes_scanned_in_object_store_calls_by_date(
                     "GET",
                     body.len() as u64,
                     &Utc::now().date_naive().to_string(),
                     tenant,
                 );
                 Ok(body)
             }
             Err(err) => Err(err.into()),
         }

Based on learnings, metrics should only be recorded on successful operations.

Also applies to: 321-347, 359-379, 499-516, 639-660, 663-679, 692-716, 891-913, 923-946

src/hottier.rs (1)

208-218: Use tenant-aware hot-tier paths when deleting.
delete_hot_tier still removes hot_tier_path/<stream> even when a tenant is provided, which can delete the wrong tenant’s data or leave the intended data behind.

✅ Suggested fix
-        let path = self.hot_tier_path.join(stream);
+        let path = if let Some(tenant) = tenant_id.as_ref() {
+            self.hot_tier_path.join(tenant).join(stream)
+        } else {
+            self.hot_tier_path.join(stream)
+        };
         fs::remove_dir_all(path).await?;
src/handlers/http/modal/ingest/ingestor_rbac.rs (1)

201-225: Missing password generation—endpoint returns existing hash instead of new password.

The function is documented to reset the password to a newly generated one (line 199-200), but never calls Basic::gen_new_password(). Instead, it copies the existing hash at line 220 and persists it unchanged. The put_staging_metadata call at line 210 occurs before any password is generated.

Compare to the correct implementation in src/handlers/http/rbac.rs:180 which calls user::Basic::gen_new_password() to obtain a new hash. The ingestor_rbac.rs version is missing this critical step, breaking the endpoint's intended functionality.

Additionally, the storage call result is discarded.

src/handlers/http/cluster/mod.rs (1)

503-560: Tenant header missing in intra-cluster user/role sync requests.

These calls forward auth but don’t propagate tenant context. Downstream handlers that derive tenant from headers will default to DEFAULT_TENANT, causing cross-tenant updates. Add the tenant header (or include tenant in the payload) for each sync request.

🔧 Directional fix (example)
-            let res = INTRA_CLUSTER_CLIENT
+            let mut req = INTRA_CLUSTER_CLIENT
                 .patch(url)
                 .header(header::AUTHORIZATION, &ingestor.token)
                 .header(header::CONTENT_TYPE, "application/json");
+            if let Some(t) = tenant_id.as_deref() {
+                req = req.header("tenant", t);
+            }
+            let res = req
                 .body(role_data)
                 .send()
                 .await

Also applies to: 565-605, 668-707

src/parseable/streams.rs (2)

585-620: Metrics label cardinality mismatch — fix test code that uses old label sets.

The metrics label changes are real: STAGING_FILES expects 2 labels (["stream", "tenant_id"]) and STORAGE_SIZE expects 4 labels (["type", "stream", "format", "tenant_id"]). The code in lines 585-620 is correct, but lines 1384-1392 in the same file still use old label sets and will panic at runtime:

  • Line 1384: STAGING_FILES.with_label_values(&[&stream]) provides 1 label instead of 2
  • Lines 1386-1388, 1390-1392: STORAGE_SIZE.with_label_values(...) each provide 3 labels instead of 4

Update these test assertions to include tenant_id label values.


118-127: Update Stream::new call sites for tenant-aware signature.

Line 118 adds a tenant_id: &Option<String> parameter and changes data_path to be tenant-aware via options.local_stream_data_path(&stream_name, tenant_id). Multiple test call sites in this file still use the old 4-argument signature instead of the new 5-argument signature. Tests on lines 1214–1217, 1232–1235, 1250–1253, 1268–1271, 1289–1292, 1312–1315, 1346–1349, 1375–1378, 1432–1435, 1458, 1481–1484, 1508, 1531–1534, and 1562 are missing the tenant_id parameter and will not compile. Additionally, assertions on lines 1223, 1241, 1259, and 1277 call options.local_stream_data_path(stream_name) with only one argument; they must pass a second argument for tenant_id.

Update all test call sites to pass &None and adjust assertions accordingly.

🤖 Fix all issues with AI agents
In `@src/correlation.rs`:
- Around line 274-276: In validate(), ensure table resolution is tenant-scoped:
when obtaining the SessionContext via QUERY_SESSION.get_ctx() and before calling
ctx.table(), pass the tenant_id (from get_tenant_id_from_key(session_key)) or
explicitly select the tenant schema so DataFusion resolves the table inside the
tenant's schema (same pattern used in Query.execute()). Update the
ctx.table(...) invocation in validate() to use the tenant-scoped lookup (or set
the active schema for ctx to tenant_id) so authorization at line 308 cannot be
bypassed by cross-tenant table name collisions.

In `@src/rbac/mod.rs`:
- Around line 270-289: The function validate_basic_user_tenant_id currently
returns true whenever the tenant exists because it uses
users().get(tenant_id).map(...).is_some() which only checks tenant presence;
change the logic to return whether a matching user was found for that tenant —
e.g. replace the outer .map(...).is_some() with
users().get(tenant_id).is_some_and(|tenant_users|
tenant_users.values().par_bridge().any(|user| matches!(&user.ty,
UserType::Native(basic) if basic.username.eq(username) &&
basic.verify_password(password)))) or equivalently use
.map(...).unwrap_or(false) and keep the inner find_any(...).is_some(); update
validate_basic_user_tenant_id accordingly so the boolean reflects the user
match, not just tenant existence.

In `@src/rbac/role.rs`:
- Around line 188-230: The DefaultPrivilege enum currently has Reader {
resource: Option<...> } while Ingestor { resource: Option<...> } is treated as
global when None; fix the mismatch by deciding intended behavior and
implementing one of two fixes: (A) Make Reader require a resource (change Reader
to Reader { resource: ParseableResourceType }) and update the
From<&DefaultPrivilege> for RoleBuilder (the reader_perm_builder() match arm)
and any serde usages accordingly; or (B) Keep Reader optional but update the
authorization matching logic in src/rbac/map.rs (the code that special-cases
Action::Ingest for resource_type == None) to also handle global Reader access
when resource_type is None so reader permissions are granted when
Reader.resource is None; ensure RoleBuilder conversion
(reader_perm_builder().with_resource(...)) stays consistent with the chosen
semantics.
♻️ Duplicate comments (69)
src/users/filters.rs (1)

128-134: Filter silently dropped when tenant bucket doesn't exist.

This issue was previously identified: when adding the first filter for a new tenant (or after a restart where the tenant bucket wasn't initialized), get_mut() returns None and the filter is silently dropped. Use the entry API as demonstrated in load().

🐛 Proposed fix using entry API
     pub async fn update(&self, filter: &Filter, tenant_id: &Option<String>) {
         let mut s = self.0.write().await;
-        if let Some(filters) = s.get_mut(tenant_id.as_deref().unwrap_or(DEFAULT_TENANT)) {
-            filters.retain(|f| f.filter_id != filter.filter_id);
-            filters.push(filter.clone());
-        }
+        let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
+        let filters = s.entry(tenant.to_owned()).or_default();
+        filters.retain(|f| f.filter_id != filter.filter_id);
+        filters.push(filter.clone());
     }
src/prism/logstream/mod.rs (5)

63-67: Remove temporary debug warns in dataset info flow.

These look like leftover debugging noise and were previously flagged.

🧹 Proposed cleanup
-    tracing::warn!("starting dataset info");
     let info = info?;
-    tracing::warn!("got info");
     let schema = schema?;
-    tracing::warn!("got schema");

109-114: Drop the “starting stats” warn.

This appears to be a debug artifact and was already noted previously.

🧹 Proposed cleanup
-    tracing::warn!("starting stats");
     let stats = stats::get_current_stats(stream_name, "json", tenant_id)

238-246: Remove stream-list debug logging.

This warning is noisy and was previously called out.

🧹 Proposed cleanup
-        tracing::warn!(get_datasets_streams=?self.streams);

275-290: Drop the temporary warnings in process_stream.

These were already flagged before and add noise without actionable signal.

🧹 Proposed cleanup
         if !self.is_authorized(&stream, &key) {
-            tracing::warn!("not authorized for datasets");
             return Ok(None);
         }
@@
         if !PARSEABLE.check_or_load_stream(&stream, tenant_id).await {
-            tracing::warn!("unable to load stream {stream} for tenant {tenant_id:?}");
             return Ok(None);
         }

63-71: Stop returning fake stats in Prism logstream info.

The real stats are discarded and replaced with defaults, which breaks dataset statistics.

🩹 Proposed fix
-    tracing::warn!(real_stats=?stats);
-    let stats = QueriedStats::default();
-    tracing::warn!("got FAKE stats");
+    let stats = stats?;
src/handlers/http/role.rs (2)

154-167: Persist before in-memory update + avoid unwrap() on lock.
The in-memory DEFAULT_ROLE is updated before persistence. If put_metadata fails, in-memory state diverges. Also, write().unwrap() can panic on a poisoned lock. This was already raised in a previous review.

🐛 Suggested fix
     let mut metadata = get_metadata(&tenant_id).await?;
     metadata.default_role = Some(name.clone());
-    DEFAULT_ROLE.write().unwrap().insert(
-        tenant_id.as_deref().unwrap_or(DEFAULT_TENANT).to_owned(),
-        Some(name),
-    );
-    // *DEFAULT_ROLE.lock().unwrap() = Some(name);
-    put_metadata(&metadata, &tenant_id).await?;
+    put_metadata(&metadata, &tenant_id).await?;
+    DEFAULT_ROLE
+        .write()
+        .expect("failed to acquire DEFAULT_ROLE write lock")
+        .insert(tenant_id.as_deref().unwrap_or(DEFAULT_TENANT).to_owned(), Some(name));

173-190: Remove commented-out block.
The legacy commented-out match block should be removed to avoid stale code. This was already raised in a previous review.

🧹 Suggested cleanup
-    // let res = match DEFAULT_ROLE
-    //     .read()
-    //     .unwrap()
-    //     .get()
-    // {
-    //     Some(role) => serde_json::Value::String(role),
-    //     None => serde_json::Value::Null,
-    // };
src/query/stream_schema_provider.rs (3)

282-293: Same incomplete tenant-aware URL pattern as hot tier.

This duplicates the commented-out code pattern from get_hottier_exectuion_plan. Consider addressing both locations together when completing the multi-tenancy implementation.


84-96: TOCTOU race between table_exist and get_stream remains unaddressed.

There's a race window between the table_exist check (line 85) and the get_stream call (line 88). If the stream is deleted between these calls, the .expect(STREAM_EXISTS) will panic.

Consider handling the get_stream result directly:

Suggested fix
     async fn table(&self, name: &str) -> DataFusionResult<Option<Arc<dyn TableProvider>>> {
-        if self.table_exist(name) {
-            Ok(Some(Arc::new(StandardTableProvider {
-                schema: PARSEABLE
-                    .get_stream(name, &self.tenant_id)
-                    .expect(STREAM_EXISTS)
-                    .get_schema(),
-                stream: name.to_owned(),
-                tenant_id: self.tenant_id.clone(),
-            })))
-        } else {
-            Ok(None)
+        match PARSEABLE.get_stream(name, &self.tenant_id) {
+            Ok(stream) => Ok(Some(Arc::new(StandardTableProvider {
+                schema: stream.get_schema(),
+                stream: name.to_owned(),
+                tenant_id: self.tenant_id.clone(),
+            }))),
+            Err(_) => Ok(None),
         }
     }

637-646: Handle ObjectStoreUrl::parse error instead of unwrapping.

The unwrap() on line 646 can panic if the URL is malformed. Since scan() returns Result<_, DataFusionError>, propagate the error:

Suggested fix
         let object_store_url = glob_storage.store_url();
-
         self.create_parquet_physical_plan(
             &mut execution_plans,
-            ObjectStoreUrl::parse(object_store_url).unwrap(),
+            ObjectStoreUrl::parse(object_store_url)
+                .map_err(|e| DataFusionError::Plan(format!("Invalid object store URL: {e}")))?,
             partitioned_files,
src/query/mod.rs (1)

106-120: Call add_schema when new tenants are created at runtime to register them with the DataFusion SessionContext.

When a new tenant is added via add_tenant() in src/parseable/mod.rs, its schema must be registered with DataFusion. Currently, schemas are only registered during SessionContext initialization for existing tenants (lines 181–194 in src/query/mod.rs). New tenants added at runtime have no schema registration, which will cause queries on those tenants to fail.

Additionally, consider error handling instead of panics in add_schema (lines 109, 111, 119). Returning a Result would provide more graceful handling if the catalog is in an unexpected state.

♻️ Suggested improvement
    pub fn add_schema(&self, tenant_id: &str) {
+    pub fn add_schema(&self, tenant_id: &str) -> Result<(), datafusion::error::DataFusionError> {
+        let mut ctx = self.session_context
             .write()
-            .expect("SessionContext should be writeable")
-            .catalog("datafusion")
-            .expect("Default catalog should be available")
-            .register_schema(
-                tenant_id,
-                Arc::new(GlobalSchemaProvider {
-                    storage: PARSEABLE.storage().get_object_store(),
-                    tenant_id: Some(tenant_id.to_owned()),
-                }),
-            )
-            .expect("Should be able to register new schema");
+            .map_err(|_| datafusion::error::DataFusionError::Internal("SessionContext lock poisoned".to_string()))?;
+        
+        let catalog = ctx.catalog("datafusion")
+            .ok_or_else(|| datafusion::error::DataFusionError::Internal("Default catalog not available".to_string()))?;
+        
+        catalog.register_schema(
+            tenant_id,
+            Arc::new(GlobalSchemaProvider {
+                storage: PARSEABLE.storage().get_object_store(),
+                tenant_id: Some(tenant_id.to_owned()),
+            }),
+        )?;
+        Ok(())
     }

Then call QUERY_SESSION.add_schema(&tenant_id)?; in add_tenant() after successfully registering the tenant.

src/users/dashboards.rs (1)

255-265: Creation becomes a no-op for new tenants.
If the tenant key is missing, get_mut returns None, so the dashboard is neither saved nor cached while returning Ok(()). Consider entry(...).or_default() before duplicate checks, and add/confirm a test that creates a dashboard for a tenant with no preloaded dashboards.

src/correlation.rs (3)

141-144: Correlation silently not added to memory when tenant bucket doesn't exist.

When creating the first correlation for a tenant (after restart or for a new tenant), the tenant bucket won't exist in the in-memory map. The correlation is persisted to metastore but never added to the cache, causing get_correlation to fail until the next full reload.

🐛 Proposed fix using entry API
         let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
         // Update in memory
-        if let Some(corrs) = self.write().await.get_mut(tenant) {
-            corrs.insert(correlation.id.to_owned(), correlation.clone());
-        }
+        self.write()
+            .await
+            .entry(tenant.to_owned())
+            .or_default()
+            .insert(correlation.id.to_owned(), correlation.clone());

176-183: Same issue: update() silently fails when tenant bucket doesn't exist.

Apply the same fix using the entry API to ensure the correlation is always inserted.

🐛 Proposed fix
         let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
         // Update in memory
-        if let Some(corrs) = self.write().await.get_mut(tenant) {
-            corrs.insert(
-                updated_correlation.id.to_owned(),
-                updated_correlation.clone(),
-            );
-        }
+        self.write()
+            .await
+            .entry(tenant.to_owned())
+            .or_default()
+            .insert(updated_correlation.id.to_owned(), updated_correlation.clone());

210-211: Critical bug: remove operates on wrong map level, deletes entire tenant's correlations or silently fails.

Line 211 calls self.write().await.remove(&correlation.id) which removes from the outer HashMap<String, CorrelationMap> using correlation.id as the key. The outer map is keyed by tenant_id, not correlation_id. This either silently fails (if no tenant_id matches the correlation_id) or corrupts data by deleting an unrelated tenant's entire correlation map.

🐛 Proposed fix
         // Delete from memory
-        self.write().await.remove(&correlation.id);
+        let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
+        if let Some(corrs) = self.write().await.get_mut(tenant) {
+            corrs.remove(&correlation.id);
+        }
src/storage/store_metadata.rs (2)

309-331: Ensure tenant staging directory exists before write.

put_staging_metadata writes into a tenant subdirectory (lines 313-318) but never creates it. For new tenants, OpenOptions::open at line 329 will fail with NotFound.

🐛 Suggested fix
 pub fn put_staging_metadata(meta: &StorageMetadata, tenant_id: &Option<String>) -> io::Result<()> {
     let mut staging_metadata = meta.clone();
     staging_metadata.server_mode = PARSEABLE.options.mode;
     staging_metadata.staging = PARSEABLE.options.staging_dir().to_path_buf();
     let path = if let Some(tenant_id) = tenant_id.as_ref() {
-        PARSEABLE
-            .options
-            .staging_dir()
-            .join(tenant_id)
-            .join(PARSEABLE_METADATA_FILE_NAME)
+        let tenant_dir = PARSEABLE.options.staging_dir().join(tenant_id);
+        create_dir_all(&tenant_dir)?;
+        tenant_dir.join(PARSEABLE_METADATA_FILE_NAME)
     } else {
         PARSEABLE
             .options
             .staging_dir()
             .join(PARSEABLE_METADATA_FILE_NAME)
     };

126-131: Tenant-specific staging metadata isn't read.

resolve_parseable_metadata now receives tenant_id but still calls get_staging_metadata() without it at line 130. This means non-default tenants will read the default tenant's staging metadata, potentially causing incorrect environment change detection.

🛠️ Suggested fix
-    let staging_metadata = get_staging_metadata()?;
+    let staging_metadata = get_staging_metadata(tenant_id)?;

And update get_staging_metadata to accept and use tenant_id:

-pub fn get_staging_metadata() -> io::Result<Option<StorageMetadata>> {
-    let path = RelativePathBuf::from(PARSEABLE_METADATA_FILE_NAME)
-        .to_path(PARSEABLE.options.staging_dir());
+pub fn get_staging_metadata(tenant_id: &Option<String>) -> io::Result<Option<StorageMetadata>> {
+    let path = if let Some(tenant_id) = tenant_id.as_ref() {
+        PARSEABLE.options.staging_dir().join(tenant_id).join(PARSEABLE_METADATA_FILE_NAME)
+    } else {
+        PARSEABLE.options.staging_dir().join(PARSEABLE_METADATA_FILE_NAME)
+    };
src/tenants/mod.rs (2)

86-105: Consider returning Ok(None) for missing tenants.

The single call site treats Err(TenantNotFound) and Ok(None) identically. Semantically, if a tenant doesn't exist, the action cannot be suspended, making Ok(None) more appropriate and simplifying error handling at call sites.

♻️ Suggested change
         } else {
-            Err(TenantNotFound(tenant_id.to_owned()))
+            Ok(None) // Tenant doesn't exist, so action is not suspended
         }

36-40: Potential data duplication causes state divergence.

TenantOverview stores suspended_services both as a direct field and within meta.suspended_services. When suspend_service or resume_service is called (lines 70-80), only the HashSet is updated, leaving meta.suspended_services stale. This means get_tenants() returns outdated suspension state.

♻️ Option 1: Sync meta when suspending/resuming
 pub fn suspend_service(&self, tenant_id: &str, service: Service) {
     if let Some(mut tenant) = self.tenants.get_mut(tenant_id) {
         tenant.suspended_services.insert(service.clone());
+        // Keep meta in sync
+        let services = tenant.meta.suspended_services.get_or_insert_with(HashSet::new);
+        services.insert(service);
     }
 }

 pub fn resume_service(&self, tenant_id: &str, service: Service) {
     if let Some(mut tenant) = self.tenants.get_mut(tenant_id) {
         tenant.suspended_services.remove(&service);
+        if let Some(services) = tenant.meta.suspended_services.as_mut() {
+            services.remove(&service);
+        }
     }
 }
♻️ Option 2: Remove duplication, derive from meta
 pub struct TenantOverview {
-    suspended_services: HashSet<Service>,
     meta: StorageMetadata,
 }

+impl TenantOverview {
+    fn suspended_services(&self) -> &HashSet<Service> {
+        static EMPTY: HashSet<Service> = HashSet::new();
+        self.meta.suspended_services.as_ref().unwrap_or(&EMPTY)
+    }
+}

Also applies to: 51-60, 70-80

src/handlers/http/logstream.rs (1)

444-444: Resolve or remove the "TODO tenants" marker.

This marker is ambiguous. If hot-tier is already tenant-aware (it receives tenant_id on lines 450, 456, 462, 470), remove it. Otherwise, track the missing work.

Would you like me to open an issue to track this?

src/handlers/http/modal/ingest/ingestor_role.rs (3)

45-51: Inverted tenant validation logic.

The condition req_tenant.ne(DEFAULT_TENANT) && (req_tenant_id.eq(&sync_req.tenant_id)) checks if request tenant is NOT the default AND EQUALS the payload tenant. Based on the error message, this should reject cross-tenant operations, meaning the second condition should check for a mismatch (ne), not equality (eq).

🐛 Proposed fix
-    if req_tenant.ne(DEFAULT_TENANT) && (req_tenant_id.eq(&sync_req.tenant_id)) {
+    if req_tenant.ne(DEFAULT_TENANT) && (req_tenant_id.ne(&sync_req.tenant_id)) {
         return Err(RoleError::Anyhow(anyhow::Error::msg(
             "non super-admin user trying to create role for another tenant",
         )));
     }

58-58: Silently ignoring storage error.

The result of put_staging_metadata is discarded with let _ = .... If metadata persistence fails, the in-memory state will be updated but storage will be inconsistent. Propagate this error.

🐛 Proposed fix
-    let _ = storage::put_staging_metadata(&metadata, &sync_req.tenant_id);
+    storage::put_staging_metadata(&metadata, &sync_req.tenant_id)?;

59-66: Inconsistent tenant_id source for role map update.

tenant_id is derived from req_tenant_id (line 59-62) but sync_req.tenant_id is used for metadata operations (lines 53, 58). Use a consistent tenant source throughout.

🐛 Proposed fix
-    let tenant_id = req_tenant_id
-        .as_deref()
-        .unwrap_or(DEFAULT_TENANT)
-        .to_owned();
+    let tenant_id = sync_req.tenant_id
+        .as_deref()
+        .unwrap_or(DEFAULT_TENANT)
+        .to_owned();
     mut_roles()
         .entry(tenant_id.clone())
src/rbac/map.rs (2)

318-328: Session expiration logic is inverted.

Line 328 uses expiry < &now which retains sessions where expiry is in the past (i.e., expired sessions). This should be expiry > &now to retain valid (non-expired) sessions.

🐛 Proposed fix
-        sessions.retain(|(_, expiry)| expiry < &now);
+        sessions.retain(|(_, expiry)| expiry > &now);

444-451: Replace unwrap with error handling when creating HeaderValue.

HeaderValue::from_bytes(tenant.as_bytes()).unwrap() can panic if tenant contains invalid HTTP header bytes. Use HeaderValue::from_str() with proper error handling.

🛠️ Proposed fix
     pub fn mutate_request_with_tenant(&self, key: &SessionKey, req: &mut ServiceRequest) {
         if let Some((_, tenant, _)) = self.active_sessions.get(key) {
-            req.headers_mut().insert(
-                HeaderName::from_static("tenant"),
-                HeaderValue::from_bytes(tenant.as_bytes()).unwrap(),
-            );
+            if let Ok(value) = HeaderValue::from_str(tenant) {
+                req.headers_mut()
+                    .insert(HeaderName::from_static("tenant"), value);
+            }
         }
     }
src/migration/mod.rs (1)

499-507: Inconsistent filename: hardcoded string vs constant.

Lines 504 and 506 use hardcoded ".parseable.json" while get_staging_metadata (line 469) uses PARSEABLE_METADATA_FILE_NAME constant. This inconsistency could lead to path mismatches.

🐛 Proposed fix
     let path = if let Some(tenant) = tenant_id.as_ref() {
         config
             .options
             .staging_dir()
             .join(tenant)
-            .join(".parseable.json")
+            .join(PARSEABLE_METADATA_FILE_NAME)
     } else {
-        config.options.staging_dir().join(".parseable.json")
+        config.options.staging_dir().join(PARSEABLE_METADATA_FILE_NAME)
     };
src/storage/object_storage.rs (1)

1212-1244: Default-tenant path consistency issue remains in schema/stream.json helpers.
The DEFAULT_TENANT prefix is still included here, which diverges from stream_relative_path and prior layout expectations.

src/hottier.rs (3)

106-110: Stream/tenant exclusion logic still wrong.
Same issue as noted previously: the stream/tenant exclusion should skip only when both match, not via separate inequality checks.


595-602: Avoid panics in check_stream_hot_tier_exists.
hot_tier_file_path(...).unwrap() can still panic on path errors; the earlier recommendation to handle this defensively still applies.


794-799: Use internal-stream hot tier size for pstats.
create_pstats_hot_tier still uses MIN_STREAM_HOT_TIER_SIZE_BYTES instead of INTERNAL_STREAM_HOT_TIER_SIZE_BYTES, as previously flagged.

src/metastore/metastores/object_store_metastore.rs (6)

1005-1008: Manifest path construction inconsistency persists.
manifest_path("") is still used for put/delete, diverging from get_manifest_path.

Also applies to: 1022-1025


515-540: Dashboards overwrite per-user entries within a tenant.
Still inserting with the same tenant key for each user, losing earlier dashboards.


671-693: Potential panics from unwraps after filter migration.
migrate_v1_v2 output isn’t guaranteed to be an object; the unwraps remain risky.


1301-1316: Tenant prefix not stripped when listing streams.
The tenant segment can still be treated as a stream name.


317-323: get_alert_states still uses DEFAULT_TENANT for None.
This continues to point at DEFAULT_TENANT/.alerts/ instead of /.alerts/.


1167-1171: delete_tenant still attributes metrics to None.
delete_prefix is called with &None instead of the actual tenant id.

src/alerts/alert_types.rs (1)

90-101: Avoid panics when building the Authorization header.

HeaderValue::from_str(&header).unwrap() on line 96 can panic on malformed values. Convert this into a handled error instead.

🛠️ Proposed fix
-        let auth = if let Some(tenant) = self.tenant_id.as_ref()
-            && let Some(header) = TENANT_METADATA.get_global_query_auth(tenant)
-        {
-            let mut map = HeaderMap::new();
-            map.insert(
-                HeaderName::from_static("authorization"),
-                HeaderValue::from_str(&header).unwrap(),
-            );
-            Some(map)
-        } else {
-            None
-        };
+        let auth = if let Some(tenant) = self.tenant_id.as_ref()
+            && let Some(header) = TENANT_METADATA.get_global_query_auth(tenant)
+        {
+            let mut map = HeaderMap::new();
+            let value = HeaderValue::from_str(&header)
+                .map_err(|e| AlertError::CustomError(format!("Invalid auth header: {e}")))?;
+            map.insert(HeaderName::from_static("authorization"), value);
+            Some(map)
+        } else {
+            None
+        };
src/utils/mod.rs (2)

63-74: Avoid panic when BasicAuth omits a password.

Line 67 uses .unwrap() on the optional password. The HTTP Basic Authentication specification allows passwords to be omitted, so this will panic on such requests.

🛠️ Proposed fix
-        Users.mutate_request_with_basic_user(basic.user_id(), basic.password().unwrap(), req);
+        Users.mutate_request_with_basic_user(basic.user_id(), basic.password().unwrap_or(""), req);

122-126: Potential panic on malformed header value.

Using .unwrap() on to_str() can panic if the tenant header contains non-UTF8 bytes. The past review indicated this was addressed, but the code still shows the unwrap().

🛠️ Proposed fix
 pub fn get_tenant_id_from_request(req: &HttpRequest) -> Option<String> {
     req.headers()
         .get("tenant")
-        .map(|tenant_value| tenant_value.to_str().unwrap().to_owned())
+        .and_then(|tenant_value| tenant_value.to_str().ok().map(|s| s.to_owned()))
 }
src/handlers/http/middleware.rs (2)

196-209: Handle potential panic from HeaderValue::from_str().unwrap().

Line 201 uses HeaderValue::from_str(tid).unwrap() which can panic if tid contains characters not valid in HTTP headers.

🛠️ Proposed fix
-                    if let Ok((_, tenant)) = get_user_and_tenant_from_request(req.request())
-                        && let Some(tid) = tenant.as_ref()
-                    {
-                        req.headers_mut().insert(
-                            HeaderName::from_static("tenant"),
-                            HeaderValue::from_str(tid).unwrap(),
-                        );
-                        tenant
+                    match get_user_and_tenant_from_request(req.request()) {
+                        Ok((_, Some(tid))) => {
+                            if let Ok(header_val) = HeaderValue::from_str(&tid) {
+                                req.headers_mut().insert(
+                                    HeaderName::from_static("tenant"),
+                                    header_val,
+                                );
+                            }
+                            Some(tid)
+                        }
+                        Ok((_, None)) => {
+                            // Use DEFAULT_TENANT for non-tenant users
+                            if let Ok(header_val) = HeaderValue::from_str(DEFAULT_TENANT) {
+                                req.headers_mut().insert(
+                                    HeaderName::from_static("tenant"),
+                                    header_val,
+                                );
+                            }
+                            Some(DEFAULT_TENANT.to_string())
+                        }
+                        Err(_) => {
+                            header_error = Some(actix_web::Error::from(PostError::Header(
+                                crate::utils::header_parsing::ParseHeaderError::InvalidTenantId,
+                            )));
+                            None
+                        }
+                    }

375-387: Handle missing tenants explicitly in check_suspension.

The is_action_suspended method returns Err(TenantNotFound) when a tenant doesn't exist, but check_suspension silently treats this error the same as Ok(None) via the else branch, allowing the function to return Authorized for non-existent tenants.

🛠️ Proposed fix
 pub fn check_suspension(req: &HttpRequest, action: Action) -> rbac::Response {
     if let Some(tenant) = req.headers().get("tenant")
         && let Ok(tenant) = tenant.to_str()
     {
-        if let Ok(Some(suspension)) = TENANT_METADATA.is_action_suspended(tenant, &action) {
-            return rbac::Response::Suspended(suspension);
-        } else {
-            // tenant does not exist
+        match TENANT_METADATA.is_action_suspended(tenant, &action) {
+            Ok(Some(suspension)) => return rbac::Response::Suspended(suspension),
+            Ok(None) => {},
+            Err(_) => return rbac::Response::UnAuthorized,
         }
     }
     rbac::Response::Authorized
 }
src/handlers/http/modal/query/querier_rbac.rs (4)

62-70: Role existence check may silently pass for non-existent tenants.

When roles().get(...) returns None (tenant not in roles map), the if let Some(...) guard fails and the loop continues without adding the role to non_existent_roles. This means roles specified for a non-existent tenant silently pass validation.

🛠️ Proposed fix
     let mut non_existent_roles = Vec::new();
     for role in &user_roles {
-        if let Some(tenant_roles) = roles().get(tenant_id.as_deref().unwrap_or(DEFAULT_TENANT))
-            && !tenant_roles.contains_key(role)
-        {
+        let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
+        let role_exists = roles()
+            .get(tenant)
+            .is_some_and(|tenant_roles| tenant_roles.contains_key(role));
+        if !role_exists {
             non_existent_roles.push(role.clone());
         }
     }

198-204: Same role validation issue as in post_user.

Apply the same fix pattern here - the if let Some(...) guard silently passes when tenant doesn't exist.

🛠️ Proposed fix
     roles_to_add.iter().for_each(|r| {
-        if let Some(tenant_roles) = roles().get(tenant_id.as_deref().unwrap_or(DEFAULT_TENANT))
-            && tenant_roles.get(r).is_none()
-        {
+        let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
+        let role_exists = roles()
+            .get(tenant)
+            .is_some_and(|tenant_roles| tenant_roles.contains_key(r));
+        if !role_exists {
             non_existent_roles.push(r.clone());
         }
     });

260-266: Same role validation issue in remove_roles_from_user.

Apply the same fix pattern here.

🛠️ Proposed fix
     roles_to_remove.iter().for_each(|r| {
-        if let Some(tenant_roles) = roles().get(tenant_id.as_deref().unwrap_or(DEFAULT_TENANT))
-            && tenant_roles.get(r).is_none()
-        {
+        let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
+        let role_exists = roles()
+            .get(tenant)
+            .is_some_and(|tenant_roles| tenant_roles.contains_key(r));
+        if !role_exists {
             non_existent_roles.push(r.clone());
         }
     });

81-81: User created without tenant association.

user::User::new_basic is called with None for the tenant parameter (second argument), even though tenant_id is available from the request. This means new users won't be associated with their tenant.

-    let (user, password) = user::User::new_basic(username.clone(), None, false);
+    let (user, password) = user::User::new_basic(username.clone(), tenant_id.clone(), false);
src/handlers/http/modal/ingest/ingestor_rbac.rs (4)

51-57: Inverted tenant validation logic.

The condition checks for equality when it should check for inequality. The error message indicates this should prevent cross-tenant operations, but the current logic prevents same-tenant operations instead.

-        if req_tenant.ne(DEFAULT_TENANT)
-            && (req_tenant.eq(user.tenant.as_deref().unwrap_or(DEFAULT_TENANT)))
+        if req_tenant.ne(DEFAULT_TENANT)
+            && (req_tenant.ne(user.tenant.as_deref().unwrap_or(DEFAULT_TENANT)))

60-60: Silently ignoring storage error.

The result of put_staging_metadata is discarded. If persistence fails, in-memory and storage states diverge. Same issue at lines 86, 133, 192, and 210.

-        let _ = storage::put_staging_metadata(&metadata, req_tenant_id);
+        storage::put_staging_metadata(&metadata, req_tenant_id)?;

108-114: Role validation silently passes for non-existent tenants.

If the tenant doesn't exist in the roles map, roles silently pass validation because the if let Some(...) guard fails.

     roles_to_add.iter().for_each(|r| {
-        if let Some(tenant_roles) = roles().get(tenant_id.as_deref().unwrap_or(DEFAULT_TENANT))
-            && tenant_roles.get(r).is_none()
-        {
+        let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
+        let role_exists = roles()
+            .get(tenant)
+            .is_some_and(|tenant_roles| tenant_roles.contains_key(r));
+        if !role_exists {
             non_existent_roles.push(r.clone());
         }
     });

154-160: Same role validation issue in remove_roles_from_user.

Apply the same fix pattern here to prevent roles from silently passing validation when the tenant doesn't exist.

src/alerts/target.rs (2)

61-69: Convert DEFAULT_TENANT string to None in Target.tenant field for consistency.

When loading targets, the code sets t.tenant = Some(tenant_id.clone()) for all tenants, including when tenant_id == DEFAULT_TENANT. This violates the codebase convention where None represents the default tenant.

             let inner = targets
                 .into_iter()
                 .map(|mut t| {
-                    t.tenant = Some(tenant_id.clone());
+                    t.tenant = if tenant_id == DEFAULT_TENANT {
+                        None
+                    } else {
+                        Some(tenant_id.clone())
+                    };
                     (t.id, t)
                 })
                 .collect();

75-86: Override target.tenant with authenticated tenant before persisting.

The update method persists the target using target.tenant directly from the deserialized object. Since Target.tenant is publicly deserialized, a client can inject an arbitrary tenant value in the JSON payload, bypassing tenant validation. The handler should set target.tenant from the authenticated request context before calling this method.

src/handlers/http/oidc.rs (3)

112-116: Use get_tenant_id_from_request() for tenant selection in basic-auth flow.

For SessionKey::BasicAuth, get_tenant_id_from_key() returns None because basic credentials have no pre-established session mapping. This causes Users.get_user(&username, &None) to default to DEFAULT_TENANT regardless of the actual tenant context.


141-159: Intra-cluster login sync should treat non-2xx as failure.

reqwest::send() succeeds on non-2xx responses. Call .error_for_status() on the response and log per-node failures.

-                        INTRA_CLUSTER_CLIENT
+                        let resp = INTRA_CLUSTER_CLIENT
                             .post(url)
                             .header(header::AUTHORIZATION, node.token)
                             .header(header::CONTENT_TYPE, "application/json")
                             .json(&json!(...))
                             .send()
-                            .await?;
+                            .await?;
+                        resp.error_for_status()?;
                         Ok::<(), anyhow::Error>(())

325-326: Address incomplete tenant implementation in OIDC user creation.

The comment "LET TENANT BE NONE FOR NOW!!!" indicates new OAuth users are created without tenant association, despite tenant_id being extracted from the request. This creates a multi-tenant isolation gap.

Replace None with tenant_id or add a TODO with a tracking issue reference.

src/rbac/mod.rs (1)

291-310: Non-deterministic tenant selection for basic-auth with duplicate credentials.

mutate_request_with_basic_user uses par_iter().find_any() which returns a non-deterministic result if identical credentials exist across tenants. Either enforce username uniqueness across tenants or use deterministic iteration.

src/handlers/http/query.rs (2)

118-125: Tenant derived from header is spoofable.

Using get_tenant_id_from_request makes tenant scoping spoofable via the tenant header. Consider deriving tenant from the authenticated session using get_user_and_tenant_from_request (see src/utils/mod.rs for get_tenant_id_from_key).


370-380: Same tenant spoofing issue in get_counts.

Apply the same fix to derive tenant from the authenticated session rather than the request header.

src/handlers/http/cluster/mod.rs (1)

1876-1917: Propagate tenant header and avoid unwrap() for querier token.

Line 1913 uses HeaderValue::from_str(...).unwrap() which can panic on invalid header characters, and tenant_id is never forwarded to the querier. This can both crash and break tenant isolation for forwarded queries.

🔧 Suggested fix
     let mut map = reqwest::header::HeaderMap::new();

+    if let Some(tenant) = tenant_id.as_deref() {
+        map.insert(
+            reqwest::header::HeaderName::from_static("tenant"),
+            reqwest::header::HeaderValue::from_str(tenant)
+                .map_err(QueryError::from)?,
+        );
+    }
+
     if let Some(auth) = auth_token {
         // always basic auth
         for (key, value) in auth.iter() {
@@
         }
     } else {
-        map.insert(
-            reqwest::header::AUTHORIZATION,
-            reqwest::header::HeaderValue::from_str(&querier.token).unwrap(),
-        );
+        match reqwest::header::HeaderValue::from_str(&querier.token) {
+            Ok(val) => { map.insert(reqwest::header::AUTHORIZATION, val); }
+            Err(e) => {
+                mark_querier_available(&domain_name).await;
+                return Err(QueryError::Anyhow(anyhow::anyhow!("Invalid token header: {e}")));
+            }
+        }
     };
src/rbac/user.rs (1)

385-393: Use each user's tenant when invalidating sessions.

Line 392 uses the caller-provided tenant_id for all users; if the group ever contains mixed-tenant users (or corrupted data), the wrong tenant’s sessions can be invalidated. Prefer each user’s tenant_id.

🔧 Suggested fix
-        for group_user in &self.users {
-            mut_sessions().remove_user(group_user.userid(), tenant_id);
-        }
+        for group_user in &self.users {
+            mut_sessions().remove_user(
+                group_user.userid(),
+                group_user.tenant_id.as_deref().unwrap_or(DEFAULT_TENANT),
+            );
+        }
src/handlers/http/rbac.rs (1)

70-75: Tenant derived from request header is spoofable — use authenticated session instead.

Line 73 reads tenant_id from a header, which any authenticated user can spoof. Tenant should be derived from the authenticated session/claims. This affects all RBAC handlers using get_tenant_id_from_request.

src/parseable/mod.rs (4)

1072-1087: TOCTOU race in add_tenant.

Line 1081 checks under a read lock and then writes under a separate write lock, allowing duplicates between the two locks. Use a single write lock for check-and-insert.

🔒 Suggested fix
-        if self.tenants.read().unwrap().contains(&tenant_id) {
-            return Err(anyhow::Error::msg(format!(
-                "Tenant with id- {tenant_id} already exists"
-            )));
-        } else {
-            self.tenants.write().unwrap().push(tenant_id.clone());
-            TENANT_METADATA.insert_tenant(tenant_id, tenant_meta);
-        }
+        let mut tenants = self.tenants.write().unwrap();
+        if tenants.contains(&tenant_id) {
+            return Err(anyhow::Error::msg(format!(
+                "Tenant with id- {tenant_id} already exists"
+            )));
+        }
+        tenants.push(tenant_id.clone());
+        TENANT_METADATA.insert_tenant(tenant_id, tenant_meta);

1131-1157: delete_tenant leaves stale tenant references and data behind.

Line 1156 removes TENANT_METADATA but never removes the tenant from self.tenants, and it doesn’t delete the tenant’s object-store prefix. This leaves stale tenants in memory and residual data on disk (privacy/compliance risk).

Consider:

  • Removing the tenant from self.tenants.
  • Making this async and calling metastore.delete_tenant(tenant_id) before clearing in-memory state.

1160-1196: load_tenants has empty branch and silent lock failure.

The else if !is_multi_tenant {} branch is a no-op, and the write-lock failure path returns Ok(None) (masking poisoned lock issues).

🔧 Suggested fix
-            } else if !is_multi_tenant {
-            } else {
+            } else if is_multi_tenant {
                 return Err(anyhow::Error::msg(format!(
                     "Found invalid tenant directory with multi-tenant mode- {tenant_id}.\nExiting."
                 )));
             }
+            // In single-tenant mode, directories without metadata are ignored
         }
 
-        if let Ok(mut t) = self.tenants.write() {
-            t.extend(dirs);
-            Ok(Some(()))
-        } else {
-            Ok(None)
-        }
+        let mut t = self.tenants.write().expect("tenants lock poisoned");
+        t.extend(dirs);
+        Ok(Some(()))

1198-1207: list_tenants silently swallows lock poisoning.

Returning None when the read lock fails masks poisoned-lock failures and hides real issues.

🔧 Suggested fix
-        if let Ok(t) = self.tenants.as_ref().read()
-            && !t.is_empty()
-        {
-            let t = t.clone();
-            Some(t)
-        } else {
-            None
-        }
+        let t = self.tenants.read().expect("tenants lock poisoned");
+        if t.is_empty() { None } else { Some(t.clone()) }
src/catalog/mod.rs (1)

563-576: Empty string tenant root creates inconsistent path format.

The partition_path function uses an empty string as the fallback when tenant_id is None, which creates paths like ["", stream, "date=..."]. This differs from the rest of the codebase which uses DEFAULT_TENANT and may cause path matching issues.

The past review suggested conditionally building the path to omit the root component when no tenant exists:

Proposed fix - conditionally include tenant root
 pub fn partition_path(
     stream: &str,
     lower_bound: DateTime<Utc>,
     upper_bound: DateTime<Utc>,
     tenant_id: &Option<String>,
 ) -> RelativePathBuf {
-    let root = tenant_id.as_deref().unwrap_or("");
     let lower = lower_bound.date_naive().format("%Y-%m-%d").to_string();
     let upper = upper_bound.date_naive().format("%Y-%m-%d").to_string();
-    if lower == upper {
-        RelativePathBuf::from_iter([root, stream, &format!("date={lower}")])
+    let date_segment = if lower == upper {
+        format!("date={lower}")
     } else {
-        RelativePathBuf::from_iter([root, stream, &format!("date={lower}:{upper}")])
+        format!("date={lower}:{upper}")
+    };
+    if let Some(tenant) = tenant_id.as_ref() {
+        RelativePathBuf::from_iter([tenant.as_str(), stream, &date_segment])
+    } else {
+        RelativePathBuf::from_iter([stream, &date_segment])
     }
 }
src/alerts/mod.rs (3)

1377-1379: Apply same fix for tenant bucket creation in update_notification_state().

Same issue as update_state() - the get_mut(tenant) pattern can silently drop writes.

Proposed fix
-        if let Some(alerts) = write_access.get_mut(tenant) {
-            alerts.insert(*alert.get_id(), alert.clone_box());
-        }
+        write_access
+            .entry(tenant.to_owned())
+            .or_default()
+            .insert(*alert.get_id(), alert.clone_box());

1331-1339: Don't silently drop state updates when tenant bucket doesn't exist.

The write-back block in update_state() uses get_mut(tenant) which silently skips the insert if the tenant bucket doesn't exist. This is inconsistent with the update() method which correctly uses entry().or_default().

Proposed fix
         {
             let mut write_access = self.alerts.write().await;

             let tenant = alert.get_tenant_id().as_deref().unwrap_or(DEFAULT_TENANT);
-            if let Some(alerts) = write_access.get_mut(tenant) {
-                alerts.insert(*alert.get_id(), alert.clone_box());
-            }
-            // write_access.insert(*alert.get_id(), alert.clone_box());
+            write_access
+                .entry(tenant.to_owned())
+                .or_default()
+                .insert(*alert.get_id(), alert.clone_box());
         }

1045-1050: Verify tenant normalization handles DEFAULT_TENANT consistently.

The current logic only checks for empty string when determining tenant context. If tenant_id equals DEFAULT_TENANT, it should also be treated as None to maintain consistent storage semantics throughout the codebase.

-            let tenant = if tenant_id.is_empty() {
-                &None
-            } else {
-                &Some(tenant_id.clone())
+            let tenant = if tenant_id.is_empty() || tenant_id == DEFAULT_TENANT {
+                None
+            } else {
+                Some(tenant_id.clone())
             };

Comment on lines 274 to +276
pub async fn validate(&self, session_key: &SessionKey) -> Result<(), CorrelationError> {
let ctx = &QUERY_SESSION;

let ctx = &QUERY_SESSION.get_ctx();
let tenant_id = get_tenant_id_from_key(session_key);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, let's find where QUERY_SESSION is defined
rg -n "QUERY_SESSION\s*=" --type rust -B 2 -A 2

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Search for the get_ctx() implementation
rg -n "fn get_ctx" --type rust -A 5

Repository: parseablehq/parseable

Length of output: 384


🏁 Script executed:

# Look at the validate() method in detail to see tenant handling
rg -n "pub async fn validate" src/correlation.rs -A 70 | head -100

Repository: parseablehq/parseable

Length of output: 2869


🏁 Script executed:

# Check how ctx.table() is implemented
rg -n "fn table\(" --type rust -A 5 | head -50

Repository: parseablehq/parseable

Length of output: 611


🏁 Script executed:

# Find QUERY_SESSION initialization to see which schema provider it uses
rg -n "QUERY_SESSION\s*:" --type rust -B 5 -A 10 src/query/mod.rs

Repository: parseablehq/parseable

Length of output: 1064


🏁 Script executed:

# Check what schema provider QUERY_SESSION is initialized with
rg -n "SessionContext::new\|SchemaProvider" src/query/mod.rs -B 3 -A 5

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Look for the schema provider setup and initialization
rg -n "pub struct.*SessionContext\|impl.*SessionContext" src/query/mod.rs -A 15 | head -40

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Search for global schema provider to see how it's implemented
rg -n "GlobalSchemaProvider\|global.*schema" --type rust -B 2 -A 5 src/

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Find the create_session_context method implementation
rg -n "fn create_session_context" --type rust -A 20 src/query/mod.rs

Repository: parseablehq/parseable

Length of output: 1200


🏁 Script executed:

# Also check if there are other files in src/query/ that might have this method
fd . src/query --type f -name "*.rs" | head -20

Repository: parseablehq/parseable

Length of output: 295


🏁 Script executed:

# Search more broadly for create_session_context
rg -n "create_session_context" --type rust -B 2 -A 15

Repository: parseablehq/parseable

Length of output: 2799


🏁 Script executed:

# Let's see more of the create_session_context to understand default schema setup
rg -n "create_session_context" src/query/mod.rs -A 40 | head -60

Repository: parseablehq/parseable

Length of output: 2458


🏁 Script executed:

# Check how SessionContext.table() is used elsewhere in the codebase
rg -n "ctx\.table\(" --type rust -B 3 -A 3 | head -80

Repository: parseablehq/parseable

Length of output: 508


🏁 Script executed:

# Search for DEFAULT_TENANT usage and how schemas are selected
rg -n "DEFAULT_TENANT" --type rust -B 2 -A 2

Repository: parseablehq/parseable

Length of output: 50377


🏁 Script executed:

# Search for how SessionContext handles multiple schemas and table lookup
rg -n "set_default_schema\|current_schema\|\.table\(" src/query/mod.rs -B 2 -A 5 | head -60

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Check if there's documentation or usage of SessionContext schema selection
rg -n "SessionContext" src/correlation.rs -B 3 -A 5

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Look for how the context is used with specific schemas/tables
rg -n "catalog\(" src/query/mod.rs -B 2 -A 3 | head -40

Repository: parseablehq/parseable

Length of output: 899


🏁 Script executed:

# Check if there's any schema resolution or selection in the validate method
rg -n "fn validate" src/correlation.rs -A 70 | grep -E "schema|catalog|tenant" -i

Repository: parseablehq/parseable

Length of output: 206


🏁 Script executed:

# Search for how tables are looked up in stream_schema_provider
rg -n "fn table" src/query/stream_schema_provider.rs -B 2 -A 15 | head -60

Repository: parseablehq/parseable

Length of output: 1785


🏁 Script executed:

# Look for table resolution or search logic
rg -n "table_exist\|get_table" src/query/stream_schema_provider.rs -B 2 -A 8

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Check if there's any handling of qualified table names (schema.table)
rg -n "split\|qualify" src/correlation.rs -B 2 -A 2

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Look for other usages of ctx.table to see if there's a pattern
rg -n "\.table\(" src/ --type rust -B 2 -A 2 | grep -v test | head -80

Repository: parseablehq/parseable

Length of output: 626


🏁 Script executed:

# Let's check how DataFusion resolves unqualified table names
# First, check if table names are qualified with schema
rg -n "select_columns\|table_name" src/correlation.rs -B 2 -A 2 | head -40

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Check if there's schema qualification in table lookups
rg -n "catalog\|schema" src/correlation.rs

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Look for any qualified table access pattern (schema.table)
rg -n '"\w+\.\w+"' src/correlation.rs

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Check how other code handles multi-tenant table access with QUERY_SESSION
rg -n "QUERY_SESSION" src/ --type rust -B 3 -A 3 | grep -E "tenant|schema|table" -i -B 2 -A 2

Repository: parseablehq/parseable

Length of output: 4297


🏁 Script executed:

# Check how final_logical_plan uses tenant_id to ensure tenant safety
rg -n "final_logical_plan" src/query/mod.rs -A 30 | head -80

Repository: parseablehq/parseable

Length of output: 2927


🏁 Script executed:

# Look for create_streams_for_distributed to see how other code handles tenant context
rg -n "create_streams_for_distributed\|create_stream" src/ --type rust -B 2 -A 8 | head -100

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Check if there's a way to set current schema on SessionContext before calling table()
rg -n "use_schema\|set_schema\|use_catalog" src/ --type rust

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Check how GlobalSchemaProvider determines which tenant's tables to return
rg -n "struct GlobalSchemaProvider" src/ --type rust -A 20

Repository: parseablehq/parseable

Length of output: 1537


🏁 Script executed:

# Look for how PARSEABLE.get_stream works with tenant_id
rg -n "fn get_stream" src/parseable.rs -A 10 | head -40

Repository: parseablehq/parseable

Length of output: 121


🏁 Script executed:

# Check if there's any explicit schema selection before calling ctx.table()
rg -n "validate\|\.table\(" src/correlation.rs -B 5 -A 10 | grep -A 20 "fn validate"

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Look for how other code using QUERY_SESSION ensures tenant safety
rg -n "execute_logical_plan\|tenant_id" src/query/mod.rs -B 2 -A 5 | head -100

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Find where parseable module or streams are defined
find src -name "*.rs" -type f | xargs grep -l "struct Parseable\|impl Parseable" | head -5

Repository: parseablehq/parseable

Length of output: 226


🏁 Script executed:

# Check how PARSEABLE.streams.list and PARSEABLE.get_stream work
rg -n "pub streams:" src/ --type rust -B 2 -A 5

Repository: parseablehq/parseable

Length of output: 1590


🏁 Script executed:

# Look for how DataFusion resolves unqualified table names when multiple schemas exist
rg -n "register_schema\|table(" src/query/mod.rs -B 2 -A 5 | head -60

Repository: parseablehq/parseable

Length of output: 161


🏁 Script executed:

# Verify if validate() is always called in single-tenant context or multi-tenant
rg -n "\.validate\(" src/ --type rust -B 3 -A 3

Repository: parseablehq/parseable

Length of output: 2975


🏁 Script executed:

# Let's look at how DataFusion SessionContext handles table lookup with multiple schemas
# Check if there's a default schema or search order
rg -n "use_schema\|set_default_schema" src/ --type rust

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# See how streams are stored and if table names are unique per tenant or global
rg -n "fn list" src/parseable/streams.rs -A 10 | head -40

Repository: parseablehq/parseable

Length of output: 1112


🏁 Script executed:

# Look at the broader context of validation to see if tenant context is set elsewhere
cat -n src/correlation.rs | sed -n '270,280p'

Repository: parseablehq/parseable

Length of output: 656


🏁 Script executed:

# Check how Alert validation works (since it also validates)
rg -n "impl.*Alert\|pub async fn validate" src/alerts/mod.rs -A 30 | head -80

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Check if DataFusion SessionContext has a way to set current schema or search path
rg -n "set_default_schema\|use_schema\|current_schema\|search_path" src/ --type rust

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Look at how ctx.table() would resolve unqualified table names - check schema registration order
rg -n "register_schema" src/query/mod.rs -B 3 -A 3

Repository: parseablehq/parseable

Length of output: 1433


🏁 Script executed:

# Check if table names are truly unique globally or if they could collide across tenants
rg -n "get_stream\|streams.get" src/parseable/streams.rs -A 5 | head -60

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Look at what happens if a table is looked up without specifying schema explicitly
rg -n "\.table\(" src/ --type rust -B 5 -A 2 | grep -E "tenant|schema|ctx" -B 3 -A 2 | head -80

Repository: parseablehq/parseable

Length of output: 452


🏁 Script executed:

# Let me check the exact behavior by looking at how other parts use tenant_id with table access
rg -n "QUERY_SESSION_STATE\|execute_logical_plan" src/correlation.rs -B 5 -A 10

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Check if there's any mechanism that sets the default schema before table lookup
rg -n "execute\|execute_logical_plan" src/correlation.rs -B 5 -A 15

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Verify if table names must be qualified when multiple schemas exist
rg -n "schema\.\|\".\w+\"\." src/correlation.rs

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Check how Query.execute() handles tenant_id to see if there's a pattern we should follow
rg -n "pub async fn execute" src/query/mod.rs -A 20 | head -50

Repository: parseablehq/parseable

Length of output: 1826


🏁 Script executed:

# Look at the full execute implementation to see how tenant_id is used
rg -n "pub async fn execute" src/query/mod.rs -A 50 | head -80

Repository: parseablehq/parseable

Length of output: 2917


🏁 Script executed:

# Check how final_logical_plan uses tenant_id
rg -n "fn final_logical_plan" src/query/mod.rs -A 35 | head -60

Repository: parseablehq/parseable

Length of output: 1796


🏁 Script executed:

# Look at the transform function to see how tenant_id is applied to the logical plan
rg -n "fn transform" src/query/mod.rs -B 2 -A 20 | head -80

Repository: parseablehq/parseable

Length of output: 1103


🏁 Script executed:

# Check if there's a use_schema or set default schema mechanism by looking at how logical plans are created
rg -n "LogicalPlan\|use_schema" src/query/mod.rs -B 2 -A 5 | head -100

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Now let's understand the critical issue - check if DataFusion's default table resolution respects tenant boundaries
# Look for how table names are resolved when multiple schemas exist
rg -n "table_name\|TableScan" src/query/mod.rs -B 2 -A 5 | head -60

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Check if there's a schema prefix used when accessing tables
rg -n "table_name\|qualified" src/correlation.rs -B 2 -A 2

Repository: parseablehq/parseable

Length of output: 47


🏁 Script executed:

# Verify if the validate() method is supposed to use a different approach for multi-tenant
rg -n "validate" src/ --type rust -B 5 -A 20 | grep -E "tenant|schema|QUERY_SESSION|ctx" -B 2 -A 2 | head -80

Repository: parseablehq/parseable

Length of output: 4999


🏁 Script executed:

# Check if there's any test or documentation about expected tenant isolation
find src -name "*test*" -o -name "*spec*" | xargs grep -l "validate\|correlation\|tenant" 2>/dev/null | head -5

Repository: parseablehq/parseable

Length of output: 47


Use tenant-scoped schema when validating table configurations.

The ctx.table() call at line 314 searches unqualified table names across all registered schemas in the SessionContext. When multiple tenants exist with tables sharing the same name, DataFusion could return a table from the wrong tenant's schema, bypassing the authorization check at line 308.

Use the same pattern as Query.execute() (see src/query/mod.rs line 806) by passing tenant_id to ensure tables are resolved within the correct tenant context. Alternatively, explicitly select the tenant's schema before calling table().

🤖 Prompt for AI Agents
In `@src/correlation.rs` around lines 274 - 276, In validate(), ensure table
resolution is tenant-scoped: when obtaining the SessionContext via
QUERY_SESSION.get_ctx() and before calling ctx.table(), pass the tenant_id (from
get_tenant_id_from_key(session_key)) or explicitly select the tenant schema so
DataFusion resolves the table inside the tenant's schema (same pattern used in
Query.execute()). Update the ctx.table(...) invocation in validate() to use the
tenant-scoped lookup (or set the active schema for ctx to tenant_id) so
authorization at line 308 cannot be bypassed by cross-tenant table name
collisions.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (11)
src/prism/logstream/mod.rs (1)

399-399: Verify status code for TimeParseError.

NOT_FOUND seems semantically incorrect for a time parse error. Typically, parse errors map to BAD_REQUEST (400) since they indicate malformed input rather than a missing resource.

-            PrismLogstreamError::TimeParse(_) => StatusCode::NOT_FOUND,
+            PrismLogstreamError::TimeParse(_) => StatusCode::BAD_REQUEST,
src/metastore/metastores/object_store_metastore.rs (2)

903-917: Bug: Tenant prefix may be incorrectly treated as a date.

When tenant_id is provided, path.parts() at line 913 extracts all path components including the tenant name. The filter at line 914 only excludes stream_name and STREAM_ROOT_DIRECTORY, but not the tenant prefix. This causes the tenant name to be treated as a date in the loop at line 918.

             let dates = resp
                 .common_prefixes
                 .iter()
                 .flat_map(|path| path.parts())
-                .filter(|name| name.as_ref() != stream_name && name.as_ref() != STREAM_ROOT_DIRECTORY)
+                .filter(|name| {
+                    let n = name.as_ref();
+                    n != stream_name
+                        && n != STREAM_ROOT_DIRECTORY
+                        && tenant_id.as_ref().map_or(true, |t| n != t.as_str())
+                })
                 .map(|name| name.as_ref().to_string())
                 .collect::<Vec<_>>();

576-602: get_chats is not tenant-aware and creates a data isolation inconsistency.

The get_chats method doesn't iterate over tenants using PARSEABLE.list_tenants() and always passes &None for tenant_id. This is inconsistent with similar methods like get_dashboards, get_filters, and get_correlations, which all aggregate data across tenants using the same pattern. More critically, the related put_chat and delete_chat methods both accept tenant_id: &Option<String> and pass it to storage operations, creating a functional mismatch—data can be written and deleted per-tenant but read globally. This could lead to data leakage or incorrect isolation in multi-tenant deployments. Update get_chats to iterate over tenants consistently:

async fn get_chats(&self) -> Result<DashMap<String, Vec<Bytes>>, MetastoreError> {
    let all_user_chats = DashMap::new();
    let base_paths = PARSEABLE.list_tenants().unwrap_or_else(|| vec!["".into()]);
    
    for mut tenant in base_paths {
        let tenant_id = &Some(tenant.clone());
        let users_dir = RelativePathBuf::from_iter([&tenant, USERS_ROOT_DIR]);
        for user in self.storage.list_dirs_relative(&users_dir, tenant_id).await? {
            if user.starts_with(".") {
                continue;
            }
            let mut chats = Vec::new();
            let chats_path = users_dir.join(&user).join("chats");
            let user_chats = self
                .storage
                .get_objects(
                    Some(&chats_path),
                    Box::new(|file_name| file_name.ends_with(".json")),
                    tenant_id,
                )
                .await?;
            chats.extend(user_chats);
            all_user_chats.insert(format!("{}/{}", tenant, user), chats);
        }
    }
    Ok(all_user_chats)
}
src/hottier.rs (4)

186-205: Oldest hot-tier timestamp is still tenant-agnostic.

Line 203 calls get_oldest_date_time_entry(stream) which reads from the non-tenant hot tier path. For tenants sharing stream names, this will return the wrong timestamp or None. Please thread tenant_id through the date/path helpers.

🐛 Suggested change (plus propagate tenant_id through helpers)
-        stream_hot_tier.oldest_date_time_entry = self.get_oldest_date_time_entry(stream).await?;
+        stream_hot_tier.oldest_date_time_entry =
+            self.get_oldest_date_time_entry(stream, tenant_id).await?;

208-218: Delete uses non-tenant path; can remove the wrong hot tier.

Line 216 removes hot_tier_path/stream even when tenant_id is Some. This can delete another tenant’s hot tier or fail to delete the intended one.

🐛 Proposed fix
-        let path = self.hot_tier_path.join(stream);
+        let path = match tenant_id.as_ref() {
+            Some(tenant) => self.hot_tier_path.join(tenant).join(stream),
+            None => self.hot_tier_path.join(stream),
+        };

236-257: Validate/sanitize tenant_id before using it in filesystem paths.

tenant_id is now used as a path segment (Line 246–251). If it can contain .. or path separators, it could escape the hot tier root and access unintended locations. Please confirm it’s sanitized or add a guard.


410-465: Make local hot-tier path helpers tenant-aware to prevent stream-name collisions.

Lines 454–463 write manifests via get_stream_path_for_date(stream, &date), which produces hot_tier_path/stream/date={date} without tenant prefix. Manifests from different tenants sharing the same stream name will overwrite each other. Similarly, cleanup_hot_tier_old_data receives tenant_id but never uses it for path construction; get_hot_tier_parquet_files and fetch_hot_tier_dates also omit tenant separation.

Update path helpers—get_stream_path_for_date, fetch_hot_tier_dates, and get_hot_tier_parquet_files—to accept and include tenant_id following the existing pattern in hot_tier_file_path.

src/rbac/user.rs (1)

332-359: Tenant scoping is missing in UserGroup::validate.

Group existence and user existence checks ignore tenant buckets, and role validation still skips missing tenants.

🐛 Proposed fix
 pub fn validate(&self, tenant_id: &Option<String>) -> Result<(), RBACError> {
+    let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
     let valid_name = is_valid_group_name(&self.name);

-    if read_user_groups().contains_key(&self.name) {
+    if read_user_groups()
+        .get(tenant)
+        .is_some_and(|groups| groups.contains_key(&self.name))
+    {
         return Err(RBACError::UserGroupExists(self.name.clone()));
     }
     let mut non_existent_roles = Vec::new();
     if !self.roles.is_empty() {
         // validate that the roles exist
         for role in &self.roles {
-            if let Some(tenant_roles) =
-                roles().get(tenant_id.as_deref().unwrap_or(DEFAULT_TENANT))
-                && !tenant_roles.contains_key(role)
-            {
+            let role_exists = roles()
+                .get(tenant)
+                .is_some_and(|tenant_roles| tenant_roles.contains_key(role));
+            if !role_exists {
                 non_existent_roles.push(role.clone());
             }
         }
     }
     let mut non_existent_users = Vec::new();
     if !self.users.is_empty() {
         // validate that the users exist
         for group_user in &self.users {
-            if !users().contains_key(group_user.userid()) {
+            let user_exists = users()
+                .get(tenant)
+                .is_some_and(|tenant_users| tenant_users.contains_key(group_user.userid()));
+            if !user_exists {
                 non_existent_users.push(group_user.userid().to_string());
             }
         }
     }
src/parseable/streams.rs (1)

118-127: Update all Stream::new call sites/tests for the new tenant_id parameter.

Line 123 adds tenant_id to Stream::new, but tests in this file (and other callers) still use the old 4‑arg signature, which will not compile.

🛠️ Example update (apply globally)
-        let staging = Stream::new(
-            options.clone(),
-            stream_name,
-            LogStreamMetadata::default(),
-            None,
-        );
+        let tenant_id: Option<String> = None;
+        let staging = Stream::new(
+            options.clone(),
+            stream_name,
+            LogStreamMetadata::default(),
+            None,
+            &tenant_id,
+        );
src/storage/azure_blob.rs (1)

214-266: Move object-store call metrics after successful operations.

Right now increment_object_store_calls_by_date fires before the success path in several methods (GET/PUT/DELETE/HEAD/upload), so failed calls are counted as successes. Please move the counter into the Ok branch (or after ?) to keep metrics accurate.

💡 Example adjustment (apply similarly to other ops)
-        let resp = self.client.get(&to_object_store_path(path)).await;
-        increment_object_store_calls_by_date("GET", &Utc::now().date_naive().to_string(), tenant);
-
-        match resp {
-            Ok(resp) => {
+        match self.client.get(&to_object_store_path(path)).await {
+            Ok(resp) => {
+                increment_object_store_calls_by_date(
+                    "GET",
+                    &Utc::now().date_naive().to_string(),
+                    tenant,
+                );
                 let body: Bytes = resp.bytes().await?;
                 ...
             }
             Err(err) => Err(err.into()),
         }

Based on learnings, metrics should only be captured on successful operations.

Also applies to: 268-311, 359-381, 383-472, 499-517, 639-661, 663-680, 692-716

src/storage/gcs.rs (1)

180-229: Move object-store call metrics after successful operations.

Several methods increment increment_object_store_calls_by_date before confirming success (GET/PUT/DELETE/HEAD/multipart). This counts failures as successes.

💡 Example adjustment (apply similarly elsewhere)
-        let resp = self.client.put(&to_object_store_path(path), resource).await;
-        increment_object_store_calls_by_date("PUT", &Utc::now().date_naive().to_string(), tenant);
-        match resp {
+        match self.client.put(&to_object_store_path(path), resource).await {
             Ok(_) => {
+                increment_object_store_calls_by_date(
+                    "PUT",
+                    &Utc::now().date_naive().to_string(),
+                    tenant,
+                );
                 increment_files_scanned_in_object_store_calls_by_date(
                     "PUT",
                     1,
                     &Utc::now().date_naive().to_string(),
                     tenant,
                 );
                 Ok(())
             }
             Err(err) => Err(err.into()),
         }

Based on learnings, metrics should only be captured on successful operations.

Also applies to: 232-283, 323-345, 347-438, 441-499, 621-643, 645-663, 675-699

🤖 Fix all issues with AI agents
In `@src/metastore/metastores/object_store_metastore.rs`:
- Around line 145-167: The tenant string is being normalized only for the
HashMap key but not for storage operations, causing Some("") to be passed to
storage.get_objects; change get_keystones to compute a tenant_for_storage
Option<String> (e.g., let tenant_for_storage = if tenant.is_empty() { None }
else { Some(tenant.clone()) }) and pass &tenant_for_storage to
storage.get_objects, then normalize the key inserted into keystones to
DEFAULT_TENANT when tenant.is_empty(); apply the same fix pattern in
get_conversations() and any other methods that iterate tenants and call
storage.get_* so storage sees None for default-tenant cases instead of Some("").

In `@src/parseable/streams.rs`:
- Around line 585-607: The change added a tenant label to metrics::STAGING_FILES
and metrics::STORAGE_SIZE (see reset_staging_metrics and update_staging_metrics
using &self.stream_name and tenant_str with DEFAULT_TENANT), so update all
callers and tests that call .with_label_values(...) for these metrics to include
the tenant label (use DEFAULT_TENANT when tenant_id is None) and adjust any
assertions/expected label arrays accordingly to prevent label-count panics;
search for usages of STAGING_FILES and STORAGE_SIZE in tests and code and add
the extra tenant entry to the label slices and expected values.

@parmesant parmesant force-pushed the multi-tenancy branch 2 times, most recently from b5380d2 to b9366dc Compare January 20, 2026 10:26
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (9)
src/hottier.rs (4)

208-220: delete_hot_tier ignores tenant_id when constructing the deletion path.

Line 216 uses self.hot_tier_path.join(stream) without considering tenant_id. For tenant-scoped streams, the path should be hot_tier_path/tenant_id/stream, matching the logic in hot_tier_file_path.

🐛 Proposed fix
 pub async fn delete_hot_tier(
     &self,
     stream: &str,
     tenant_id: &Option<String>,
 ) -> Result<(), HotTierError> {
     if !self.check_stream_hot_tier_exists(stream, tenant_id) {
         return Err(HotTierValidationError::NotFound(stream.to_owned()).into());
     }
-    let path = self.hot_tier_path.join(stream);
+    let path = if let Some(tid) = tenant_id.as_ref() {
+        self.hot_tier_path.join(tid).join(stream)
+    } else {
+        self.hot_tier_path.join(stream)
+    };
     fs::remove_dir_all(path).await?;

     Ok(())
 }

471-496: fetch_hot_tier_dates doesn't account for tenant-scoped paths.

This function uses self.hot_tier_path.join(stream) without tenant awareness. For multi-tenancy, tenant streams are stored at hot_tier_path/tenant_id/stream, so this function will fail to find dates for tenant-scoped streams.

The following functions have the same issue and need tenant_id added:

  • get_stream_path_for_date (line 528)
  • get_hot_tier_manifest_files (line 533)
  • get_hot_tier_parquet_files (line 565)
  • get_oldest_date_time_entry (line 708)
🐛 Proposed fix for fetch_hot_tier_dates
-    pub async fn fetch_hot_tier_dates(&self, stream: &str) -> Result<Vec<NaiveDate>, HotTierError> {
+    pub async fn fetch_hot_tier_dates(&self, stream: &str, tenant_id: &Option<String>) -> Result<Vec<NaiveDate>, HotTierError> {
         let mut date_list = Vec::new();
-        let path = self.hot_tier_path.join(stream);
+        let path = if let Some(tid) = tenant_id.as_ref() {
+            self.hot_tier_path.join(tid).join(stream)
+        } else {
+            self.hot_tier_path.join(stream)
+        };
         if !path.exists() {
             return Ok(date_list);
         }

410-468: Inconsistent tenant handling in local path operations.

While tenant_id is correctly passed to remote operations (line 444: get_object, line 451: put_hot_tier), the local path functions called at lines 454 and 461-462 (get_stream_path_for_date) don't account for tenant-scoped paths. This relates to the broader issue flagged for fetch_hot_tier_dates and related functions.


611-683: Local path operations in cleanup don't use tenant_id.

fetch_hot_tier_dates (line 620) and get_stream_path_for_date (line 622) are called without tenant awareness, while put_hot_tier (line 666) correctly passes tenant_id. This inconsistency will cause cleanup to operate on incorrect paths for tenant-scoped streams.

src/migration/mod.rs (1)

60-117: Staging metadata migration ignores tenant-scoped files.

put_staging_metadata now writes per-tenant metadata, but run_metadata_migration still calls get_staging_metadata(config) without tenant context. For non-default tenants, this will likely miss their staging metadata and skip migration.

💡 Suggested fix
-    let staging_metadata = get_staging_metadata(config)?;
+    let staging_metadata = get_staging_metadata(config, tenant_id)?;

@@
-pub fn get_staging_metadata(config: &Parseable) -> anyhow::Result<Option<serde_json::Value>> {
-    let path =
-        RelativePathBuf::from(PARSEABLE_METADATA_FILE_NAME).to_path(config.options.staging_dir());
+pub fn get_staging_metadata(
+    config: &Parseable,
+    tenant_id: &Option<String>,
+) -> anyhow::Result<Option<serde_json::Value>> {
+    let path = if let Some(tenant) = tenant_id.as_ref() {
+        RelativePathBuf::from(PARSEABLE_METADATA_FILE_NAME)
+            .to_path(config.options.staging_dir().join(tenant))
+    } else {
+        RelativePathBuf::from(PARSEABLE_METADATA_FILE_NAME)
+            .to_path(config.options.staging_dir())
+    };
src/rbac/user.rs (1)

333-357: Tenant-scoped validation is incomplete.

Group existence and user existence checks are still global; with multi-tenant maps this can incorrectly validate cross-tenant state. Also treat missing tenant roles as non-existent.

🐛 Proposed fix
-        if read_user_groups().contains_key(&self.name) {
+        let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
+        if read_user_groups()
+            .get(tenant)
+            .is_some_and(|groups| groups.contains_key(&self.name))
+        {
             return Err(RBACError::UserGroupExists(self.name.clone()));
         }
         let mut non_existent_roles = Vec::new();
         if !self.roles.is_empty() {
             // validate that the roles exist
             for role in &self.roles {
-                if let Some(tenant_roles) =
-                    roles().get(tenant_id.as_deref().unwrap_or(DEFAULT_TENANT))
-                    && !tenant_roles.contains_key(role)
-                {
+                let role_exists = roles()
+                    .get(tenant)
+                    .is_some_and(|tenant_roles| tenant_roles.contains_key(role));
+                if !role_exists {
                     non_existent_roles.push(role.clone());
                 }
             }
         }
         let mut non_existent_users = Vec::new();
         if !self.users.is_empty() {
             // validate that the users exist
             for group_user in &self.users {
-                if !users().contains_key(group_user.userid()) {
+                if !users()
+                    .get(tenant)
+                    .is_some_and(|tenant_users| tenant_users.contains_key(group_user.userid()))
+                {
                     non_existent_users.push(group_user.userid().to_string());
                 }
             }
         }
src/alerts/mod.rs (1)

616-635: Filter SSE broadcasts by alert tenant to avoid cross-tenant leakage.

trigger_notifications walks all active sessions without checking tenant, so users in other tenants with identically named streams could receive alert SSEs. Filter sessions by self.tenant_id before authorizing the query.

🔧 Suggested fix
-        for (session, _, _) in active_session {
+        let alert_tenant = self.tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
+        for (session, _, session_tenant) in active_session {
+            if session_tenant != alert_tenant {
+                continue;
+            }
             if user_auth_for_query(&session, &self.query).await.is_ok()
                 && let SessionKey::SessionId(id) = &session
             {
                 broadcast_to.push(*id);
             }
         }
src/storage/azure_blob.rs (1)

214-266: Only emit object-store metrics on successful operations.

increment_object_store_calls_by_date currently runs even on failed GET/PUT. Move the increment into the success path (or after an await?) so metrics reflect successful calls.

🛠️ Suggested fix
-        let resp = self.client.get(&to_object_store_path(path)).await;
-        increment_object_store_calls_by_date("GET", &Utc::now().date_naive().to_string(), tenant);
-
-        match resp {
+        let resp = self.client.get(&to_object_store_path(path)).await;
+        match resp {
             Ok(resp) => {
+                increment_object_store_calls_by_date(
+                    "GET",
+                    &Utc::now().date_naive().to_string(),
+                    tenant,
+                );
                 let body: Bytes = resp.bytes().await?;
                 increment_files_scanned_in_object_store_calls_by_date(
                     "GET",
                     1,
                     &Utc::now().date_naive().to_string(),
                     tenant,
                 );
-        let resp = self.client.put(&to_object_store_path(path), resource).await;
-        increment_object_store_calls_by_date("PUT", &Utc::now().date_naive().to_string(), tenant);
-        match resp {
+        let resp = self.client.put(&to_object_store_path(path), resource).await;
+        match resp {
             Ok(_) => {
+                increment_object_store_calls_by_date(
+                    "PUT",
+                    &Utc::now().date_naive().to_string(),
+                    tenant,
+                );
                 increment_files_scanned_in_object_store_calls_by_date(
                     "PUT",
                     1,
                     &Utc::now().date_naive().to_string(),
                     tenant,
                 );
Based on learnings, metrics should only be captured when operations succeed.
src/storage/gcs.rs (1)

179-230: Only emit object-store metrics on successful operations.

increment_object_store_calls_by_date runs even when GET/PUT fails. Move it into the success path (or after an await?) to avoid counting failed operations.

🛠️ Suggested fix
-        let resp = self.client.get(&to_object_store_path(path)).await;
-        let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
-        increment_object_store_calls_by_date("GET", &Utc::now().date_naive().to_string(), tenant);
-        match resp {
+        let resp = self.client.get(&to_object_store_path(path)).await;
+        let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
+        match resp {
             Ok(resp) => {
+                increment_object_store_calls_by_date(
+                    "GET",
+                    &Utc::now().date_naive().to_string(),
+                    tenant,
+                );
                 let body: Bytes = resp.bytes().await?;
                 increment_files_scanned_in_object_store_calls_by_date(
                     "GET",
                     1,
                     &Utc::now().date_naive().to_string(),
                     tenant,
                 );
-        let resp = self.client.put(&to_object_store_path(path), resource).await;
-        increment_object_store_calls_by_date("PUT", &Utc::now().date_naive().to_string(), tenant);
-        match resp {
+        let resp = self.client.put(&to_object_store_path(path), resource).await;
+        match resp {
             Ok(_) => {
+                increment_object_store_calls_by_date(
+                    "PUT",
+                    &Utc::now().date_naive().to_string(),
+                    tenant,
+                );
                 increment_files_scanned_in_object_store_calls_by_date(
                     "PUT",
                     1,
                     &Utc::now().date_naive().to_string(),
                     tenant,
                 );
Based on learnings, metrics should only be captured when operations succeed.
🤖 Fix all issues with AI agents
In `@src/connectors/kafka/processor.rs`:
- Around line 53-57: The current batch uses the tenant_id from only the first
record (variable tenant_id) which can misassign tenant context; update the batch
handling in processor.rs to either (a) validate that all items in records share
the same tenant_id before proceeding — iterate records and compare each
r.tenant_id to the first and return/log an error if any differ — or (b) group
records by r.tenant_id and process each tenant group separately (i.e., create
per-tenant sub-batches) so processing and context use the correct tenant for
functions that currently rely on tenant_id.

In `@src/rbac/mod.rs`:
- Around line 304-307: Replace the panicking
HeaderValue::from_bytes(...).unwrap() call with fallible handling: attempt to
create the header value from tenant.as_bytes() (e.g., using match or if let
Ok(hv) = HeaderValue::from_bytes(...)) and only call
req.headers_mut().insert(HeaderName::from_static("tenant"), hv) on success; on
Err, skip the insert (optionally log or trace the invalid tenant) so the code
never panics when tenant contains invalid header bytes. Ensure you update the
code paths surrounding req.headers_mut(), HeaderValue::from_bytes, and the
tenant variable accordingly.

In `@src/rbac/role.rs`:
- Line 63: The Action::SuperAdmin enum variant is dead code—either remove the
Action::SuperAdmin variant and its match arm in build(), or change
super_admin_perm_builder() to construct Action::SuperAdmin (instead of
Action::All) so the build() match arm is reachable; update any pattern matches
or tests accordingly (targets: Action::SuperAdmin, super_admin_perm_builder(),
and build()) to keep variants and handlers consistent.

In `@src/stats.rs`:
- Around line 116-127: The code is not normalizing Some("") tenant IDs which
creates split metric series; add a small helper (e.g., normalize_tenant or
normalized_tenant_label) and use it in update_deleted_stats (where tenant is
computed from tenant_id) and the other affected function (around lines 235-255)
to convert Option<String> values so that Some("") or empty &str becomes
DEFAULT_TENANT and Some(non-empty) stays as-is; update places that currently do
tenant_id.as_deref().unwrap_or(DEFAULT_TENANT) to call this helper so all
metrics consistently use the normalized tenant label.

Comment on lines +53 to +57
let tenant_id = if let Some(r) = records.first() {
&r.tenant_id
} else {
&None
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Get the file to understand context around lines 53-57
wc -l src/connectors/kafka/processor.rs

Repository: parseablehq/parseable

Length of output: 102


🏁 Script executed:

#!/bin/bash
# Show the code around the problematic lines with more context
sed -n '40,75p' src/connectors/kafka/processor.rs

Repository: parseablehq/parseable

Length of output: 1265


🏁 Script executed:

#!/bin/bash
# Search for batching, consumer, and tenant-related logic in Kafka connector
rg -n "chunks_timeout|ConsumerRecord|tenant|batch" src/connectors/kafka/ -B 2 -A 2 | head -100

Repository: parseablehq/parseable

Length of output: 5959


🏁 Script executed:

#!/bin/bash
# Check how records are consumed and if there's tenant-based filtering upstream
rg -n "record_stream|from_borrowed_msg|ReceiverStream" src/connectors/kafka/ -B 3 -A 3

Repository: parseablehq/parseable

Length of output: 6521


🏁 Script executed:

#!/bin/bash
# Look for any validation or assertion about tenant consistency
rg -n "tenant_id|assert|validate" src/connectors/kafka/processor.rs -B 2 -A 2

Repository: parseablehq/parseable

Length of output: 742


🏁 Script executed:

#!/bin/bash
# Check how StreamWorker is initialized and how records flow to it
sed -n '120,186p' src/connectors/kafka/processor.rs

Repository: parseablehq/parseable

Length of output: 2132


🏁 Script executed:

#!/bin/bash
# Check if there's any tenant-based partitioning or routing logic
rg -n "partition|tenant" src/connectors/kafka/ --type-list | head -20

Repository: parseablehq/parseable

Length of output: 454


🏁 Script executed:

#!/bin/bash
# Check the consumer configuration and how partitions are assigned
fd "\.rs" src/connectors/kafka/ -x grep -l "partition_assignment\|rebalance\|partition" {} \;

Repository: parseablehq/parseable

Length of output: 365


🏁 Script executed:

#!/bin/bash
# Look at consumer setup and any partition handling logic
sed -n '1,100p' src/connectors/kafka/consumer.rs

Repository: parseablehq/parseable

Length of output: 4136


🏁 Script executed:

#!/bin/bash
# Check the rebalance listener to understand partition assignment
cat src/connectors/kafka/rebalance_listener.rs

Repository: parseablehq/parseable

Length of output: 3726


🏁 Script executed:

#!/bin/bash
# Check if records are filtered or grouped by tenant before processing
sed -n '150,190p' src/connectors/kafka/consumer.rs

Repository: parseablehq/parseable

Length of output: 1493


🏁 Script executed:

#!/bin/bash
# Look for any validation of tenant consistency within build_event_from_chunk or process method
sed -n '100,120p' src/connectors/kafka/processor.rs

Repository: parseablehq/parseable

Length of output: 539


Verify that all records in a batch belong to the same tenant.

The tenant_id is extracted from only the first record, but batching is purely time/size-based with no tenant awareness. If a Kafka partition contains messages from multiple tenants (which is possible if the producer doesn't enforce tenant-based partitioning), all records in a batch would be processed under the wrong tenant context for all but the first record.

Add validation to ensure all records in a batch share the same tenant_id, or group records by tenant before processing.

🤖 Prompt for AI Agents
In `@src/connectors/kafka/processor.rs` around lines 53 - 57, The current batch
uses the tenant_id from only the first record (variable tenant_id) which can
misassign tenant context; update the batch handling in processor.rs to either
(a) validate that all items in records share the same tenant_id before
proceeding — iterate records and compare each r.tenant_id to the first and
return/log an error if any differ — or (b) group records by r.tenant_id and
process each tenant group separately (i.e., create per-tenant sub-batches) so
processing and context use the correct tenant for functions that currently rely
on tenant_id.

Comment on lines 116 to 127
pub async fn update_deleted_stats(
storage: Arc<dyn ObjectStorage>,
storage: &Arc<dyn ObjectStorage>,
stream_name: &str,
meta: ObjectStoreFormat,
dates: Vec<String>,
tenant_id: &Option<String>,
) -> Result<(), ObjectStorageError> {
let mut num_row: i64 = 0;
let mut storage_size: i64 = 0;
let mut ingestion_size: i64 = 0;

let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
let mut manifests = meta.snapshot.manifest_list;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Normalize empty tenant IDs to avoid split metric series.
If callers pass Some("") (common for default-tenant paths), labels become "" instead of DEFAULT_TENANT, creating a parallel metric namespace and inconsistent deletes/lookups. Recommend normalizing empty to DEFAULT_TENANT in one helper and reusing it everywhere.

🔧 Suggested fix
+fn normalize_tenant<'a>(tenant_id: &'a Option<String>) -> &'a str {
+    tenant_id
+        .as_deref()
+        .filter(|t| !t.is_empty())
+        .unwrap_or(DEFAULT_TENANT)
+}
...
-    let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
+    let tenant = normalize_tenant(tenant_id);
...
 pub fn event_labels<'a>(
     stream_name: &'a str,
     format: &'static str,
     tenant_id: &'a Option<String>,
 ) -> [&'a str; 3] {
-    if let Some(tenant_id) = tenant_id.as_ref() {
-        [stream_name, format, tenant_id]
-    } else {
-        [stream_name, format, DEFAULT_TENANT]
-    }
+    let tenant = normalize_tenant(tenant_id);
+    [stream_name, format, tenant]
 }
...
 pub fn storage_size_labels<'a>(
     stream_name: &'a str,
     tenant_id: &'a Option<String>,
 ) -> [&'a str; 4] {
-    if let Some(tenant_id) = tenant_id.as_ref() {
-        ["data", stream_name, "parquet", tenant_id]
-    } else {
-        ["data", stream_name, "parquet", DEFAULT_TENANT]
-    }
+    let tenant = normalize_tenant(tenant_id);
+    ["data", stream_name, "parquet", tenant]
 }

Based on learnings, default-tenant paths may use empty-string segments, so metrics should normalize empty to DEFAULT_TENANT to avoid label fragmentation.

Also applies to: 235-255

🤖 Prompt for AI Agents
In `@src/stats.rs` around lines 116 - 127, The code is not normalizing Some("")
tenant IDs which creates split metric series; add a small helper (e.g.,
normalize_tenant or normalized_tenant_label) and use it in update_deleted_stats
(where tenant is computed from tenant_id) and the other affected function
(around lines 235-255) to convert Option<String> values so that Some("") or
empty &str becomes DEFAULT_TENANT and Some(non-empty) stays as-is; update places
that currently do tenant_id.as_deref().unwrap_or(DEFAULT_TENANT) to call this
helper so all metrics consistently use the normalized tenant label.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (10)
src/storage/gcs.rs (1)

180-207: Metrics are recorded for all API call attempts, not just successful operations.

The current code calls increment_object_store_calls_by_date after await but before checking if the response succeeded, meaning it records metrics regardless of success or failure. This pattern is consistent across all storage implementations (S3, Azure, GCS).

However, per documented best practices in related learnings, metrics should only be captured on successful operations. To align with this pattern, move increment_object_store_calls_by_date inside the Ok branch of the match statement, so it's only called when the operation actually succeeds.

src/hottier.rs (5)

208-220: Critical: delete_hot_tier ignores tenant_id in path resolution.

The function checks existence using the tenant-aware path (via check_stream_hot_tier_exists), but then deletes self.hot_tier_path.join(stream) without considering tenant_id. For tenanted streams, this would delete the wrong directory or fail silently.

🐛 Proposed fix
 pub async fn delete_hot_tier(
     &self,
     stream: &str,
     tenant_id: &Option<String>,
 ) -> Result<(), HotTierError> {
     if !self.check_stream_hot_tier_exists(stream, tenant_id) {
         return Err(HotTierValidationError::NotFound(stream.to_owned()).into());
     }
-    let path = self.hot_tier_path.join(stream);
+    let path = if let Some(tid) = tenant_id.as_ref() {
+        self.hot_tier_path.join(tid).join(stream)
+    } else {
+        self.hot_tier_path.join(stream)
+    };
     fs::remove_dir_all(path).await?;

     Ok(())
 }

470-497: Missing tenant awareness and potential panic in fetch_hot_tier_dates.

Two issues:

  1. Line 473 uses self.hot_tier_path.join(stream) without considering tenant_id, which would look in the wrong directory for tenanted streams.
  2. Line 491 uses .unwrap() on date parsing, which will panic if a directory name doesn't match the expected date=YYYY-MM-DD format.
🐛 Proposed fix
-    pub async fn fetch_hot_tier_dates(&self, stream: &str) -> Result<Vec<NaiveDate>, HotTierError> {
+    pub async fn fetch_hot_tier_dates(&self, stream: &str, tenant_id: &Option<String>) -> Result<Vec<NaiveDate>, HotTierError> {
         let mut date_list = Vec::new();
-        let path = self.hot_tier_path.join(stream);
+        let path = if let Some(tid) = tenant_id.as_ref() {
+            self.hot_tier_path.join(tid).join(stream)
+        } else {
+            self.hot_tier_path.join(stream)
+        };
         if !path.exists() {
             return Ok(date_list);
         }

         let directories = fs::read_dir(&path).await?;
         let mut dates = ReadDirStream::new(directories);
         while let Some(date) = dates.next().await {
             let date = date?;
             if !date.path().is_dir() {
                 continue;
             }
-            let date = NaiveDate::parse_from_str(
+            let Ok(date) = NaiveDate::parse_from_str(
                 date.file_name()
                     .to_string_lossy()
                     .trim_start_matches("date="),
                 "%Y-%m-%d",
-            )
-            .unwrap();
-            date_list.push(date);
+            ) else {
+                warn!("Skipping directory with invalid date format: {:?}", date.file_name());
+                continue;
+            };
+            date_list.push(date);
         }

527-530: Missing tenant awareness in get_stream_path_for_date.

This function constructs paths without considering tenant_id, which would produce incorrect paths for tenanted streams. It's called from process_parquet_file and cleanup_hot_tier_old_data, propagating the issue.

🐛 Proposed fix
-    pub fn get_stream_path_for_date(&self, stream: &str, date: &NaiveDate) -> PathBuf {
-        self.hot_tier_path.join(stream).join(format!("date={date}"))
+    pub fn get_stream_path_for_date(&self, stream: &str, date: &NaiveDate, tenant_id: &Option<String>) -> PathBuf {
+        if let Some(tid) = tenant_id.as_ref() {
+            self.hot_tier_path.join(tid).join(stream).join(format!("date={date}"))
+        } else {
+            self.hot_tier_path.join(stream).join(format!("date={date}"))
+        }
     }

Note: This will require updating all call sites (process_parquet_file, cleanup_hot_tier_old_data, get_oldest_date_time_entry, fetch_hot_tier_dates) to pass tenant_id.


708-754: get_oldest_date_time_entry needs tenant awareness.

This function calls fetch_hot_tier_dates and get_stream_path_for_date, both of which need tenant awareness. Once those are fixed, this function should also accept tenant_id and pass it through.


564-592: get_hot_tier_parquet_files needs tenant awareness.

This function calls fetch_hot_tier_dates (line 570) and get_stream_path_for_date (line 577), both of which need tenant awareness. Update this function to accept tenant_id and propagate it.

src/metastore/metastores/object_store_metastore.rs (1)

344-374: Add tenant scoping to alert_state_json_path calls.

The alert_state_json_path helper at line 1292 in src/storage/object_storage.rs does not accept a tenant parameter, but get_alert_states (line 317-322) reads alert state files from tenant-prefixed directories. Lines 349 and 373 call alert_state_json_path without tenant context, causing a mismatch: state is written to and read from ALERTS_ROOT_DIRECTORY/alert_state_{id}.json while get_alert_states searches in [tenant]/ALERTS_ROOT_DIRECTORY/alert_state_*.json, resulting in missing state for tenant-scoped reads and potential cross-tenant state sharing.

Update alert_state_json_path to accept tenant_id: &Option<String> and follow the tenant-scoping pattern already used by alert_json_path and mttr_json_path in the same file, then pass tenant_id at both call sites.

src/rbac/user.rs (1)

333-349: Role validation skips missing tenants.
If the tenant entry is absent in roles(), roles pass validation incorrectly.

🐛 Suggested fix
-                if let Some(tenant_roles) =
-                    roles().get(tenant_id.as_deref().unwrap_or(DEFAULT_TENANT))
-                    && !tenant_roles.contains_key(role)
-                {
+                let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
+                let role_exists = roles()
+                    .get(tenant)
+                    .is_some_and(|tenant_roles| tenant_roles.contains_key(role));
+                if !role_exists {
                     non_existent_roles.push(role.clone());
                 }
src/alerts/mod.rs (1)

616-635: Filter alert broadcasts by tenant to avoid cross-tenant leakage.

active_sessions includes tenant ids, but the broadcast loop ignores them. This can deliver alert notifications to sessions from other tenants.

🔒 Suggested fix
-        let active_session = sessions().get_active_sessions();
+        let active_session = sessions().get_active_sessions();
+        let alert_tenant = self.tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
         let mut broadcast_to = vec![];
-        for (session, _, _) in active_session {
+        for (session, _, tenant) in active_session {
+            if tenant != alert_tenant {
+                continue;
+            }
             if user_auth_for_query(&session, &self.query).await.is_ok()
                 && let SessionKey::SessionId(id) = &session
             {
                 broadcast_to.push(*id);
             }
         }
src/migration/mod.rs (1)

50-118: get_staging_metadata does not support per-tenant reads.

run_metadata_migration receives tenant_id but calls get_staging_metadata(config) which always reads from the shared staging directory. When tenant_id is provided, the staging metadata should be read from the tenant-specific location, not the shared root. Currently, staging metadata is read from the shared location and then written to per-tenant locations, causing all tenants to share the same staging metadata read.

Update get_staging_metadata to accept and use tenant_id parameter, similar to how put_staging_metadata already does.

🤖 Fix all issues with AI agents
In `@src/parseable/streams.rs`:
- Around line 1698-1702: The code panics because
guard.get(DEFAULT_TENANT).unwrap() is called on an empty map; replace the unwrap
with a safe check such as using map_or or is_none to assert the tenant bucket
doesn't contain the stream. For example, change the assertion to use
guard.get(DEFAULT_TENANT).map_or(true, |m| !m.contains_key(stream_name)) or
assert!(guard.get(DEFAULT_TENANT).is_none() ||
!guard.get(DEFAULT_TENANT).unwrap().contains_key(stream_name)); operate on the
result of streams.read(), DEFAULT_TENANT, and stream_name to avoid unwrapping a
missing entry.
♻️ Duplicate comments (60)
src/hottier.rs (3)

107-109: Logic error in stream/tenant exclusion filter.

The condition stream != current_stream && tenant_id != *current_tenant_id skips entries only when both differ. This incorrectly includes entries where the stream matches but the tenant differs. The intent is to exclude the exact (stream, tenant) pair.


595-603: Avoid panics in check_stream_hot_tier_exists and remove dead code.

The .unwrap() at line 601 can panic on path conversion errors. Additionally, the commented-out code at lines 597-600 should be removed.


794-800: Fix create_pstats_hot_tier to use INTERNAL_STREAM_HOT_TIER_SIZE_BYTES instead of MIN_STREAM_HOT_TIER_SIZE_BYTES.

The function explicitly documents that it creates a hot tier for the pstats internal stream, but allocates 10 GiB (MIN_STREAM_HOT_TIER_SIZE_BYTES) instead of 10 MiB (INTERNAL_STREAM_HOT_TIER_SIZE_BYTES). Since pstats is an internal system stream like pmeta, it should use the internal size allocation at lines 796 and 798.

src/correlation.rs (4)

134-144: In-memory insert skips new tenant buckets.

Line 142 only inserts when the tenant map already exists, so the first correlation for a tenant isn’t cached.


155-183: Update path still drops inserts when the tenant bucket is missing.


193-211: Delete removes from the wrong map level.


275-315: Table validation still isn’t tenant-scoped.

Line 314 uses ctx.table() without selecting the tenant schema, so table resolution can cross tenants when names collide.

src/handlers/http/role.rs (2)

154-168: Persist before in-memory update (already flagged).

Same issue as prior review: DEFAULT_ROLE is updated before put_metadata, and the lock uses unwrap(). Please address as noted earlier.


173-191: Remove commented-out block (already flagged).

src/rbac/role.rs (1)

63-63: Action::SuperAdmin variant appears unused.

The SuperAdmin variant is defined here and handled in build() at line 171, but super_admin_perm_builder() at lines 232-237 constructs Action::All instead. Unless there's an external instantiation path, this variant is dead code.

#!/bin/bash
# Verify if Action::SuperAdmin is instantiated anywhere
rg -n "Action::SuperAdmin" --type rust -B 2 -A 2
src/handlers/http/query.rs (1)

118-131: Consider deriving tenant from authenticated session instead of header.

Using get_tenant_id_from_request extracts the tenant from an HTTP header, which could be spoofable. For stronger isolation, consider using get_user_and_tenant_from_request (or similar session-based extraction) to tie tenant scoping to the authenticated user's session.

src/storage/store_metadata.rs (2)

126-131: Tenant-specific staging metadata isn't read.

resolve_parseable_metadata now receives tenant_id but still calls get_staging_metadata() without it. This means non-default tenants can read the default tenant's staging metadata and mis-detect environment changes.


309-324: Ensure tenant staging directory exists before write.

put_staging_metadata writes into a tenant subdirectory but never creates it. For new tenants this will fail with NotFound on open.

src/handlers/http/modal/ingest/ingestor_role.rs (3)

47-51: Inverted tenant validation logic.

The condition req_tenant.ne(DEFAULT_TENANT) && (req_tenant_id.eq(&sync_req.tenant_id)) checks if the request tenant is NOT the default AND matches the payload tenant. This would reject valid same-tenant operations. Based on the error message, the second condition should check for a mismatch.

🐛 Proposed fix
-    if req_tenant.ne(DEFAULT_TENANT) && (req_tenant_id.eq(&sync_req.tenant_id)) {
+    if req_tenant.ne(DEFAULT_TENANT) && (req_tenant_id.ne(&sync_req.tenant_id)) {
         return Err(RoleError::Anyhow(anyhow::Error::msg(
             "non super-admin user trying to create role for another tenant",
         )));
     }

58-58: Silently ignoring storage error.

The result of put_staging_metadata is discarded with let _ = .... If metadata persistence fails, the in-memory state will be updated but storage will be inconsistent.

🐛 Proposed fix
-    let _ = storage::put_staging_metadata(&metadata, &sync_req.tenant_id);
+    storage::put_staging_metadata(&metadata, &sync_req.tenant_id)?;

53-66: Inconsistent tenant_id usage between metadata and in-memory operations.

Line 53 and 58 use sync_req.tenant_id for metadata operations, but lines 59-62 derive tenant_id from req_tenant_id for in-memory mut_roles() updates. This inconsistency could cause role updates to target the wrong tenant. Compare with src/handlers/http/role.rs (lines 55-59 in relevant snippets) which consistently uses the same tenant source for both operations.

🐛 Proposed fix - use consistent tenant source
-    let tenant_id = req_tenant_id
-        .as_deref()
-        .unwrap_or(DEFAULT_TENANT)
-        .to_owned();
+    let tenant_id = sync_req.tenant_id
+        .as_deref()
+        .map_or(DEFAULT_TENANT, |v| v)
+        .to_owned();
     mut_roles()
         .entry(tenant_id.clone())
src/handlers/http/oidc.rs (3)

112-116: Basic-auth tenant extraction may return None incorrectly.

For SessionKey::BasicAuth, get_tenant_id_from_key() returns None because basic credentials have no pre-established session mapping. This causes Users.get_user(&username, &None) to default to the DEFAULT_TENANT regardless of the actual tenant context. Consider using get_tenant_id_from_request(&req) for the basic-auth branch.


130-159: Intra-cluster login sync should check response status.

The sync request uses .send().await? which succeeds on non-2xx responses. A node returning 4xx/5xx would be silently treated as success.

🐛 Proposed fix
                         INTRA_CLUSTER_CLIENT
                             .post(url)
                             .header(header::AUTHORIZATION, node.token)
                             .header(header::CONTENT_TYPE, "application/json")
                             .json(&json!(
                                 {
                                     "sessionCookie": _session,
                                     "user": _user,
                                     "expiry": EXPIRY_DURATION
                                 }
                             ))
                             .send()
-                            .await?;
+                            .await?
+                            .error_for_status()?;
                         Ok::<(), anyhow::Error>(())

323-327: New OAuth users created without tenant association.

The comment "LET TENANT BE NONE FOR NOW!!!" at line 325 indicates that new OAuth users are intentionally created without tenant association, despite tenant_id being extracted and used throughout the function. This creates a multi-tenant isolation gap.

🐛 Proposed fix (if tenant should be used)
-        // LET TENANT BE NONE FOR NOW!!!
-        (None, roles) => put_user(&user_id, roles, user_info, bearer, None).await?,
+        (None, roles) => put_user(&user_id, roles, user_info, bearer, tenant_id.clone()).await?,
src/parseable/streams.rs (1)

1389-1401: Tests still use old metrics label sets (missing tenant label).

STAGING_FILES and STORAGE_SIZE now include tenant labels, but tests still use the pre-tenant label list, causing label-count panics.

🛠️ Example test fix
-        let staging_files = metrics::STAGING_FILES.with_label_values(&[&stream]).get();
+        let tenant = DEFAULT_TENANT;
+        let staging_files = metrics::STAGING_FILES.with_label_values(&[&stream, tenant]).get();
...
-        let storage_size_arrows = metrics::STORAGE_SIZE
-            .with_label_values(&["staging", &stream, "arrows"])
+        let storage_size_arrows = metrics::STORAGE_SIZE
+            .with_label_values(&["staging", &stream, "arrows", tenant])
             .get();
...
-        let storage_size_parquet = metrics::STORAGE_SIZE
-            .with_label_values(&["staging", &stream, "parquet"])
+        let storage_size_parquet = metrics::STORAGE_SIZE
+            .with_label_values(&["staging", &stream, "parquet", tenant])
             .get();
src/query/mod.rs (1)

181-204: Log schema registration failures instead of dropping them.

Schema registration errors are currently ignored with let _ = ..., which can hide startup issues and lead to hard-to-debug query failures. Please log the error (and keep startup alive if that’s intended).

🔧 Suggested fix
-                    let _ = catalog.register_schema(t, schema_provider);
+                    if let Err(e) = catalog.register_schema(t, schema_provider) {
+                        tracing::error!("failed to register schema for tenant {t}: {e}");
+                    }
...
-            let _ = catalog.register_schema(
+            if let Err(e) = catalog.register_schema(
                 &state.config_options().catalog.default_schema,
                 schema_provider,
-            );
+            ) {
+                tracing::error!("failed to register default schema: {e}");
+            }
src/rbac/mod.rs (2)

304-307: Avoid panicking on invalid tenant header values.

HeaderValue::from_bytes(...).unwrap() can panic if tenant contains invalid header bytes. Prefer fallible handling.

🔧 Safer insertion
-            req.headers_mut().insert(
-                HeaderName::from_static("tenant"),
-                HeaderValue::from_bytes(tenant.as_bytes()).unwrap(),
-            );
+            if let Ok(value) = HeaderValue::from_str(tenant) {
+                req.headers_mut()
+                    .insert(HeaderName::from_static("tenant"), value);
+            }

290-327: Basic-auth tenant selection is nondeterministic across tenants.

Parallel find_any across tenant maps can pick an arbitrary tenant when credentials collide. Enforce uniqueness or make the lookup deterministic / tenant-scoped.

src/users/dashboards.rs (1)

255-265: Dashboard creation silently no-ops for new tenants.

When the tenant bucket doesn’t exist, creation returns Ok(()) without persisting anything. Use entry().or_default() to ensure tenant storage exists.

🐛 Suggested fix
-        if let Some(dbs) = dashboards.get_mut(tenant) {
-            let has_duplicate = dbs
-                .iter()
-                .any(|d| d.title == dashboard.title && d.dashboard_id != dashboard.dashboard_id);
-            if has_duplicate {
-                return Err(DashboardError::Metadata("Dashboard title must be unique"));
-            }
-            self.save_dashboard(dashboard, tenant_id).await?;
-
-            dbs.push(dashboard.clone());
-        }
+        let dbs = dashboards.entry(tenant.to_owned()).or_default();
+        let has_duplicate = dbs
+            .iter()
+            .any(|d| d.title == dashboard.title && d.dashboard_id != dashboard.dashboard_id);
+        if has_duplicate {
+            return Err(DashboardError::Metadata("Dashboard title must be unique"));
+        }
+        self.save_dashboard(dashboard, tenant_id).await?;
+        dbs.push(dashboard.clone());
src/prism/logstream/mod.rs (1)

63-71: Stats are hardcoded to defaults (real stats ignored).

The actual stats result is discarded and replaced with QueriedStats::default(), which breaks dataset stats. Use the real stats? result.

🐛 Suggested fix
-    // let stats = stats?;
-    tracing::warn!(real_stats=?stats);
-    let stats = QueriedStats::default();
-    tracing::warn!("got FAKE stats");
+    let stats = stats?;
src/metastore/metastores/object_store_metastore.rs (4)

317-323: Use empty-string default for alert-state base path.

Line 321 falls back to DEFAULT_TENANT, but default-tenant alert state files live under /.alerts/ (no tenant directory). This will miss default-tenant states. Align with the empty-string path used elsewhere for default tenants.

🛠️ Suggested fix
-        let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
-        let base_path = RelativePathBuf::from_iter([tenant, ALERTS_ROOT_DIRECTORY]);
+        let base_path = if let Some(tenant) = tenant_id.as_deref() {
+            RelativePathBuf::from_iter([tenant, ALERTS_ROOT_DIRECTORY])
+        } else {
+            RelativePathBuf::from(ALERTS_ROOT_DIRECTORY)
+        };

Based on learnings, default-tenant paths are intentionally built with empty string segments.


516-541: Accumulate dashboards per tenant instead of overwriting.

Line 539 inserts dashboards per user, overwriting prior users within the same tenant. Collect per-tenant dashboards and insert once after iterating users.

🛠️ Suggested fix
-        for mut tenant in base_paths {
-            let tenant_id = &Some(tenant.clone());
-            let users_dir = RelativePathBuf::from_iter([&tenant, USERS_ROOT_DIR]);
-            for user in self
-                .storage
-                .list_dirs_relative(&users_dir, tenant_id)
-                .await?
-            {
-                let dashboards_path = users_dir.join(&user).join("dashboards");
-                let dashboard_bytes = self
-                    .storage
-                    .get_objects(
-                        Some(&dashboards_path),
-                        Box::new(|file_name| file_name.ends_with(".json")),
-                        tenant_id,
-                    )
-                    .await?;
-                if tenant.is_empty() {
-                    tenant.clone_from(&DEFAULT_TENANT.to_string());
-                }
-                dashboards.insert(tenant.to_owned(), dashboard_bytes);
-                // dashboards.extend(dashboard_bytes);
-            }
-        }
+        for mut tenant in base_paths {
+            let tenant_id = &Some(tenant.clone());
+            let users_dir = RelativePathBuf::from_iter([&tenant, USERS_ROOT_DIR]);
+            let mut tenant_dashboards = Vec::new();
+            for user in self
+                .storage
+                .list_dirs_relative(&users_dir, tenant_id)
+                .await?
+            {
+                let dashboards_path = users_dir.join(&user).join("dashboards");
+                let dashboard_bytes = self
+                    .storage
+                    .get_objects(
+                        Some(&dashboards_path),
+                        Box::new(|file_name| file_name.ends_with(".json")),
+                        tenant_id,
+                    )
+                    .await?;
+                tenant_dashboards.extend(dashboard_bytes);
+            }
+            if tenant.is_empty() {
+                tenant.clone_from(&DEFAULT_TENANT.to_string());
+            }
+            dashboards.insert(tenant, tenant_dashboards);
+        }

674-693: Avoid unwraps after filter migration.

Lines 679-692 unwrap filter_value.as_object() after migrate_v1_v2. If migration produces a non-object, this will panic. Guard the extracted object before reading fields.

🛠️ Suggested fix
-                                let user_id = filter_value
-                                    .as_object()
-                                    .unwrap()
-                                    .get("user_id")
-                                    .and_then(|user_id| user_id.as_str());
-                                let filter_id = filter_value
-                                    .as_object()
-                                    .unwrap()
-                                    .get("filter_id")
-                                    .and_then(|filter_id| filter_id.as_str());
-                                let stream_name = filter_value
-                                    .as_object()
-                                    .unwrap()
-                                    .get("stream_name")
-                                    .and_then(|stream_name| stream_name.as_str());
+                                let migrated_obj = filter_value.as_object();
+                                let user_id = migrated_obj
+                                    .and_then(|o| o.get("user_id"))
+                                    .and_then(|v| v.as_str());
+                                let filter_id = migrated_obj
+                                    .and_then(|o| o.get("filter_id"))
+                                    .and_then(|v| v.as_str());
+                                let stream_name = migrated_obj
+                                    .and_then(|o| o.get("stream_name"))
+                                    .and_then(|v| v.as_str());

1167-1171: Pass tenant_id to delete_prefix for accurate attribution.

Line 1169 passes &None, so storage metrics/labels can be attributed to the default tenant instead of the deleted tenant. Use the actual tenant_id.

🛠️ Suggested fix
-        self.storage
-            .delete_prefix(&RelativePathBuf::from(tenant_id), &None)
+        self.storage
+            .delete_prefix(&RelativePathBuf::from(tenant_id), &Some(tenant_id.to_string()))
src/alerts/alert_types.rs (1)

90-97: Avoid panic when building Authorization header.

Line 96 uses HeaderValue::from_str(&header).unwrap(), which can panic if stored auth is malformed. Convert this to a handled error (or skip the header) so alert evaluation can fail gracefully.

🛠️ Suggested fix
-            map.insert(
-                HeaderName::from_static("authorization"),
-                HeaderValue::from_str(&header).unwrap(),
-            );
+            let value = HeaderValue::from_str(&header)
+                .map_err(|e| AlertError::CustomError(format!("Invalid auth header: {e}")))?;
+            map.insert(HeaderName::from_static("authorization"), value);
src/tenants/mod.rs (2)

36-79: Keep suspension state in sync with StorageMetadata.

Line 38 duplicates suspended_services outside StorageMetadata, but suspend_service/resume_service only update the HashSet. This leaves meta.suspended_services stale, so get_tenants() returns outdated suspension state.

🛠️ Suggested fix (sync meta on updates)
     pub fn suspend_service(&self, tenant_id: &str, service: Service) {
         if let Some(mut tenant) = self.tenants.get_mut(tenant_id) {
-            tenant.suspended_services.insert(service);
+            tenant.suspended_services.insert(service.clone());
+            let services = tenant.meta.suspended_services.get_or_insert_with(HashSet::new);
+            services.insert(service);
         }
     }
 
     pub fn resume_service(&self, tenant_id: &str, service: Service) {
         if let Some(mut tenant) = self.tenants.get_mut(tenant_id) {
             tenant.suspended_services.remove(&service);
+            if let Some(services) = tenant.meta.suspended_services.as_mut() {
+                services.remove(&service);
+                if services.is_empty() {
+                    tenant.meta.suspended_services = None;
+                }
+            }
         }
     }

86-104: Return Ok(None) for missing tenants.

Line 103 returns Err(TenantNotFound) for absent tenants, but the effective behavior is “not suspended.” Returning Ok(None) simplifies callers and matches semantics.

🛠️ Suggested fix
-        } else {
-            Err(TenantNotFound(tenant_id.to_owned()))
-        }
+        } else {
+            Ok(None)
+        }
src/utils/mod.rs (2)

29-68: Avoid panic when BasicAuth omits a password.
Basic auth allows empty passwords; unwrap() will panic. This should mirror the safe handling in src/utils/actix.rs (Lines 50-70).

🐛 Proposed fix
-    if let Ok(basic) = &creds {
-        Users.mutate_request_with_basic_user(basic.user_id(), basic.password().unwrap(), req);
+    if let Ok(basic) = &creds {
+        Users.mutate_request_with_basic_user(
+            basic.user_id(),
+            basic.password().unwrap_or(""),
+            req,
+        );

122-135: Handle non‑UTF8 tenant header values safely.
to_str().unwrap() can panic on invalid header bytes.

🐛 Proposed fix
 pub fn get_tenant_id_from_request(req: &HttpRequest) -> Option<String> {
     req.headers()
         .get("tenant")
-        .map(|tenant_value| tenant_value.to_str().unwrap().to_owned())
+        .and_then(|tenant_value| tenant_value.to_str().ok())
+        .map(|s| s.to_owned())
 }
src/query/stream_schema_provider.rs (2)

61-110: Avoid TOCTOU between table_exist and get_stream.
If the stream is deleted between checks, the .expect(STREAM_EXISTS) can panic.

🐛 Proposed fix
-        if self.table_exist(name) {
-            Ok(Some(Arc::new(StandardTableProvider {
-                schema: PARSEABLE
-                    .get_stream(name, &self.tenant_id)
-                    .expect(STREAM_EXISTS)
-                    .get_schema(),
-                stream: name.to_owned(),
-                tenant_id: self.tenant_id.clone(),
-            })))
-        } else {
-            Ok(None)
-        }
+        match PARSEABLE.get_stream(name, &self.tenant_id) {
+            Ok(stream) => Ok(Some(Arc::new(StandardTableProvider {
+                schema: stream.get_schema(),
+                stream: name.to_owned(),
+                tenant_id: self.tenant_id.clone(),
+            }))),
+            Err(_) => Ok(None),
+        }

636-647: Avoid panicking on ObjectStoreUrl parsing.
ObjectStoreUrl::parse(...).unwrap() will crash on invalid configuration; this path already returns Result, so propagate the error.

🐛 Proposed fix
-        let object_store_url = glob_storage.store_url();
-
-        self.create_parquet_physical_plan(
-            &mut execution_plans,
-            ObjectStoreUrl::parse(object_store_url).unwrap(),
+        let object_store_url = glob_storage.store_url();
+        let object_store_url = ObjectStoreUrl::parse(object_store_url)
+            .map_err(|e| DataFusionError::Plan(format!("Invalid object store URL: {e}")))?;
+
+        self.create_parquet_physical_plan(
+            &mut execution_plans,
+            object_store_url,
             partitioned_files,
             statistics,
             projection,
             filters,
             limit,
             state,
             time_partition.clone(),
         )
src/handlers/http/middleware.rs (2)

171-210: Default‑tenant sessions are rejected in non‑ingest multi‑tenant flow.
get_user_and_tenant_from_request returns None for DEFAULT_TENANT sessions (src/utils/mod.rs, Lines 102‑106), but the current branch treats None as InvalidTenantId. Also, HeaderValue::from_str(...).unwrap() can panic on invalid header characters.

🐛 Proposed fix
-                    if let Ok((_, tenant)) = get_user_and_tenant_from_request(req.request())
-                        && let Some(tid) = tenant.as_ref()
-                    {
-                        req.headers_mut().insert(
-                            HeaderName::from_static("tenant"),
-                            HeaderValue::from_str(tid).unwrap(),
-                        );
-                        tenant
-                    } else {
-                        header_error = Some(actix_web::Error::from(PostError::Header(
-                            crate::utils::header_parsing::ParseHeaderError::InvalidTenantId,
-                        )));
-                        None
-                    }
+                    match get_user_and_tenant_from_request(req.request()) {
+                        Ok((_, tenant)) => {
+                            let tid = tenant
+                                .as_ref()
+                                .map_or(DEFAULT_TENANT.to_string(), |v| v.clone());
+                            match HeaderValue::from_str(&tid) {
+                                Ok(val) => {
+                                    req.headers_mut()
+                                        .insert(HeaderName::from_static("tenant"), val);
+                                    Some(tid)
+                                }
+                                Err(_) => {
+                                    header_error = Some(actix_web::Error::from(PostError::Header(
+                                        crate::utils::header_parsing::ParseHeaderError::InvalidTenantId,
+                                    )));
+                                    None
+                                }
+                            }
+                        }
+                        Err(_) => {
+                            header_error = Some(actix_web::Error::from(PostError::Header(
+                                crate::utils::header_parsing::ParseHeaderError::InvalidTenantId,
+                            )));
+                            None
+                        }
+                    }

375-387: Don’t treat unknown tenants as Authorized.
TENANT_METADATA.is_action_suspended returning Err should not fall through to Authorized.

🐛 Proposed fix
-    if let Some(tenant) = req.headers().get("tenant")
-        && let Ok(tenant) = tenant.to_str()
-    {
-        if let Ok(Some(suspension)) = TENANT_METADATA.is_action_suspended(tenant, &action) {
-            return rbac::Response::Suspended(suspension);
-        } else {
-            // tenant does not exist
-        }
-    }
+    if let Some(tenant) = req.headers().get("tenant")
+        && let Ok(tenant) = tenant.to_str()
+    {
+        match TENANT_METADATA.is_action_suspended(tenant, &action) {
+            Ok(Some(suspension)) => return rbac::Response::Suspended(suspension),
+            Ok(None) => {}
+            Err(_) => return rbac::Response::UnAuthorized,
+        }
+    }
src/users/filters.rs (1)

128-133: Filter updates are dropped when tenant bucket is missing.
get_mut() skips insertion for new tenants, so the update is silently lost.

🐛 Proposed fix
     pub async fn update(&self, filter: &Filter, tenant_id: &Option<String>) {
         let mut s = self.0.write().await;
-        if let Some(filters) = s.get_mut(tenant_id.as_deref().unwrap_or(DEFAULT_TENANT)) {
-            filters.retain(|f| f.filter_id != filter.filter_id);
-            filters.push(filter.clone());
-        }
+        let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
+        let filters = s.entry(tenant.to_owned()).or_default();
+        filters.retain(|f| f.filter_id != filter.filter_id);
+        filters.push(filter.clone());
     }
src/stats.rs (1)

126-175: Normalize empty tenant IDs to avoid split metric series.
If callers pass Some(""), labels become "" instead of DEFAULT_TENANT, creating parallel namespaces and inconsistent deletes/lookups.

🔧 Suggested fix
+fn normalize_tenant<'a>(tenant_id: &'a Option<String>) -> &'a str {
+    tenant_id
+        .as_deref()
+        .filter(|t| !t.is_empty())
+        .unwrap_or(DEFAULT_TENANT)
+}
...
-    let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
+    let tenant = normalize_tenant(tenant_id);
...
-    if let Some(tenant_id) = tenant_id.as_ref() {
-        [stream_name, format, tenant_id]
-    } else {
-        [stream_name, format, DEFAULT_TENANT]
-    }
+    let tenant = normalize_tenant(tenant_id);
+    [stream_name, format, tenant]
...
-    if let Some(tenant_id) = tenant_id.as_ref() {
-        ["data", stream_name, "parquet", tenant_id]
-    } else {
-        ["data", stream_name, "parquet", DEFAULT_TENANT]
-    }
+    let tenant = normalize_tenant(tenant_id);
+    ["data", stream_name, "parquet", tenant]

Based on learnings, default-tenant paths can use empty-string segments, so metrics should normalize empty to DEFAULT_TENANT.

Also applies to: 235-255

src/alerts/target.rs (1)

61-69: Normalize DEFAULT_TENANT to None when loading targets.
Line 65 currently stores Some(DEFAULT_TENANT) in t.tenant, which conflicts with the rest of the codebase where default tenant is represented as None.

🔧 Suggested fix
-                .map(|mut t| {
-                    t.tenant = Some(tenant_id.clone());
+                .map(|mut t| {
+                    t.tenant = if tenant_id == DEFAULT_TENANT {
+                        None
+                    } else {
+                        Some(tenant_id.clone())
+                    };
                     (t.id, t)
                 })
src/handlers/http/modal/ingest/ingestor_rbac.rs (3)

49-57: Tenant validation condition is inverted.
Line 51–53 rejects same-tenant requests instead of cross-tenant ones.

🐛 Suggested fix
-        if req_tenant.ne(DEFAULT_TENANT)
-            && (req_tenant.eq(user.tenant.as_deref().unwrap_or(DEFAULT_TENANT)))
+        if req_tenant.ne(DEFAULT_TENANT)
+            && (req_tenant.ne(user.tenant.as_deref().unwrap_or(DEFAULT_TENANT)))
         {

59-60: Do not ignore put_staging_metadata failures.
Swallowing these errors can leave in-memory and on-disk state diverged.

🔧 Suggested fix (all occurrences)
-        let _ = storage::put_staging_metadata(&metadata, req_tenant_id);
+        storage::put_staging_metadata(&metadata, req_tenant_id)?;

Also applies to: 86-86, 133-133, 192-192, 210-210


108-114: Role validation passes for non-existent tenants.
If the tenant key is missing in roles(), these loops silently accept roles.

🐛 Suggested fix
-        if let Some(tenant_roles) = roles().get(tenant_id.as_deref().unwrap_or(DEFAULT_TENANT))
-            && tenant_roles.get(r).is_none()
-        {
+        let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
+        let role_exists = roles()
+            .get(tenant)
+            .is_some_and(|tenant_roles| tenant_roles.contains_key(r));
+        if !role_exists {
             non_existent_roles.push(r.clone());
         }

Also applies to: 154-160

src/handlers/http/modal/query/querier_rbac.rs (3)

60-66: Role validation skips missing tenant entries.
Roles for a non-existent tenant silently pass.

🐛 Suggested fix
-        if let Some(tenant_roles) = roles().get(tenant_id.as_deref().unwrap_or(DEFAULT_TENANT))
-            && !tenant_roles.contains_key(role)
-        {
+        let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
+        let role_exists = roles()
+            .get(tenant)
+            .is_some_and(|tenant_roles| tenant_roles.contains_key(role));
+        if !role_exists {
             non_existent_roles.push(role.clone());
         }

81-81: User is created without tenant association.
Line 81 passes None, so the user is not bound to the request tenant.

🐛 Suggested fix
-    let (user, password) = user::User::new_basic(username.clone(), None, false);
+    let (user, password) = user::User::new_basic(username.clone(), tenant_id.clone(), false);

198-204: Role validation still skips missing tenants (add/remove).
Same logic hole as in post_user.

🐛 Suggested fix
-        if let Some(tenant_roles) = roles().get(tenant_id.as_deref().unwrap_or(DEFAULT_TENANT))
-            && tenant_roles.get(r).is_none()
-        {
+        let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
+        let role_exists = roles()
+            .get(tenant)
+            .is_some_and(|tenant_roles| tenant_roles.contains_key(r));
+        if !role_exists {
             non_existent_roles.push(r.clone());
         }

Also applies to: 260-266

src/rbac/user.rs (1)

385-393: Session invalidation should use each user’s tenant.
Line 392 uses the caller’s tenant for all users; this can invalidate the wrong tenant if mixed-tenant membership ever occurs.

🔧 Suggested fix
-        for group_user in &self.users {
-            mut_sessions().remove_user(group_user.userid(), tenant_id);
-        }
+        for group_user in &self.users {
+            mut_sessions().remove_user(
+                group_user.userid(),
+                group_user.tenant_id.as_deref().unwrap_or(DEFAULT_TENANT),
+            );
+        }
src/alerts/mod.rs (1)

1332-1379: Ensure tenant buckets are created on state/notification updates.

get_mut(tenant) drops updates when the tenant bucket is missing (first-time tenant or race). Use entry(...).or_default() like update() does.

♻️ Suggested fix
-        {
-            let mut write_access = self.alerts.write().await;
-
-            let tenant = alert.get_tenant_id().as_deref().unwrap_or(DEFAULT_TENANT);
-            if let Some(alerts) = write_access.get_mut(tenant) {
-                alerts.insert(*alert.get_id(), alert.clone_box());
-            }
-        }
+        {
+            let mut write_access = self.alerts.write().await;
+            let tenant = alert.get_tenant_id().as_deref().unwrap_or(DEFAULT_TENANT);
+            write_access
+                .entry(tenant.to_owned())
+                .or_default()
+                .insert(*alert.get_id(), alert.clone_box());
+        }
...
-        if let Some(alerts) = write_access.get_mut(tenant) {
-            alerts.insert(*alert.get_id(), alert.clone_box());
-        }
+        write_access
+            .entry(tenant.to_owned())
+            .or_default()
+            .insert(*alert.get_id(), alert.clone_box());
src/rbac/map.rs (2)

318-329: Fix expired-session retention predicate.

retain(|(_, expiry)| expiry < &now) keeps expired sessions and drops valid ones. That inverts the intended behavior.

🐛 Proposed fix
-        sessions.retain(|(_, expiry)| expiry < &now);
+        sessions.retain(|(_, expiry)| expiry > &now);

446-451: Avoid panic when injecting tenant header.

HeaderValue::from_bytes(...).unwrap() can panic on invalid header bytes. Handle the error instead of unwrapping.

🛠️ Suggested fix
-        if let Some((_, tenant, _)) = self.active_sessions.get(key) {
-            req.headers_mut().insert(
-                HeaderName::from_static("tenant"),
-                HeaderValue::from_bytes(tenant.as_bytes()).unwrap(),
-            );
-        }
+        if let Some((_, tenant, _)) = self.active_sessions.get(key) {
+            if let Ok(value) = HeaderValue::from_str(tenant) {
+                req.headers_mut()
+                    .insert(HeaderName::from_static("tenant"), value);
+            }
+        }
src/handlers/http/logstream.rs (1)

442-445: Resolve or remove the “TODO tenants” marker.

Either confirm hot-tier handling is already tenant-aware and remove the TODO, or complete the missing tenant scoping.

src/parseable/mod.rs (4)

1072-1087: Make add_tenant atomic to avoid TOCTOU races.

The read-then-write lock pattern allows another thread to insert the tenant between checks.

🐛 Suggested fix
-        if self.tenants.read().unwrap().contains(&tenant_id) {
-            return Err(anyhow::Error::msg(format!(
-                "Tenant with id- {tenant_id} already exists"
-            )));
-        } else {
-            self.tenants.write().unwrap().push(tenant_id.clone());
-            TENANT_METADATA.insert_tenant(tenant_id, tenant_meta);
-        }
+        let mut tenants = self.tenants.write().unwrap();
+        if tenants.contains(&tenant_id) {
+            return Err(anyhow::Error::msg(format!(
+                "Tenant with id- {tenant_id} already exists"
+            )));
+        }
+        tenants.push(tenant_id.clone());
+        TENANT_METADATA.insert_tenant(tenant_id, tenant_meta);

1131-1157: Complete tenant deletion (tenants list + storage cleanup).

delete_tenant clears RBAC and TENANT_METADATA but leaves self.tenants and object-store prefixes intact, causing stale state and data retention.

🧹 Minimal in-memory cleanup
-        // delete from in-mem
-        TENANT_METADATA.delete_tenant(tenant_id);
+        // delete from in-mem
+        if let Ok(mut tenants) = self.tenants.write() {
+            tenants.retain(|t| t != tenant_id);
+        }
+        TENANT_METADATA.delete_tenant(tenant_id);

Consider also deleting tenant storage via the metastore (likely requires making this async).


1160-1195: Handle invalid tenant dirs and lock poisoning explicitly.

The else if !is_multi_tenant {} branch is a no-op, and poisoned lock handling returns Ok(None) silently.

🛠️ Suggested fix
-            } else if !is_multi_tenant {
-            } else {
+            } else if is_multi_tenant {
                 return Err(anyhow::Error::msg(format!(
                     "Found invalid tenant directory with multi-tenant mode- {tenant_id}.\nExiting."
                 )));
             }
+            // In single-tenant mode, ignore extra tenant dirs explicitly (or return Err).
 ...
-        if let Ok(mut t) = self.tenants.write() {
-            t.extend(dirs);
-            Ok(Some(()))
-        } else {
-            Ok(None)
-        }
+        let mut t = self.tenants.write().expect("tenants lock poisoned");
+        t.extend(dirs);
+        Ok(Some(()))

1198-1206: Avoid silently ignoring poisoned locks in list_tenants.

Returning None on lock failure hides a real fault. Prefer expect() or propagate an error.

src/migration/mod.rs (2)

168-171: Don’t abort all migrations when one tenant’s list_streams fails.

list_streams(&tenant_id).await? returns early and skips remaining tenants.

♻️ Suggested fix
-        let stream_names = PARSEABLE.metastore.list_streams(&tenant_id).await?;
+        let stream_names = match PARSEABLE.metastore.list_streams(&tenant_id).await {
+            Ok(names) => names,
+            Err(e) => {
+                warn!("Failed to list streams for tenant {:?}: {:?}", tenant_id, e);
+                continue;
+            }
+        };

499-507: Use PARSEABLE_METADATA_FILE_NAME instead of a hardcoded filename.

This should align with get_staging_metadata and other staging metadata writers.

🛠️ Suggested fix
-            .join(".parseable.json")
+            .join(PARSEABLE_METADATA_FILE_NAME)
 ...
-        config.options.staging_dir().join(".parseable.json")
+        config.options.staging_dir().join(PARSEABLE_METADATA_FILE_NAME)
src/storage/object_storage.rs (1)

1212-1224: Default-tenant paths can drift from existing layout.

schema_path/stream_json_path always prefix a tenant segment when tenant_id is Some(...). If callers pass Some(DEFAULT_TENANT), metadata will land under default/... while parquet paths skip the tenant prefix, splitting metadata/data across directories. Treat DEFAULT_TENANT like None (or empty) when building these paths.

🔧 Suggested fix
 pub fn schema_path(stream_name: &str, tenant_id: &Option<String>) -> RelativePathBuf {
-    let tenant = tenant_id.as_deref().unwrap_or("");
+    let tenant = tenant_id
+        .as_deref()
+        .filter(|t| *t != DEFAULT_TENANT)
+        .unwrap_or("");
     if PARSEABLE.options.mode == Mode::Ingest {
         let id = INGESTOR_META
             .get()
             .unwrap_or_else(|| panic!("{}", INGESTOR_EXPECT))
             .get_node_id();
         let file_name = format!(".ingestor.{id}{SCHEMA_FILE_NAME}");

         RelativePathBuf::from_iter([tenant, stream_name, STREAM_ROOT_DIRECTORY, &file_name])
     } else {
         RelativePathBuf::from_iter([tenant, stream_name, STREAM_ROOT_DIRECTORY, SCHEMA_FILE_NAME])
     }
 }

 #[inline(always)]
 pub fn stream_json_path(stream_name: &str, tenant_id: &Option<String>) -> RelativePathBuf {
-    let tenant = tenant_id.as_deref().unwrap_or("");
+    let tenant = tenant_id
+        .as_deref()
+        .filter(|t| *t != DEFAULT_TENANT)
+        .unwrap_or("");
     if PARSEABLE.options.mode == Mode::Ingest {
         let id = INGESTOR_META
             .get()
             .unwrap_or_else(|| panic!("{}", INGESTOR_EXPECT))
             .get_node_id();
         let file_name = format!(".ingestor.{id}{STREAM_METADATA_FILE_NAME}",);
         RelativePathBuf::from_iter([tenant, stream_name, STREAM_ROOT_DIRECTORY, &file_name])
     } else {
         RelativePathBuf::from_iter([
             tenant,
             stream_name,
             STREAM_ROOT_DIRECTORY,
             STREAM_METADATA_FILE_NAME,
         ])
     }
 }

Based on learnings, default-tenant paths should avoid an extra tenant directory segment.

Also applies to: 1228-1243

src/handlers/http/cluster/mod.rs (1)

1901-1917: Tenant context not forwarded in query requests.

send_query_request builds headers without the tenant header, so the querier can default to the wrong tenant. Add the tenant header when tenant_id is present (use the same header name expected by get_tenant_id_from_request).

🔧 Suggested fix
     if let Some(auth) = auth_token {
         // always basic auth
         for (key, value) in auth.iter() {
             if let Ok(name) = reqwest::header::HeaderName::from_bytes(key.as_str().as_bytes())
                 && let Ok(val) = reqwest::header::HeaderValue::from_bytes(value.as_bytes())
             {
                 map.insert(name, val);
             }
         }
     } else {
         map.insert(
             reqwest::header::AUTHORIZATION,
             reqwest::header::HeaderValue::from_str(&querier.token).unwrap(),
         );
     };
+    if let Some(tenant) = tenant_id.as_deref() {
+        map.insert(
+            reqwest::header::HeaderName::from_static("tenant"),
+            reqwest::header::HeaderValue::from_str(tenant).unwrap(),
+        );
+    }
🧹 Nitpick comments (8)
src/storage/localfs.rs (2)

595-628: Consider adding metrics for consistency.

The list_dirs_relative method accepts _tenant_id but doesn't record any metrics, unlike list_dirs. This creates an inconsistency in observability. Consider adding metrics calls here if this operation should be tracked.


665-702: Consider adding metrics for consistency with list_dates.

Both list_hours and list_minutes accept _tenant_id but don't record any metrics, while list_dates records metrics for the same type of listing operation. Consider adding increment_object_store_calls_by_date calls to maintain observability consistency across listing operations.

src/storage/field_stats.rs (1)

88-109: This follows the existing pattern in src/handlers/http/query.rs where tenant_id.as_deref().unwrap_or("public") is already used to set DataFusion's default_schema.

The concern about tenant_id being Some("") causing an empty default schema is valid in theory, but the identical pattern is already in production code (query.rs) without reported issues. If this is indeed a correctness problem, it should be addressed consistently across both locations rather than only in this new code. The retrieved learning confirms the codebase intentionally uses Some("") for default-tenant representations in other layers.

src/hottier.rs (2)

242-255: Remove commented-out code.

The commented-out lines 242-245 are dead code from the pre-tenant implementation. Consider removing them to improve readability.

♻️ Proposed cleanup
     pub fn hot_tier_file_path(
         &self,
         stream: &str,
         tenant_id: &Option<String>,
     ) -> Result<object_store::path::Path, HotTierError> {
-        // let path = self
-        //     .hot_tier_path
-        //     .join(stream)
-        //     .join(STREAM_HOT_TIER_FILENAME);
         let path = if let Some(tenant_id) = tenant_id.as_ref() {

100-104: Consider extracting repeated tenant enumeration pattern.

The pattern for enumerating tenants (including None for non-tenanted mode) is repeated at lines 100-104, 293-297, 757-761, and 781-785. Consider extracting this into a helper method to reduce duplication.

♻️ Suggested helper
// In PARSEABLE or a utility module:
pub fn enumerate_tenants() -> Vec<Option<String>> {
    if let Some(tenants) = PARSEABLE.list_tenants() {
        tenants.into_iter().map(Some).collect()
    } else {
        vec![None]
    }
}
src/rbac/role.rs (1)

232-237: super_admin_perm_builder is identical to admin_perm_builder.

Both functions return the same RoleBuilder with Action::All and ParseableResourceType::All. Consider whether SuperAdmin should have distinct semantics (e.g., using Action::SuperAdmin) or if the duplication is intentional for future differentiation.

src/event/mod.rs (1)

142-152: Consider normalizing empty tenant IDs.
If any caller passes Some(""), this will look up a non-existent tenant and fail.

💡 Suggested refactor
-    let tenant_id = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
+    let tenant_id = tenant_id
+        .as_deref()
+        .filter(|t| !t.is_empty())
+        .unwrap_or(DEFAULT_TENANT);
src/storage/object_storage.rs (1)

759-761: Lower the schema dump log level.

Line 760 logs the full schema at warn, which is noisy and can leak field names. Consider debug/trace instead.

🔧 Suggested tweak
-        tracing::warn!("fetched schema- {schema:?}");
+        tracing::debug!("fetched schema- {schema:?}");

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (6)
src/rbac/user.rs (1)

332-359: UserGroup::validate queries the wrong map level after tenant partitioning.

Line 336 and Line 355 check the outer tenant map, which will (a) miss duplicate groups within the tenant and (b) mark all users as non-existent. Use the tenant bucket before checking group/user existence.

🛠️ Suggested fix
-        if read_user_groups().contains_key(&self.name) {
+        let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
+        if read_user_groups()
+            .get(tenant)
+            .is_some_and(|groups| groups.contains_key(&self.name))
+        {
             return Err(RBACError::UserGroupExists(self.name.clone()));
         }
...
-                if !users().contains_key(group_user.userid()) {
+                if !users()
+                    .get(tenant)
+                    .is_some_and(|tenant_users| tenant_users.contains_key(group_user.userid()))
+                {
                     non_existent_users.push(group_user.userid().to_string());
                 }
src/handlers/http/modal/ingest/ingestor_rbac.rs (1)

201-225: post_gen_password has same ignored storage error pattern.

Line 210 ignores the put_staging_metadata result. Additionally, line 210 writes metadata before the password hash is actually retrieved/modified (lines 211-223), which seems like incorrect ordering.

src/alerts/mod.rs (1)

627-634: Filter SSE broadcasts to sessions in the alert’s tenant.

Right now, any session authorized for the same dataset name in another tenant can receive this alert. Gate by tenant match before user_auth_for_query.

✅ Example guard
-        for (session, _, _) in active_session {
-            if user_auth_for_query(&session, &self.query).await.is_ok()
+        for (session, _, _) in active_session {
+            if get_tenant_id_from_key(&session) == self.tenant_id
+                && user_auth_for_query(&session, &self.query).await.is_ok()
             && let SessionKey::SessionId(id) = &session
             {
                 broadcast_to.push(*id);
             }
         }
src/metastore/metastores/object_store_metastore.rs (1)

896-941: Tenant path is dropped when listing manifest date partitions.

When tenant_id is Some, the date_path constructed at line 919 omits the tenant prefix. Since manifests are stored at paths like {tenant}/{stream_name}/date={date}/... (per put_manifest using partition_path), this causes list_with_delimiter to list from the wrong location.

🐛 Proposed fix
         for date in dates {
-            let date_path = object_store::path::Path::from(format!("{}/{}", stream_name, &date));
+            let date_path = if let Some(tenant) = tenant_id {
+                object_store::path::Path::from(format!("{}/{}/{}", tenant, stream_name, &date))
+            } else {
+                object_store::path::Path::from(format!("{}/{}", stream_name, &date))
+            };
             let resp = self.storage.list_with_delimiter(Some(date_path)).await?;
src/hottier.rs (2)

208-217: Use tenant-aware paths when deleting hot tiers.

delete_hot_tier always uses self.hot_tier_path.join(stream) (Line 216), so deleting a tenant-scoped hot tier can wipe default-tenant data and leave the target tenant behind. Build the directory path with tenant_id.

🐛 Proposed fix
-        let path = self.hot_tier_path.join(stream);
+        let path = if let Some(tenant) = tenant_id.as_ref() {
+            self.hot_tier_path.join(tenant).join(stream)
+        } else {
+            self.hot_tier_path.join(stream)
+        };
         fs::remove_dir_all(path).await?;

187-205: Make hot-tier data/manifest paths tenant-aware.

get_hot_tier derives oldest_date_time_entry via get_oldest_date_time_entry(stream) (Line 203), and process_parquet_file/cleanup_hot_tier_old_data read/write manifests via get_stream_path_for_date(stream, ...) and fetch_hot_tier_dates(stream) (Line 454, 620). These helpers ignore tenant_id, so non-default tenants will read/write the default path and can delete or misreport another tenant’s data. Thread tenant_id through those helpers and their call sites to keep per-tenant hot-tier data isolated.

Also applies to: 454-465, 611-668

🤖 Fix all issues with AI agents
In `@src/handlers/http/role.rs`:
- Around line 41-48: The code is panicking because
tenant_value.to_str().unwrap() is used in get_tenant_id_from_request; replace
the unwrap with a defensive Option path: call tenant_value.to_str().ok() and
handle the None case instead of panicking (e.g., return None or a safe
default/error upstream consistent with existing get_tenant_id_from_request
return type). Locate tenant_value.to_str().unwrap() in the
get_tenant_id_from_request implementation in src/utils/mod.rs and change it to
tenant_value.to_str().ok() and update the surrounding logic to propagate or
handle the Option safely.
♻️ Duplicate comments (65)
src/handlers/http/role.rs (2)

160-168: Persist default role before mutating in-memory state.

Updating DEFAULT_ROLE before put_metadata risks inconsistency if persistence fails, and .write().unwrap() can panic on a poisoned lock.


183-190: Remove commented-out code.

The commented block is stale and should be removed before merge.

src/storage/store_metadata.rs (2)

126-130: Tenant-specific staging metadata still not read.

resolve_parseable_metadata receives tenant_id but get_staging_metadata() at line 130 is called without it, so non-default tenants may read incorrect staging metadata.


309-324: Tenant staging directory not created before write.

put_staging_metadata constructs a tenant-specific path at lines 313-318 but never creates the tenant subdirectory. For new tenants, open() will fail with NotFound.

src/handlers/http/middleware.rs (2)

382-394: Handle missing tenants explicitly in check_suspension.

The else branch at lines 389-391 silently ignores TenantNotFound errors from is_action_suspended, allowing the function to return Authorized for non-existent tenants. This bypasses tenant-level suspension checks.


204-217: Handle HeaderValue::from_str error and remove dead code.

Line 206 uses .unwrap() which can panic if tid contains invalid header characters. Lines 211-216 contain commented-out code that should be removed.

🐛 Suggested fix
                 else {
                     // tenant header should not be present, modify request to add
                     let mut t = None;
                     if let Ok((_, tenant)) = get_user_and_tenant_from_request(req.request())
                         && let Some(tid) = tenant.as_ref()
+                        && let Ok(header_val) = HeaderValue::from_str(tid)
                     {
                         req.headers_mut().insert(
                             HeaderName::from_static("tenant"),
-                            HeaderValue::from_str(tid).unwrap(),
+                            header_val,
                         );
                         t = tenant;
                     }
                     t
-                    // else {
-                    //     header_error = Some(actix_web::Error::from(PostError::Header(
-                    //         crate::utils::header_parsing::ParseHeaderError::InvalidTenantId,
-                    //     )));
-                    //     None
-                    // }
                 };
src/users/dashboards.rs (1)

244-265: Creating dashboards for a new tenant becomes a silent no-op.

Line 255 only proceeds when a tenant bucket already exists, so new tenants never get their first dashboard saved. Use entry().or_default() to create the bucket before checking duplicates.

🛠️ Suggested fix
-        if let Some(dbs) = dashboards.get_mut(tenant) {
-            let has_duplicate = dbs
-                .iter()
-                .any(|d| d.title == dashboard.title && d.dashboard_id != dashboard.dashboard_id);
-            if has_duplicate {
-                return Err(DashboardError::Metadata("Dashboard title must be unique"));
-            }
-            self.save_dashboard(dashboard, tenant_id).await?;
-
-            dbs.push(dashboard.clone());
-        }
+        let dbs = dashboards.entry(tenant.to_owned()).or_default();
+        let has_duplicate = dbs
+            .iter()
+            .any(|d| d.title == dashboard.title && d.dashboard_id != dashboard.dashboard_id);
+        if has_duplicate {
+            return Err(DashboardError::Metadata("Dashboard title must be unique"));
+        }
+        self.save_dashboard(dashboard, tenant_id).await?;
+        dbs.push(dashboard.clone());
src/parseable/streams.rs (2)

1388-1399: Tests still use old metric label sets.

Line 1391–1399 call .with_label_values(...) without the tenant label, but the metric now expects it. This will panic at runtime.

🛠️ Suggested fix
-        let staging_files = metrics::STAGING_FILES.with_label_values(&[&stream]).get();
+        let tenant = DEFAULT_TENANT;
+        let staging_files = metrics::STAGING_FILES.with_label_values(&[&stream, tenant]).get();
...
-        let storage_size_arrows = metrics::STORAGE_SIZE
-            .with_label_values(&["staging", &stream, "arrows"])
+        let storage_size_arrows = metrics::STORAGE_SIZE
+            .with_label_values(&["staging", &stream, "arrows", tenant])
             .get();
...
-        let storage_size_parquet = metrics::STORAGE_SIZE
-            .with_label_values(&["staging", &stream, "parquet"])
+        let storage_size_parquet = metrics::STORAGE_SIZE
+            .with_label_values(&["staging", &stream, "parquet", tenant])
             .get();

1697-1701: Test panics by unwrapping a missing tenant bucket.

Line 1700 unwraps guard.get(DEFAULT_TENANT) while the map is empty, which will panic. Use a safe check.

🛠️ Suggested fix
-        assert!(!guard.get(DEFAULT_TENANT).unwrap().contains_key(stream_name));
+        assert!(guard.get(DEFAULT_TENANT).is_none());
src/query/mod.rs (1)

181-205: Log schema registration failures instead of discarding them.

Line 191 and Line 201 ignore register_schema errors, so startup can succeed with missing schemas and later queries will fail without a clear signal. Please log or surface the error.

🛠️ Suggested fix
-                    let _ = catalog.register_schema(t, schema_provider);
+                    if let Err(e) = catalog.register_schema(t, schema_provider) {
+                        tracing::error!("failed to register schema for tenant {t}: {e}");
+                    }
...
-            let _ = catalog.register_schema(
+            if let Err(e) = catalog.register_schema(
                 &state.config_options().catalog.default_schema,
                 schema_provider,
-            );
+            ) {
+                tracing::error!("failed to register default schema: {e}");
+            }
src/rbac/map.rs (2)

318-329: Session cleanup keeps expired entries and drops valid ones.

Line 328 retains sessions where expiry < now, which leaves only expired sessions. It should retain non-expired ones.

🛠️ Suggested fix
-        sessions.retain(|(_, expiry)| expiry < &now);
+        sessions.retain(|(_, expiry)| expiry > &now);

446-452: Avoid panics when inserting the tenant header.

Line 450 uses HeaderValue::from_bytes(...).unwrap(), which panics on invalid header bytes. Handle the Result instead.

🛠️ Suggested fix
-        if let Some((_, tenant, _)) = self.active_sessions.get(key) {
-            req.headers_mut().insert(
-                HeaderName::from_static("tenant"),
-                HeaderValue::from_bytes(tenant.as_bytes()).unwrap(),
-            );
-        }
+        if let Some((_, tenant, _)) = self.active_sessions.get(key) {
+            if let Ok(value) = HeaderValue::from_str(tenant) {
+                req.headers_mut()
+                    .insert(HeaderName::from_static("tenant"), value);
+            }
+        }
actix-web HeaderValue::from_bytes validation rules; does from_str return Result for invalid header values?
src/migration/mod.rs (2)

168-194: A single tenant’s list_streams failure aborts all migrations.

Line 170 uses ?, which stops the whole migration on one tenant error. Prefer per-tenant error handling and continue with others.

🛠️ Suggested fix
-        let stream_names = PARSEABLE.metastore.list_streams(&tenant_id).await?;
+        let stream_names = match PARSEABLE.metastore.list_streams(&tenant_id).await {
+            Ok(names) => names,
+            Err(e) => {
+                warn!("Failed to list streams for tenant {:?}: {:?}", tenant_id, e);
+                continue;
+            }
+        };

494-507: Use the metadata filename constant instead of a hardcoded string.

Line 504/506 hardcode ".parseable.json" while the rest of the module uses PARSEABLE_METADATA_FILE_NAME. This can drift.

🛠️ Suggested fix
-            .join(".parseable.json")
+            .join(PARSEABLE_METADATA_FILE_NAME)
     } else {
-        config.options.staging_dir().join(".parseable.json")
+        config.options.staging_dir().join(PARSEABLE_METADATA_FILE_NAME)
     };
src/connectors/kafka/processor.rs (1)

53-57: Verify that all records in a batch belong to the same tenant.

The tenant_id is extracted from only the first record in the batch, but batching is time/size-based without tenant awareness. If a Kafka partition contains messages from multiple tenants, records beyond the first would be processed under an incorrect tenant context.

Consider validating tenant consistency across all records or grouping by tenant before processing.

src/tenants/mod.rs (1)

50-60: Potential data duplication in tenant storage.

suspended_services is stored both in TenantOverview.suspended_services and within TenantOverview.meta.suspended_services. When suspend_service or resume_service is called, only the HashSet is updated, leaving meta.suspended_services stale.

If get_tenants() returns meta for persistence, the suspension state changes won't be reflected.

src/utils/mod.rs (2)

63-74: Avoid panic when BasicAuth omits a password.

Line 67 uses unwrap() on an optional password, which can panic if the password is omitted (valid per HTTP Basic Auth spec).

🛠️ Proposed fix
-        Users.mutate_request_with_basic_user(basic.user_id(), basic.password().unwrap(), req);
+        Users.mutate_request_with_basic_user(basic.user_id(), basic.password().unwrap_or(""), req);

126-130: Potential panic on malformed header value.

Using .unwrap() on to_str() can panic if the tenant header contains non-UTF8 bytes.

🛠️ Proposed fix
 pub fn get_tenant_id_from_request(req: &HttpRequest) -> Option<String> {
-    req.headers()
-        .get("tenant")
-        .map(|tenant_value| tenant_value.to_str().unwrap().to_owned())
+    req.headers()
+        .get("tenant")
+        .and_then(|tenant_value| tenant_value.to_str().ok().map(|s| s.to_owned()))
 }
src/users/filters.rs (1)

128-134: Filter silently dropped when tenant bucket doesn't exist.

The update method only modifies existing tenant buckets via get_mut(). If the tenant entry doesn't exist (e.g., first filter for a new tenant after restart), the filter is silently dropped.

🐛 Proposed fix using entry API
     pub async fn update(&self, filter: &Filter, tenant_id: &Option<String>) {
         let mut s = self.0.write().await;
-        if let Some(filters) = s.get_mut(tenant_id.as_deref().unwrap_or(DEFAULT_TENANT)) {
-            filters.retain(|f| f.filter_id != filter.filter_id);
-            filters.push(filter.clone());
-        }
+        let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
+        let filters = s.entry(tenant.to_owned()).or_default();
+        filters.retain(|f| f.filter_id != filter.filter_id);
+        filters.push(filter.clone());
     }
src/correlation.rs (4)

275-315: Table validation is not tenant-scoped.

ctx.table() resolves unqualified names across all schemas, so a collision can validate against another tenant’s table. Use a tenant-scoped schema/provider or set the schema before resolving tables (same pattern as query execution).


140-144: Create the per-tenant cache entry on insert.

If the tenant bucket doesn’t exist yet (e.g., first correlation after restart), the correlation is persisted but not cached. This causes reads to fail despite a successful create.

🐛 Proposed fix
-        if let Some(corrs) = self.write().await.get_mut(tenant) {
-            corrs.insert(correlation.id.to_owned(), correlation.clone());
-        }
+        self.write()
+            .await
+            .entry(tenant.to_owned())
+            .or_default()
+            .insert(correlation.id.to_owned(), correlation.clone());

176-183: Same cache-miss issue in update().

If the tenant bucket is missing, the update is silently skipped in memory, leaving stale cache state.

🐛 Proposed fix
-        if let Some(corrs) = self.write().await.get_mut(tenant) {
-            corrs.insert(
-                updated_correlation.id.to_owned(),
-                updated_correlation.clone(),
-            );
-        }
+        self.write()
+            .await
+            .entry(tenant.to_owned())
+            .or_default()
+            .insert(updated_correlation.id.to_owned(), updated_correlation.clone());

210-211: Delete removes the wrong map level.

remove(&correlation.id) is called on the outer tenant map, which can delete an entire tenant entry or do nothing. This should delete from the inner map for the resolved tenant.

🐛 Proposed fix
-        self.write().await.remove(&correlation.id);
+        let tenant = tenant_id.as_ref().map_or(DEFAULT_TENANT, |v| v.as_str());
+        if let Some(corrs) = self.write().await.get_mut(tenant) {
+            corrs.remove(&correlation.id);
+        }
src/alerts/alert_types.rs (1)

90-97: Avoid panic when building Authorization header.

HeaderValue::from_str(&header).unwrap() can panic on malformed input. Convert this into a handled error (or skip the header) instead.

🛠️ Proposed fix
-            map.insert(
-                HeaderName::from_static("authorization"),
-                HeaderValue::from_str(&header).unwrap(),
-            );
+            let value = HeaderValue::from_str(&header)
+                .map_err(|e| AlertError::CustomError(format!("Invalid auth header: {e}")))?;
+            map.insert(HeaderName::from_static("authorization"), value);
src/alerts/target.rs (1)

61-67: Normalize DEFAULT_TENANT to None when loading targets.

Storing Some(DEFAULT_TENANT) breaks the convention where None represents the default tenant and can cause mismatches elsewhere.

♻️ Suggested fix
-                .map(|mut t| {
-                    t.tenant = Some(tenant_id.clone());
+                .map(|mut t| {
+                    t.tenant = if tenant_id == DEFAULT_TENANT {
+                        None
+                    } else {
+                        Some(tenant_id.clone())
+                    };
                     (t.id, t)
                 })
src/query/stream_schema_provider.rs (1)

84-93: Avoid TOCTOU between table_exist and get_stream.

The stream can be deleted between the existence check and the expect, causing a panic. Use a single get_stream and handle Err as Ok(None).

🛠️ Proposed fix
-        if self.table_exist(name) {
-            Ok(Some(Arc::new(StandardTableProvider {
-                schema: PARSEABLE
-                    .get_stream(name, &self.tenant_id)
-                    .expect(STREAM_EXISTS)
-                    .get_schema(),
-                stream: name.to_owned(),
-                tenant_id: self.tenant_id.clone(),
-            })))
-        } else {
-            Ok(None)
-        }
+        match PARSEABLE.get_stream(name, &self.tenant_id) {
+            Ok(stream) => Ok(Some(Arc::new(StandardTableProvider {
+                schema: stream.get_schema(),
+                stream: name.to_owned(),
+                tenant_id: self.tenant_id.clone(),
+            }))),
+            Err(_) => Ok(None),
+        }
src/handlers/http/modal/ingest/ingestor_role.rs (3)

45-51: Fix inverted tenant validation.

The current check rejects same-tenant operations. It should reject only when the request tenant is non-default and does not match the payload tenant.

🐛 Proposed fix
-    if req_tenant.ne(DEFAULT_TENANT) && (req_tenant_id.eq(&sync_req.tenant_id)) {
+    if req_tenant.ne(DEFAULT_TENANT) && (req_tenant_id.ne(&sync_req.tenant_id)) {

53-58: Don’t ignore staging metadata write failures.

let _ = ... hides persistence failures and can leave in-memory state diverged from disk.

🐛 Proposed fix
-    let _ = storage::put_staging_metadata(&metadata, &sync_req.tenant_id);
+    storage::put_staging_metadata(&metadata, &sync_req.tenant_id)?;

59-66: Use a consistent tenant source for in-memory updates.

Metadata is written using sync_req.tenant_id, but in-memory roles are keyed by req_tenant_id. This can split a single update across two tenants.

🐛 Proposed fix
-    let tenant_id = req_tenant_id
-        .as_deref()
-        .unwrap_or(DEFAULT_TENANT)
-        .to_owned();
+    let tenant_id = sync_req
+        .tenant_id
+        .as_deref()
+        .unwrap_or(DEFAULT_TENANT)
+        .to_owned();
src/handlers/http/modal/query/querier_rbac.rs (4)

60-66: Treat missing tenant role map as “role does not exist.”

When roles().get(tenant) is None, the current check silently passes. This should fail the role validation.

🐛 Proposed fix
-        if let Some(tenant_roles) = roles().get(tenant_id.as_deref().unwrap_or(DEFAULT_TENANT))
-            && !tenant_roles.contains_key(role)
-        {
-            non_existent_roles.push(role.clone());
-        }
+        let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
+        let role_exists = roles()
+            .get(tenant)
+            .is_some_and(|tenant_roles| tenant_roles.contains_key(role));
+        if !role_exists {
+            non_existent_roles.push(role.clone());
+        }

81-81: Associate new users with the tenant.

User::new_basic is called with None, creating users without tenant association. Pass the request’s tenant.

🐛 Proposed fix
-    let (user, password) = user::User::new_basic(username.clone(), None, false);
+    let (user, password) = user::User::new_basic(username.clone(), tenant_id.clone(), false);

198-203: Role existence check still skips missing tenant.

Same validation gap as in post_user—missing tenant map should fail the role check.

🐛 Proposed fix
-        if let Some(tenant_roles) = roles().get(tenant_id.as_deref().unwrap_or(DEFAULT_TENANT))
-            && tenant_roles.get(r).is_none()
-        {
-            non_existent_roles.push(r.clone());
-        }
+        let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
+        let role_exists = roles()
+            .get(tenant)
+            .is_some_and(|tenant_roles| tenant_roles.contains_key(r));
+        if !role_exists {
+            non_existent_roles.push(r.clone());
+        }

260-264: Same role validation issue in remove_roles_from_user.

A missing tenant map should mark roles as non-existent instead of silently passing.

🐛 Proposed fix
-        if let Some(tenant_roles) = roles().get(tenant_id.as_deref().unwrap_or(DEFAULT_TENANT))
-            && tenant_roles.get(r).is_none()
-        {
-            non_existent_roles.push(r.clone());
-        }
+        let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
+        let role_exists = roles()
+            .get(tenant)
+            .is_some_and(|tenant_roles| tenant_roles.contains_key(r));
+        if !role_exists {
+            non_existent_roles.push(r.clone());
+        }
src/stats.rs (2)

116-127: Normalize empty tenant IDs to avoid split metric series.

The tenant derivation at line 126 doesn't handle Some("") - if callers pass an empty string, metrics will be labeled with "" instead of DEFAULT_TENANT, creating fragmented metric namespaces.

Based on learnings, default-tenant paths may use empty-string segments, so metrics should normalize empty to DEFAULT_TENANT.

🔧 Suggested helper
fn normalize_tenant(tenant_id: &Option<String>) -> &str {
    tenant_id
        .as_deref()
        .filter(|t| !t.is_empty())
        .unwrap_or(DEFAULT_TENANT)
}

235-256: Same empty-string normalization issue in label helpers.

event_labels and storage_size_labels have the same issue: Some("") passes the if let Some(tenant_id) check but produces empty-string labels instead of DEFAULT_TENANT.

Apply the same normalization pattern here.

src/handlers/http/oidc.rs (4)

102-116: Use get_tenant_id_from_request() for tenant selection in basic-auth flow.

For SessionKey::BasicAuth, get_tenant_id_from_key() returns None because basic credentials have no pre-established session mapping. This causes Users.get_user(&username, &None) to default to DEFAULT_TENANT regardless of the actual tenant context, breaking multi-tenant support for basic-auth users.

Use get_tenant_id_from_request(&req) instead to allow tenant selection via HTTP header.


130-159: Intra-cluster login sync should treat non-2xx as failure.

reqwest::send() succeeds on non-2xx responses, so the sync can silently fail. Consider .error_for_status() and logging per-node failures.

🔧 Proposed fix
                     async move {
-                        INTRA_CLUSTER_CLIENT
+                        let resp = INTRA_CLUSTER_CLIENT
                             .post(url)
                             .header(header::AUTHORIZATION, node.token)
                             .header(header::CONTENT_TYPE, "application/json")
                             .json(&json!(
                                 {
                                     "sessionCookie": _session,
                                     "user": _user,
                                     "expiry": EXPIRY_DURATION
                                 }
                             ))
                             .send()
-                            .await?;
+                            .await?;
+                        resp.error_for_status()?;
                         Ok::<(), anyhow::Error>(())
                     }

229-229: Replace unwrap() on header conversion to prevent panics on malformed tenant header.

Per the relevant snippet in src/utils/mod.rs, get_tenant_id_from_request() uses .to_str().unwrap() on the tenant header value. A malformed header with invalid UTF-8 bytes will panic the process.

🔧 Suggested fix in src/utils/mod.rs
pub fn get_tenant_id_from_request(req: &HttpRequest) -> Option<String> {
    req.headers()
        .get("tenant")
        .and_then(|v| v.to_str().ok())
        .map(|s| s.to_owned())
}

323-327: Address incomplete tenant implementation in OIDC user creation.

The comment "LET TENANT BE NONE FOR NOW!!!" at line 325 indicates that new OAuth users are intentionally created without tenant association, despite tenant_id being extracted from the request and used throughout the function for existing user lookups.

This creates a multi-tenant isolation gap: existing OIDC users are correctly looked up within tenant context, but new OIDC users are created without it.

Replace None with tenant_id in the put_user call, or if this is intentional WIP, add a TODO comment with a tracking issue reference.

src/handlers/http/query.rs (2)

118-145: Derive tenant from authenticated session, not the tenant header.

Using get_tenant_id_from_request makes tenant scoping spoofable via header manipulation. An authenticated user could potentially query another tenant's data by setting the tenant header.

Consider using get_user_and_tenant_from_request (deriving tenant from the session) and mapping errors to QueryError::Unauthorized.

🔧 Proposed fix
-    create_streams_for_distributed(tables.clone(), &get_tenant_id_from_request(&req)).await?;
-
-    let tenant_id = get_tenant_id_from_request(&req);
+    let (_user_id, tenant_id) =
+        get_user_and_tenant_from_request(&req).map_err(|_| QueryError::Unauthorized)?;
+    create_streams_for_distributed(tables.clone(), &tenant_id).await?;

370-434: Same tenant spoofing concern in get_counts.

Line 376 uses get_tenant_id_from_request(&req) which is header-based and spoofable. Apply the same session-derived tenant pattern as recommended for query().

src/handlers/http/modal/ingest/ingestor_rbac.rs (4)

40-67: Inverted tenant validation logic in post_user.

Lines 51-52: The condition checks if req_tenant.eq(user.tenant...) when it should check for inequality. The error message indicates this should prevent cross-tenant operations, but the current logic prevents same-tenant operations instead.

🔧 Proposed fix
-        if req_tenant.ne(DEFAULT_TENANT)
-            && (req_tenant.eq(user.tenant.as_deref().unwrap_or(DEFAULT_TENANT)))
+        if req_tenant.ne(DEFAULT_TENANT)
+            && (req_tenant.ne(user.tenant.as_deref().unwrap_or(DEFAULT_TENANT)))

60-60: Silently ignoring storage error.

The result of put_staging_metadata is discarded with let _ = .... If persistence fails, in-memory and storage states diverge, potentially causing data loss or inconsistent state on restart.

🔧 Proposed fix
-        let _ = storage::put_staging_metadata(&metadata, req_tenant_id);
+        storage::put_staging_metadata(&metadata, req_tenant_id)?;

Apply the same fix at lines 86, 133, 192, and 210.


108-114: Role validation silently passes for non-existent tenants.

If the tenant doesn't exist in the roles() map, the if let Some(tenant_roles) = roles().get(...) check fails, and roles are not added to non_existent_roles. This means validation passes silently when it should fail.

🔧 Proposed fix
     roles_to_add.iter().for_each(|r| {
-        if let Some(tenant_roles) = roles().get(tenant_id.as_deref().unwrap_or(DEFAULT_TENANT))
-            && tenant_roles.get(r).is_none()
-        {
+        let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
+        let role_exists = roles()
+            .get(tenant)
+            .is_some_and(|tenant_roles| tenant_roles.contains_key(r));
+        if !role_exists {
             non_existent_roles.push(r.clone());
         }
     });

154-160: Same role validation issue in remove_roles_from_user.

Apply the same fix pattern as add_roles_to_user.

src/handlers/http/logstream.rs (2)

442-445: Resolve the hot-tier tenant TODO.

Either remove the TODO if hot-tier handling is already tenant-aware, or complete the tenant scoping in this flow.


52-61: Derive tenant from authenticated session, not request header.

get_tenant_id_from_request reads a user-controlled header. A user can target another tenant’s stream (same stream name) by spoofing the header. Derive tenant_id from the authenticated session/claims instead and reject if unavailable. This applies across these handlers.

🔒 Example approach
-    let tenant_id = get_tenant_id_from_request(&req);
+    let key = extract_session_key_from_req(&req)
+        .map_err(|err| StreamError::Anyhow(anyhow::Error::msg(err.to_string())))?;
+    let tenant_id = get_tenant_id_from_key(&key);
src/prism/logstream/mod.rs (1)

68-71: Restore real stats instead of placeholder defaults.

QueriedStats::default() masks real results and breaks dataset stats. Please use the actual query result and propagate errors.

🐛 Proposed fix
-    // let stats = stats?;
-    tracing::warn!(real_stats=?stats);
-    let stats = QueriedStats::default();
-    tracing::warn!("got FAKE stats");
+    let stats = stats?;
src/rbac/mod.rs (1)

304-307: Avoid panic on invalid tenant header bytes.

HeaderValue::from_bytes(...).unwrap() can panic; handle the error and skip insertion instead.

🔧 Safer insertion
-            req.headers_mut().insert(
-                HeaderName::from_static("tenant"),
-                HeaderValue::from_bytes(tenant.as_bytes()).unwrap(),
-            );
+            if let Ok(value) = HeaderValue::from_str(tenant) {
+                req.headers_mut()
+                    .insert(HeaderName::from_static("tenant"), value);
+            }
src/handlers/http/rbac.rs (1)

72-75: Do not trust the tenant header for RBAC scoping.

These handlers derive tenant_id from a user-supplied header, which is spoofable. Use the authenticated session/claims to resolve tenant_id and reject requests without a valid session-derived tenant.

src/alerts/mod.rs (2)

1045-1050: Normalize DEFAULT_TENANT to None during alert load.

Treating DEFAULT_TENANT as Some(...) writes/reads default alerts via tenant-prefixed paths. Normalize to None to keep default-tenant storage semantics consistent.

🔁 Proposed fix
-            let tenant = if tenant_id.is_empty() {
-                &None
-            } else {
-                &Some(tenant_id.clone())
-            };
+            let tenant = if tenant_id.is_empty() || tenant_id == DEFAULT_TENANT {
+                &None
+            } else {
+                &Some(tenant_id.clone())
+            };

1333-1337: Ensure tenant buckets exist before inserting alert updates.

get_mut(tenant) drops updates when a tenant bucket is missing. Use entry(...).or_default() in both update paths.

♻️ Suggested fix
-            if let Some(alerts) = write_access.get_mut(tenant) {
-                alerts.insert(*alert.get_id(), alert.clone_box());
-            }
+            write_access
+                .entry(tenant.to_owned())
+                .or_default()
+                .insert(*alert.get_id(), alert.clone_box());

Also applies to: 1377-1379

src/handlers/http/cluster/mod.rs (2)

1876-1922: Propagate tenant header and avoid panic on invalid token.

Line 1913 drops tenant context, and HeaderValue::from_str(..).unwrap() can panic on invalid characters.

🐛 Proposed fix
     let mut map = reqwest::header::HeaderMap::new();

     if let Some(auth) = auth_token {
         // always basic auth
         for (key, value) in auth.iter() {
             if let Ok(name) = reqwest::header::HeaderName::from_bytes(key.as_str().as_bytes())
                 && let Ok(val) = reqwest::header::HeaderValue::from_bytes(value.as_bytes())
             {
                 map.insert(name, val);
             }
         }
     } else {
-        map.insert(
-            reqwest::header::AUTHORIZATION,
-            reqwest::header::HeaderValue::from_str(&querier.token).unwrap(),
-        );
+        let auth_val = reqwest::header::HeaderValue::from_str(&querier.token)
+            .map_err(|e| QueryError::Anyhow(anyhow::anyhow!("Invalid token header: {}", e)))?;
+        map.insert(reqwest::header::AUTHORIZATION, auth_val);
     };
+
+    if let Some(tenant) = tenant_id.as_deref() {
+        let tenant_val = reqwest::header::HeaderValue::from_str(tenant)
+            .map_err(|e| QueryError::Anyhow(anyhow::anyhow!("Invalid tenant header: {}", e)))?;
+        map.insert(reqwest::header::HeaderName::from_static("tenant"), tenant_val);
+    }

668-674: Avoid panics on invalid tenant headers.

get_tenant_id_from_request still uses to_str().unwrap(), so a malformed header can crash these handlers. Prefer fallible parsing and surface a 4xx or None.

💡 Suggested fix (in src/utils/mod.rs)
 pub fn get_tenant_id_from_request(req: &HttpRequest) -> Option<String> {
-    req.headers()
-        .get("tenant")
-        .map(|tenant_value| tenant_value.to_str().unwrap().to_owned())
+    req.headers()
+        .get("tenant")
+        .and_then(|tenant_value| tenant_value.to_str().ok())
+        .map(|v| v.to_owned())
 }

Also applies to: 919-921, 1065-1067

src/metastore/metastores/object_store_metastore.rs (4)

317-332: Default-tenant path fallback is inconsistent for alert states.

Using DEFAULT_TENANT here searches DEFAULT_TENANT/.alerts/ instead of the tenant-less path used elsewhere, so default-tenant alert states won’t be found.

🐛 Proposed fix
-        let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
-        let base_path = RelativePathBuf::from_iter([tenant, ALERTS_ROOT_DIRECTORY]);
+        let base_path = if let Some(tenant) = tenant_id.as_ref() {
+            RelativePathBuf::from_iter([tenant, ALERTS_ROOT_DIRECTORY])
+        } else {
+            RelativePathBuf::from(ALERTS_ROOT_DIRECTORY)
+        };

Based on learnings, default-tenant paths should use the empty segment to avoid an extra tenant directory.


516-545: Dashboard aggregation overwrites prior users within the same tenant.

The per-user loop always inserts with the same tenant key, so only the last user’s dashboards remain.

🐛 Suggested fix to accumulate dashboards per tenant
     async fn get_dashboards(&self) -> Result<HashMap<String, Vec<Bytes>>, MetastoreError> {
         let mut dashboards = HashMap::new();
         let base_paths = PARSEABLE.list_tenants().unwrap_or_else(|| vec!["".into()]);
         for mut tenant in base_paths {
+            let mut tenant_dashboards = Vec::new();
             let tenant_id = &Some(tenant.clone());
             let users_dir = RelativePathBuf::from_iter([&tenant, USERS_ROOT_DIR]);
             for user in self
                 .storage
                 .list_dirs_relative(&users_dir, tenant_id)
                 .await?
             {
                 let dashboards_path = users_dir.join(&user).join("dashboards");
                 let dashboard_bytes = self
                     .storage
                     .get_objects(
                         Some(&dashboards_path),
                         Box::new(|file_name| file_name.ends_with(".json")),
                         tenant_id,
                     )
                     .await?;
-                if tenant.is_empty() {
-                    tenant.clone_from(&DEFAULT_TENANT.to_string());
-                }
-                dashboards.insert(tenant.to_owned(), dashboard_bytes);
+                tenant_dashboards.extend(dashboard_bytes);
             }
+            if tenant.is_empty() {
+                tenant.clone_from(&DEFAULT_TENANT.to_string());
+            }
+            dashboards.insert(tenant, tenant_dashboards);
         }

         Ok(dashboards)
     }

634-723: Guard against panics when migration yields non-object filters.

filter_value.as_object().unwrap() can panic after migrate_v1_v2 if the payload becomes non-object.

🛡️ Safer extraction
-                                let user_id = filter_value
-                                    .as_object()
-                                    .unwrap()
-                                    .get("user_id")
-                                    .and_then(|user_id| user_id.as_str());
-                                let filter_id = filter_value
-                                    .as_object()
-                                    .unwrap()
-                                    .get("filter_id")
-                                    .and_then(|filter_id| filter_id.as_str());
-                                let stream_name = filter_value
-                                    .as_object()
-                                    .unwrap()
-                                    .get("stream_name")
-                                    .and_then(|stream_name| stream_name.as_str());
+                                let migrated_obj = filter_value.as_object();
+                                let user_id = migrated_obj
+                                    .and_then(|o| o.get("user_id"))
+                                    .and_then(|v| v.as_str());
+                                let filter_id = migrated_obj
+                                    .and_then(|o| o.get("filter_id"))
+                                    .and_then(|v| v.as_str());
+                                let stream_name = migrated_obj
+                                    .and_then(|o| o.get("stream_name"))
+                                    .and_then(|v| v.as_str());

1167-1172: delete_tenant drops tenant attribution.

delete_prefix is called with &None, so metrics (and any tenant-scoped behavior in storage adapters) get attributed to default tenant instead of the one being deleted.

🐛 Proposed fix
     async fn delete_tenant(&self, tenant_id: &str) -> Result<(), MetastoreError> {
         self.storage
-            .delete_prefix(&RelativePathBuf::from(tenant_id), &None)
+            .delete_prefix(&RelativePathBuf::from(tenant_id), &Some(tenant_id.to_string()))
             .await
             .map_err(MetastoreError::ObjectStorageError)
     }
src/storage/object_storage.rs (1)

1212-1245: Treat DEFAULT_TENANT like None in schema/stream.json paths.

schema_path / stream_json_path always include a tenant segment when tenant_id is Some(...) (Line 1212-1239). If callers pass Some(DEFAULT_TENANT), these paths diverge from stream_relative_path and the existing default-tenant layout.

🐛 Proposed fix
-pub fn schema_path(stream_name: &str, tenant_id: &Option<String>) -> RelativePathBuf {
-    let tenant = tenant_id.as_deref().unwrap_or("");
+pub fn schema_path(stream_name: &str, tenant_id: &Option<String>) -> RelativePathBuf {
+    let tenant = tenant_id.as_deref().filter(|t| *t != DEFAULT_TENANT);
     if PARSEABLE.options.mode == Mode::Ingest {
         let id = INGESTOR_META
             .get()
             .unwrap_or_else(|| panic!("{}", INGESTOR_EXPECT))
             .get_node_id();
         let file_name = format!(".ingestor.{id}{SCHEMA_FILE_NAME}");
-        RelativePathBuf::from_iter([tenant, stream_name, STREAM_ROOT_DIRECTORY, &file_name])
+        if let Some(tenant) = tenant {
+            RelativePathBuf::from_iter([tenant, stream_name, STREAM_ROOT_DIRECTORY, &file_name])
+        } else {
+            RelativePathBuf::from_iter([stream_name, STREAM_ROOT_DIRECTORY, &file_name])
+        }
     } else {
-        RelativePathBuf::from_iter([tenant, stream_name, STREAM_ROOT_DIRECTORY, SCHEMA_FILE_NAME])
+        if let Some(tenant) = tenant {
+            RelativePathBuf::from_iter([tenant, stream_name, STREAM_ROOT_DIRECTORY, SCHEMA_FILE_NAME])
+        } else {
+            RelativePathBuf::from_iter([stream_name, STREAM_ROOT_DIRECTORY, SCHEMA_FILE_NAME])
+        }
     }
 }
 
-pub fn stream_json_path(stream_name: &str, tenant_id: &Option<String>) -> RelativePathBuf {
-    let tenant = tenant_id.as_deref().unwrap_or("");
+pub fn stream_json_path(stream_name: &str, tenant_id: &Option<String>) -> RelativePathBuf {
+    let tenant = tenant_id.as_deref().filter(|t| *t != DEFAULT_TENANT);
     if PARSEABLE.options.mode == Mode::Ingest {
         let id = INGESTOR_META
             .get()
             .unwrap_or_else(|| panic!("{}", INGESTOR_EXPECT))
             .get_node_id();
         let file_name = format!(".ingestor.{id}{STREAM_METADATA_FILE_NAME}",);
-        RelativePathBuf::from_iter([tenant, stream_name, STREAM_ROOT_DIRECTORY, &file_name])
+        if let Some(tenant) = tenant {
+            RelativePathBuf::from_iter([tenant, stream_name, STREAM_ROOT_DIRECTORY, &file_name])
+        } else {
+            RelativePathBuf::from_iter([stream_name, STREAM_ROOT_DIRECTORY, &file_name])
+        }
     } else {
-        RelativePathBuf::from_iter([tenant, stream_name, STREAM_ROOT_DIRECTORY, STREAM_METADATA_FILE_NAME])
+        if let Some(tenant) = tenant {
+            RelativePathBuf::from_iter([tenant, stream_name, STREAM_ROOT_DIRECTORY, STREAM_METADATA_FILE_NAME])
+        } else {
+            RelativePathBuf::from_iter([stream_name, STREAM_ROOT_DIRECTORY, STREAM_METADATA_FILE_NAME])
+        }
     }
 }
Based on learnings, default-tenant paths should map to an empty segment rather than a tenant directory.
src/hottier.rs (2)

105-113: Exclude only the current stream+tenant in the hot-tier total.

At Line 107-109, stream != current_stream && tenant_id != *current_tenant_id filters too aggressively and still counts the current stream when only one side differs. Skip only the exact (stream, tenant) pair.

🐛 Proposed fix
-                if self.check_stream_hot_tier_exists(&stream, &tenant_id)
-                    && stream != current_stream
-                    && tenant_id != *current_tenant_id
+                if self.check_stream_hot_tier_exists(&stream, &tenant_id)
+                    && !(stream == current_stream && tenant_id == *current_tenant_id)
                 {

595-602: Avoid panics when resolving hot-tier paths.

hot_tier_file_path(...).unwrap() (Line 601) can panic for invalid paths or tenant input. This is an existence check; return false on errors and log.

🐛 Proposed fix
-        let path = self.hot_tier_file_path(stream, tenant_id).unwrap();
-        PathBuf::from(path.to_string()).exists()
+        match self.hot_tier_file_path(stream, tenant_id) {
+            Ok(path) => PathBuf::from(path.to_string()).exists(),
+            Err(err) => {
+                warn!(
+                    "Failed to resolve hot tier file path for stream={stream} tenant={tenant_id:?}: {err}"
+                );
+                false
+            }
+        }
src/parseable/mod.rs (4)

1072-1088: Make add_tenant check+insert atomic.

The read-lock contains check and the write-lock push are separated (Line 1081-1087), which introduces a TOCTOU race. Use a single write lock for the check and insert.

🐛 Proposed fix
-        if self.tenants.read().unwrap().contains(&tenant_id) {
+        let mut tenants = self.tenants.write().unwrap();
+        if tenants.contains(&tenant_id) {
             return Err(anyhow::Error::msg(format!(
                 "Tenant with id- {tenant_id} already exists"
             )));
-        } else {
-            self.tenants.write().unwrap().push(tenant_id.clone());
-            TENANT_METADATA.insert_tenant(tenant_id, tenant_meta);
-        }
+        }
+        tenants.push(tenant_id.clone());
+        TENANT_METADATA.insert_tenant(tenant_id, tenant_meta);

1131-1157: Ensure delete_tenant fully cleans tenant state and storage.

Right now it clears RBAC and TENANT_METADATA but leaves self.tenants untouched, so list_tenants() can still return deleted tenants (Line 1155-1156). Also, tenant data in object storage isn’t removed, which is a privacy/compliance risk. Consider removing the tenant from the in-memory list and making this async to call metastore.delete_tenant.

🧹 Proposed fix (in‑mem cleanup)
         // delete roles
         mut_roles().remove(tenant_id);
 
+        // remove from tenant list
+        if let Ok(mut tenants) = self.tenants.write() {
+            tenants.retain(|t| t != tenant_id);
+        }
+
         // delete from in-mem
         TENANT_METADATA.delete_tenant(tenant_id);
         Ok(())

1160-1195: Handle single-tenant directories and lock failures explicitly.

load_tenants has an empty else if !is_multi_tenant {} branch (Line 1182-1183), and it returns Ok(None) on poisoned write-lock (Line 1190-1194), which hides real failures. Either explicitly ignore directories in single-tenant mode or raise a clear error, and propagate lock poisoning.

🛠 Proposed fix
-            } else if !is_multi_tenant {
-            } else {
+            } else if is_multi_tenant {
                 return Err(anyhow::Error::msg(format!(
                     "Found invalid tenant directory with multi-tenant mode- {tenant_id}.\nExiting."
                 )));
             }
+            // In single-tenant mode, ignore directories without tenant metadata
         }
 
-        if let Ok(mut t) = self.tenants.write() {
-            t.extend(dirs);
-            Ok(Some(()))
-        } else {
-            Ok(None)
-        }
+        let mut t = self.tenants.write().expect("tenants lock poisoned");
+        t.extend(dirs);
+        Ok(Some(()))

1198-1205: Avoid swallowing poisoned lock in list_tenants.

Returning None when read() fails hides poisoned-lock errors (Line 1198-1204). Prefer expect/propagate and only return None when the list is empty.

🛠 Proposed fix
-        if let Ok(t) = self.tenants.as_ref().read()
-            && !t.is_empty()
-        {
-            let t = t.clone();
-            Some(t)
-        } else {
-            None
-        }
+        let t = self.tenants.read().expect("tenants lock poisoned");
+        if t.is_empty() {
+            None
+        } else {
+            Some(t.clone())
+        }
🧹 Nitpick comments (7)
src/connectors/kafka/mod.rs (1)

153-168: Consider direct header iteration for efficiency.

Building the full HashMap via extract_headers just to retrieve a single header is wasteful. Consider iterating headers directly:

♻️ Suggested refactor
-        let tenant_id = if let Some(headers) = extract_headers(&msg)
-            && let Some(tenant_id) = headers.get("tenant")
-        {
-            tenant_id.clone()
-        } else {
-            None
-        };
+        let tenant_id = msg.headers().and_then(|headers| {
+            headers.iter().find_map(|h| {
+                (h.key == "tenant").then(|| h.value.map(|v| String::from_utf8_lossy(v).to_string()))
+            })
+        }).flatten();

Also, the #[allow(unused)] attribute on extract_headers (line 190) is now stale since this code uses it—consider removing that attribute or adopting the direct iteration above.

src/rbac/utils.rs (2)

54-67: Remove dead commented-out code and simplify the filter/map chain.

Lines 57-62 contain dead code that should be removed. The .filter(...).map(...) pattern can be simplified using .and_then():

♻️ Suggested refactor
     let direct_roles: HashMap<String, Vec<DefaultPrivilege>> = Users
         .get_role(id, &user.tenant)
         .iter()
         .filter_map(|role_name| {
             roles()
                 .get(tenant_id)
-                .filter(|roles| roles.get(role_name).is_some())
-                // .map(|roles| {
-                //     if let Some(role) = roles.get(role_name) {
-                //         (role_name.to_owned(), role.clone())
-                //     }
-                // })
-                // .get(role_name)
-                .map(|roles| {
-                    let role = roles.get(role_name).unwrap();
-                    (role_name.to_owned(), role.clone())
-                })
+                .and_then(|roles| {
+                    roles.get(role_name).map(|role| (role_name.to_owned(), role.clone()))
+                })
         })
         .collect();

80-88: Apply the same simplification to group role extraction.

The same .filter(...).map(...).unwrap() pattern is used here; refactor to .and_then() for consistency:

♻️ Suggested refactor
                 .filter_map(|role_name| {
                     roles()
                         .get(tenant_id)
-                        .filter(|roles| roles.get(role_name).is_some())
-                        .map(|roles| {
-                            let role = roles.get(role_name).unwrap();
-                            (role_name.to_owned(), role.clone())
-                        })
+                        .and_then(|roles| {
+                            roles.get(role_name).map(|role| (role_name.to_owned(), role.clone()))
+                        })
                 })
src/utils/mod.rs (1)

93-94: Remove debug logging or change log level.

The tracing::warn! calls appear to be leftover debugging code. Either remove them or change to trace!/debug! level.

♻️ Proposed fix
             if let Some(tenant) = Users.get_user_tenant_from_basic(&username, &password) {
-                tracing::warn!(tenant=?tenant);
-                tracing::warn!(user=?username);
+                tracing::debug!(tenant=?tenant, user=?username, "Resolved tenant from basic auth");
                 Ok((username.clone(), Some(tenant)))
src/users/filters.rs (1)

136-142: Consider consistent behavior for delete_filter when tenant bucket is missing.

Similar to update(), delete_filter silently does nothing if the tenant bucket doesn't exist. While less critical (deleting from non-existent bucket is a no-op), consider logging or returning a result to indicate the operation's outcome.

src/storage/azure_blob.rs (1)

214-239: Only increment object-store call metrics on success.

increment_object_store_calls_by_date runs before checking the result (Line 221). That counts failed calls as successes. Move the increment into the Ok branch (and apply the same pattern to PUT/LIST/DELETE here).

♻️ Example fix (GET)
-        let resp = self.client.get(&to_object_store_path(path)).await;
-        increment_object_store_calls_by_date("GET", &Utc::now().date_naive().to_string(), tenant);
+        let resp = self.client.get(&to_object_store_path(path)).await;
 
         match resp {
             Ok(resp) => {
+                increment_object_store_calls_by_date(
+                    "GET",
+                    &Utc::now().date_naive().to_string(),
+                    tenant,
+                );
                 let body: Bytes = resp.bytes().await?;
Based on learnings, object-store metrics should only be emitted on successful operations.
src/storage/gcs.rs (1)

180-205: Only increment object-store call metrics on success.

increment_object_store_calls_by_date runs before checking the result (Line 186-188), so failures are counted as successes. Move the increment into the Ok branch (and apply similarly to PUT/LIST/DELETE).

♻️ Example fix (GET)
-        let resp = self.client.get(&to_object_store_path(path)).await;
-        let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
-        increment_object_store_calls_by_date("GET", &Utc::now().date_naive().to_string(), tenant);
+        let resp = self.client.get(&to_object_store_path(path)).await;
+        let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
         match resp {
             Ok(resp) => {
+                increment_object_store_calls_by_date(
+                    "GET",
+                    &Utc::now().date_naive().to_string(),
+                    tenant,
+                );
                 let body: Bytes = resp.bytes().await?;
Based on learnings, object-store metrics should only be emitted on successful operations.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (7)
src/hottier.rs (2)

208-217: delete_hot_tier ignores tenant_id in the path.

This deletes the non-tenant directory even when a tenant is supplied, risking cross-tenant data removal.

🐛 Suggested fix
-        let path = self.hot_tier_path.join(stream);
+        let path = if let Some(tenant) = tenant_id.as_ref() {
+            self.hot_tier_path.join(tenant).join(stream)
+        } else {
+            self.hot_tier_path.join(stream)
+        };

410-466: Hot tier manifests/cleanup paths aren’t tenant-scoped.

get_stream_path_for_date and related helpers still use hot_tier_path/stream/.... With duplicate stream names across tenants, manifests and cleanup can collide or miss data. Consider threading tenant_id into get_stream_path_for_date, fetch_hot_tier_dates, and get_oldest_date_time_entry, then use the tenant-scoped base path in process_parquet_file and cleanup flows.

src/alerts/alert_structs.rs (1)

292-375: Add tenantId and tenant_id to RESERVED_FIELDS to prevent tenant ID override attacks.
AlertConfig serializes tenant_id as tenantId due to #[serde(rename_all = "camelCase")], but these keys are not stripped from other_fields during validation. With #[serde(flatten)] on both AlertRequest and AlertConfig, a client can inject tenantId which survives sanitization and persists through serialization/deserialization cycles, potentially compromising tenant isolation on reload.

Proposed fix
 const RESERVED_FIELDS: &[&str] = &[
     "version",
     "id",
     "severity",
@@
     "lastTriggeredAt",
     "last_triggered_at",
+    "tenantId",
+    "tenant_id",
 ];
src/rbac/user.rs (1)

332-359: Tenant-scoped group/user validation is using the wrong map keys.
read_user_groups().contains_key(&self.name) and users().contains_key(group_user.userid()) now check tenant IDs, not group/user IDs. This will miss duplicates and mark valid users as missing.

🛠️ Tenant-aware checks
-        if read_user_groups().contains_key(&self.name) {
+        let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
+        if read_user_groups()
+            .get(tenant)
+            .is_some_and(|groups| groups.contains_key(&self.name))
+        {
             return Err(RBACError::UserGroupExists(self.name.clone()));
         }
@@
-            for group_user in &self.users {
-                if !users().contains_key(group_user.userid()) {
+            for group_user in &self.users {
+                if !users()
+                    .get(tenant)
+                    .is_some_and(|tenant_users| tenant_users.contains_key(group_user.userid()))
+                {
                     non_existent_users.push(group_user.userid().to_string());
                 }
             }
src/migration/mod.rs (1)

60-118: Tenant-scoped staging metadata is read from the global path.

run_metadata_migration always calls get_staging_metadata(config) (root .parseable.json), while migrate_staging now writes tenant-scoped paths. In multi-tenant mode this skips per-tenant staging metadata (or rewrites the default file for each tenant). Make the read path tenant-aware and pass tenant_id through so read/write are symmetric.

🛠️ Suggested fix (call site)
-    let staging_metadata = get_staging_metadata(config)?;
+    let staging_metadata = get_staging_metadata(config, tenant_id)?;
🛠️ Suggested fix (new helper signature)
pub fn get_staging_metadata(
    config: &Parseable,
    tenant_id: &Option<String>,
) -> anyhow::Result<Option<serde_json::Value>> {
    let path = if let Some(tenant) = tenant_id.as_ref() {
        config
            .options
            .staging_dir()
            .join(tenant)
            .join(PARSEABLE_METADATA_FILE_NAME)
    } else {
        config.options.staging_dir().join(PARSEABLE_METADATA_FILE_NAME)
    };
    // existing read/parse logic...
}

Also applies to: 123-151

src/storage/azure_blob.rs (1)

214-266: Record object-store call metrics only on success.

Several methods increment increment_object_store_calls_by_date before confirming success (e.g., _get_object, _put_object, _list_dates, _upload_file, head, delete_object, check, try_delete_node_meta, and list_* helpers). Move the increments into the Ok branches (or after ?) so failed operations don’t inflate metrics. Based on learnings, metrics should only be incremented on successful operations.

🛠️ Example fix (apply similarly elsewhere)
-        let resp = self.client.get(&to_object_store_path(path)).await;
-        increment_object_store_calls_by_date("GET", &Utc::now().date_naive().to_string(), tenant);
+        let resp = self.client.get(&to_object_store_path(path)).await;
         match resp {
             Ok(resp) => {
+                increment_object_store_calls_by_date(
+                    "GET",
+                    &Utc::now().date_naive().to_string(),
+                    tenant,
+                );
                 let body: Bytes = resp.bytes().await?;
                 ...

Also applies to: 321-357, 359-381, 383-425, 499-517, 639-661, 663-680, 692-717, 785-820, 822-858, 891-921, 923-954

src/storage/gcs.rs (1)

180-207: Record object-store call metrics only on success.

Several methods increment increment_object_store_calls_by_date before confirming success (e.g., _get_object, _put_object, _list_dates, _upload_file, _upload_multipart, get_buffered_reader, head, delete_object, check, try_delete_node_meta, and list_* helpers). Move the increments into Ok branches (or after ?) so failed operations don’t inflate metrics. Based on learnings, metrics should only be incremented on successful operations.

🛠️ Example fix (apply similarly elsewhere)
-        let resp = self.client.get(&to_object_store_path(path)).await;
-        let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
-        increment_object_store_calls_by_date("GET", &Utc::now().date_naive().to_string(), tenant);
+        let resp = self.client.get(&to_object_store_path(path)).await;
+        let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
         match resp {
             Ok(resp) => {
+                increment_object_store_calls_by_date(
+                    "GET",
+                    &Utc::now().date_naive().to_string(),
+                    tenant,
+                );
                 let body: Bytes = resp.bytes().await?;
                 ...

Also applies to: 209-230, 285-321, 323-345, 347-437, 443-469, 481-499, 621-643, 645-663, 675-700, 767-801, 804-840, 869-900, 902-933

🤖 Fix all issues with AI agents
In `@src/handlers/http/middleware.rs`:
- Around line 176-191: When handling Action::Ingest in the tenant header parsing
block, treat an empty or all-whitespace tenant string as invalid/missing: after
obtaining tenant.to_str(), trim the value and if trimmed is empty set
header_error to PostError::Header(ParseHeaderError::MissingTenantId) (or
InvalidTenantId if you prefer) and return None; also keep the existing check
that exact DEFAULT_TENANT still sets ParseHeaderError::InvalidTenantId. Update
the logic around the tenant header extraction in middleware.rs (the
Action::Ingest branch) to use tenant_str.trim().is_empty() and set header_error
accordingly while still using header_error, PostError::Header,
ParseHeaderError::{MissingTenantId,InvalidTenantId} and the tenant variable
names already present.

In `@src/query/stream_schema_provider.rs`:
- Around line 224-232: The parquet-file retrieval ignores tenant scoping causing
cross-tenant collisions; update the hot-tier path construction to propagate
tenant_id instead of using the hardcoded object_store_url: thread tenant_id into
get_hot_tier_manifest_files -> get_hot_tier_parquet_files ->
fetch_hot_tier_dates (and callers such as create_parquet_physical_plan) and use
it to build tenant-scoped paths/ObjectStoreUrl (or, if sharing is intentional,
add a clear code comment and documentation explaining why hot-tier data is
global); ensure functions and signatures referencing
get_hot_tier_manifest_files, get_hot_tier_parquet_files, fetch_hot_tier_dates,
create_parquet_physical_plan, and tenant_id are updated accordingly.

In `@src/rbac/role.rs`:
- Line 94: The authorization currently constructs Permission::Resource(action,
None) for many actions in the Role::build() code but src/rbac/map.rs only
matches None for Ingest and Query, causing other actions (QueryLLM, AddLLM,
DeleteLLM, UpdateStream, GetSchema, DetectSchema, GetStats, GetRetention,
PutRetention, All, SuperAdmin) to be denied; update the match in map.rs (the
logic that matches Permission::Resource(action, resource_type)) to handle
Permission::Resource(action, None) for those remaining actions with the correct
semantics, or alternatively change Role::build() to never emit
Permission::Resource(..., None) for actions that lack None-handling—adjust the
Permission::Resource matching or Role::build() so that every action with None is
handled consistently.

In `@src/storage/retention.rs`:
- Around line 111-120: Task::new currently calls NonZeroU32::new(days).unwrap(),
which will panic if days == 0; change Task::new to return Result<Task,
RetentionError> (or Result<Self, E> using your project's error type), replace
the unwrap with NonZeroU32::new(days).ok_or(RetentionError::InvalidDays)? (or
map to an appropriate error variant), construct and return Ok(Self {
description, days, action: Action::Delete }), and update callers to handle the
Result; alternatively, if you decide zero should be forbidden with a clear
panic, replace unwrap with .expect("days must be non-zero") to provide a
descriptive message.

In `@src/utils/mod.rs`:
- Around line 92-95: The warn-level logs in the Users.get_user_tenant_from_basic
branch leak user/tenant identifiers (tenant and username); remove the
tracing::warn! calls or change them to tracing::debug! and ensure identifiers
are redacted (e.g., mask or log only suffix/prefix) before logging so PII is not
emitted; update the block around Users.get_user_tenant_from_basic to return
Ok((username.clone(), Some(tenant))) without warn-level PII logging.

Comment on lines +176 to +191
let tenant = if self.action.eq(&Action::Ingest) {
if let Some(tenant) = req.headers().get("tenant")
&& let Ok(tenant) = tenant.to_str()
{
if tenant.eq(DEFAULT_TENANT) {
header_error = Some(actix_web::Error::from(PostError::Header(
crate::utils::header_parsing::ParseHeaderError::InvalidTenantId,
)));
}
Some(tenant.to_owned())
} else {
// tenant header is not present, error out
header_error = Some(actix_web::Error::from(PostError::Header(
crate::utils::header_parsing::ParseHeaderError::MissingTenantId,
)));
None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reject empty tenant IDs on ingest.

An empty tenant header currently passes validation (it’s not DEFAULT_TENANT), so ingestion can proceed with a blank tenant value. Treat empty/whitespace as missing or invalid to avoid default/empty-tenant bypass.

🐛 Suggested fix
-                    if let Some(tenant) = req.headers().get("tenant")
-                        && let Ok(tenant) = tenant.to_str()
-                    {
-                        if tenant.eq(DEFAULT_TENANT) {
+                    if let Some(tenant) = req.headers().get("tenant")
+                        && let Ok(tenant) = tenant.to_str()
+                    {
+                        let tenant = tenant.trim();
+                        if tenant.is_empty() {
+                            header_error = Some(actix_web::Error::from(PostError::Header(
+                                crate::utils::header_parsing::ParseHeaderError::MissingTenantId,
+                            )));
+                            None
+                        } else if tenant.eq(DEFAULT_TENANT) {
                             header_error = Some(actix_web::Error::from(PostError::Header(
                                 crate::utils::header_parsing::ParseHeaderError::InvalidTenantId,
                             )));
-                        }
-                        Some(tenant.to_owned())
+                            Some(tenant.to_owned())
+                        } else {
+                            Some(tenant.to_owned())
+                        }
                     } else {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let tenant = if self.action.eq(&Action::Ingest) {
if let Some(tenant) = req.headers().get("tenant")
&& let Ok(tenant) = tenant.to_str()
{
if tenant.eq(DEFAULT_TENANT) {
header_error = Some(actix_web::Error::from(PostError::Header(
crate::utils::header_parsing::ParseHeaderError::InvalidTenantId,
)));
}
Some(tenant.to_owned())
} else {
// tenant header is not present, error out
header_error = Some(actix_web::Error::from(PostError::Header(
crate::utils::header_parsing::ParseHeaderError::MissingTenantId,
)));
None
let tenant = if self.action.eq(&Action::Ingest) {
if let Some(tenant) = req.headers().get("tenant")
&& let Ok(tenant) = tenant.to_str()
{
let tenant = tenant.trim();
if tenant.is_empty() {
header_error = Some(actix_web::Error::from(PostError::Header(
crate::utils::header_parsing::ParseHeaderError::MissingTenantId,
)));
None
} else if tenant.eq(DEFAULT_TENANT) {
header_error = Some(actix_web::Error::from(PostError::Header(
crate::utils::header_parsing::ParseHeaderError::InvalidTenantId,
)));
Some(tenant.to_owned())
} else {
Some(tenant.to_owned())
}
} else {
// tenant header is not present, error out
header_error = Some(actix_web::Error::from(PostError::Header(
crate::utils::header_parsing::ParseHeaderError::MissingTenantId,
)));
None
🤖 Prompt for AI Agents
In `@src/handlers/http/middleware.rs` around lines 176 - 191, When handling
Action::Ingest in the tenant header parsing block, treat an empty or
all-whitespace tenant string as invalid/missing: after obtaining
tenant.to_str(), trim the value and if trimmed is empty set header_error to
PostError::Header(ParseHeaderError::MissingTenantId) (or InvalidTenantId if you
prefer) and return None; also keep the existing check that exact DEFAULT_TENANT
still sets ParseHeaderError::InvalidTenantId. Update the logic around the tenant
header extraction in middleware.rs (the Action::Ingest branch) to use
tenant_str.trim().is_empty() and set header_error accordingly while still using
header_error, PostError::Header,
ParseHeaderError::{MissingTenantId,InvalidTenantId} and the tenant variable
names already present.

Comment on lines +224 to +232
// let object_store_url = if let Some(tenant_id) = self.tenant_id.as_ref() {
// &format!("file:///{tenant_id}/")
// } else {
// "file:///"
// };
let object_store_url = "file:///";
self.create_parquet_physical_plan(
execution_plans,
ObjectStoreUrl::parse("file:///").unwrap(),
ObjectStoreUrl::parse(object_store_url).unwrap(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for tenant-aware file path patterns in the codebase
echo "=== Searching for tenant-aware file paths ==="
rg "file://.*tenant|tenant.*file://" --type rust -B 2 -A 2

echo ""
echo "=== Searching for object_store_url patterns ==="
rg "object_store_url.*=|ObjectStoreUrl" --type rust -B 1 -A 1 | head -100

echo ""
echo "=== Checking the exact file for context ==="
wc -l src/query/stream_schema_provider.rs

Repository: parseablehq/parseable

Length of output: 5356


🏁 Script executed:

#!/bin/bash
# Look at the broader context of the stream_schema_provider file
echo "=== Lines around 224-232 in stream_schema_provider.rs ==="
sed -n '210,250p' src/query/stream_schema_provider.rs

echo ""
echo "=== Look for other tenant_id usages in this file ==="
rg -n "tenant_id" src/query/stream_schema_provider.rs -B 1 -A 1

Repository: parseablehq/parseable

Length of output: 3989


🏁 Script executed:

#!/bin/bash
# Check how tenant_id is used in object_store_metastore as per the learning
echo "=== Checking object_store_metastore.rs for tenant path patterns ==="
sed -n '140,170p' src/metastore/metastores/object_store_metastore.rs

echo ""
echo "=== Search for similar commented-out tenant patterns in other files ==="
rg "tenant_id.*file:///|file:///.*tenant_id" --type rust -B 3 -A 3

Repository: parseablehq/parseable

Length of output: 1910


🏁 Script executed:

#!/bin/bash
# Search for hot_tier_storage_path definition and usage
echo "=== Searching for hot_tier_storage_path definition ==="
rg -n "hot_tier_storage_path" --type rust -B 2 -A 2

echo ""
echo "=== Check HotTierManager for tenant handling ==="
rg -n "check_stream_hot_tier_exists|HotTierManager" --type rust | head -30

Repository: parseablehq/parseable

Length of output: 4601


🏁 Script executed:

#!/bin/bash
# Check how hot tier files are passed to this function
echo "=== Looking at get_hot_tier_execution_plan caller context ==="
sed -n '600,625p' src/query/stream_schema_provider.rs

echo ""
echo "=== Check hot_tier_files source ==="
rg -n "hot_tier_files|HotTierManager" src/query/stream_schema_provider.rs -B 1 -A 1

Repository: parseablehq/parseable

Length of output: 1892


🏁 Script executed:

#!/bin/bash
# Check if hot tier file paths already include tenant context
echo "=== Search for hot tier file path construction ==="
rg -n "hot_tier.*path|hot_tier.*file|File.*tenant" --type rust | grep -i "hot_tier" | head -20

echo ""
echo "=== Check stream definition and tenant relationship ==="
sed -n '100,120p' src/query/stream_schema_provider.rs

Repository: parseablehq/parseable

Length of output: 2212


🏁 Script executed:

#!/bin/bash
# Look at hot_tier_file_path implementation
echo "=== hot_tier_file_path implementation ==="
sed -n '237,260p' src/hottier.rs

echo ""
echo "=== get_hot_tier_manifest_files implementation ==="
rg -n "get_hot_tier_manifest_files" src/hottier.rs -A 30 | head -60

Repository: parseablehq/parseable

Length of output: 2233


🏁 Script executed:

#!/bin/bash
# Check how file paths flow through the hot tier system
echo "=== Check File struct and file_path usage ==="
rg -n "struct File|file_path" --type rust | grep -E "(File|file_path)" | head -20

echo ""
echo "=== Look at how files are stored/retrieved from hot tier ==="
sed -n '375,410p' src/hottier.rs

Repository: parseablehq/parseable

Length of output: 3273


🏁 Script executed:

#!/bin/bash
# Look at get_hot_tier_parquet_files implementation
echo "=== get_hot_tier_parquet_files implementation ==="
rg -n "get_hot_tier_parquet_files" src/hottier.rs -A 40 | head -80

echo ""
echo "=== Check how manifest files are created with tenant context ==="
rg -n "File {|file_path.*=" --type rust src/hottier.rs | head -20

Repository: parseablehq/parseable

Length of output: 3241


🏁 Script executed:

#!/bin/bash
# Understand the File struct and how it includes/excludes tenant context
echo "=== Search for File struct definition ==="
rg -n "struct File|pub.*file_path" --type rust | grep -B 2 -A 2 "file_path" | head -40

Repository: parseablehq/parseable

Length of output: 629


🏁 Script executed:

#!/bin/bash
# Check get_stream_path_for_date implementation
echo "=== get_stream_path_for_date implementation ==="
rg -n "get_stream_path_for_date" src/hottier.rs -A 10

echo ""
echo "=== Check fetch_hot_tier_dates ==="
rg -n "fetch_hot_tier_dates" src/hottier.rs -A 15 | head -40

echo ""
echo "=== File struct definition ==="
sed -n '57,70p' src/catalog/manifest.rs

Repository: parseablehq/parseable

Length of output: 5449


🏁 Script executed:

#!/bin/bash
# Check all callers of get_hot_tier_manifest_files
echo "=== Callers of get_hot_tier_manifest_files ==="
rg -n "get_hot_tier_manifest_files" --type rust -B 3 -A 3

echo ""
echo "=== Check if get_hot_tier_parquet_files ever receives tenant context ==="
rg -n "get_hot_tier_parquet_files" --type rust -B 3 -A 3

Repository: parseablehq/parseable

Length of output: 1955


Multi-tenant isolation issue in hot tier file retrieval.

The hot tier file retrieval chain (get_hot_tier_manifest_filesget_hot_tier_parquet_filesfetch_hot_tier_dates) ignores tenant_id while building file paths. This causes different tenants to access parquet files from the same hot_tier_path/stream/date/ directory structure, creating a cross-tenant collision risk. While the hot tier metadata file path (hot_tier_file_path) correctly incorporates tenant scoping, the parquet file retrieval does not, leaving the system vulnerable to multi-tenant isolation violations.

The commented-out object store URL construction is a symptom of this deeper issue. Either:

  1. Pass tenant_id through get_hot_tier_manifest_filesget_hot_tier_parquet_filesfetch_hot_tier_dates to build tenant-scoped paths, or
  2. Document why hot tier data is intentionally shared across tenants and provide a clear explanation in the code.
🤖 Prompt for AI Agents
In `@src/query/stream_schema_provider.rs` around lines 224 - 232, The parquet-file
retrieval ignores tenant scoping causing cross-tenant collisions; update the
hot-tier path construction to propagate tenant_id instead of using the hardcoded
object_store_url: thread tenant_id into get_hot_tier_manifest_files ->
get_hot_tier_parquet_files -> fetch_hot_tier_dates (and callers such as
create_parquet_physical_plan) and use it to build tenant-scoped
paths/ObjectStoreUrl (or, if sharing is intentional, add a clear code comment
and documentation explaining why hot-tier data is global); ensure functions and
signatures referencing get_hot_tier_manifest_files, get_hot_tier_parquet_files,
fetch_hot_tier_dates, create_parquet_physical_plan, and tenant_id are updated
accordingly.

Comment on lines +111 to +120
impl Task {
pub fn new(description: String, days: u32) -> Self {
let days = NonZeroU32::new(days).unwrap();
Self {
description,
days,
action: Action::Delete,
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Task::new will panic if days is 0.

NonZeroU32::new(0) returns None, and the .unwrap() will panic. Consider returning a Result or validating the input.

🐛 Suggested fix
 impl Task {
-    pub fn new(description: String, days: u32) -> Self {
-        let days = NonZeroU32::new(days).unwrap();
-        Self {
+    pub fn new(description: String, days: u32) -> Option<Self> {
+        let days = NonZeroU32::new(days)?;
+        Some(Self {
             description,
             days,
             action: Action::Delete,
-        }
+        })
     }
 }

Alternatively, if callers are trusted to never pass 0:

+    /// Creates a new Task. Panics if `days` is 0.
     pub fn new(description: String, days: u32) -> Self {
+        assert!(days > 0, "days must be greater than 0");
         let days = NonZeroU32::new(days).unwrap();
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
impl Task {
pub fn new(description: String, days: u32) -> Self {
let days = NonZeroU32::new(days).unwrap();
Self {
description,
days,
action: Action::Delete,
}
}
}
impl Task {
pub fn new(description: String, days: u32) -> Option<Self> {
let days = NonZeroU32::new(days)?;
Some(Self {
description,
days,
action: Action::Delete,
})
}
}
Suggested change
impl Task {
pub fn new(description: String, days: u32) -> Self {
let days = NonZeroU32::new(days).unwrap();
Self {
description,
days,
action: Action::Delete,
}
}
}
impl Task {
/// Creates a new Task. Panics if `days` is 0.
pub fn new(description: String, days: u32) -> Self {
assert!(days > 0, "days must be greater than 0");
let days = NonZeroU32::new(days).unwrap();
Self {
description,
days,
action: Action::Delete,
}
}
}
🤖 Prompt for AI Agents
In `@src/storage/retention.rs` around lines 111 - 120, Task::new currently calls
NonZeroU32::new(days).unwrap(), which will panic if days == 0; change Task::new
to return Result<Task, RetentionError> (or Result<Self, E> using your project's
error type), replace the unwrap with
NonZeroU32::new(days).ok_or(RetentionError::InvalidDays)? (or map to an
appropriate error variant), construct and return Ok(Self { description, days,
action: Action::Delete }), and update callers to handle the Result;
alternatively, if you decide zero should be forbidden with a clear panic,
replace unwrap with .expect("days must be non-zero") to provide a descriptive
message.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
src/prism/logstream/mod.rs (1)

393-393: Incorrect HTTP status code for TimeParse error.

StatusCode::NOT_FOUND (404) is semantically incorrect for time parsing errors. These are typically caused by malformed input in the request and should return StatusCode::BAD_REQUEST (400).

🐛 Suggested fix
-            PrismLogstreamError::TimeParse(_) => StatusCode::NOT_FOUND,
+            PrismLogstreamError::TimeParse(_) => StatusCode::BAD_REQUEST,
src/handlers/http/modal/ingest/ingestor_rbac.rs (1)

203-227: Password reset reads metadata then immediately writes unchanged metadata.

In post_gen_password, line 212 calls put_staging_metadata before any modifications are made to metadata (lines 213-225 only read from it). The password hash is then updated in memory via Users.change_password_hash but the updated hash is never persisted back to storage.

🐛 Proposed fix - persist after modification
     let mut metadata = get_metadata(&tenant_id).await?;
-    let _ = storage::put_staging_metadata(&metadata, &tenant_id);
     if let Some(user) = metadata
         .users
         .iter_mut()
         .filter_map(|user| match user.ty {
             user::UserType::Native(ref mut user) => Some(user),
             _ => None,
         })
         .find(|user| user.username == username)
     {
         new_hash.clone_from(&user.password_hash);
     } else {
         return Err(RBACError::UserDoesNotExist);
     }
+    // Persist the metadata after password hash is set
+    storage::put_staging_metadata(&metadata, &tenant_id)?;
     Users.change_password_hash(&username, &new_hash, &tenant_id);
src/handlers/http/cluster/mod.rs (1)

1880-1925: Tenant context not propagated in outgoing query headers.

When constructing the request headers in send_query_request, the tenant_id parameter is accepted but never added to the outgoing request headers. The remote querier will lose tenant context and may default to the wrong tenant.

🐛 Proposed fix - add tenant header
     let mut map = reqwest::header::HeaderMap::new();

+    // Add tenant header if present
+    if let Some(tenant) = tenant_id {
+        if let Ok(val) = reqwest::header::HeaderValue::from_str(tenant) {
+            map.insert(
+                reqwest::header::HeaderName::from_static("tenant"),
+                val,
+            );
+        }
+    }
+
     if let Some(auth) = auth_token {
🤖 Fix all issues with AI agents
In `@src/rbac/user.rs`:
- Around line 352-354: The existence check for a user group is currently global;
update the logic in the creation/validation path (where read_user_groups() is
used and RBACError::UserGroupExists(self.name.clone()) is returned) to scope the
check to the tenant: fetch the tenant entry from read_user_groups() (the
top-level UserGroups HashMap) using the tenant identifier on the current struct
(e.g., self.tenant) and then call contains_key(&self.name) on that tenant's
inner HashMap; return RBACError::UserGroupExists only if the tenant map exists
and contains the group name, otherwise allow creation.

In `@src/storage/store_metadata.rs`:
- Around line 71-72: The struct fields suspended_services and global_query_auth
lack serde defaults which causes deserialization to fail for older metadata; add
#[serde(default)] above each of these fields (i.e., annotate suspended_services:
Option<HashSet<Service>> and global_query_auth: Option<String>) so missing keys
deserialize to None (or an empty collection), ensuring existing metadata loads
correctly.
🧹 Nitpick comments (4)
src/prism/logstream/mod.rs (2)

91-91: Avoid unnecessary Vec allocation.

The vec! macro allocates on the heap, but update_schema_when_distributed accepts a slice reference. A stack-allocated array would be more efficient.

♻️ Suggested fix
-    match update_schema_when_distributed(&vec![stream_name.to_owned()], tenant_id).await {
+    match update_schema_when_distributed(&[stream_name.to_owned()], tenant_id).await {

180-181: Consider handling poisoned lock.

Using .unwrap() on read() will panic if the RwLock is poisoned (i.e., a thread panicked while holding the lock). While this is often acceptable, you could use .read().unwrap_or_else(|e| e.into_inner()) to recover from poisoned state if resilience is desired.

src/storage/store_metadata.rs (1)

318-331: Consider extracting tenant-aware path resolution into a helper.

The path resolution logic for tenant vs non-tenant is duplicated between get_staging_metadata (lines 282-291) and put_staging_metadata (lines 322-331). Extracting this into a helper function would reduce duplication and make future changes easier.

♻️ Proposed refactor: extract helper function
fn tenant_staging_metadata_path(tenant_id: &Option<String>) -> io::Result<PathBuf> {
    let path = if let Some(tenant_id) = tenant_id.as_ref() {
        let tenant_dir = PARSEABLE.options.staging_dir().join(tenant_id);
        create_dir_all(&tenant_dir)?;
        tenant_dir.join(PARSEABLE_METADATA_FILE_NAME)
    } else {
        PARSEABLE
            .options
            .staging_dir()
            .join(PARSEABLE_METADATA_FILE_NAME)
    };
    Ok(path)
}

Then use it in both functions:

 pub fn get_staging_metadata(tenant_id: &Option<String>) -> io::Result<Option<StorageMetadata>> {
-    let path = if let Some(tenant_id) = tenant_id.as_ref() {
-        let tenant_dir = PARSEABLE.options.staging_dir().join(tenant_id);
-        create_dir_all(&tenant_dir)?;
-        tenant_dir.join(PARSEABLE_METADATA_FILE_NAME)
-    } else {
-        PARSEABLE
-            .options
-            .staging_dir()
-            .join(PARSEABLE_METADATA_FILE_NAME)
-    };
+    let path = tenant_staging_metadata_path(tenant_id)?;
src/rbac/user.rs (1)

402-411: add_roles uses caller-provided tenant_id for session invalidation.

The method uses the function parameter tenant_id for all users in the group. If the group contains users from different tenants (even accidentally), this would invalidate sessions for the wrong tenant. Consider using each user's group_user.tenant_id instead, similar to the pattern in add_users, remove_roles, and remove_users.

♻️ Proposed fix for consistency
     pub fn add_roles(&mut self, roles: HashSet<String>, tenant_id: &str) -> Result<(), RBACError> {
         if roles.is_empty() {
             return Ok(());
         }
         self.roles.extend(roles);
         // also refresh all user sessions
         for group_user in &self.users {
-            mut_sessions().remove_user(group_user.userid(), tenant_id);
+            mut_sessions().remove_user(
+                group_user.userid(),
+                group_user.tenant_id.as_deref().unwrap_or(tenant_id),
+            );
         }
         Ok(())
     }

Comment on lines 352 to 354
if read_user_groups().contains_key(&self.name) {
return Err(RBACError::UserGroupExists(self.name.clone()));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

UserGroup existence check is not tenant-scoped.

The validation read_user_groups().contains_key(&self.name) checks globally rather than within the tenant context. With the new per-tenant UserGroups structure (HashMap<String, HashMap<String, UserGroup>>), this should first lookup the tenant entry.

🐛 Proposed fix
-        if read_user_groups().contains_key(&self.name) {
+        let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
+        if read_user_groups()
+            .get(tenant)
+            .is_some_and(|groups| groups.contains_key(&self.name))
+        {
             return Err(RBACError::UserGroupExists(self.name.clone()));
         }
🤖 Prompt for AI Agents
In `@src/rbac/user.rs` around lines 352 - 354, The existence check for a user
group is currently global; update the logic in the creation/validation path
(where read_user_groups() is used and
RBACError::UserGroupExists(self.name.clone()) is returned) to scope the check to
the tenant: fetch the tenant entry from read_user_groups() (the top-level
UserGroups HashMap) using the tenant identifier on the current struct (e.g.,
self.tenant) and then call contains_key(&self.name) on that tenant's inner
HashMap; return RBACError::UserGroupExists only if the tenant map exists and
contains the group name, otherwise allow creation.

Comment on lines +71 to +72
pub suspended_services: Option<HashSet<Service>>,
pub global_query_auth: Option<String>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Missing #[serde(default)] will break deserialization of existing metadata.

Existing metadata files won't have suspended_services or global_query_auth fields. Without #[serde(default)], serde_json::from_slice will error on missing fields for Option<T> types. This will prevent the server from starting with existing deployments.

🐛 Proposed fix: add serde(default) attributes
     #[serde(default)]
     pub default_role: Option<String>,
+    #[serde(default)]
     pub suspended_services: Option<HashSet<Service>>,
+    #[serde(default)]
     pub global_query_auth: Option<String>,
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pub suspended_services: Option<HashSet<Service>>,
pub global_query_auth: Option<String>,
#[serde(default)]
pub default_role: Option<String>,
#[serde(default)]
pub suspended_services: Option<HashSet<Service>>,
#[serde(default)]
pub global_query_auth: Option<String>,
}
🤖 Prompt for AI Agents
In `@src/storage/store_metadata.rs` around lines 71 - 72, The struct fields
suspended_services and global_query_auth lack serde defaults which causes
deserialization to fail for older metadata; add #[serde(default)] above each of
these fields (i.e., annotate suspended_services: Option<HashSet<Service>> and
global_query_auth: Option<String>) so missing keys deserialize to None (or an
empty collection), ensuring existing metadata loads correctly.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (5)
src/storage/object_storage.rs (4)

116-117: Remove commented-out debug logs.

These commented-out trace/warn statements are debug artifacts that should be cleaned up before merging.

🧹 Suggested cleanup
-    // tracing::warn!("upload single stream_relative_path- {stream_relative_path:?}");
-    // tracing::warn!("upload single path- {path:?}");

690-692: Downgrade debug logs from warn to trace level.

These logs are useful for debugging but don't indicate actionable operator issues. They should be tracing::trace! or tracing::debug! to avoid log noise in production.

📝 Suggested fix
-        tracing::warn!(
-            "unable to find stream- {stream_name} with tenant- {tenant_id:?} in PARSEABLE.get_stream"
-        );
+        tracing::debug!(
+            "unable to find stream- {stream_name} with tenant- {tenant_id:?} in PARSEABLE.get_stream"
+        );

Also applies to lines 708 and 749 where similar tracing::warn! calls should be downgraded.


758-761: Remove commented-out debug log.

🧹 Suggested cleanup
         let schema = fetch_schema(stream_name, tenant_id).await?;
-        // tracing::warn!("fetched schema for stream {stream_name} - {schema:?}");
         let schema_bytes = Bytes::from(serde_json::to_vec(&schema)?);

1028-1036: Remove commented-out debug logs.

These debug artifacts should be cleaned up before merging.

🧹 Suggested cleanup
     let filename = path
         .file_name()
         .expect("only parquet files are returned by iterator")
         .to_str()
         .expect("filename is valid string");
-    // tracing::warn!("spawn parquet file name- {filename}");

     let stream_relative_path = stream_relative_path(
         stream_name,
         filename,
         &upload_context.custom_partition,
         &tenant_id,
     );
-    // tracing::warn!("spawn parquet stream_relative_path- {stream_relative_path}");
src/cli.rs (1)

159-165: Clarify the CLI help for multi-tenancy.

The help text is vague for users; make it explicit that this flag enables multi-tenant behavior.

💡 Suggested tweak
-        help = "Mode of parseable instance"
+        help = "Enable multi-tenant mode (tenant-aware storage, RBAC, and APIs)"

Comment on lines +524 to +529
pub fn local_stream_data_path(&self, stream_name: &str, tenant_id: &Option<String>) -> PathBuf {
if let Some(tenant_id) = tenant_id.as_ref() {
self.local_staging_path.join(tenant_id).join(stream_name)
} else {
self.local_staging_path.join(stream_name)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash

# Search for tenant_id validation, sanitization, or normalization patterns
echo "=== Searching for tenant_id validation/normalization ==="
rg -nP --type=rust '(validate|sanitize|normalize).*tenant' -A 2 -B 2

echo -e "\n=== Searching for TenantId strong type definition ==="
rg -nP --type=rust '\bstruct\s+TenantId\b|\benum\s+TenantId\b|type\s+TenantId\b' -A 5

echo -e "\n=== Searching for tenant_id type annotations in method signatures ==="
rg -nP --type=rust 'fn.*tenant_id.*:' -A 2

echo -e "\n=== Finding all call sites of local_stream_data_path ==="
rg -nP --type=rust '\blocal_stream_data_path\s*\(' -B 3 -A 3

echo -e "\n=== Check src/cli.rs around the method in question ==="
sed -n '500,560p' src/cli.rs

Repository: parseablehq/parseable

Length of output: 29220


🏁 Script executed:

#!/bin/bash

# Check for any downstream validation of the path returned by local_stream_data_path
echo "=== Checking where local_stream_data_path results are used ==="
rg -nP --type=rust 'local_stream_data_path' -B 2 -A 8 | head -60

# Look for any path canonicalization or validation in LogStream/staging code
echo -e "\n=== Checking for path validation in stream creation ==="
rg -nP --type=rust '(canonicalize|validate.*path|normalize.*path)' -B 2 -A 2

# Check src/parseable/streams.rs around line 126 for context on how data_path is used
echo -e "\n=== Checking LogStream implementation and data_path usage ==="
sed -n '100,200p' src/parseable/streams.rs

Repository: parseablehq/parseable

Length of output: 9318


Validate tenant_id before using in filesystem paths to prevent directory traversal.

tenant_id is extracted directly from the HTTP "tenant" header (src/utils/mod.rs:132) without validation and passed to local_stream_data_path(), which uses it unsanitized in PathBuf::join(). A malicious tenant_id like ../ or an absolute path like /tmp will escape the staging directory when create_dir_all(&self.data_path) is called in Stream::push() (line ~155 in src/parseable/streams.rs). This bypasses the intended staging directory isolation.

Add validation to ensure tenant_id is a single safe path segment (or empty), or change the method signature to return Result<PathBuf, Error> and validate the result before file operations:

pub fn local_stream_data_path(&self, stream_name: &str, tenant_id: &Option<String>) -> PathBuf {
    if let Some(tenant_id) = tenant_id.as_deref() {
+       if !tenant_id.is_empty() {
+           let mut comps = std::path::Path::new(tenant_id).components();
+           let valid = matches!(comps.next(), Some(std::path::Component::Normal(_)))
+               && comps.next().is_none();
+           if !valid {
+               panic!("Invalid tenant_id `{tenant_id}`: must be a single path segment");
+           }
+       }
        self.local_staging_path.join(tenant_id).join(stream_name)
    } else {
        self.local_staging_path.join(stream_name)
    }
}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pub fn local_stream_data_path(&self, stream_name: &str, tenant_id: &Option<String>) -> PathBuf {
if let Some(tenant_id) = tenant_id.as_ref() {
self.local_staging_path.join(tenant_id).join(stream_name)
} else {
self.local_staging_path.join(stream_name)
}
pub fn local_stream_data_path(&self, stream_name: &str, tenant_id: &Option<String>) -> PathBuf {
if let Some(tenant_id) = tenant_id.as_deref() {
if !tenant_id.is_empty() {
let mut comps = std::path::Path::new(tenant_id).components();
let valid = matches!(comps.next(), Some(std::path::Component::Normal(_)))
&& comps.next().is_none();
if !valid {
panic!("Invalid tenant_id `{tenant_id}`: must be a single path segment");
}
}
self.local_staging_path.join(tenant_id).join(stream_name)
} else {
self.local_staging_path.join(stream_name)
}
}

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/prism/home/mod.rs (1)

362-388: Potential panic from unwrap() on dashboard_id.

Line 371 uses unwrap() on dashboard.dashboard_id.as_ref(). If a dashboard somehow has None for its ID, this will panic at runtime. Consider using filter_map with a guard or logging a warning instead of panicking.

🛡️ Suggested defensive handling
     let dashboard_titles = DASHBOARDS
         .list_dashboards(0, tenant_id)
         .await
         .iter()
         .filter_map(|dashboard| {
-            let dashboard_id = *dashboard.dashboard_id.as_ref().unwrap();
-            let dashboard_id = dashboard_id.to_string();
+            let dashboard_id = dashboard.dashboard_id.as_ref()?;
+            let dashboard_id = dashboard_id.to_string();
             if dashboard.title.to_lowercase().contains(query_value)
                 || dashboard_id.to_lowercase().contains(query_value)
             {
                 Some(Resource {
                     id: dashboard_id,
                     name: dashboard.title.clone(),
                     resource_type: ResourceType::Dashboard,
                 })
             } else {
                 None
             }
         })
         .collect_vec();
src/metastore/metastores/object_store_metastore.rs (1)

317-342: Fix tenant fallback in get_alert_states for consistency.

Line 321 incorrectly uses DEFAULT_TENANT as fallback:

let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);

This causes a path mismatch when retrieving default-tenant alert states. Change to empty string like get_alerts:

let tenant = tenant_id.as_deref().unwrap_or("");
🤖 Fix all issues with AI agents
In `@src/metastore/metastore_traits.rs`:
- Around line 319-334: get_node_metadata accepts a tenant_id but
put_node_metadata and delete_node_metadata do not, causing API inconsistency
since implementations treat node metadata as globally scoped (they call storage
with &None and NodeMetadata.file_path() omits tenant info); make the API
consistent by either (A) adding tenant_id: &Option<String> to put_node_metadata
and delete_node_metadata and update their implementations to pass the tenant_id
through to storage and to include tenant when building paths in
NodeMetadata.file_path(), or (B) removing tenant_id from get_node_metadata and
its callers so all three methods operate globally; update all usages of
get_node_metadata / put_node_metadata / delete_node_metadata and
NodeMetadata.file_path() accordingly to match the chosen approach.
🧹 Nitpick comments (8)
src/storage/object_storage.rs (5)

116-117: Remove commented-out debug statements.

These commented-out tracing statements should be removed before merging. If debugging is needed in the future, they can be re-added or a proper debug/trace level log can be used.

-    // tracing::warn!("upload single stream_relative_path- {stream_relative_path:?}");
-    // tracing::warn!("upload single path- {path:?}");

690-692: Consider downgrading warn to debug for expected operational condition.

This warning fires whenever a stream isn't found in memory during create_stream_from_ingestor. In distributed deployments where streams may not yet be synced, this could generate significant log noise. Consider using tracing::debug! unless this indicates an unexpected state that operators should investigate.


708-708: Consider downgrading metadata count log to debug level.

Logging the count of inserted stream metadata at warn level may be too verbose for normal operations. This appears to be a success case rather than a warning condition.

-            tracing::warn!("inserted {} stream metadata", all_log_sources.len());
+            tracing::debug!("inserted {} stream metadata", all_log_sources.len());

749-749: Consider downgrading "returning empty bytes" to debug level.

This log at warn level fires when no stream metadata is found, which may be a normal condition during stream initialization. Consider using tracing::debug! to reduce noise.


1028-1028: Remove commented-out debug statements.

These commented-out tracing statements should be cleaned up before merging.

-    // tracing::warn!("spawn parquet file name- {filename}");
     ...
-    // tracing::warn!("spawn parquet stream_relative_path- {stream_relative_path}");

Also applies to: 1036-1036

src/metastore/metastores/object_store_metastore.rs (3)

906-906: Remove commented-out debug statement.

-        // tracing::warn!(put_stream_json_path=?path);

1025-1025: Remove commented-out debug statement.

-        // tracing::warn!(put_manifest_path=?path);

1309-1316: Remove commented-out debug statements in list_streams.

Multiple commented-out tracing statements should be cleaned up before merging:

  • Lines 1309-1316
  • Line 1327
  • Line 1334
  • Lines 1345-1357

Also applies to: 1327-1327, 1334-1334, 1345-1357

Comment on lines 319 to +334
/// node metadata
async fn get_node_metadata(&self, node_type: NodeType) -> Result<Vec<Bytes>, MetastoreError>;
async fn get_node_metadata(
&self,
node_type: NodeType,
tenant_id: &Option<String>,
) -> Result<Vec<Bytes>, MetastoreError>;
async fn delete_node_metadata(
&self,
domain_name: &str,
node_type: NodeType,
) -> Result<bool, MetastoreError>;
async fn put_node_metadata(&self, obj: &dyn MetastoreObject) -> Result<(), MetastoreError>;
async fn list_streams(&self) -> Result<HashSet<String>, MetastoreError>;
async fn list_streams(
&self,
tenant_id: &Option<String>,
) -> Result<HashSet<String>, MetastoreError>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if NodeMetadata is tenant-scoped or global
rg -n "struct NodeMetadata|impl.*NodeMetadata" --type rust -A 10 | head -50
# Check how node metadata file paths are constructed
rg -n "fn file_path" --type rust -A 5 | head -50

Repository: parseablehq/parseable

Length of output: 3934


🏁 Script executed:

#!/bin/bash
# Read the trait definitions in metastore_traits.rs around lines 319-334
cat -n src/metastore/metastore_traits.rs | sed -n '310,340p'

Repository: parseablehq/parseable

Length of output: 1503


🏁 Script executed:

#!/bin/bash
# Find implementations of get_node_metadata to see if tenant_id is used
rg -n "fn get_node_metadata" --type rust -A 15 | head -100

Repository: parseablehq/parseable

Length of output: 2654


🏁 Script executed:

#!/bin/bash
# Find implementations of delete_node_metadata and put_node_metadata
rg -n "fn delete_node_metadata|fn put_node_metadata" --type rust -A 20 | head -150

Repository: parseablehq/parseable

Length of output: 4437


API inconsistency: get_node_metadata accepts tenant_id parameter, but put_node_metadata and delete_node_metadata don't.

At line 320, get_node_metadata accepts tenant_id, while put_node_metadata (line 330) and delete_node_metadata (line 325) do not. The implementations show that node metadata is globally scoped—put_node_metadata and delete_node_metadata both hardcode &None for tenant_id when interacting with storage, and NodeMetadata.file_path() constructs paths without any tenant identifier. Either all three methods should consistently accept tenant_id (with implementations that use it), or none should accept it since the metadata is globally scoped.

🤖 Prompt for AI Agents
In `@src/metastore/metastore_traits.rs` around lines 319 - 334, get_node_metadata
accepts a tenant_id but put_node_metadata and delete_node_metadata do not,
causing API inconsistency since implementations treat node metadata as globally
scoped (they call storage with &None and NodeMetadata.file_path() omits tenant
info); make the API consistent by either (A) adding tenant_id: &Option<String>
to put_node_metadata and delete_node_metadata and update their implementations
to pass the tenant_id through to storage and to include tenant when building
paths in NodeMetadata.file_path(), or (B) removing tenant_id from
get_node_metadata and its callers so all three methods operate globally; update
all usages of get_node_metadata / put_node_metadata / delete_node_metadata and
NodeMetadata.file_path() accordingly to match the chosen approach.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/storage/object_storage.rs (1)

1130-1148: Handle empty-tenant IDs to avoid leading slashes in object keys.
If tenant_id is Some(""), the current branch builds "/{stream_name}/...", which can create distinct object keys for default tenant uploads. Treat empty strings the same as None/DEFAULT_TENANT.

🛠️ Proposed fix
-    if let Some(tenant) = tenant_id
-        && !tenant.eq(DEFAULT_TENANT)
-    {
-        format!("{tenant}/{stream_name}/{file_suffix}")
-    } else {
-        format!("{stream_name}/{file_suffix}")
-    }
+    let tenant = tenant_id
+        .as_deref()
+        .filter(|t| !t.is_empty() && *t != DEFAULT_TENANT);
+    if let Some(tenant) = tenant {
+        format!("{tenant}/{stream_name}/{file_suffix}")
+    } else {
+        format!("{stream_name}/{file_suffix}")
+    }
🧹 Nitpick comments (4)
src/users/dashboards.rs (4)

208-220: Remove commented-out code before merge.

These lines appear to be remnants from the previous single-tenant implementation. Keeping dead code harms readability.

♻️ Proposed cleanup
                     tenant_dbs.push(dashboard);
-
-                    // this.retain(|d: &Dashboard| d.dashboard_id != dashboard.dashboard_id);
-                    // this.push(dashboard);
                 } else {
                     tracing::warn!("Failed to deserialize dashboard: {:?}", dashboard_value);
                 }
@@ -216,8 +213,7 @@
             this.entry(tenant_id).or_default().extend(tenant_dbs);
         }

         let mut s = self.0.write().await;
-        // s.append(&mut this);
         s.clone_from(&this);

314-323: Remove commented-out code.

This dead code from the previous implementation should be removed.

♻️ Proposed cleanup
         dbs.push(dashboard.clone());
-        } else {
-            return Err(DashboardError::Metadata(
-                "Dashboard does not exist or you do not have permission to access it",
-            ));
-        };
-
-        // let existing_dashboard = dashboards
-        //     .iter()
-        //     .find(|d| d.dashboard_id == Some(dashboard_id) && d.author == Some(user_id.to_string()))
-        //     .cloned()
-        //     .ok_or_else(|| {
-        //         DashboardError::Metadata(
-        //             "Dashboard does not exist or you do not have permission to access it",
-        //         )
-        //     })?;
+        } else {
+            return Err(DashboardError::Metadata(
+                "Dashboard does not exist or you do not have permission to access it",
+            ));
+        };

         Ok(())

350-356: Simplify the in-memory deletion logic.

The current pattern using .map(...).iter().for_each(drop) is confusing. Using map() for side effects and then calling .iter().for_each(drop) on an Option<()> is non-idiomatic and obscures the intent.

♻️ Proposed simplification
         // delete from in-memory
-        self.0
-            .write()
-            .await
-            .get_mut(tenant_id.as_deref().unwrap_or(DEFAULT_TENANT))
-            .map(|dashboards| dashboards.retain(|d| d.dashboard_id != Some(dashboard_id)))
-            .iter()
-            .for_each(drop);
+        if let Some(dashboards) = self
+            .0
+            .write()
+            .await
+            .get_mut(tenant_id.as_deref().unwrap_or(DEFAULT_TENANT))
+        {
+            dashboards.retain(|d| d.dashboard_id != Some(dashboard_id));
+        }

408-418: Remove remaining commented-out code blocks.

The commented code in get_dashboard_by_user(), list_dashboards(), and list_tags() should be removed before merge. These are remnants of the single-tenant implementation and add noise to the codebase.

Also applies to: 443-447, 469-473

add `protected` field to user
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (6)
src/prism/logstream/mod.rs (1)

389-389: ⚠️ Potential issue | 🟡 Minor

Incorrect HTTP status code for TimeParse error.

TimeParseError indicates the client provided an invalid time format, which should return 400 BAD_REQUEST, not 404 NOT_FOUND. NOT_FOUND implies the requested resource doesn't exist, whereas this is a validation failure.

Proposed fix
-            PrismLogstreamError::TimeParse(_) => StatusCode::NOT_FOUND,
+            PrismLogstreamError::TimeParse(_) => StatusCode::BAD_REQUEST,
src/rbac/map.rs (1)

257-273: ⚠️ Potential issue | 🟠 Major

Preserve existing sessions instead of overwriting.

insert(user, vec![...]) replaces any existing sessions for the user, despite the “one or more session” intent. Append to the existing list.

🛠️ Suggested fix
-        let sessions = self.user_sessions.entry(tenant_id.to_owned()).or_default();
-        sessions.insert(user.clone(), vec![(key.clone(), expiry)]);
+        let tenant_sessions = self.user_sessions.entry(tenant_id.to_owned()).or_default();
+        tenant_sessions
+            .entry(user.clone())
+            .or_default()
+            .push((key.clone(), expiry));
src/handlers/http/logstream.rs (1)

399-414: ⚠️ Potential issue | 🟡 Minor

Inconsistent tenant_id handling in get_stream_info.

At line 399, tenant_id is converted from Option<String> to &str using unwrap_or(DEFAULT_TENANT), but the storage call at line 386 uses the original &tenant_id (which is &Option<String>). This creates inconsistency: the storage layer receives &Option<String> while the in-memory lookup uses &str. Consider normalizing the tenant_id earlier in the function to ensure consistent behavior.

Suggested fix
+    let tenant_id_str = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
+    
     // Get first and latest event timestamps from storage
     let (stream_first_event_at, stream_latest_event_at) = match storage
         .get_first_and_latest_event_from_storage(&stream_name, &tenant_id)
         .await
     ...
     
-    let tenant_id = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
     let hash_map = PARSEABLE.streams.read().unwrap();
     let stream_meta = hash_map
-        .get(tenant_id)
-        .ok_or_else(|| TenantNotFound(tenant_id.to_owned()))?
+        .get(tenant_id_str)
+        .ok_or_else(|| TenantNotFound(tenant_id_str.to_owned()))?
src/metastore/metastores/object_store_metastore.rs (1)

920-958: ⚠️ Potential issue | 🟡 Minor

Tenant prefix included in stream names in get_all_manifest_files.

At lines 927-933, when listing dates, the filter excludes stream_name and STREAM_ROOT_DIRECTORY from path parts, but when tenant_id is provided, the path includes the tenant prefix which isn't filtered out. This could include the tenant ID as a "date" entry.

Suggested fix
         let dates = resp
             .common_prefixes
             .iter()
             .flat_map(|path| path.parts())
-            .filter(|name| name.as_ref() != stream_name && name.as_ref() != STREAM_ROOT_DIRECTORY)
+            .filter(|name| {
+                let name_str = name.as_ref();
+                name_str != stream_name 
+                    && name_str != STREAM_ROOT_DIRECTORY
+                    && tenant_id.as_ref().map_or(true, |t| name_str != t)
+            })
             .map(|name| name.as_ref().to_string())
             .collect::<Vec<_>>();
src/storage/object_storage.rs (2)

176-202: ⚠️ Potential issue | 🟡 Minor

Normalize empty tenant IDs in storage metrics.

If default tenant is represented as an empty string, current labels will split metrics across "" and DEFAULT_TENANT. Normalize before recording.

🛠️ Suggested fix
 fn update_storage_metrics(
     path: &std::path::Path,
     stream_name: &str,
     filename: &str,
     tenant_id: &str,
 ) -> Result<(), ObjectStorageError> {
+    let tenant = if tenant_id.is_empty() { DEFAULT_TENANT } else { tenant_id };
     let mut file_date_part = filename.split('.').collect::<Vec<&str>>()[0];
     file_date_part = file_date_part.split('=').collect::<Vec<&str>>()[1];
     let compressed_size = path
         .metadata()
         .map(|m| m.len())
         .map_err(|e| ObjectStorageError::Custom(format!("metadata failed for {filename}: {e}")))?;
     STORAGE_SIZE
-        .with_label_values(&["data", stream_name, "parquet", tenant_id])
+        .with_label_values(&["data", stream_name, "parquet", tenant])
         .add(compressed_size as i64);
     EVENTS_STORAGE_SIZE_DATE
-        .with_label_values(&["data", stream_name, "parquet", file_date_part, tenant_id])
+        .with_label_values(&["data", stream_name, "parquet", file_date_part, tenant])
         .inc_by(compressed_size);
     LIFETIME_EVENTS_STORAGE_SIZE
-        .with_label_values(&["data", stream_name, "parquet", tenant_id])
+        .with_label_values(&["data", stream_name, "parquet", tenant])
         .add(compressed_size as i64);

     // billing metrics for parquet storage
-    increment_parquets_stored_by_date(file_date_part, tenant_id);
-    increment_parquets_stored_size_by_date(compressed_size, file_date_part, tenant_id);
+    increment_parquets_stored_by_date(file_date_part, tenant);
+    increment_parquets_stored_size_by_date(compressed_size, file_date_part, tenant);

Based on learnings, default-tenant paths may be represented with an empty string segment; the main concern is metrics labeling rather than path resolution.


1120-1138: ⚠️ Potential issue | 🟡 Minor

Treat empty tenant IDs like the default tenant in stream_relative_path for consistency.

The function can receive Some("") as tenant_id (used intentionally for default-tenant paths in the metastore layer). The current check !tenant.eq(DEFAULT_TENANT) will be true for empty strings, producing "/{stream_name}/..." paths. While empty strings are intentionally used in storage operations, they should be handled consistently: treat Some("") the same as None to avoid formatting empty tenant segments.

The suggested fix aligns the path construction with how the metastore handles default-tenant paths (converting empty strings for metrics consistency).

Suggested fix
 fn stream_relative_path(
     stream_name: &str,
     filename: &str,
     custom_partition: &Option<String>,
     tenant_id: &Option<String>,
 ) -> String {
     let mut file_suffix = str::replacen(filename, ".", "/", 3);

     if let Some(custom_partition_fields) = custom_partition {
         let custom_partition_list = custom_partition_fields.split(',').collect::<Vec<&str>>();
         file_suffix = str::replacen(filename, ".", "/", 3 + custom_partition_list.len());
     }
-    if let Some(tenant) = tenant_id
-        && !tenant.eq(DEFAULT_TENANT)
-    {
-        format!("{tenant}/{stream_name}/{file_suffix}")
-    } else {
-        format!("{stream_name}/{file_suffix}")
-    }
+    let tenant = tenant_id
+        .as_deref()
+        .filter(|t| !t.is_empty() && *t != DEFAULT_TENANT);
+    if let Some(tenant) = tenant {
+        format!("{tenant}/{stream_name}/{file_suffix}")
+    } else {
+        format!("{stream_name}/{file_suffix}")
+    }
 }
🤖 Fix all issues with AI agents
In `@src/utils/mod.rs`:
- Around line 240-251: The function create_intracluster_auth_headermap currently
calls HeaderValue::from_bytes(...).unwrap(), which can panic on invalid header
bytes; change it to handle the Result instead of unwrapping: for each header
retrieval (AUTHORIZATION and COOKIE) call
HeaderValue::from_bytes(auth.as_bytes()) and match or if let Ok(hv) = ... then
map.insert(reqwest::header::AUTHORIZATION or COOKIE, hv) and otherwise skip the
header or log a warning; follow the same safe pattern used in
src/handlers/http/cluster/mod.rs so invalid header bytes do not cause a panic.
🧹 Nitpick comments (4)
src/query/mod.rs (2)

76-77: Remove commented-out code.

These commented lines are dead code that should be removed to maintain code cleanliness.

🧹 Proposed fix
-// pub static QUERY_SESSION: Lazy<SessionContext> =
-//     Lazy::new(|| Query::create_session_context(PARSEABLE.storage()));
-
 pub static QUERY_SESSION_STATE: Lazy<SessionState> =

295-335: Consider caching get_ctx() result to reduce lock acquisitions.

QUERY_SESSION.get_ctx() is called multiple times (lines 295, 312, 319, 335), each acquiring a read lock. While the lock is shared and the clone is cheap, caching the context at the start of the function would reduce lock contention under high concurrency.

♻️ Proposed optimization
+        let ctx = QUERY_SESSION.get_ctx();
         let df = QUERY_SESSION
-            .get_ctx()
+            ctx
             .execute_logical_plan(self.final_logical_plan(tenant_id))
             .await?;
         // ... later in the function ...
-        let plan = QUERY_SESSION
-            .get_ctx()
+        let plan = ctx
             .state()
             .create_physical_plan(df.logical_plan())
             .await?;
         // ... and so on for other usages
src/handlers/http/logstream.rs (1)

272-299: Consider handling missing metric labels gracefully.

The get_stats_date function uses .unwrap() on metric label lookups (lines 281, 285, 289). If the metric with the specified labels doesn't exist, this will panic. Consider using get_metric_with_label_values(...).map(|m| m.get() as u64).unwrap_or(0) for defensive handling.

Suggested defensive approach
     let events_ingested = EVENTS_INGESTED_DATE
         .get_metric_with_label_values(&event_labels)
-        .unwrap()
-        .get() as u64;
+        .map(|m| m.get() as u64)
+        .unwrap_or(0);
     let ingestion_size = EVENTS_INGESTED_SIZE_DATE
         .get_metric_with_label_values(&event_labels)
-        .unwrap()
-        .get() as u64;
+        .map(|m| m.get() as u64)
+        .unwrap_or(0);
     let storage_size = EVENTS_STORAGE_SIZE_DATE
         .get_metric_with_label_values(&storage_size_labels)
-        .unwrap()
-        .get() as u64;
+        .map(|m| m.get() as u64)
+        .unwrap_or(0);
src/metastore/metastore_traits.rs (1)

71-82: Inconsistent API: bulk getters return per-tenant maps but don't accept tenant_id.

Methods like get_keystones, get_conversations, get_alerts return HashMap<String, Vec<Bytes>> (tenant → data) but don't accept a tenant_id parameter, meaning they always fetch all tenants. Meanwhile, get_alert_states accepts tenant_id and returns Vec<AlertStateEntry> for a single tenant. This inconsistency may confuse API consumers. Consider documenting this design choice or aligning the APIs.

Comment on lines +240 to +251
pub fn create_intracluster_auth_headermap(req: &HttpRequest) -> reqwest::header::HeaderMap {
let mut map = reqwest::header::HeaderMap::new();
if let Some(auth) = req.headers().get(actix_web::http::header::AUTHORIZATION) {
map.insert(
reqwest::header::AUTHORIZATION,
reqwest::header::HeaderValue::from_bytes(auth.as_bytes()).unwrap(),
);
} else if let Some(auth) = req.headers().get(actix_web::http::header::COOKIE) {
map.insert(
reqwest::header::COOKIE,
reqwest::header::HeaderValue::from_bytes(auth.as_bytes()).unwrap(),
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

find . -name "mod.rs" -path "*/src/utils/*" -type f

Repository: parseablehq/parseable

Length of output: 132


🏁 Script executed:

# Check if src/utils/mod.rs exists and examine the relevant lines
if [ -f "src/utils/mod.rs" ]; then
  wc -l src/utils/mod.rs
fi

Repository: parseablehq/parseable

Length of output: 85


🏁 Script executed:

# Get the function and surrounding context
sed -n '230,260p' src/utils/mod.rs

Repository: parseablehq/parseable

Length of output: 1007


🏁 Script executed:

rg "HeaderValue::from_bytes" --type rust -B 2 -A 2

Repository: parseablehq/parseable

Length of output: 2086


🏁 Script executed:

# Also check the imports to understand the source of HeaderValue
head -50 src/utils/mod.rs | grep -E "^use|^extern"

Repository: parseablehq/parseable

Length of output: 843


Don't unwrap HeaderValue::from_bytes() when mirroring headers.

HeaderValue::from_bytes() returns a Result and will panic if the bytes are invalid. Handle the error instead of unwrapping. The codebase already demonstrates the correct pattern elsewhere (e.g., in src/handlers/http/cluster/mod.rs).

🛠️ Suggested fix
-        map.insert(
-            reqwest::header::AUTHORIZATION,
-            reqwest::header::HeaderValue::from_bytes(auth.as_bytes()).unwrap(),
-        );
+        if let Ok(value) = reqwest::header::HeaderValue::from_bytes(auth.as_bytes()) {
+            map.insert(reqwest::header::AUTHORIZATION, value);
+        }
@@
-        map.insert(
-            reqwest::header::COOKIE,
-            reqwest::header::HeaderValue::from_bytes(auth.as_bytes()).unwrap(),
-        );
+        if let Ok(value) = reqwest::header::HeaderValue::from_bytes(auth.as_bytes()) {
+            map.insert(reqwest::header::COOKIE, value);
+        }
🤖 Prompt for AI Agents
In `@src/utils/mod.rs` around lines 240 - 251, The function
create_intracluster_auth_headermap currently calls
HeaderValue::from_bytes(...).unwrap(), which can panic on invalid header bytes;
change it to handle the Result instead of unwrapping: for each header retrieval
(AUTHORIZATION and COOKIE) call HeaderValue::from_bytes(auth.as_bytes()) and
match or if let Ok(hv) = ... then map.insert(reqwest::header::AUTHORIZATION or
COOKIE, hv) and otherwise skip the header or log a warning; follow the same safe
pattern used in src/handlers/http/cluster/mod.rs so invalid header bytes do not
cause a panic.

@nitisht nitisht closed this Mar 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants