Skip to content

[Performance Issue][macOS] Severe UI Freezing During AI Streaming Output in Long Conversations, No Abnormal Hardware Usage, Resolved After Response Completion #300

@NiaGao

Description

@NiaGao

Basic Environment Information

  • Device Model: MacBook Air M1
  • Bitfun Client Version: 0.2.1
  • Running Architecture: Apple Silicon Native

Issue Summary

On the macOS Bitfun client, severe UI freezing and unresponsive interactions occur exclusively during the AI's streaming (token-by-token) response in long conversations. There is no abnormal high load on CPU, memory, GPU or disk IO throughout the process, and the UI returns to full fluency immediately after the AI finishes its response. The longer the chat context and the more conversation turns, the more severe the freezing becomes. All other applications on the system run perfectly smoothly, ruling out system-wide performance issues.

Stable Reproduction Steps

  1. Open the Bitfun macOS client v0.2.1 and create a new blank chat session.
  2. Have continuous multi-turn conversations with the AI, accumulating no less than 20 turns of dialogue to form a long chat context.
  3. Send a new prompt to trigger the AI's streaming token-by-token response.
  4. During the AI's streaming output, try scrolling the chat window, clicking UI buttons, or typing in the input box, and observe the UI responsiveness.

Actual Behavior

  1. Freezing only occurs during the AI's streaming token-by-token output, with specific symptoms: choppy page scrolling, unresponsive mouse clicks, delayed/dropped keyboard input, and completely blocked UI interactions.
  2. The UI returns to full, instant fluency immediately after the AI finishes its response and stops streaming output, with no residual lag.
  3. The severity of freezing increases exponentially with the length of the chat context and the number of historical messages: there is almost no noticeable lag in conversations with less than 10 turns, while frequent long unresponsive periods occur in conversations with more than 50 turns.
  4. Monitored via macOS Activity Monitor throughout the process: Bitfun shows no abnormal peak or sustained high load in CPU, memory, GPU, or disk IO usage, ruling out hardware performance bottlenecks.
  5. This issue only occurs in the Bitfun client, while all other applications on the system run smoothly, ruling out system-wide anomalies.
  6. The current client version (v0.2.1) does not provide any related settings to mitigate this issue (e.g. toggle off real-time Markdown rendering, enable virtual scrolling, disable typing animation, etc.), so end users cannot optimize this on their own.

Root Cause Speculation (for the development team's reference)

This freezing matches the typical symptoms of frontend long-list rendering bottlenecks, most likely caused by the following issues:

  1. Full re-render of the entire chat list on every new token received during streaming output, instead of only incrementally updating the currently generating message. Excessive DOM nodes in long contexts trigger frequent full-page reflow & repaint, blocking the main thread.
  2. Time-consuming operations such as Markdown parsing, code syntax highlighting, and token counting are executed synchronously on the main thread, triggered frequently during streaming output, causing intermittent main thread blocking.
  3. No virtual scrolling (virtual list) is enabled for the chat history list, resulting in full rendering of all historical message DOM nodes. Rendering overhead rises sharply in long contexts.
  4. Synchronous local cache writes, IPC communications, or other disk IO operations are executed during streaming output, further exacerbating main thread blocking.

Expected Behavior & Enhancement Requests

  1. Fix the main thread blocking issue during streaming output in long conversations, optimize the rendering logic, and adopt solutions such as incremental updates and virtual scrolling to reduce rendering overhead.
  2. Add optional settings for users to toggle off real-time Markdown rendering, disable typing animation, and batch-render AI output content, to mitigate freezing in long conversations.
  3. Offload time-consuming operations such as Markdown parsing, syntax highlighting, and token counting to Web Worker threads, to avoid blocking the main thread UI rendering.

Thank you for your time and effort in addressing this issue! Please let me know if you need any additional information, logs, or test cases to debug this problem.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions