Skip to content

Add Spring RabbitMQ plugin#796

Open
liuhaolong10 wants to merge 4 commits intoapache:mainfrom
liuhaolong10:spring_rabbit
Open

Add Spring RabbitMQ plugin#796
liuhaolong10 wants to merge 4 commits intoapache:mainfrom
liuhaolong10:spring_rabbit

Conversation

@liuhaolong10
Copy link

Fix RabbitMQ trace disconnection and incomplete consumer trace stack (closes #13720)

  • Add a unit test to verify that the fix works.
  • Explain briefly why the bug exists and how to fix it.

Bug Root Cause

The original instrumentation method in the RabbitMQ plugin is executed in a dedicated thread pool for RabbitMQ message processing, which is not the same thread where consumers execute business logic. This causes trace information loss:

  • The plugin can only capture "message consumed" events but cannot track subsequent business operations (e.g., MySQL/Redis calls).
  • The traceId cannot be connected between producer and consumer services.

Fix Solution

  1. Implement a new instrumentation method for RabbitMQ that aligns with the business logic thread, ensuring traceId continuity between producer and consumer.
  2. Modify the original RabbitMQ plugin code to avoid data collection conflicts with the new spring-rabbit plugin.

Additional Notes for New Plugin

The new spring-rabbit plugin solves the problem that the original RabbitMQ plugin cannot collect messages consumed by @RabbitListener annotation, ensuring complete trace collection for spring-rabbit based RabbitMQ consumption scenarios.

  • If this pull request closes/resolves/fixes an existing issue, replace the issue number. Closes #13720.
  • Update the CHANGES log.

@wu-sheng wu-sheng added enhancement New feature or request plugin labels Mar 8, 2026
@wu-sheng wu-sheng added this to the 9.7.0 milestone Mar 8, 2026
@liuhaolong10
Copy link
Author

@wu-sheng
When I run the test script locally with the command bash ./test/plugin/run.sh -f spring-rabbitmq-3.x-4.x-scenario, it consistently fails to execute successfully. There are two main reasons identified:

  1. Timeout when connecting to GitHub.
  2. Failure to pull the rabbitmq-server image via Docker.

I have a proxy enabled locally, and accessing GitHub via the browser works without any issues. Despite spending the entire afternoon troubleshooting this test, I still haven't been able to make it run successfully.

Could you please share some tips or best practices that you use when running this test locally? Any guidance would be greatly appreciated.

@wu-sheng
Copy link
Member

wu-sheng commented Mar 8, 2026

GitHub is on Azure US, it should not have image pulling issue. Have you rechecked whether this image exists on DockerHub?

@wu-sheng
Copy link
Member

wu-sheng commented Mar 8, 2026

From my check of the logs, the server indeed is never booted. Because HEAD:/spring-rabbitmq-3.x-4.x-scenario/case/healthcheck is never responding. Is you app waiting for RabbitMQ server booted properly?

@liuhaolong10
Copy link
Author

The health check for spring-rabbitmq-2. x-scenario failed when I ran it around 2 PM.
The health check for spring-rabbitmq-3.x-4. x-scenario passed, but the expected data was incorrect.

In my local runs, most of the health checks don't pass at all.
I am using the rabbitmq:3.8.18 image, which I copied directly from the scenario code in the rabbitmq-scenario test module.
I believe the image itself should not be the problem.

@wu-sheng
Copy link
Member

wu-sheng commented Mar 8, 2026

In my local runs, most of the health checks don't pass at all.

Health check should be passed, as it verifies whether the service is ready to send traffic.

@wu-sheng
Copy link
Member

  1. The 2.x test scenario uses spring-boot-starter-amqp versions 2.0.0.RELEASE through 2.4.0 — these are all EOL Spring Boot 2 versions. Given that 2.x is no longer maintained. We may be better to remove that?
  2. Thread name check in RabbitMQConsumerInterceptor is fragile and unreliable
    (rabbitmq-plugin/.../RabbitMQConsumerInterceptor.java)
  if (Thread.currentThread().getName().toLowerCase().contains("springframework")) {
      return;
  }

This is the biggest problem in the PR. Relying on thread names to determine behavior is extremely brittle:

  • Thread names are not a contract and can be customized by users or framework versions.
  • If a non-Spring application happens to name threads with "springframework", the original RabbitMQ plugin silently breaks.
  • If Spring changes thread naming conventions, both plugins break (duplicate spans or no spans).
  • This couples the existing rabbitmq-plugin to Spring-specific knowledge, violating separation of concerns.
  1. Consider adopting plugin v2 APIs.
  2. Supported-list.d should be updated.

@liuhaolong10
Copy link
Author

  1. When adding the new plugin, my intention was to verify and cover as many versions as possible, so I validated the spring-boot-starter-amqp versions from 2.0.0.RELEASE to 2.4.0. Although these versions are no longer maintained by Spring, they are still widely used in many companies' projects. I verified these versions to clearly indicate which versions the new plugin supports. If the test scenarios involving EOL (End of Life) versions introduce security risks, I can remove the corresponding test scenario code.

  2. You are absolutely right that the thread name check mechanism is problematic — this was an oversight on my part. After sorting out the code logic, I plan to modify it to judge by object type instead:

    Consumer consumer = (Consumer) allArguments[6];
    if (consumer != null && "org.springframework.amqp.rabbit.listener.BlockingQueueConsumer$InternalConsumer".equals(consumer.getClass().getName())) {
        return;
    }
  3. I will refactor the plugin code to use the plugin v2 APIs in my next commit.

  4. I will update the Supported-list.d file in my next commit as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request plugin

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants