Skip to content

CASSCPP-3 Crash on Pure virtual function while the destructor of the Session object is called#583

Open
absurdfarce wants to merge 1 commit intoapache:trunkfrom
absurdfarce:casscpp3
Open

CASSCPP-3 Crash on Pure virtual function while the destructor of the Session object is called#583
absurdfarce wants to merge 1 commit intoapache:trunkfrom
absurdfarce:casscpp3

Conversation

@absurdfarce
Copy link
Contributor

Multiple steps to try to keep us out of the crash situation. First: remove the call to the callback from the handler's on_close() function. It's not especially useful when a connection is in the process of closing down like this; when that happens host up or host add events (the objective of the callback in question) should be considered unreliable anyway. This change alone is probably enough to fix the problem but it's effect there is entirely one of timing; by not having to handle the extra event delivery logic we're not taking as long in the callback so there's less opportunity to overlap with other resources being destructed. That said, minimizing the amount of work here is still a pretty good idea.

Second: don't make the on_close() function of the handler a pure virtual function. There's no obvious reason this function has to be a pure virtual function so the hope is that by giving it a (no-op) body we'll just execute that if we wind up with a conversion from a PrepareHostHandler to a ConnectionListener object while the event loop is shutting down.

Also avoid the callback in shutdown situations in order to avoid taking longer than we need to.
}

void PrepareHostHandler::on_close(Connection* connection) {
callback_(this);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Long-term the better answer is probably to not make HOST_UP/HOST_READY/HOST_ADD event distribution a function of the callback in use here. Those ops should be handled by logic in the relevant functions of cluster.cpp, specifically as some kind of "after" action from the prepare op. But this fix allows us to address the immediate problem without having to refactor the entirety of message delivery.

@absurdfarce
Copy link
Contributor Author

@yifan-c If you have a sec I'd love a review of this one!

@absurdfarce absurdfarce requested a review from yifan-c March 12, 2026 19:45
@absurdfarce
Copy link
Contributor Author

IBM Jenkins run for this change was clean. Well, there were some weird failures around cloud connectivity for Rocky Linux 9 only but I'm quite sure those were environmental/runner issues and not anything related to this change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant