Skip to content

perf: Use LinkedHashSet instead of array backed collection in ObjectsStore#98

Open
wchill wants to merge 1 commit intoREAndroid:mainfrom
wchill:perf/use-linkedhashset
Open

perf: Use LinkedHashSet instead of array backed collection in ObjectsStore#98
wchill wants to merge 1 commit intoREAndroid:mainfrom
wchill:perf/use-linkedhashset

Conversation

@wchill
Copy link

@wchill wchill commented Mar 12, 2026

Fixes #97

ObjectsStore does not provide an API for random access, so there is no point in using a list or array because we get no benefit from constant-time random access, as long as we can guarantee ordering (which LinkedHashSet does). This speeds up add, remove, and contains operations from O(n) to O(1), which prevents the degenerate case mentioned in #97.

The underlying data structure is not exposed (because all accesses are proxied through ObjectsStore), so we don't need to worry about the extra functions that ObjectsList/ArrayCollection provides.

Speedup: approximately 4 minutes spent for decode/modify/encode/zipalign/sign total -> 26 seconds on a Ryzen AI 9 HX 370.

@REAndroid
Copy link
Owner

REAndroid commented Mar 14, 2026

Thank you for this PR!
The purpose ObjectsStore is store non-duplicate objects with identity hash, mostly the entry size is expected to be 1 thus will not create array based collection. However LinkedHashSet does not store based on identity hash for example

  if-eqz v0, :cond_0
  if-nez v1, :cond_0
  ....
  :cond_0
  :cond_0

In smali we don't print duplicate :cond_0 labels but both resulting true on equals and hashCode therefore LinkedHashSet will drop one of :cond_0. This may cause critical error while dex live editing.

You are creating ArrayList for every sort, may not dispose fast during garbage collection

I must make a test for this and merge latter.

BTW: The root cause of this problem most likely r8 ignores stripping dead debug line numbers. Sometime you may find several hundred line numbers for single instruction. Check with -no-dex-debug surely you will get significant improvements. I have a plan to make a dropper for damaged debug line numbers.

@wchill
Copy link
Author

wchill commented Mar 14, 2026

I did originally use an IdentityHashMap to implement this, but the issue is that the contains() check on ArrayCollection actually checks for both reference equality and equality using equals(). So while IdentityHashMap might work, I did not explore it further because the semantics are not the same.

Creation of an ArrayList on each sort did not seem to be an issue in practice, unless you are attempting to sort every time something is added/removed to the data structure (which would be a problem in itself).

@REAndroid
Copy link
Owner

I have to study it further, allow me some time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Poor performance during APK decoding when APK contains many duplicate spec names

2 participants