(improvement) Optimize VectorType deserialization with struct.unpack and numpy by mykaul · Pull Request #730 · scylladb/python-driver

mykaul · 2026-03-07T10:01:05Z

Summary

Replace element-by-element VectorType deserialization with bulk struct.unpack for known numeric types (float, double, int32, int64, short), caching a struct.Struct object at type-creation time
Add numpy fast-path (np.frombuffer().tolist()) for vectors with >= 32 elements, delivering ~4x speedup for 768/1536-dimension float vectors

Performance (pure Python path, `CASS_DRIVER_NO_CYTHON=1`)

Vector Config	Before	After (struct)	After (numpy)	Total Speedup
`Vector<float, 3>`	0.88 µs	0.25 µs	— (uses struct)	3.58x
`Vector<float, 128>`	4.72 µs	4.06 µs	1.87 µs	2.5x
`Vector<float, 768>`	32.43 µs	30.72 µs	8.45 µs	3.8x
`Vector<float, 1536>`	63.74 µs	63.24 µs	15.77 µs	4.0x

Details

Commit 1 — struct.unpack optimization:

At apply_parameters() time, cache a struct.Struct('>Nf') for the vector's subtype+dimension
deserialize() calls list(struct.unpack(byts)) — single C-level bulk unpack
Also optimizes serialization via struct.pack(*v)
Fallback for non-numeric fixed-size types uses pre-allocated result list + cached method reference

Commit 2 — numpy for large vectors:

For vectors >= 32 elements with a known numeric dtype, use np.frombuffer(byts, dtype='>f4', count=N).tolist()
numpy avoids intermediate Python object creation during unpacking; .tolist() batch-converts with better cache locality
Threshold of 32 chosen empirically: below this, struct.unpack is faster due to lower fixed overhead
_numpy_dtype cached on the class at type-creation time

Both commits modify only cassandra/cqltypes.py. No Cython dependency.

…ct.unpack Add bulk deserialization using struct.unpack for common numeric vector types instead of element-by-element deserialization. This provides significant performance improvements, especially for small vectors and integer types. Optimized types: - FloatType ('>Nf' format) - DoubleType ('>Nd' format) - Int32Type ('>Ni' format) - LongType ('>Nq' format) - ShortType ('>Nh' format) Performance improvements (measured with CASS_DRIVER_NO_CYTHON=1): Small vectors (3-4 elements): Vector<float, 3> : 0.88 μs → 0.25 μs (3.58x faster) Vector<float, 4> : 0.78 μs → 0.28 μs (2.79x faster) Medium vectors (128 elements): Vector<float, 128> : 4.72 μs → 4.06 μs (1.16x faster) Vector<double, 128> : 4.83 μs → 4.01 μs (1.20x faster) Vector<int, 128> : 2.27 μs → 1.25 μs (1.82x faster) Large vectors (384-1536 elements): Vector<float, 384> : 15.38 μs → 14.67 μs (1.05x faster) Vector<float, 768> : 32.43 μs → 30.72 μs (1.06x faster) Vector<float, 1536> : 63.74 μs → 63.24 μs (1.01x faster) The optimization is most effective for: - Small vectors (3-4 elements): 2.8-3.6x speedup - Integer vectors: 1.8x speedup - Medium-sized float/double vectors: 1.2-1.3x speedup For very large vectors (384+ elements), the benefit is minimal as the deserialization time is dominated by data copying rather than function call overhead. Variable-size subtypes and other numeric types continue to use the element-by-element fallback path. Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>

For vectors with 32 or more elements, use numpy.frombuffer() which provides 1.3-1.5x speedup for large vectors (128+ elements) compared to struct.unpack. The hybrid approach: - Small vectors (< 32 elements): struct.unpack (2.8-3.6x faster than baseline) - Large vectors (>= 32 elements): numpy.frombuffer().tolist() (1.3-1.5x faster than struct.unpack) Threshold of 32 elements balances code complexity with performance gains. Benchmark results: - float[128]: 2.15 μs → 1.87 μs (1.15x faster) - float[384]: 6.17 μs → 4.44 μs (1.39x faster) - float[768]: 12.25 μs → 8.45 μs (1.45x faster) - float[1536]: 24.44 μs → 15.77 μs (1.55x faster) Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>

Copilot

Pull request overview

This PR optimizes VectorType (de)serialization in cassandra/cqltypes.py by introducing bulk numeric (de)serialization via a cached struct.Struct, and an optional numpy-based deserialization fast path for larger vectors.

Changes:

Cache a per-parameterized-vector struct.Struct to bulk unpack/pack common numeric vector subtypes.
Add an optional numpy frombuffer(...).tolist() deserialization fast-path for vectors with vector_size >= 32.
Refactor variable-size vector deserialization to a fixed-iteration loop with stricter bounds checks.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

cassandra/cqltypes.py

+    import numpy as np



cassandra/cqltypes.py

            try:
                size, bytes_read = uvint_unpack(byts[idx:])
-                idx += bytes_read
-                rv.append(cls.subtype.deserialize(byts[idx:idx + size], protocol_version))
-                idx += size
-            except:
+            except (IndexError, KeyError):
                raise ValueError("Error reading additional data during vector deserialization after successfully adding {} elements"\
-                .format(len(rv)))
+                    .format(i))


cassandra/cqltypes.py

+            if cls._vector_struct is not None:
+                if HAVE_NUMPY and cls.vector_size >= 32 and cls._numpy_dtype is not None:
+                    return np.frombuffer(byts, dtype=cls._numpy_dtype, count=cls.vector_size).tolist()
+                return list(cls._vector_struct.unpack(byts))


mykaul added 2 commits March 7, 2026 12:00

mykaul marked this pull request as draft March 7, 2026 10:22

mykaul mentioned this pull request Mar 14, 2026

Tracking: Vector search (VectorType) performance improvement PRs #746

Open

mykaul requested a review from Copilot March 16, 2026 18:14

Copilot started reviewing on behalf of mykaul March 16, 2026 18:14 View session

Copilot AI reviewed Mar 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(improvement) Optimize VectorType deserialization with struct.unpack and numpy#730

(improvement) Optimize VectorType deserialization with struct.unpack and numpy#730
mykaul wants to merge 2 commits intoscylladb:masterfrom
mykaul:vector-struct-numpy-deser

mykaul commented Mar 7, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mykaul commented Mar 7, 2026

Summary

Performance (pure Python path, CASS_DRIVER_NO_CYTHON=1)

Details

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Performance (pure Python path, `CASS_DRIVER_NO_CYTHON=1`)