Skip to content

Conversation

@tigrannajaryan
Copy link
Collaborator

Don't review.

Continue 64x2 SIMD implementation

Start 64x2 SIMD implementation

Continue 64x2 implementation

Start 64x2 implementation

Faster varint

Cleanup

Fix file names

Cleanup LUTs

Improve writeUvar32x4_AVX2

Implement writeUvar32x4_AVX2

Begin writing writeUvar32x4_AVX2

Change to 0,1,2,4 schema for 32 bit

Add WriteUvar4x32 and ReadUvar4x32

Implement ReadUvar4x64SIMD amd64

Implement WriteUvar4x64 and ReadUvar4x64
@github-actions
Copy link

github-actions bot commented Oct 1, 2025

Benchmark Result

Benchmark diff with base branch
goos: linux
goarch: amd64
pkg: github.com/splunk/stef/benchmarks
cpu: AMD EPYC 7763 64-Core Processor                
                                                 │ bench-main.txt │           bench-new.txt            │
                                                 │     sec/op     │    sec/op     vs base              │
SerializeNative/STEF/serialize-4                     6.655m ±  3%   7.099m ±  7%  +6.68% (p=0.002 n=6)
SerializeNative/STEFU/serialize-4                    33.51m ±  1%   33.67m ±  1%       ~ (p=0.240 n=6)
DeserializeNative/STEF/deser-4                       2.581m ±  1%   2.608m ±  2%       ~ (p=0.093 n=6)
DeserializeNative/STEFU/deser-4                      7.448m ±  1%   7.426m ±  1%       ~ (p=0.589 n=6)
SerializeFromPdata/STEF/serialize-4                  131.6m ±  3%   133.8m ± 22%       ~ (p=0.240 n=6)
SerializeFromPdata/STEFU/serialize-4                 33.65m ±  1%   33.85m ±  1%       ~ (p=0.093 n=6)
DeserializeToPdata/STEF/deserialize-4                39.60m ±  2%   39.88m ±  1%       ~ (p=0.394 n=6)
DeserializeToPdata/STEFU/deserialize-4               56.67m ±  2%   56.78m ±  0%       ~ (p=0.818 n=6)
STEFReaderRead-4                                     2.626m ±  1%   2.678m ±  1%  +1.96% (p=0.002 n=6)
STEFSerializeMultipart/astronomy-otelmetrics-4        3.286 ± 21%    3.255 ± 23%       ~ (p=0.589 n=6)
STEFDeserializeMultipart/astronomy-otelmetrics-4     75.00m ± 12%   78.79m ± 12%       ~ (p=0.394 n=6)
ReadSTEF-4                                           2.773m ±  1%   2.729m ±  3%       ~ (p=0.065 n=6)
ReadSTEFZ-4                                          4.435m ±  1%   4.065m ±  1%  -8.35% (p=0.002 n=6)
ReadSTEFZWriteSTEF-4                                 7.931m ±  4%   7.973m ±  0%       ~ (p=0.589 n=6)
geomean                                              20.72m         20.82m        +0.50%

                                                 │ bench-main.txt │            bench-new.txt            │
                                                 │   sec/point    │   sec/point    vs base              │
SerializeNative/STEF/serialize-4                     99.52n ±  3%   106.20n ±  7%  +6.71% (p=0.002 n=6)
SerializeNative/STEFU/serialize-4                    501.2n ±  1%    503.6n ±  1%       ~ (p=0.240 n=6)
DeserializeNative/STEF/deser-4                       38.59n ±  1%    39.01n ±  2%       ~ (p=0.093 n=6)
DeserializeNative/STEFU/deser-4                      111.4n ±  1%    111.0n ±  0%       ~ (p=0.675 n=6)
SerializeFromPdata/STEF/serialize-4                  1.968µ ±  3%    2.002µ ± 22%       ~ (p=0.229 n=6)
SerializeFromPdata/STEFU/serialize-4                 503.3n ±  1%    506.3n ±  2%       ~ (p=0.082 n=6)
DeserializeToPdata/STEF/deserialize-4                592.3n ±  2%    596.5n ±  1%       ~ (p=0.331 n=6)
DeserializeToPdata/STEFU/deserialize-4               847.8n ±  2%    849.4n ±  0%       ~ (p=0.818 n=6)
STEFReaderRead-4                                     39.28n ±  1%    40.04n ±  1%  +1.95% (p=0.002 n=6)
STEFSerializeMultipart/astronomy-otelmetrics-4       4.176µ ± 21%    4.137µ ± 23%       ~ (p=0.589 n=6)
STEFDeserializeMultipart/astronomy-otelmetrics-4     95.32n ± 12%   100.14n ± 12%       ~ (p=0.370 n=6)
ReadSTEF-4                                           41.49n ±  1%    40.84n ±  3%       ~ (p=0.061 n=6)
ReadSTEFZ-4                                          66.38n ±  1%    60.84n ±  1%  -8.35% (p=0.002 n=6)
ReadSTEFZWriteSTEF-4                                 118.7n ±  4%    119.3n ±  1%       ~ (p=0.608 n=6)
geomean                                              217.9n          219.0n        +0.51%

                                                 │ bench-main.txt │            bench-new.txt             │
                                                 │      B/op      │     B/op      vs base                │
SerializeNative/STEF/serialize-4                     3.338Mi ± 0%   3.340Mi ± 0%  +0.04% (p=0.026 n=6)
SerializeNative/STEFU/serialize-4                    7.530Mi ± 0%   7.530Mi ± 0%       ~ (p=0.576 n=6)
DeserializeNative/STEF/deser-4                       934.2Ki ± 0%   934.2Ki ± 0%       ~ (p=1.000 n=6) ¹
DeserializeNative/STEFU/deser-4                      1.470Mi ± 0%   1.470Mi ± 0%       ~ (p=1.000 n=6) ¹
SerializeFromPdata/STEF/serialize-4                  74.82Mi ± 0%   74.82Mi ± 0%       ~ (p=0.288 n=6)
SerializeFromPdata/STEFU/serialize-4                 7.530Mi ± 0%   7.530Mi ± 0%       ~ (p=0.587 n=6)
DeserializeToPdata/STEF/deserialize-4                29.91Mi ± 0%   29.91Mi ± 0%       ~ (p=0.323 n=6)
DeserializeToPdata/STEFU/deserialize-4               36.53Mi ± 0%   36.53Mi ± 0%       ~ (p=0.422 n=6)
STEFReaderRead-4                                     935.9Ki ± 0%   935.9Ki ± 0%       ~ (p=1.000 n=6) ¹
STEFSerializeMultipart/astronomy-otelmetrics-4       3.361Gi ± 0%   3.363Gi ± 0%       ~ (p=0.394 n=6)
STEFDeserializeMultipart/astronomy-otelmetrics-4     20.40Mi ± 0%   20.40Mi ± 0%       ~ (p=0.502 n=6)
ReadSTEF-4                                           935.9Ki ± 0%   935.9Ki ± 0%       ~ (p=0.455 n=6)
ReadSTEFZ-4                                          10.27Mi ± 0%   10.27Mi ± 0%       ~ (p=0.818 n=6)
ReadSTEFZWriteSTEF-4                                 13.43Mi ± 0%   13.43Mi ± 0%       ~ (p=0.180 n=6)
geomean                                              10.38Mi        10.38Mi       +0.01%
¹ all samples are equal

                                                 │ bench-main.txt │            bench-new.txt            │
                                                 │   allocs/op    │  allocs/op   vs base                │
SerializeNative/STEF/serialize-4                      2.645k ± 0%   2.647k ± 1%  +0.09% (p=0.037 n=6)
SerializeNative/STEFU/serialize-4                      884.0 ± 0%    884.0 ± 0%       ~ (p=1.000 n=6)
DeserializeNative/STEF/deser-4                         465.0 ± 0%    465.0 ± 0%       ~ (p=1.000 n=6) ¹
DeserializeNative/STEFU/deser-4                        469.0 ± 0%    469.0 ± 0%       ~ (p=1.000 n=6) ¹
SerializeFromPdata/STEF/serialize-4                   134.7k ± 0%   134.7k ± 0%       ~ (p=0.589 n=6)
SerializeFromPdata/STEFU/serialize-4                   886.0 ± 0%    886.0 ± 0%       ~ (p=1.000 n=6)
DeserializeToPdata/STEF/deserialize-4                 622.5k ± 0%   622.5k ± 0%       ~ (p=1.000 n=6)
DeserializeToPdata/STEFU/deserialize-4                811.2k ± 0%   811.2k ± 0%       ~ (p=1.000 n=6) ¹
STEFReaderRead-4                                       465.0 ± 0%    465.0 ± 0%       ~ (p=1.000 n=6) ¹
STEFSerializeMultipart/astronomy-otelmetrics-4        13.15M ± 0%   13.15M ± 0%       ~ (p=1.000 n=6)
STEFDeserializeMultipart/astronomy-otelmetrics-4      2.293k ± 0%   2.293k ± 0%       ~ (p=1.000 n=6) ¹
ReadSTEF-4                                             465.0 ± 0%    465.0 ± 0%       ~ (p=1.000 n=6) ¹
ReadSTEFZ-4                                            501.0 ± 0%    501.0 ± 0%       ~ (p=1.000 n=6) ¹
ReadSTEFZWriteSTEF-4                                  1.232k ± 0%   1.232k ± 0%       ~ (p=1.000 n=6)
geomean                                               6.199k        6.199k       +0.01%
¹ all samples are equal
Benchmark result
benchstat bench-new.txt
goos: linux
goarch: amd64
pkg: github.com/splunk/stef/benchmarks
cpu: AMD EPYC 7763 64-Core Processor                
                                                 │ bench-new.txt │
                                                 │    sec/op     │
SerializeNative/STEF/serialize-4                    7.099m ±  7%
SerializeNative/STEFU/serialize-4                   33.67m ±  1%
DeserializeNative/STEF/deser-4                      2.608m ±  2%
DeserializeNative/STEFU/deser-4                     7.426m ±  1%
SerializeFromPdata/STEF/serialize-4                 133.8m ± 22%
SerializeFromPdata/STEFU/serialize-4                33.85m ±  1%
DeserializeToPdata/STEF/deserialize-4               39.88m ±  1%
DeserializeToPdata/STEFU/deserialize-4              56.78m ±  0%
STEFReaderRead-4                                    2.678m ±  1%
STEFSerializeMultipart/astronomy-otelmetrics-4       3.255 ± 23%
STEFDeserializeMultipart/astronomy-otelmetrics-4    78.79m ± 12%
ReadSTEF-4                                          2.729m ±  3%
ReadSTEFZ-4                                         4.065m ±  1%
ReadSTEFZWriteSTEF-4                                7.973m ±  0%
geomean                                             20.82m

                                                 │ bench-new.txt │
                                                 │   sec/point   │
SerializeNative/STEF/serialize-4                    106.2n ±  7%
SerializeNative/STEFU/serialize-4                   503.6n ±  1%
DeserializeNative/STEF/deser-4                      39.01n ±  2%
DeserializeNative/STEFU/deser-4                     111.0n ±  0%
SerializeFromPdata/STEF/serialize-4                 2.002µ ± 22%
SerializeFromPdata/STEFU/serialize-4                506.3n ±  2%
DeserializeToPdata/STEF/deserialize-4               596.5n ±  1%
DeserializeToPdata/STEFU/deserialize-4              849.4n ±  0%
STEFReaderRead-4                                    40.04n ±  1%
STEFSerializeMultipart/astronomy-otelmetrics-4      4.137µ ± 23%
STEFDeserializeMultipart/astronomy-otelmetrics-4    100.1n ± 12%
ReadSTEF-4                                          40.84n ±  3%
ReadSTEFZ-4                                         60.84n ±  1%
ReadSTEFZWriteSTEF-4                                119.3n ±  1%
geomean                                             219.0n

                                                 │ bench-new.txt │
                                                 │     B/op      │
SerializeNative/STEF/serialize-4                    3.340Mi ± 0%
SerializeNative/STEFU/serialize-4                   7.530Mi ± 0%
DeserializeNative/STEF/deser-4                      934.2Ki ± 0%
DeserializeNative/STEFU/deser-4                     1.470Mi ± 0%
SerializeFromPdata/STEF/serialize-4                 74.82Mi ± 0%
SerializeFromPdata/STEFU/serialize-4                7.530Mi ± 0%
DeserializeToPdata/STEF/deserialize-4               29.91Mi ± 0%
DeserializeToPdata/STEFU/deserialize-4              36.53Mi ± 0%
STEFReaderRead-4                                    935.9Ki ± 0%
STEFSerializeMultipart/astronomy-otelmetrics-4      3.363Gi ± 0%
STEFDeserializeMultipart/astronomy-otelmetrics-4    20.40Mi ± 0%
ReadSTEF-4                                          935.9Ki ± 0%
ReadSTEFZ-4                                         10.27Mi ± 0%
ReadSTEFZWriteSTEF-4                                13.43Mi ± 0%
geomean                                             10.38Mi

                                                 │ bench-new.txt │
                                                 │   allocs/op   │
SerializeNative/STEF/serialize-4                     2.647k ± 1%
SerializeNative/STEFU/serialize-4                     884.0 ± 0%
DeserializeNative/STEF/deser-4                        465.0 ± 0%
DeserializeNative/STEFU/deser-4                       469.0 ± 0%
SerializeFromPdata/STEF/serialize-4                  134.7k ± 0%
SerializeFromPdata/STEFU/serialize-4                  886.0 ± 0%
DeserializeToPdata/STEF/deserialize-4                622.5k ± 0%
DeserializeToPdata/STEFU/deserialize-4               811.2k ± 0%
STEFReaderRead-4                                      465.0 ± 0%
STEFSerializeMultipart/astronomy-otelmetrics-4       13.15M ± 0%
STEFDeserializeMultipart/astronomy-otelmetrics-4     2.293k ± 0%
ReadSTEF-4                                            465.0 ± 0%
ReadSTEFZ-4                                           501.0 ± 0%
ReadSTEFZWriteSTEF-4                                 1.232k ± 0%
geomean                                              6.199k

@tigrannajaryan tigrannajaryan changed the title Experiment with SIMD varint encoding Draft: Experiment with SIMD varint encoding Oct 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant