The following are some initial results of some benchmarks, at this point it is more about exploration of some ideas I have and optimizing certain parts but overall there is not much optimization in place. There might be some mistakes in some places that I have missed.
specs:
Intel Core i3-6100 CPU 3.70GHz (Skylake), 1 CPU, 4 logical and 2 physical cores
Frequency=3609482 Hz, Resolution=277.0481 ns, Timer=TSC
.NET Core SDK=2.1.801
[Host] : .NET Core 2.1.12 (CoreCLR 4.6.27817.01, CoreFX 4.6.27818.01), 64bit RyuJIT
Job=InProcess Toolchain=InProcessEmitToolchain
Case 1. verifying P2PKH scripts
This is from my long comment above. Initially inside optimizer I was doing some boxing/unboxing that slowed things down a lot. Fixing that gives a much better result. The CheckSigOp is mocked to simplify things and avoid needing a tx and skip ECC.
| Method | Mean | Error | StdDev | Ratio | RatioSD | Gen 0 | Gen 1 | Gen 2 | Allocated |
|---------- |---------:|----------:|-----------:|------:|--------:|--------:|------:|------:|----------:|
| SimpleRun | 439.5 us | 9.2282 us | 10.6272 us | 1.00 | 0.00 | 91.7969 | - | - | 141.23 KB |
| Optimized | 373.3 us | 0.2264 us | 0.2118 us | 0.85 | 0.02 | 75.1953 | - | - | 115.79 KB |
Case 2. Many pushes
OP_CHECKSIG (200 times)
OP_1
push [ 0 ]
OP_IF
OP_0 (9601 times)
OP_ENDIF
| Method | Mean | Error | StdDev | Median | Ratio |
|---------- |---------:|---------:|---------:|---------:|------:|
| SimpleRun | 786.1 us | 10.88 us | 60.57 us | 782.4 us | 1.00 |
| Optimized | 743.4 us | 10.38 us | 58.15 us | 723.1 us | 0.95 |
Case 3. Many OP_IF`s
OP_IF { 100 times }
0 { 9798 times }
OP_ENDIF { 100 times }
1
| Method | Mean | Error | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
|----------- |-----------:|---------:|---------:|---------:|---------:|--------:|-----------:|
| SimpleRun | 677.3 us | 13.42 us | 15.98 us | 124.0234 | 61.5234 | 41.0156 | 686.45 KB |
| ReadAndRun | 1,343.7 us | 18.92 us | 16.77 us | 248.0469 | 123.0469 | 82.0313 | 1372.67 KB |
Case 4. Many OP_ROLL`s
998 OP_ROLL { 200 times }
| Method | Mean | Error | StdDev | Ratio | Gen 0 | Gen 1 | Gen 2 | Allocated |
|----------- |---------:|----------:|----------:|------:|---------:|-------:|------:|----------:|
| SimpleRun | 122.0 us | 1.5537 us | 1.4534 us | 1.00 | 333.1299 | 0.1221 | - | 511.5 KB |
| ReadAndRun | 145.6 us | 0.5623 us | 0.4985 us | 1.19 | 385.4980 | - | - | 594.83 KB |
Future plans:
I intend to continue working on this more and compare things with some actual time values I gain using bitcoin core. But for now I'll move on to optimization of my Asymmetric Cryptography and start the comparison there.