Mining for Tor v3 onions in the cloud
Tor supports a new hidden service protocol as of v0.3.2.1-alpha, released back in October 2017, and is now in stable branches. Dubbed the "v3" onion service protocol, among other changes, it replaces SHA1/DH/RSA1024 with SHA3/ed25519/curve25519 for much improved cryptographic security.
I already had a v2 onion site up at tbrindus6tjv6wpi.onion, so I thought it would be an interesting exercise to mine a v3 vanity domain prefixed with tbrindus
. For this, I set up 15 servers to mine for a matching prefix — more on this below!
It took well over a week of mining, but as of today, this site can also be accessed through the v3 hidden service tbrindusxnnqwmzov5qof56hyion6usmciqwykffxqsawswhk73aq5yd.onion!
A bit of background
Tor hidden service domain "names" aren't really domain names as most are used to. You can enter them in your (Tor) browser, but you can't buy a particular domain you want — a hidden service hostname is a prefix of the base32-encoded public key of the service.
If you want a particular onion, you must randomly generate billions of keys until one happens to hash into a string starting with the prefix you're looking for. In the case of tbrindus
, an 8-letter prefix, there are $32^8 = 1\,099\,511\,627\,776$ possible combinations. Every additional letter increases the space (and hence expected computation time) by a factor of 32.
V2 onions have been around for a long time, so there exist GPU-based miners like Scallion which can hash at frightening (several gigahashes a second) rates. In fact, Scallion was used to brute force 32-bit GPG key ids to demonstrate that 32-bit ids are insecure (evil32.com for more on that).
Tor's switch to ed25519 means that existing tools for generating vanity names like Scallion can't be used — at the time of writing, the best bet for v3 vanity names is mkp224o, a CPU-based miner.
I expected mkp224o
to be orders of magnitudes slower than GPU-based mining, so I spun up 15 servers across several providers (I'm looking for a new host, and thought this would be a good opportunity to test some new ones out).
Setting up the servers
Getting mkp224o
set up and running is fairly simple. On most development machines you'd probably have everything required preinstalled, with perhaps the exception of libsodium-dev
.
On a typical Debian-based distro, you can get everything you need to get running with:
$ apt install autoconf build-essential git libsodium-dev
$ git clone https://github.com/cathugger/mkp224o.git
$ cd mkp224o
$ ./autogen.sh
$ ./configure # see below
$ make
For ARM servers, I passed --enable-donna
to configure
, while for x86_64 boxes I used either --enable-amd64-51-30k
or --enable-amd64-64-24k
, whichever provided the greatest hashrate.
For mining, I specified a filter for tbrindus
:
$ ./mkp224o -s -T tbrindus
…and waited. I waited a long time.
Mining results
V2 onions can be hashed incredibly fast on common GPUs with Scallion, with many cards capable of several gigahashes per second. On my laptop's GTX 960M, Scallion pulled in 1 GH/s, and mined tbrindus6tjv6wpi.onion in under 10 minutes.
For comparison, the 15 servers I ran mkp224o
on for 6 days pulled in an aggregate 5 MH/s, or 0.5% of what my fairly standard laptop graphics card can compute.
Below, I've put together a table of the setups I ran to compute tbrindusxnnqwmzov5qof56hyion6usmciqwykffxqsawswhk73aq5yd.onion.
Host | Plan | OS | CPU | RAM | Hashes/s | Contrib. |
---|---|---|---|---|---|---|
Scaleway1 | C2S | Debian 9.0 | 4x Intel Atom C2550 @ 2.3GHz | 8GB | 229,400 | 4.76% |
Scaleway | ARM64-16GB | Debian 9.0 | 16x ARMv8 Cavium ThunderX | 16GB | 1,300,000 | 26.97% |
Scaleway | ARM64-8GB | Ubuntu 16.04 | 8x ARMv8 Cavium ThunderX | 8GB | 626,000 | 12.99% |
Scaleway | ARM64-2GB | Ubuntu 16.04 | 4x ARMv8 Cavium ThunderX | 2GB | 314,000 | 6.51% |
Scaleway2 | ARM64-2GB | Debian 9.3 | 4x ARMv8 Cavium ThunderX | 2GB | 218,000 | 4.52% |
Scaleway | C1 | Debian 9.0 | 2x Intel Atom C2750 @ 2.3GHz | 2GB | 113,500 | 2.35% |
DigitalOcean | Compute 4GB | Debian 9.4 | 2x Intel Xeon E5-2697A v4 @ 2.5GHz | 4GB | 470,000 | 9.75% |
Azure | Standard B2s | Ubuntu 16.04 | 2x Intel Xeon E5-2673 v4 @ 2.294GHz | 4GB | 68,000 | 1.41% |
Azure | Standard B2s | Debian 9.3 | 2x Intel Xeon E5-2673 v4 @ 2.294GHz | 4GB | 80,000 | 1.66% |
Azure | Standard B2s | FreeBSD 11.1 | 2x Intel Xeon E5-2673 v4 @ 2.294GHz | 4GB | 69,000 | 1.43% |
SSDNodes3 | 8GB KVM | Debian 9.3 | 2x Intel (Skylake, IBRS) @ 2.299GHz | 8GB | 274,500 | 5.69% |
SSDNodes3 | 16GB KVM | Debian 9.3 | 4x Intel (Skylake, IBRS) @ 2.299GHz | 16GB | 540,000 | 11.20% |
SSDNodes | 8GB Container | Debian 9.4 | 4x Intel Xeon E5-2697 v3 @ 766MHz | 8GB | 78,000 | 1.62% |
—4 | Raspberry Pi 3 | Raspbian 9.1 | 4x ARM Cortex-A53 @ 1.2GHz | 1GB | 70,000 | 1.45% |
—4 | Optiplex 960 | Ubuntu 16.04 | 4x Intel 2 Quad Q9400 @ 2.659GHz | 4GB | 370,000 | 7.68% |
— | — | — | — | — | 4,820,400 | 100.00% |
1. This was a dedicated machine.
2. This machine was provisioned with the same specs as the other ARM64-2GB instance, but was also running a Tor relay, which explains the difference in hashrate.
3. CPU steal time on these machines was constantly at 20% or higher.
4. I ran these machines uninterrupted at home.
A quick statistical analysis
OK, so it took a long time. I accumulated far more in server expenses than I had originally planned on, but at least I got a sense of pride and accomplishment from it.
The search for a hash prefix of tbrindus
is probabilistic and memoryless: you never get "closer" to mining a hash; every hash has an equal probability $\frac 1 {32^{\text{length(prefix)}}} = \frac 1 {32^8}$ of matching. Since it's essentially a Poisson process, and we can use an exponential distribution to estimate how long it takes, on average, for a match to be found.
The CDF of an exponential distribution has the form $1 - e^{-\lambda x}$.
We can perform 4,820,400 hashes per second (86,400 seconds in a day) with each hash having a probability of $\frac 1 {32^8}$, so we can determine the probability that we'll find a match in $x$ days (let's call it $f(x)$ for simplicity) by taking $\lambda = \frac{86\,400 \times 4\,820\,400}{32^8}$.
Since I like graphs, let's graph this function.
The expected value of an exponential distribution is given by $\frac 1 \lambda$, so we can take this and plug in our $\lambda$ to find out the expected number of days for generating a prefix of 8 characters:
Alright, so I definitely overshot that.
Bonus: UnixBench of the servers
Since I had all these servers up and running already, I figured it'd be interesting to compare UnixBench scores to see how they correlated to hashrate. In the table below, I've included the hashrate of several servers I was particularly interested in, as well as their single core and multi-core performance determined by running UnixBench on an unloaded system.
Host | Plan | OS | Hashes/s | Num. Cores | Single core perf. | Multi-core perf. |
---|---|---|---|---|---|---|
Scaleway | ARM64-16GB | Debian 9.0 | 1,300,000 | 16 | 401.2 | 1641.6 |
Scaleway | ARM64-8GB | Ubuntu 16.04 LTS | 626,000 | 8 | 380.5 | 1514.1 |
Scaleway | ARM64-2GB | Ubuntu 16.04 LTS | 314,000 | 4 | 400.9 | 1020.3 |
Scaleway | C1 | Debian 9.0 | 113,500 | 2 | 621.0 | 1047.7 |
Azure | Standard B2s | Ubuntu 16.04 | 68,000 | 2 | 472.2 | 340.0 |
SSDNodes | 16GB KVM | Debian 9.3 | 540,000 | 4 | 472.3 | 1363.2 |
SSDNodes | 8GB KVM | Debian 9.3 | 274,500 | 4 | 616.8 | 1382.8 |
I've also attached the raw UnixBench logs below, for convenience.
Scaleway — ARM64-16GB
========================================================================
BYTE UNIX Benchmarks (Version 5.1.3)
System: redacted: GNU/Linux
OS: GNU/Linux -- 4.9.23-std-1 -- #1 SMP Mon Apr 24 13:18:14 UTC 2017
Machine: aarch64 (unknown)
Language: en_US.utf8 (charmap="UTF-8", collate="UTF-8")
05:13:38 up 3 days, 1:08, 1 user, load average: 11.74, 15.14, 15.76; runlevel 2018-03-15
------------------------------------------------------------------------
Benchmark Run: Sun Mar 18 2018 05:13:38 - 05:41:33
16 CPUs in system; running 1 parallel copy of tests
Dhrystone 2 using register variables 8372406.5 lps (10.0 s, 7 samples)
Double-Precision Whetstone 1825.0 MWIPS (9.9 s, 7 samples)
Execl Throughput 1014.4 lps (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 181638.7 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 51750.8 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 422317.9 KBps (30.0 s, 2 samples)
Pipe Throughput 476739.6 lps (10.0 s, 7 samples)
Pipe-based Context Switching 29308.4 lps (10.0 s, 7 samples)
Process Creation 2046.2 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 2597.0 lpm (60.0 s, 2 samples)
Shell Scripts (8 concurrent) 1107.5 lpm (60.0 s, 2 samples)
System Call Overhead 863802.9 lps (10.0 s, 7 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 8372406.5 717.4
Double-Precision Whetstone 55.0 1825.0 331.8
Execl Throughput 43.0 1014.4 235.9
File Copy 1024 bufsize 2000 maxblocks 3960.0 181638.7 458.7
File Copy 256 bufsize 500 maxblocks 1655.0 51750.8 312.7
File Copy 4096 bufsize 8000 maxblocks 5800.0 422317.9 728.1
Pipe Throughput 12440.0 476739.6 383.2
Pipe-based Context Switching 4000.0 29308.4 73.3
Process Creation 126.0 2046.2 162.4
Shell Scripts (1 concurrent) 42.4 2597.0 612.5
Shell Scripts (8 concurrent) 6.0 1107.5 1845.8
System Call Overhead 15000.0 863802.9 575.9
========
System Benchmarks Index Score 401.2
------------------------------------------------------------------------
Benchmark Run: Sun Mar 18 2018 05:41:33 - 06:09:37
16 CPUs in system; running 16 parallel copies of tests
Dhrystone 2 using register variables 132993486.5 lps (10.0 s, 7 samples)
Double-Precision Whetstone 29057.8 MWIPS (10.0 s, 7 samples)
Execl Throughput 7995.3 lps (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 137360.5 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 29373.3 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 630759.5 KBps (30.0 s, 2 samples)
Pipe Throughput 7424668.0 lps (10.0 s, 7 samples)
Pipe-based Context Switching 401144.7 lps (10.0 s, 7 samples)
Process Creation 10546.1 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 15213.0 lpm (60.0 s, 2 samples)
Shell Scripts (8 concurrent) 2003.4 lpm (60.2 s, 2 samples)
System Call Overhead 1277419.8 lps (10.0 s, 7 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 132993486.5 11396.2
Double-Precision Whetstone 55.0 29057.8 5283.2
Execl Throughput 43.0 7995.3 1859.4
File Copy 1024 bufsize 2000 maxblocks 3960.0 137360.5 346.9
File Copy 256 bufsize 500 maxblocks 1655.0 29373.3 177.5
File Copy 4096 bufsize 8000 maxblocks 5800.0 630759.5 1087.5
Pipe Throughput 12440.0 7424668.0 5968.4
Pipe-based Context Switching 4000.0 401144.7 1002.9
Process Creation 126.0 10546.1 837.0
Shell Scripts (1 concurrent) 42.4 15213.0 3588.0
Shell Scripts (8 concurrent) 6.0 2003.4 3339.0
System Call Overhead 15000.0 1277419.8 851.6
========
System Benchmarks Index Score 1641.6
Scaleway — ARM64-8GB
BYTE UNIX Benchmarks (Version 5.1.3)
System: redacted: GNU/Linux
OS: GNU/Linux -- 4.4.121-mainline-rev1 -- #1 SMP Sun Mar 11 16:44:34 UTC 2018
Machine: aarch64 (aarch64)
Language: en_US.utf8 (charmap="UTF-8", collate="UTF-8")
05:13:17 up 2 days, 53 min, 1 user, load average: 5.56, 7.47, 7.82; runlevel 2018-03-16
------------------------------------------------------------------------
Benchmark Run: Sun Mar 18 2018 05:13:17 - 05:41:24
8 CPUs in system; running 1 parallel copy of tests
Dhrystone 2 using register variables 8502417.0 lps (10.0 s, 7 samples)
Double-Precision Whetstone 1741.0 MWIPS (10.1 s, 7 samples)
Execl Throughput 1112.8 lps (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 165427.5 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 54377.8 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 343939.2 KBps (30.0 s, 2 samples)
Pipe Throughput 462211.7 lps (10.0 s, 7 samples)
Pipe-based Context Switching 14746.0 lps (10.0 s, 7 samples)
Process Creation 2370.8 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 2677.5 lpm (60.0 s, 2 samples)
Shell Scripts (8 concurrent) 1050.2 lpm (60.0 s, 2 samples)
System Call Overhead 998124.5 lps (10.0 s, 7 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 8502417.0 728.6
Double-Precision Whetstone 55.0 1741.0 316.6
Execl Throughput 43.0 1112.8 258.8
File Copy 1024 bufsize 2000 maxblocks 3960.0 165427.5 417.7
File Copy 256 bufsize 500 maxblocks 1655.0 54377.8 328.6
File Copy 4096 bufsize 8000 maxblocks 5800.0 343939.2 593.0
Pipe Throughput 12440.0 462211.7 371.6
Pipe-based Context Switching 4000.0 14746.0 36.9
Process Creation 126.0 2370.8 188.2
Shell Scripts (1 concurrent) 42.4 2677.5 631.5
Shell Scripts (8 concurrent) 6.0 1050.2 1750.4
System Call Overhead 15000.0 998124.5 665.4
========
System Benchmarks Index Score 380.5
------------------------------------------------------------------------
Benchmark Run: Sun Mar 18 2018 05:41:24 - 06:09:38
8 CPUs in system; running 8 parallel copies of tests
Dhrystone 2 using register variables 67785992.9 lps (10.0 s, 7 samples)
Double-Precision Whetstone 13990.1 MWIPS (10.1 s, 7 samples)
Execl Throughput 5098.5 lps (29.8 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 285233.4 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 73046.0 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 1005166.1 KBps (30.0 s, 2 samples)
Pipe Throughput 3663311.5 lps (10.0 s, 7 samples)
Pipe-based Context Switching 222918.1 lps (10.0 s, 7 samples)
Process Creation 8125.0 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 10717.2 lpm (60.0 s, 2 samples)
Shell Scripts (8 concurrent) 1391.7 lpm (60.2 s, 2 samples)
System Call Overhead 3636949.3 lps (10.0 s, 7 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 67785992.9 5808.6
Double-Precision Whetstone 55.0 13990.1 2543.7
Execl Throughput 43.0 5098.5 1185.7
File Copy 1024 bufsize 2000 maxblocks 3960.0 285233.4 720.3
File Copy 256 bufsize 500 maxblocks 1655.0 73046.0 441.4
File Copy 4096 bufsize 8000 maxblocks 5800.0 1005166.1 1733.0
Pipe Throughput 12440.0 3663311.5 2944.8
Pipe-based Context Switching 4000.0 222918.1 557.3
Process Creation 126.0 8125.0 644.8
Shell Scripts (1 concurrent) 42.4 10717.2 2527.6
Shell Scripts (8 concurrent) 6.0 1391.7 2319.6
System Call Overhead 15000.0 3636949.3 2424.6
========
System Benchmarks Index Score 1514.1
Scaleway — ARM64-2GB
========================================================================
BYTE UNIX Benchmarks (Version 5.1.3)
System: redacted: GNU/Linux
OS: GNU/Linux -- 4.4.121-mainline-rev1 -- #1 SMP Sun Mar 11 16:44:34 UTC 2018
Machine: aarch64 (aarch64)
Language: en_US.utf8 (charmap="UTF-8", collate="UTF-8")
05:14:10 up 3 days, 7:45, 1 user, load average: 2.75, 3.74, 3.91; runlevel 2018-03-14
------------------------------------------------------------------------
Benchmark Run: Sun Mar 18 2018 05:14:10 - 05:42:12
4 CPUs in system; running 1 parallel copy of tests
Dhrystone 2 using register variables 8555429.5 lps (10.0 s, 7 samples)
Double-Precision Whetstone 1747.9 MWIPS (10.1 s, 7 samples)
Execl Throughput 1224.4 lps (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 184524.9 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 58246.7 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 438788.5 KBps (30.0 s, 2 samples)
Pipe Throughput 465226.2 lps (10.0 s, 7 samples)
Pipe-based Context Switching 14792.3 lps (10.0 s, 7 samples)
Process Creation 2629.9 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 3095.2 lpm (60.0 s, 2 samples)
Shell Scripts (8 concurrent) 884.2 lpm (60.0 s, 2 samples)
System Call Overhead 1011139.0 lps (10.0 s, 7 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 8555429.5 733.1
Double-Precision Whetstone 55.0 1747.9 317.8
Execl Throughput 43.0 1224.4 284.7
File Copy 1024 bufsize 2000 maxblocks 3960.0 184524.9 466.0
File Copy 256 bufsize 500 maxblocks 1655.0 58246.7 351.9
File Copy 4096 bufsize 8000 maxblocks 5800.0 438788.5 756.5
Pipe Throughput 12440.0 465226.2 374.0
Pipe-based Context Switching 4000.0 14792.3 37.0
Process Creation 126.0 2629.9 208.7
Shell Scripts (1 concurrent) 42.4 3095.2 730.0
Shell Scripts (8 concurrent) 6.0 884.2 1473.6
System Call Overhead 15000.0 1011139.0 674.1
========
System Benchmarks Index Score 400.9
------------------------------------------------------------------------
Benchmark Run: Sun Mar 18 2018 05:42:12 - 06:10:18
4 CPUs in system; running 4 parallel copies of tests
Dhrystone 2 using register variables 34136207.1 lps (10.0 s, 7 samples)
Double-Precision Whetstone 6989.1 MWIPS (10.2 s, 7 samples)
Execl Throughput 3526.3 lps (29.6 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 218968.8 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 61412.5 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 830973.8 KBps (30.0 s, 2 samples)
Pipe Throughput 1848545.0 lps (10.0 s, 7 samples)
Pipe-based Context Switching 121851.3 lps (10.0 s, 7 samples)
Process Creation 6271.4 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 7046.2 lpm (60.0 s, 2 samples)
Shell Scripts (8 concurrent) 955.8 lpm (60.1 s, 2 samples)
System Call Overhead 3570647.2 lps (10.0 s, 7 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 34136207.1 2925.1
Double-Precision Whetstone 55.0 6989.1 1270.7
Execl Throughput 43.0 3526.3 820.1
File Copy 1024 bufsize 2000 maxblocks 3960.0 218968.8 553.0
File Copy 256 bufsize 500 maxblocks 1655.0 61412.5 371.1
File Copy 4096 bufsize 8000 maxblocks 5800.0 830973.8 1432.7
Pipe Throughput 12440.0 1848545.0 1486.0
Pipe-based Context Switching 4000.0 121851.3 304.6
Process Creation 126.0 6271.4 497.7
Shell Scripts (1 concurrent) 42.4 7046.2 1661.8
Shell Scripts (8 concurrent) 6.0 955.8 1593.0
System Call Overhead 15000.0 3570647.2 2380.4
========
System Benchmarks Index Score 1020.3
Scaleway — C1
========================================================================
BYTE UNIX Benchmarks (Version 5.1.3)
System: redacted: GNU/Linux
OS: GNU/Linux -- 4.9.20-std-1 -- #1 SMP Tue Apr 4 12:56:17 UTC 2017
Machine: x86_64 (unknown)
Language: en_US.utf8 (charmap="UTF-8", collate="UTF-8")
CPU 0: Intel(R) Atom(TM) CPU C2750 @ 2.40GHz (4787.8 bogomips)
x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
CPU 1: Intel(R) Atom(TM) CPU C2750 @ 2.40GHz (4787.8 bogomips)
x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
05:14:11 up 3 days, 1:28, 1 user, load average: 2.01, 2.14, 2.06; runlevel 2018-03-15
------------------------------------------------------------------------
Benchmark Run: Sun Mar 18 2018 05:14:12 - 05:42:08
2 CPUs in system; running 1 parallel copy of tests
Dhrystone 2 using register variables 12323865.3 lps (10.0 s, 7 samples)
Double-Precision Whetstone 2014.1 MWIPS (9.9 s, 7 samples)
Execl Throughput 1223.1 lps (29.8 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 415672.5 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 120361.9 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 985611.5 KBps (30.0 s, 2 samples)
Pipe Throughput 1170708.3 lps (10.0 s, 7 samples)
Pipe-based Context Switching 46541.0 lps (10.0 s, 7 samples)
Process Creation 3049.4 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 3348.8 lpm (60.0 s, 2 samples)
Shell Scripts (8 concurrent) 685.8 lpm (60.1 s, 2 samples)
System Call Overhead 1446516.0 lps (10.0 s, 7 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 12323865.3 1056.0
Double-Precision Whetstone 55.0 2014.1 366.2
Execl Throughput 43.0 1223.1 284.4
File Copy 1024 bufsize 2000 maxblocks 3960.0 415672.5 1049.7
File Copy 256 bufsize 500 maxblocks 1655.0 120361.9 727.3
File Copy 4096 bufsize 8000 maxblocks 5800.0 985611.5 1699.3
Pipe Throughput 12440.0 1170708.3 941.1
Pipe-based Context Switching 4000.0 46541.0 116.4
Process Creation 126.0 3049.4 242.0
Shell Scripts (1 concurrent) 42.4 3348.8 789.8
Shell Scripts (8 concurrent) 6.0 685.8 1142.9
System Call Overhead 15000.0 1446516.0 964.3
========
System Benchmarks Index Score 621.0
------------------------------------------------------------------------
Benchmark Run: Sun Mar 18 2018 05:42:08 - 06:10:06
2 CPUs in system; running 2 parallel copies of tests
Dhrystone 2 using register variables 24552470.3 lps (10.0 s, 7 samples)
Double-Precision Whetstone 4016.2 MWIPS (10.0 s, 7 samples)
Execl Throughput 2918.3 lps (29.8 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 485532.6 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 131304.9 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 1365028.4 KBps (30.0 s, 2 samples)
Pipe Throughput 2329059.7 lps (10.0 s, 7 samples)
Pipe-based Context Switching 116038.9 lps (10.0 s, 7 samples)
Process Creation 7104.8 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 5589.5 lpm (60.0 s, 2 samples)
Shell Scripts (8 concurrent) 722.6 lpm (60.1 s, 2 samples)
System Call Overhead 2260798.6 lps (10.0 s, 7 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 24552470.3 2103.9
Double-Precision Whetstone 55.0 4016.2 730.2
Execl Throughput 43.0 2918.3 678.7
File Copy 1024 bufsize 2000 maxblocks 3960.0 485532.6 1226.1
File Copy 256 bufsize 500 maxblocks 1655.0 131304.9 793.4
File Copy 4096 bufsize 8000 maxblocks 5800.0 1365028.4 2353.5
Pipe Throughput 12440.0 2329059.7 1872.2
Pipe-based Context Switching 4000.0 116038.9 290.1
Process Creation 126.0 7104.8 563.9
Shell Scripts (1 concurrent) 42.4 5589.5 1318.3
Shell Scripts (8 concurrent) 6.0 722.6 1204.3
System Call Overhead 15000.0 2260798.6 1507.2
========
System Benchmarks Index Score 1047.7
Azure — Standard B2S
========================================================================
BYTE UNIX Benchmarks (Version 5.1.3)
System: redacted: GNU/Linux
OS: GNU/Linux -- 4.13.0-1011-azure -- #14-Ubuntu SMP Thu Feb 15 16:15:39 UTC 2018
Machine: x86_64 (x86_64)
Language: en_US.utf8 (charmap="UTF-8", collate="UTF-8")
CPU 0: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz (4589.4 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
CPU 1: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz (4589.4 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
05:22:33 up 6 days, 8:37, 1 user, load average: 0.08, 0.62, 1.38; runlevel 2018-03-11
------------------------------------------------------------------------
Benchmark Run: Sun Mar 18 2018 05:22:33 - 05:50:38
2 CPUs in system; running 1 parallel copy of tests
Dhrystone 2 using register variables 28065805.5 lps (10.0 s, 7 samples)
Double-Precision Whetstone 3310.3 MWIPS (8.7 s, 7 samples)
Execl Throughput 2546.1 lps (29.8 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 257690.1 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 55889.7 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 535177.7 KBps (30.0 s, 2 samples)
Pipe Throughput 315663.7 lps (10.0 s, 7 samples)
Pipe-based Context Switching 25281.3 lps (10.0 s, 7 samples)
Process Creation 3911.9 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 2343.0 lpm (60.0 s, 2 samples)
Shell Scripts (8 concurrent) 862.3 lpm (60.0 s, 2 samples)
System Call Overhead 268361.9 lps (10.0 s, 7 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 28065805.5 2405.0
Double-Precision Whetstone 55.0 3310.3 601.9
Execl Throughput 43.0 2546.1 592.1
File Copy 1024 bufsize 2000 maxblocks 3960.0 257690.1 650.7
File Copy 256 bufsize 500 maxblocks 1655.0 55889.7 337.7
File Copy 4096 bufsize 8000 maxblocks 5800.0 535177.7 922.7
Pipe Throughput 12440.0 315663.7 253.7
Pipe-based Context Switching 4000.0 25281.3 63.2
Process Creation 126.0 3911.9 310.5
Shell Scripts (1 concurrent) 42.4 2343.0 552.6
Shell Scripts (8 concurrent) 6.0 862.3 1437.2
System Call Overhead 15000.0 268361.9 178.9
========
System Benchmarks Index Score 472.2
------------------------------------------------------------------------
Benchmark Run: Sun Mar 18 2018 05:50:38 - 06:18:55
2 CPUs in system; running 2 parallel copies of tests
Dhrystone 2 using register variables 12561408.5 lps (10.0 s, 7 samples)
Double-Precision Whetstone 1364.4 MWIPS (10.5 s, 7 samples)
Execl Throughput 1285.0 lps (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 108284.8 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 29067.9 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 813617.5 KBps (30.0 s, 2 samples)
Pipe Throughput 195193.3 lps (10.0 s, 7 samples)
Pipe-based Context Switching 59307.3 lps (10.0 s, 7 samples)
Process Creation 2751.5 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 3681.4 lpm (60.0 s, 2 samples)
Shell Scripts (8 concurrent) 322.3 lpm (60.1 s, 2 samples)
System Call Overhead 280762.9 lps (10.0 s, 7 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 12561408.5 1076.4
Double-Precision Whetstone 55.0 1364.4 248.1
Execl Throughput 43.0 1285.0 298.8
File Copy 1024 bufsize 2000 maxblocks 3960.0 108284.8 273.4
File Copy 256 bufsize 500 maxblocks 1655.0 29067.9 175.6
File Copy 4096 bufsize 8000 maxblocks 5800.0 813617.5 1402.8
Pipe Throughput 12440.0 195193.3 156.9
Pipe-based Context Switching 4000.0 59307.3 148.3
Process Creation 126.0 2751.5 218.4
Shell Scripts (1 concurrent) 42.4 3681.4 868.3
Shell Scripts (8 concurrent) 6.0 322.3 537.1
System Call Overhead 15000.0 280762.9 187.2
========
System Benchmarks Index Score 340.0
SSDNodes — KVM 16GB
========================================================================
BYTE UNIX Benchmarks (Version 5.1.3)
System: redacted: GNU/Linux
OS: GNU/Linux -- 4.9.0-5-amd64 -- #1 SMP Debian 4.9.65-3+deb9u2 (2018-01-04)
Machine: x86_64 (unknown)
Language: en_US.utf8 (charmap="UTF-8", collate="UTF-8")
CPU 0: Intel Core Processor (Skylake, IBRS) (4600.0 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
CPU 1: Intel Core Processor (Skylake, IBRS) (4600.0 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
CPU 2: Intel Core Processor (Skylake, IBRS) (4600.0 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
CPU 3: Intel Core Processor (Skylake, IBRS) (4600.0 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
05:32:38 up 24 days, 9:44, 2 users, load average: 0.86, 0.95, 2.01; runlevel 2018-02-21
------------------------------------------------------------------------
Benchmark Run: Sun Mar 18 2018 05:32:39 - 06:00:50
4 CPUs in system; running 1 parallel copy of tests
Dhrystone 2 using register variables 18638854.5 lps (10.0 s, 7 samples)
Double-Precision Whetstone 3603.5 MWIPS (9.3 s, 7 samples)
Execl Throughput 543.0 lps (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 326203.0 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 107831.8 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 782124.6 KBps (30.0 s, 2 samples)
Pipe Throughput 772372.4 lps (10.0 s, 7 samples)
Pipe-based Context Switching 27040.7 lps (10.0 s, 7 samples)
Process Creation 1912.9 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 1867.0 lpm (60.0 s, 2 samples)
Shell Scripts (8 concurrent) 685.7 lpm (60.1 s, 2 samples)
System Call Overhead 603214.3 lps (10.0 s, 7 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 18638854.5 1597.2
Double-Precision Whetstone 55.0 3603.5 655.2
Execl Throughput 43.0 543.0 126.3
File Copy 1024 bufsize 2000 maxblocks 3960.0 326203.0 823.7
File Copy 256 bufsize 500 maxblocks 1655.0 107831.8 651.6
File Copy 4096 bufsize 8000 maxblocks 5800.0 782124.6 1348.5
Pipe Throughput 12440.0 772372.4 620.9
Pipe-based Context Switching 4000.0 27040.7 67.6
Process Creation 126.0 1912.9 151.8
Shell Scripts (1 concurrent) 42.4 1867.0 440.3
Shell Scripts (8 concurrent) 6.0 685.7 1142.9
System Call Overhead 15000.0 603214.3 402.1
========
System Benchmarks Index Score 472.3
------------------------------------------------------------------------
Benchmark Run: Sun Mar 18 2018 06:00:50 - 06:29:14
4 CPUs in system; running 4 parallel copies of tests
Dhrystone 2 using register variables 63227839.0 lps (10.0 s, 7 samples)
Double-Precision Whetstone 14671.5 MWIPS (9.4 s, 7 samples)
Execl Throughput 4394.5 lps (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 347374.8 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 109273.0 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 830966.2 KBps (30.0 s, 2 samples)
Pipe Throughput 2702931.3 lps (10.0 s, 7 samples)
Pipe-based Context Switching 307088.8 lps (10.0 s, 7 samples)
Process Creation 4009.3 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 6331.9 lpm (60.0 s, 2 samples)
Shell Scripts (8 concurrent) 1825.1 lpm (60.1 s, 2 samples)
System Call Overhead 2090415.5 lps (10.0 s, 7 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 63227839.0 5418.0
Double-Precision Whetstone 55.0 14671.5 2667.5
Execl Throughput 43.0 4394.5 1022.0
File Copy 1024 bufsize 2000 maxblocks 3960.0 347374.8 877.2
File Copy 256 bufsize 500 maxblocks 1655.0 109273.0 660.3
File Copy 4096 bufsize 8000 maxblocks 5800.0 830966.2 1432.7
Pipe Throughput 12440.0 2702931.3 2172.8
Pipe-based Context Switching 4000.0 307088.8 767.7
Process Creation 126.0 4009.3 318.2
Shell Scripts (1 concurrent) 42.4 6331.9 1493.4
Shell Scripts (8 concurrent) 6.0 1825.1 3041.8
System Call Overhead 15000.0 2090415.5 1393.6
========
System Benchmarks Index Score 1363.2
SSDNodes — KVM 8GB
========================================================================
BYTE UNIX Benchmarks (Version 5.1.3)
System: redacted: GNU/Linux
OS: GNU/Linux -- 4.9.0-5-amd64 -- #1 SMP Debian 4.9.65-3+deb9u2 (2018-01-04)
Machine: x86_64 (unknown)
Language: en_US.utf8 (charmap="UTF-8", collate="UTF-8")
CPU 0: Intel Core Processor (Skylake, IBRS) (4600.0 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
CPU 1: Intel Core Processor (Skylake, IBRS) (4600.0 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
05:27:18 up 24 days, 9:39, 2 users, load average: 1.83, 2.76, 2.63; runlevel 2018-02-21
------------------------------------------------------------------------
Benchmark Run: Sun Mar 18 2018 05:27:18 - 05:55:29
2 CPUs in system; running 1 parallel copy of tests
Dhrystone 2 using register variables 20712375.2 lps (10.0 s, 7 samples)
Double-Precision Whetstone 4089.5 MWIPS (10.0 s, 7 samples)
Execl Throughput 869.8 lps (29.6 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 414717.4 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 118528.4 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 1037781.8 KBps (30.0 s, 2 samples)
Pipe Throughput 839599.1 lps (10.0 s, 7 samples)
Pipe-based Context Switching 39673.2 lps (10.0 s, 7 samples)
Process Creation 2367.3 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 3917.3 lpm (60.0 s, 2 samples)
Shell Scripts (8 concurrent) 1015.1 lpm (60.0 s, 2 samples)
System Call Overhead 646058.8 lps (10.0 s, 7 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 20712375.2 1774.8
Double-Precision Whetstone 55.0 4089.5 743.5
Execl Throughput 43.0 869.8 202.3
File Copy 1024 bufsize 2000 maxblocks 3960.0 414717.4 1047.3
File Copy 256 bufsize 500 maxblocks 1655.0 118528.4 716.2
File Copy 4096 bufsize 8000 maxblocks 5800.0 1037781.8 1789.3
Pipe Throughput 12440.0 839599.1 674.9
Pipe-based Context Switching 4000.0 39673.2 99.2
Process Creation 126.0 2367.3 187.9
Shell Scripts (1 concurrent) 42.4 3917.3 923.9
Shell Scripts (8 concurrent) 6.0 1015.1 1691.8
System Call Overhead 15000.0 646058.8 430.7
========
System Benchmarks Index Score 616.8
------------------------------------------------------------------------
Benchmark Run: Sun Mar 18 2018 05:55:29 - 06:23:42
2 CPUs in system; running 2 parallel copies of tests
Dhrystone 2 using register variables 38935462.3 lps (10.0 s, 7 samples)
Double-Precision Whetstone 8156.1 MWIPS (10.0 s, 7 samples)
Execl Throughput 4726.3 lps (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 692577.9 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 203840.1 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 1799195.8 KBps (30.0 s, 2 samples)
Pipe Throughput 1621602.5 lps (10.0 s, 7 samples)
Pipe-based Context Switching 211656.3 lps (10.0 s, 7 samples)
Process Creation 9135.5 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 7138.5 lpm (60.0 s, 2 samples)
Shell Scripts (8 concurrent) 1195.9 lpm (60.1 s, 2 samples)
System Call Overhead 1202392.1 lps (10.0 s, 7 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 38935462.3 3336.4
Double-Precision Whetstone 55.0 8156.1 1482.9
Execl Throughput 43.0 4726.3 1099.1
File Copy 1024 bufsize 2000 maxblocks 3960.0 692577.9 1748.9
File Copy 256 bufsize 500 maxblocks 1655.0 203840.1 1231.7
File Copy 4096 bufsize 8000 maxblocks 5800.0 1799195.8 3102.1
Pipe Throughput 12440.0 1621602.5 1303.5
Pipe-based Context Switching 4000.0 211656.3 529.1
Process Creation 126.0 9135.5 725.0
Shell Scripts (1 concurrent) 42.4 7138.5 1683.6
Shell Scripts (8 concurrent) 6.0 1195.9 1993.2
System Call Overhead 15000.0 1202392.1 801.6
========
System Benchmarks Index Score 1382.8
These benchmarks should be taken with a grain of salt, since UnixBench tests a fair bit more than just CPU throughput. However, what appears to be fairly clear is that though the ARMv8 cores are 20-30% slower than the mixture of competing x86_64 cores in a contest of single core performance, they win out in multi-core hashrate simply due to their number.
I suppose this isn't really a thrilling discovery — it makes immediate sense — but I found it fairly interesting that it's cheaper to scale out in number of cores rather than up in per-core performance… at least when it comes to mining vanity Tor domains.
Conclusion
Overall, this was a larger undertaking than I would have assumed at first, and I spent a long time monitoring (nonexistent) progress. In the end, it was fun to do, so hopefully it was fun to read about too!