Benchmarking GeoIP gems for Ruby
On a recent project, I had need again to use GeoIP (by the lovely people at MaxMind). I decided to take some time to benchmark a handful of various GeoIP gems, both for GeoIP v1 (legacy) and the newer GeoIP v2.
I decided to benchmark the extraction of 3 pieces of data, using the GeoIP Lite City database:
- 2-character ISO country code (eg: US)
- Full region name (eg: Colorado)
- Timezone (eg: America/Denver)
Entries are labeled according to the installed gem name. Gems with “1” use the v1 GeoIP database. Those with “2” use the v2 database.
Gems labeled with “rb” are pure ruby. Those with “c” have C extensions.
3 of the tested gems do not return a timezone. Those 3 libraries turned out to be the 3 fastest, strongly suggesting that retrieving a more complete set of data from the GeoIP dataset does come with increased overhead. These 3 gems are labeled with “-tz” in the results below.
Lastly, the geoip
gem has the option of preloading the entire GeoIP database into memory. It was benchmarked without and with preloading.
Without further ado, here are the results, based on 100,000 iterations (ordered by real time):
1user system total real
2geoip-c (1;c) -tz 0.520000 0.010000 0.530000 ( 0.533592)
3geoip2_compat (2;c) -tz 0.520000 0.020000 0.540000 ( 0.534711)
4mmdb (2;c) -tz 0.700000 0.010000 0.710000 ( 0.725789)
5hive_geoip2 (2;c) 3.470000 0.010000 3.480000 ( 3.495051)
6geoip (1;rb;preload) 3.580000 0.000000 3.580000 ( 3.583469)
7geoip (1;rb) 4.780000 1.740000 6.520000 ( 6.542248)
8maxmind_geoip2 (2;c) 12.420000 4.670000 17.090000 ( 17.092035)
9maxminddb (2;rb) 27.730000 0.030000 27.760000 ( 27.781552)
As noted earlier, the 3 fastest libraries (geoip2_compat
, geoip-c
, and mmdb
) all have C extensions and all skip timezone loading.
The next fastest, and also the overall fastest to return a full response from the database, is hive_geoip2
. The API for this gem is rather basic, but it’s arguably a very worthwhile tradeoff given its high performance.
Right behind it is the preload-flavor of the pure Ruby geoip
gem, which is quite impressive. Equally impressive is that the non-preload flavor of the same gem comes in right behind, taking only about twice as long to run.
maxmind_geoip2
is just slow. Digging into the code, it appears to open and close the database file descriptor on every lookup, resulting in a substantial performance penalty.
maxminddb
has a friendly API to work with, but I wonder if that’s perhaps the cause of its slower performance. That said, unless you’re processing a rather high rate of GeoIP lookups, this one is still worth a look.
If you want a pure ruby solution, both geoip
for GeoIP v1 and maxminddb
for GeoIP v2 are both viable for modest volumes of lookups.
For higher volumes, definitely use a gem with a C extension. If you need timezones, then hive_geoip2
is your best bet. If not, then you have several choices. Do remember that you’ll also need the underlying C library and headers installed before installing the gem. That’ll be libgeoip
for v1 and libmaxminddb
for v2.
I’ve published the benchmark code if you’re interested. There are a couple of gem namespace conflicts. To run the benchmarks, set PASS==1
first and install the matching gems. Then set PASS==2
and install those gems, being careful to uninstall geoip
. If you want to return to pass 1, uninstall geoip-c
.