Benchmarking Perl - normal taint support vs. NO_TAINT_SUPPORT

Title: Benchmarking Perl - normal taint support vs. NO_TAINT_SUPPORT
Author: Steffen Schwigon (renormalist, Dresden Perl Mongers)
Date: 2021-09-16
Version: 1

Introduction

In this evaluation we compare the performance of a "normal" Perl with taint mode to a Perl built with -DNO_TAINT_SUPPORT/-DSILENT_NO_TAINT_SUPPORT.

The approach is described here.

The used toolchain and philosophy is that of Perl::Formance [1], [2]. The results are stored in a BenchmarkAnything database. The evaluation is done in Jupyter with BenchmarkAnything support libs.

Executive Summary

With very few exceptions or outliers Perl with -DNO_TAINT_SUPPORT is generally faster than with taint support by about 1% to 5% for average real world code. There are some more extreme results where the benchmarks are special:

  • Some algorithmic micro benchmarks are nearly not affected or even slower.
  • Some particular Perl feature micro benchmarks are up to 20% faster.
  • The complex text processing benchmark SpamAssassin is 14% faster ([1]).

Footnote:

  • [1] Obviously, in real life SpamAssassin would use taintmode for good reason.

Prepare data

In [1]:
import os
import sys
import pprint  

from IPython.display import display, Markdown

module_path = os.path.abspath(os.path.join('../lib'))
if module_path not in sys.path:
    sys.path.append(module_path)
from benchmarkanything import Benchmark, Query, Spec
from benchmarkvis import MatplotlibPlotter, BokehPlotter

pp = pprint.PrettyPrinter(indent=4)  # pretty printer
In [2]:
QA = 'http://qa:7360/api/v1/search'
HW = {'ss5z' : 'i7-10610U CPU @ 1.80GHz'}
QUALID = 'notaint-2021-a'
HOST = 'ss5z'
NOTAINT = '1'
WITHTAINT = '0'
LIMIT = 10000
STATS_SPEC = ['min', 'max', 'mean', 'stddev', 'ci_95l', 'ci_95u']

METRICS = [
    # start with fastest
    "perlformance.perl5.SpamAssassin.salearn.ham",
    "perlformance.perl5.SpamAssassin.salearn.ham2",
    "perlformance.perl5.SpamAssassin.salearn.spam",
    "perlformance.perl5.SpamAssassin.salearn.spam2",
    "perlformance.perl5.PerlStone2015.binarytrees",
    "perlformance.perl5.PerlStone2015.mandelbrot",
    "perlformance.perl5.Mandelbrot.withthreads",
    "perlformance.perl5.Mandelbrot.withmce",
    "perlformance.perl5.PerlStone2015.fib",
    "perlformance.perl5.Fib",
    "perlformance.perl5.FibMoose",
    "perlformance.perl5.FibMouse",
    "perlformance.perl5.FibOO",
    "perlformance.perl5.FibOOSig",
    "perlformance.perl5.AccessorsHash.set",
    "perlformance.perl5.AccessorsArray.set",
    "perlformance.perl5.AccessorsMoose.set",
    "perlformance.perl5.AccessorsMouse.set",
    "perlformance.perl5.AccessorsClassAccessor.set",
    "perlformance.perl5.DPath.dpath",
    "perlformance.perl5.PerlStone2015.fannkuch",
    "perlformance.perl5.PerlStone2015.fasta",
    "perlformance.perl5.PerlStone2015.regexdna",
    "perlformance.perl5.Threads.threadstorm",
    "perlformance.perl5.ThreadsShared.threadstorm",
    "perlformance.perl5.Mem.allocate",
    "perlformance.perl5.Mem.copy",

    # fast but micro benchmarks
    "perlformance.perl5.PerlStone2015.09data.a_alloc",
    "perlformance.perl5.PerlStone2015.09data.a_copy",
    "perlformance.perl5.PerlStone2015.07lists.unshift",
    "perlformance.perl5.PerlStone2015.07lists.push",

    # slower
    "perlformance.perl5.PerlStone2015.spectralnorm",
    "perlformance.perl5.PerlStone2015.nbody",
    
    "perlformance.perl5.PerlStone2015.01overview.opmix1",
    "perlformance.perl5.PerlStone2015.01overview.opmix2",
    "perlformance.perl5.PerlStone2015.04control.blocks1",
    "perlformance.perl5.PerlStone2015.04control.blocks2",
    "perlformance.perl5.PerlStone2015.05regex.fixedstr",
    "perlformance.perl5.PerlStone2015.regex.backtrack",
    "perlformance.perl5.PerlStone2015.regex.code_literal",
    "perlformance.perl5.PerlStone2015.regex.code_runtime",
    "perlformance.perl5.PerlStone2015.regex.precomp_access",
    "perlformance.perl5.PerlStone2015.regex.runtime_comp",
    "perlformance.perl5.PerlStone2015.regex.runtime_comp_nocache",
    "perlformance.perl5.PerlStone2015.regex.split1",
    "perlformance.perl5.PerlStone2015.regex.split2",
    "perlformance.perl5.PerlStone2015.regex.splitratio",
    "perlformance.perl5.PerlStone2015.regex.trie_limit",
    "perlformance.perl5.MatrixReal.matrix_times_itself.030"
]

METRIC_VARIANTS = [
 ('withtaint',        [['=', 'perlconfig_derived_notaintsupport', WITHTAINT]]),
 ('NO_TAINT_SUPPORT', [['=', 'perlconfig_derived_notaintsupport', NOTAINT  ]])
]

base_query = Query(
    select=[
        'NAME',
        'VALUE',
        'CREATED',
        'perlconfig_derived_notaintsupport',
        'env_perlformance_qualid',
        'sysinfo_hostname'
    ],
    where = [
             ['=', 'env_perlformance_qualid', QUALID],
            ],
    limit = LIMIT
)

vis2 = {}
bmx = {}
stats = {}

def fetchdata():
    for metric in METRICS:
        bmx[metric] = Benchmark(metric)
        stats[metric] = {}

        for v in METRIC_VARIANTS:
            variant_name        = v[0]
            variant_constraints = v[1]

            query = base_query.variant(
                where=[
                   ['=', 'NAME', metric],
                   ['=', 'sysinfo_hostname', HOST]
                ]
                + variant_constraints
            )
            query_data = query.post(QA)
            bmx[metric].add_data_series(variant_name, query_data)

            # --- statistical summary ---
            stats_spec = Spec(STATS_SPEC, [])
            try:
                s = stats_spec.evaluate(query_data)
            except Exception as e:
                pp.pprint("# Error while getting stats (%s, %s): %s" % \
                          str(metric, variant_name, e))
                continue
            stats[metric][variant_name] = s
            #pp.pprint (s)

def visualize(metric, type):
    vis2 = MatplotlibPlotter()
    vis2.start()
    if type == 'boxplot':
        vis2.boxplot(bmx[metric], \
                     label = 'duration (seconds, smaller=better)')
    if type == 'percentiles':
        vis2.percentiles(bmx[metric], \
                         label = 'duration (seconds, smaller=better)')
    if type == 'histogram':
        vis2.histogram(bmx[metric], \
                       label = 'duration (seconds, smaller=better)')
    if type == 'timeline':
        vis2.timeline(bmx[metric], \
                      label = 'duration (seconds, smaller=better)')
    vis2.show()

Evaluate

In [3]:
fetchdata()
display(Markdown('## ' + HW[HOST]))
for metric in METRICS:        
    variants = METRIC_VARIANTS
    display(Markdown('### ' + metric))
    text = ''
    text = text + '* min .. ci95l .. **mean** .. ci95u .. max (stdev)\n'
    
    for v in METRIC_VARIANTS:
        variant_name        = v[0]
        variant_constraints = v[1]
        try:
            s = stats[metric][variant_name]
        except Exception as e:
            pp.pprint("# No data for stats (%s, %s): %s" % \
                      str(metric, variant_name, e))
            s = [0, 0, 0, 0, 0, 0]
            continue
        t2 = '* %.2f .. %.2f .. **%.2f** .. %.2f .. %.2f (stdev: %.2f) -- %s\n' % \
               (s[0],   s[4],     s[2],     s[5],   s[1],        s[3], variant_name)
        text = text + t2

    try:
        rel_diff = 100 * \
          stats[metric]['NO_TAINT_SUPPORT'][2] / stats[metric]['withtaint'][2]
        improvement = 100 - rel_diff
        faster_or_slower = '*faster*'
        if improvement < 0:
            faster_or_slower = '**slower!**'
        t3 = '* <u>%.2f%% (%s)</u>\n' % (improvement, faster_or_slower)
        text = text + t3
    except Exception as e:
        pp.pprint("# No stats for diff (%s): %s" % str(metric, e))

    display(Markdown(text))

    visualize(metric, 'boxplot')
    visualize(metric, 'percentiles')
    #visualize(metric, 'histogram')
    #visualize(metric, 'timeline')
    #display(Markdown('<span style="page-break-before: always !important">⚀</span>')) # doesn't help

i7-10610U CPU @ 1.80GHz

perlformance.perl5.SpamAssassin.salearn.ham

  • min .. ci95l .. mean .. ci95u .. max (stdev)
  • 20.39 .. 20.51 .. 20.53 .. 20.55 .. 20.71 (stdev: 0.06) -- withtaint
  • 17.45 .. 17.51 .. 17.52 .. 17.53 .. 17.59 (stdev: 0.04) -- NO_TAINT_SUPPORT
  • 14.65% (*faster*)

perlformance.perl5.SpamAssassin.salearn.ham2

  • min .. ci95l .. mean .. ci95u .. max (stdev)
  • 13.75 .. 13.82 .. 13.83 .. 13.84 .. 13.94 (stdev: 0.03) -- withtaint
  • 11.80 .. 11.85 .. 11.86 .. 11.86 .. 11.93 (stdev: 0.03) -- NO_TAINT_SUPPORT
  • 14.27% (*faster*)

perlformance.perl5.SpamAssassin.salearn.spam

  • min .. ci95l .. mean .. ci95u .. max (stdev)
  • 8.43 .. 8.50 .. 8.50 .. 8.51 .. 8.57 (stdev: 0.03) -- withtaint
  • 7.54 .. 7.58 .. 7.61 .. 7.64 .. 8.18 (stdev: 0.09) -- NO_TAINT_SUPPORT
  • 10.51% (*faster*)

perlformance.perl5.SpamAssassin.salearn.spam2

  • min .. ci95l .. mean .. ci95u .. max (stdev)
  • 22.58 .. 22.68 .. 22.73 .. 22.77 .. 23.66 (stdev: 0.15) -- withtaint
  • 20.20 .. 20.31 .. 20.34 .. 20.36 .. 20.57 (stdev: 0.07) -- NO_TAINT_SUPPORT
  • 10.52% (*faster*)

perlformance.perl5.PerlStone2015.binarytrees

  • min .. ci95l .. mean .. ci95u .. max (stdev)
  • 38.13 .. 38.43 .. 38.52 .. 38.61 .. 38.90 (stdev: 0.22) -- withtaint
  • 37.70 .. 38.12 .. 38.25 .. 38.38 .. 39.29 (stdev: 0.31) -- NO_TAINT_SUPPORT
  • 0.70% (*faster*)

perlformance.perl5.PerlStone2015.mandelbrot

  • min .. ci95l .. mean .. ci95u .. max (stdev)
  • 24.55 .. 25.09 .. 25.25 .. 25.40 .. 25.96 (stdev: 0.37) -- withtaint
  • 23.26 .. 23.84 .. 24.05 .. 24.25 .. 25.47 (stdev: 0.48) -- NO_TAINT_SUPPORT
  • 4.76% (*faster*)

perlformance.perl5.Mandelbrot.withthreads

  • min .. ci95l .. mean .. ci95u .. max (stdev)
  • 23.58 .. 24.03 .. 24.15 .. 24.27 .. 24.85 (stdev: 0.33) -- withtaint
  • 22.52 .. 23.10 .. 23.25 .. 23.40 .. 24.19 (stdev: 0.41) -- NO_TAINT_SUPPORT
  • 3.74% (*faster*)

perlformance.perl5.Mandelbrot.withmce

  • min .. ci95l .. mean .. ci95u .. max (stdev)
  • 9.77 .. 10.02 .. 10.07 .. 10.12 .. 10.56 (stdev: 0.14) -- withtaint
  • 9.72 .. 9.97 .. 10.04 .. 10.11 .. 10.63 (stdev: 0.19) -- NO_TAINT_SUPPORT
  • 0.27% (*faster*)

perlformance.perl5.PerlStone2015.fib

  • min .. ci95l .. mean .. ci95u .. max (stdev)
  • 18.62 .. 18.83 .. 18.93 .. 19.03 .. 19.48 (stdev: 0.24) -- withtaint
  • 18.07 .. 18.41 .. 18.52 .. 18.62 .. 19.05 (stdev: 0.24) -- NO_TAINT_SUPPORT
  • 2.18% (*faster*)

perlformance.perl5.Fib

  • min .. ci95l .. mean .. ci95u .. max (stdev)
  • 18.09 .. 18.27 .. 18.38 .. 18.49 .. 19.29 (stdev: 0.30) -- withtaint
  • 17.66 .. 17.84 .. 17.91 .. 17.98 .. 18.52 (stdev: 0.19) -- NO_TAINT_SUPPORT
  • 2.53% (*faster*)

perlformance.perl5.FibMoose

  • min .. ci95l .. mean .. ci95u .. max (stdev)
  • 24.95 .. 25.13 .. 25.23 .. 25.33 .. 25.82 (stdev: 0.27) -- withtaint
  • 24.27 .. 24.54 .. 24.66 .. 24.78 .. 25.51 (stdev: 0.33) -- NO_TAINT_SUPPORT
  • 2.25% (*faster*)

perlformance.perl5.FibMouse

  • min .. ci95l .. mean .. ci95u .. max (stdev)
  • 25.02 .. 25.41 .. 25.59 .. 25.77 .. 26.71 (stdev: 0.48) -- withtaint
  • 24.31 .. 24.62 .. 24.75 .. 24.88 .. 25.57 (stdev: 0.36) -- NO_TAINT_SUPPORT
  • 3.28% (*faster*)

perlformance.perl5.FibOO

  • min .. ci95l .. mean .. ci95u .. max (stdev)
  • 25.00 .. 25.27 .. 25.41 .. 25.56 .. 26.49 (stdev: 0.40) -- withtaint
  • 24.29 .. 24.64 .. 24.78 .. 24.91 .. 25.64 (stdev: 0.36) -- NO_TAINT_SUPPORT
  • 2.51% (*faster*)

perlformance.perl5.FibOOSig

  • min .. ci95l .. mean .. ci95u .. max (stdev)
  • 24.10 .. 24.29 .. 24.41 .. 24.54 .. 25.34 (stdev: 0.34) -- withtaint
  • 23.31 .. 23.85 .. 24.24 .. 24.64 .. 29.77 (stdev: 1.07) -- NO_TAINT_SUPPORT
  • 0.70% (*faster*)

perlformance.perl5.AccessorsHash.set

  • min .. ci95l .. mean .. ci95u .. max (stdev)
  • 1.85 .. 1.86 .. 1.87 .. 1.88 .. 1.96 (stdev: 0.03) -- withtaint
  • 1.78 .. 1.81 .. 1.83 .. 1.84 .. 1.89 (stdev: 0.03) -- NO_TAINT_SUPPORT
  • 2.60% (*faster*)

perlformance.perl5.AccessorsArray.set

  • min .. ci95l .. mean .. ci95u .. max (stdev)
  • 1.68 .. 1.69 .. 1.70 .. 1.71 .. 1.76 (stdev: 0.02) -- withtaint
  • 1.64 .. 1.66 .. 1.66 .. 1.67 .. 1.71 (stdev: 0.02) -- NO_TAINT_SUPPORT
  • 2.18% (*faster*)