GHC runtime performance comparison

  • January 2017 (?) original version
  • April 2021 update (include source links, and data some older and newer compilers)

This nice example, originally due to Knuth, was suggested recently as a performance test case.

My version of the source is (actual file)

import Control.Monad.ST
            import Data.STRef

            a k x1 x2 x3 x4 x5 =
               do kk <- newSTRef k
                  let b = do k <- Main.modifySTRef' kk pred >> readSTRef kk; a k b x1 x2 x3 x4
                  if k <= 0 then do x3' <- x3; x4' <- x4; let x = (x3' + x4') in x `seq` return x
                            else do x5' <- x5; b' <- b;   let x = (x5' + b') in x `seq` return x

            main = putStr $ show (runST (a 22 (return 1) (return (-1)) (return (-1)) (return 1) (return 0)))

            modifySTRef' ref f = do { x <- readSTRef ref; let { x' = f x } ; x' `seq` writeSTRef ref x' }

where I inlined the definition of modifySTref' since old versions of base don’t have it.

I measured runtimes of this program when compiled by various ghc versions:

ghc-6.10.4        : -14254067 1.65 user 0.01 system 0:01.66 elapsed 100% CPU (0 text+0 data 11600 max)k
            ghc-6.12.3        : -14254067 1.56 user 0.00 system 0:01.56 elapsed 100% CPU (0 text+0 data 12464 max)k
            ghc-7.4.2         : -14254067 1.73 user 0.00 system 0:01.74 elapsed 100% CPU (0 text+0 data 12112 max)k
            ghc-7.6.3         : -14254067 1.89 user 0.00 system 0:01.89 elapsed 100% CPU (0 text+0 data 12000 max)k
            ghc-7.8.4         : -14254067 1.76 user 0.01 system 0:01.77 elapsed 100% CPU (0 text+0 data 12800 max)k
            ghc-7.10.1        : -14254067 2.08 user 0.00 system 0:02.09 elapsed 100% CPU (0 text+0 data 14208 max)k
            ghc-7.10.2        : -14254067 2.11 user 0.00 system 0:02.11 elapsed 100% CPU (0 text+0 data 14272 max)k
            ghc-7.10.3        : -14254067 2.12 user 0.00 system 0:02.13 elapsed 100% CPU (0 text+0 data 14304 max)k
            ghc-8.0.1         : -14254067 2.29 user 0.00 system 0:02.29 elapsed 100% CPU (0 text+0 data 14080 max)k
            ghc-8.0.2         : -14254067 2.30 user 0.00 system 0:02.30 elapsed 100% CPU (0 text+0 data 14192 max)k
            ghc-8.1.20170128  : -14254067 2.14 user 0.00 system 0:02.15 elapsed 100% CPU (0 text+0 data 12816 max)k

(the number “-14254067” is the actual output of the program)

Update April 2021: re-running this (on a different machine, and with more compilers) run.log

/opt/ghc/ghc-6.8.3  : -14254067 1.07 user 0.00 system 0:01.08 elapsed 99% CPU (0 text+0 data 4028 max)k
            /opt/ghc/ghc-6.10.4  : -14254067 1.07 user 0.00 system 0:01.07 elapsed 99% CPU (0 text+0 data 4280 max)k
            /opt/ghc/ghc-6.12.3  : -14254067 1.13 user 0.00 system 0:01.13 elapsed 99% CPU (0 text+0 data 4732 max)k
            /opt/ghc/ghc-7.0.4  : -14254067 1.38 user 0.00 system 0:01.39 elapsed 99% CPU (0 text+0 data 5392 max)k
            /opt/ghc/ghc-7.4.2  : -14254067 1.13 user 0.00 system 0:01.14 elapsed 99% CPU (0 text+0 data 5012 max)k
            /opt/ghc/ghc-7.6.3  : -14254067 1.09 user 0.00 system 0:01.10 elapsed 99% CPU (0 text+0 data 4988 max)k
            /opt/ghc/ghc-7.8.4  : -14254067 1.12 user 0.00 system 0:01.13 elapsed 99% CPU (0 text+0 data 5028 max)k
            /opt/ghc/ghc-7.10.3  : -14254067 1.43 user 0.00 system 0:01.44 elapsed 99% CPU (0 text+0 data 5404 max)k
            /opt/ghc/ghc-8.0.2  : -14254067 1.61 user 0.00 system 0:01.61 elapsed 99% CPU (0 text+0 data 5540 max)k
            /opt/ghc/ghc-8.2.2  : -14254067 1.65 user 0.00 system 0:01.66 elapsed 99% CPU (0 text+0 data 4980 max)k
            /opt/ghc/ghc-8.6.2  : -14254067 1.51 user 0.00 system 0:01.52 elapsed 99% CPU (0 text+0 data 4844 max)k
            /opt/ghc/ghc-8.6.3  : -14254067 1.47 user 0.00 system 0:01.47 elapsed 99% CPU (0 text+0 data 5000 max)k
            /opt/ghc/ghc-8.6.4  : -14254067 1.45 user 0.00 system 0:01.46 elapsed 99% CPU (0 text+0 data 4800 max)k
            /opt/ghc/ghc-8.6.5  : -14254067 1.45 user 0.00 system 0:01.45 elapsed 99% CPU (0 text+0 data 5004 max)k
            /opt/ghc/ghc-8.8.1  : -14254067 1.44 user 0.00 system 0:01.45 elapsed 99% CPU (0 text+0 data 5172 max)k
            /opt/ghc/ghc-8.8.3  : -14254067 1.41 user 0.00 system 0:01.42 elapsed 99% CPU (0 text+0 data 5124 max)k
            /opt/ghc/ghc-8.8.4  : -14254067 1.47 user 0.00 system 0:01.47 elapsed 99% CPU (0 text+0 data 5156 max)k
            /opt/ghc/ghc-8.10.1  : -14254067 1.36 user 0.00 system 0:01.36 elapsed 99% CPU (0 text+0 data 5268 max)k
            /opt/ghc/ghc-8.10.2  : -14254067 1.36 user 0.00 system 0:01.37 elapsed 99% CPU (0 text+0 data 5492 max)k
            /opt/ghc/ghc-8.10.3  : -14254067 1.59 user 0.00 system 0:01.59 elapsed 99% CPU (0 text+0 data 5496 max)k
            /opt/ghc/ghc-8.10.4  : -14254067 1.47 user 0.00 system 0:01.48 elapsed 99% CPU (0 text+0 data 5500 max)k
            /opt/ghc/ghc-9.0.1  : -14254067 1.30 user 0.00 system 0:01.31 elapsed 99% CPU (0 text+0 data 5228 max)k
            /opt/ghc/ghc-9.2.0.20210331  : -14254067 1.35 user 0.00 system 0:01.35 elapsed 99% CPU (0 text+0 data 8300 max)k
            

I used roughly this script (actual file)

for VERSION
            in ghc-6.10.4  ghc-6.12.3  \
               ghc-7.4.2 \
               ghc-7.6.3 \
               ghc-7.8.4 \
               ghc-7.10.1 ghc-7.10.2 ghc-7.10.3 \
               ghc-8.0.1 ghc-8.0.2
            do
                exec=./mob-$VERSION
                /opt/ghc/$VERSION/bin/ghc -O2 -fforce-recomp -o $exec mob.hs 2>/dev/null 1>/dev/null
                echo -n $VERSION " : "
                /usr/bin/time -f " %U user %S system %E elapsed %P CPU (%X text+%D data %M max)k" $exec
            done

And, of course, nobody noticed (?) that the program uses Integer. Writing a type annotation (actual file)

a :: Int -> ST s Int -> ST s Int -> ST s Int -> ST s Int -> ST s Int -> ST s Int

we get these results

ghc-6.10.4        : -14254067 1.28 user 0.00 system 0:01.28 elapsed 100% CPU (0 text+0 data 10496 max)k
            ghc-6.12.3        : -14254067 1.28 user 0.00 system 0:01.28 elapsed 100% CPU (0 text+0 data 11552 max)k
            ghc-7.4.2         : -14254067 1.25 user 0.00 system 0:01.25 elapsed 100% CPU (0 text+0 data 11216 max)k
            ghc-7.6.3         : -14254067 1.26 user 0.00 system 0:01.25 elapsed 100% CPU (0 text+0 data 11216 max)k
            ghc-7.8.4         : -14254067 1.28 user 0.00 system 0:01.28 elapsed 100% CPU (0 text+0 data 12224 max)k
            ghc-7.10.1        : -14254067 1.37 user 0.01 system 0:01.38 elapsed 100% CPU (0 text+0 data 13584 max)k
            ghc-7.10.2        : -14254067 1.34 user 0.02 system 0:01.35 elapsed 100% CPU (0 text+0 data 13648 max)k
            ghc-7.10.3        : -14254067 1.32 user 0.02 system 0:01.33 elapsed 100% CPU (0 text+0 data 13648 max)k
            ghc-8.0.1         : -14254067 1.27 user 0.00 system 0:01.27 elapsed 100% CPU (0 text+0 data 12656 max)k
            ghc-8.0.2         : -14254067 1.30 user 0.00 system 0:01.30 elapsed 100% CPU (0 text+0 data 12704 max)k
            ghc-8.1.20170128  : -14254067 1.34 user 0.00 system 0:01.34 elapsed 100% CPU (0 text+0 data 12768 max)k

Now, ghc-8.0 looks good again. Well, much better.

Update April 2021 run-int.log

/opt/ghc/ghc-6.8.3  : -14254067 1.01 user 0.00 system 0:01.01 elapsed 99% CPU (0 text+0 data 3712 max)k
            /opt/ghc/ghc-6.10.4  : -14254067 0.94 user 0.00 system 0:00.94 elapsed 99% CPU (0 text+0 data 4028 max)k
            /opt/ghc/ghc-6.12.3  : -14254067 0.91 user 0.00 system 0:00.92 elapsed 99% CPU (0 text+0 data 4464 max)k
            /opt/ghc/ghc-7.0.4  : -14254067 1.00 user 0.00 system 0:01.00 elapsed 99% CPU (0 text+0 data 5072 max)k
            /opt/ghc/ghc-7.4.2  : -14254067 0.95 user 0.00 system 0:00.96 elapsed 99% CPU (0 text+0 data 4916 max)k
            /opt/ghc/ghc-7.6.3  : -14254067 0.94 user 0.00 system 0:00.95 elapsed 99% CPU (0 text+0 data 4884 max)k
            /opt/ghc/ghc-7.8.4  : -14254067 0.96 user 0.00 system 0:00.97 elapsed 99% CPU (0 text+0 data 4984 max)k
            /opt/ghc/ghc-7.10.3  : -14254067 0.99 user 0.00 system 0:01.00 elapsed 99% CPU (0 text+0 data 5240 max)k
            /opt/ghc/ghc-8.0.2  : -14254067 0.96 user 0.00 system 0:00.97 elapsed 99% CPU (0 text+0 data 5236 max)k
            /opt/ghc/ghc-8.2.2  : -14254067 0.92 user 0.00 system 0:00.93 elapsed 99% CPU (0 text+0 data 5044 max)k
            /opt/ghc/ghc-8.6.2  : -14254067 0.90 user 0.00 system 0:00.90 elapsed 99% CPU (0 text+0 data 4820 max)k
            /opt/ghc/ghc-8.6.3  : -14254067 0.92 user 0.00 system 0:00.92 elapsed 99% CPU (0 text+0 data 4984 max)k
            /opt/ghc/ghc-8.6.4  : -14254067 0.91 user 0.00 system 0:00.92 elapsed 99% CPU (0 text+0 data 4976 max)k
            /opt/ghc/ghc-8.6.5  : -14254067 0.90 user 0.00 system 0:00.90 elapsed 99% CPU (0 text+0 data 4820 max)k
            /opt/ghc/ghc-8.8.1  : -14254067 0.84 user 0.00 system 0:00.84 elapsed 99% CPU (0 text+0 data 5056 max)k
            /opt/ghc/ghc-8.8.3  : -14254067 0.86 user 0.00 system 0:00.86 elapsed 99% CPU (0 text+0 data 5060 max)k
            /opt/ghc/ghc-8.8.4  : -14254067 0.88 user 0.00 system 0:00.88 elapsed 99% CPU (0 text+0 data 5140 max)k
            /opt/ghc/ghc-8.10.1  : -14254067 0.90 user 0.00 system 0:00.90 elapsed 99% CPU (0 text+0 data 5168 max)k
            /opt/ghc/ghc-8.10.2  : -14254067 0.90 user 0.00 system 0:00.91 elapsed 99% CPU (0 text+0 data 5508 max)k
            /opt/ghc/ghc-8.10.3  : -14254067 1.19 user 0.00 system 0:01.19 elapsed 99% CPU (0 text+0 data 5420 max)k
            /opt/ghc/ghc-8.10.4  : -14254067 0.93 user 0.00 system 0:00.93 elapsed 99% CPU (0 text+0 data 5452 max)k
            /opt/ghc/ghc-9.0.1  : -14254067 0.90 user 0.00 system 0:00.90 elapsed 99% CPU (0 text+0 data 5276 max)k
            /opt/ghc/ghc-9.2.0.20210331  : -14254067 0.93 user 0.00 system 0:00.93 elapsed 99% CPU (0 text+0 data 8400 max)k