This nice example, originally due to Knuth, was suggested recently as a performance test case.
My version of the source is (actual file)
import Control.Monad.ST
import Data.STRef
a k x1 x2 x3 x4 x5 =
do kk <- newSTRef k
let b = do k <- Main.modifySTRef' kk pred >> readSTRef kk; a k b x1 x2 x3 x4
if k <= 0 then do x3' <- x3; x4' <- x4; let x = (x3' + x4') in x `seq` return x
else do x5' <- x5; b' <- b; let x = (x5' + b') in x `seq` return x
main = putStr $ show (runST (a 22 (return 1) (return (-1)) (return (-1)) (return 1) (return 0)))
modifySTRef' ref f = do { x <- readSTRef ref; let { x' = f x } ; x' `seq` writeSTRef ref x' }
where I inlined the definition of modifySTref'
since old versions of base don’t have it.
I measured runtimes of this program when compiled by various ghc versions:
ghc-6.10.4 : -14254067 1.65 user 0.01 system 0:01.66 elapsed 100% CPU (0 text+0 data 11600 max)k
ghc-6.12.3 : -14254067 1.56 user 0.00 system 0:01.56 elapsed 100% CPU (0 text+0 data 12464 max)k
ghc-7.4.2 : -14254067 1.73 user 0.00 system 0:01.74 elapsed 100% CPU (0 text+0 data 12112 max)k
ghc-7.6.3 : -14254067 1.89 user 0.00 system 0:01.89 elapsed 100% CPU (0 text+0 data 12000 max)k
ghc-7.8.4 : -14254067 1.76 user 0.01 system 0:01.77 elapsed 100% CPU (0 text+0 data 12800 max)k
ghc-7.10.1 : -14254067 2.08 user 0.00 system 0:02.09 elapsed 100% CPU (0 text+0 data 14208 max)k
ghc-7.10.2 : -14254067 2.11 user 0.00 system 0:02.11 elapsed 100% CPU (0 text+0 data 14272 max)k
ghc-7.10.3 : -14254067 2.12 user 0.00 system 0:02.13 elapsed 100% CPU (0 text+0 data 14304 max)k
ghc-8.0.1 : -14254067 2.29 user 0.00 system 0:02.29 elapsed 100% CPU (0 text+0 data 14080 max)k
ghc-8.0.2 : -14254067 2.30 user 0.00 system 0:02.30 elapsed 100% CPU (0 text+0 data 14192 max)k
ghc-8.1.20170128 : -14254067 2.14 user 0.00 system 0:02.15 elapsed 100% CPU (0 text+0 data 12816 max)k
(the number “-14254067” is the actual output of the program)
Update April 2021: re-running this (on a different machine, and with more compilers) run.log
/opt/ghc/ghc-6.8.3 : -14254067 1.07 user 0.00 system 0:01.08 elapsed 99% CPU (0 text+0 data 4028 max)k
/opt/ghc/ghc-6.10.4 : -14254067 1.07 user 0.00 system 0:01.07 elapsed 99% CPU (0 text+0 data 4280 max)k
/opt/ghc/ghc-6.12.3 : -14254067 1.13 user 0.00 system 0:01.13 elapsed 99% CPU (0 text+0 data 4732 max)k
/opt/ghc/ghc-7.0.4 : -14254067 1.38 user 0.00 system 0:01.39 elapsed 99% CPU (0 text+0 data 5392 max)k
/opt/ghc/ghc-7.4.2 : -14254067 1.13 user 0.00 system 0:01.14 elapsed 99% CPU (0 text+0 data 5012 max)k
/opt/ghc/ghc-7.6.3 : -14254067 1.09 user 0.00 system 0:01.10 elapsed 99% CPU (0 text+0 data 4988 max)k
/opt/ghc/ghc-7.8.4 : -14254067 1.12 user 0.00 system 0:01.13 elapsed 99% CPU (0 text+0 data 5028 max)k
/opt/ghc/ghc-7.10.3 : -14254067 1.43 user 0.00 system 0:01.44 elapsed 99% CPU (0 text+0 data 5404 max)k
/opt/ghc/ghc-8.0.2 : -14254067 1.61 user 0.00 system 0:01.61 elapsed 99% CPU (0 text+0 data 5540 max)k
/opt/ghc/ghc-8.2.2 : -14254067 1.65 user 0.00 system 0:01.66 elapsed 99% CPU (0 text+0 data 4980 max)k
/opt/ghc/ghc-8.6.2 : -14254067 1.51 user 0.00 system 0:01.52 elapsed 99% CPU (0 text+0 data 4844 max)k
/opt/ghc/ghc-8.6.3 : -14254067 1.47 user 0.00 system 0:01.47 elapsed 99% CPU (0 text+0 data 5000 max)k
/opt/ghc/ghc-8.6.4 : -14254067 1.45 user 0.00 system 0:01.46 elapsed 99% CPU (0 text+0 data 4800 max)k
/opt/ghc/ghc-8.6.5 : -14254067 1.45 user 0.00 system 0:01.45 elapsed 99% CPU (0 text+0 data 5004 max)k
/opt/ghc/ghc-8.8.1 : -14254067 1.44 user 0.00 system 0:01.45 elapsed 99% CPU (0 text+0 data 5172 max)k
/opt/ghc/ghc-8.8.3 : -14254067 1.41 user 0.00 system 0:01.42 elapsed 99% CPU (0 text+0 data 5124 max)k
/opt/ghc/ghc-8.8.4 : -14254067 1.47 user 0.00 system 0:01.47 elapsed 99% CPU (0 text+0 data 5156 max)k
/opt/ghc/ghc-8.10.1 : -14254067 1.36 user 0.00 system 0:01.36 elapsed 99% CPU (0 text+0 data 5268 max)k
/opt/ghc/ghc-8.10.2 : -14254067 1.36 user 0.00 system 0:01.37 elapsed 99% CPU (0 text+0 data 5492 max)k
/opt/ghc/ghc-8.10.3 : -14254067 1.59 user 0.00 system 0:01.59 elapsed 99% CPU (0 text+0 data 5496 max)k
/opt/ghc/ghc-8.10.4 : -14254067 1.47 user 0.00 system 0:01.48 elapsed 99% CPU (0 text+0 data 5500 max)k
/opt/ghc/ghc-9.0.1 : -14254067 1.30 user 0.00 system 0:01.31 elapsed 99% CPU (0 text+0 data 5228 max)k
/opt/ghc/ghc-9.2.0.20210331 : -14254067 1.35 user 0.00 system 0:01.35 elapsed 99% CPU (0 text+0 data 8300 max)k
I used roughly this script (actual file)
for VERSION
in ghc-6.10.4 ghc-6.12.3 \
ghc-7.4.2 \
ghc-7.6.3 \
ghc-7.8.4 \
ghc-7.10.1 ghc-7.10.2 ghc-7.10.3 \
ghc-8.0.1 ghc-8.0.2
do
exec=./mob-$VERSION
/opt/ghc/$VERSION/bin/ghc -O2 -fforce-recomp -o $exec mob.hs 2>/dev/null 1>/dev/null
echo -n $VERSION " : "
/usr/bin/time -f " %U user %S system %E elapsed %P CPU (%X text+%D data %M max)k" $exec
done
And, of course, nobody noticed (?) that the program uses Integer. Writing a type annotation (actual file)
a :: Int -> ST s Int -> ST s Int -> ST s Int -> ST s Int -> ST s Int -> ST s Int
we get these results
ghc-6.10.4 : -14254067 1.28 user 0.00 system 0:01.28 elapsed 100% CPU (0 text+0 data 10496 max)k
ghc-6.12.3 : -14254067 1.28 user 0.00 system 0:01.28 elapsed 100% CPU (0 text+0 data 11552 max)k
ghc-7.4.2 : -14254067 1.25 user 0.00 system 0:01.25 elapsed 100% CPU (0 text+0 data 11216 max)k
ghc-7.6.3 : -14254067 1.26 user 0.00 system 0:01.25 elapsed 100% CPU (0 text+0 data 11216 max)k
ghc-7.8.4 : -14254067 1.28 user 0.00 system 0:01.28 elapsed 100% CPU (0 text+0 data 12224 max)k
ghc-7.10.1 : -14254067 1.37 user 0.01 system 0:01.38 elapsed 100% CPU (0 text+0 data 13584 max)k
ghc-7.10.2 : -14254067 1.34 user 0.02 system 0:01.35 elapsed 100% CPU (0 text+0 data 13648 max)k
ghc-7.10.3 : -14254067 1.32 user 0.02 system 0:01.33 elapsed 100% CPU (0 text+0 data 13648 max)k
ghc-8.0.1 : -14254067 1.27 user 0.00 system 0:01.27 elapsed 100% CPU (0 text+0 data 12656 max)k
ghc-8.0.2 : -14254067 1.30 user 0.00 system 0:01.30 elapsed 100% CPU (0 text+0 data 12704 max)k
ghc-8.1.20170128 : -14254067 1.34 user 0.00 system 0:01.34 elapsed 100% CPU (0 text+0 data 12768 max)k
Now, ghc-8.0 looks good again. Well, much better.
Update April 2021 run-int.log
/opt/ghc/ghc-6.8.3 : -14254067 1.01 user 0.00 system 0:01.01 elapsed 99% CPU (0 text+0 data 3712 max)k
/opt/ghc/ghc-6.10.4 : -14254067 0.94 user 0.00 system 0:00.94 elapsed 99% CPU (0 text+0 data 4028 max)k
/opt/ghc/ghc-6.12.3 : -14254067 0.91 user 0.00 system 0:00.92 elapsed 99% CPU (0 text+0 data 4464 max)k
/opt/ghc/ghc-7.0.4 : -14254067 1.00 user 0.00 system 0:01.00 elapsed 99% CPU (0 text+0 data 5072 max)k
/opt/ghc/ghc-7.4.2 : -14254067 0.95 user 0.00 system 0:00.96 elapsed 99% CPU (0 text+0 data 4916 max)k
/opt/ghc/ghc-7.6.3 : -14254067 0.94 user 0.00 system 0:00.95 elapsed 99% CPU (0 text+0 data 4884 max)k
/opt/ghc/ghc-7.8.4 : -14254067 0.96 user 0.00 system 0:00.97 elapsed 99% CPU (0 text+0 data 4984 max)k
/opt/ghc/ghc-7.10.3 : -14254067 0.99 user 0.00 system 0:01.00 elapsed 99% CPU (0 text+0 data 5240 max)k
/opt/ghc/ghc-8.0.2 : -14254067 0.96 user 0.00 system 0:00.97 elapsed 99% CPU (0 text+0 data 5236 max)k
/opt/ghc/ghc-8.2.2 : -14254067 0.92 user 0.00 system 0:00.93 elapsed 99% CPU (0 text+0 data 5044 max)k
/opt/ghc/ghc-8.6.2 : -14254067 0.90 user 0.00 system 0:00.90 elapsed 99% CPU (0 text+0 data 4820 max)k
/opt/ghc/ghc-8.6.3 : -14254067 0.92 user 0.00 system 0:00.92 elapsed 99% CPU (0 text+0 data 4984 max)k
/opt/ghc/ghc-8.6.4 : -14254067 0.91 user 0.00 system 0:00.92 elapsed 99% CPU (0 text+0 data 4976 max)k
/opt/ghc/ghc-8.6.5 : -14254067 0.90 user 0.00 system 0:00.90 elapsed 99% CPU (0 text+0 data 4820 max)k
/opt/ghc/ghc-8.8.1 : -14254067 0.84 user 0.00 system 0:00.84 elapsed 99% CPU (0 text+0 data 5056 max)k
/opt/ghc/ghc-8.8.3 : -14254067 0.86 user 0.00 system 0:00.86 elapsed 99% CPU (0 text+0 data 5060 max)k
/opt/ghc/ghc-8.8.4 : -14254067 0.88 user 0.00 system 0:00.88 elapsed 99% CPU (0 text+0 data 5140 max)k
/opt/ghc/ghc-8.10.1 : -14254067 0.90 user 0.00 system 0:00.90 elapsed 99% CPU (0 text+0 data 5168 max)k
/opt/ghc/ghc-8.10.2 : -14254067 0.90 user 0.00 system 0:00.91 elapsed 99% CPU (0 text+0 data 5508 max)k
/opt/ghc/ghc-8.10.3 : -14254067 1.19 user 0.00 system 0:01.19 elapsed 99% CPU (0 text+0 data 5420 max)k
/opt/ghc/ghc-8.10.4 : -14254067 0.93 user 0.00 system 0:00.93 elapsed 99% CPU (0 text+0 data 5452 max)k
/opt/ghc/ghc-9.0.1 : -14254067 0.90 user 0.00 system 0:00.90 elapsed 99% CPU (0 text+0 data 5276 max)k
/opt/ghc/ghc-9.2.0.20210331 : -14254067 0.93 user 0.00 system 0:00.93 elapsed 99% CPU (0 text+0 data 8400 max)k