LMBench 3.0 GCC vs ICC - Kernel 2.6.33

Compiler CFLAGS Configuration

GCC 4.4.2:

-O3 -march=core2 -mtune=core2 -msse4 -funroll-loops -fprefetch-loop-arrays -fvariable-expansion-in-unroller -ffast-math -fno-tree-vectorize

ICC 11.1.064:

-O3 -xSSE4.2 -ip -fp-model fast=2 -unroll-aggressive -vec-guard-write

Kernel Configuration

2.6.33-rc8 Kernel Config

Hardware Configuration

Intel Core i7 920 @ 2784 MHz - 1066 MHz Memory Frequency - 6GB Memory (4183MB allocated to LMBench)

Benchmark Results

Basic system parameters
------------------------------------------------------------------------------
Host                 OS Description              Mhz  tlb  cache  mem   scal
                                                     pages line   par   load
                                                           bytes
--------- ------------- ----------------------- ---- ----- ----- ------ ----
gcc      Linux 2.6.33-        x86_64-linux-gnu 2784                       1
icc      Linux 2.6.33-        x86_64-linux-gnu 2784                       1

Processor, Processes - times in microseconds - smaller is better
------------------------------------------------------------------------------
Host                 OS  Mhz null null      open slct sig  sig  fork exec sh
                             call  I/O stat clos TCP  inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
gcc      Linux 2.6.33- 2784 0.05 0.08 0.61 0.94 2.37 0.13 0.87 71.4 258. 874.
icc      Linux 2.6.33- 2784 0.05 0.08 0.55 0.87 2.19 0.13 0.92 76.0 265. 887.

Basic integer operations - times in nanoseconds - smaller is better
-------------------------------------------------------------------
Host                 OS  intgr intgr  intgr  intgr  intgr
                          bit   add    mul    div    mod
--------- ------------- ------ ------ ------ ------ ------
gcc      Linux 2.6.33- 0.3600 0.1800 0.1100 8.6000 8.2400
icc      Linux 2.6.33- 0.3600 0.1800 0.1100 8.6100 8.2600

Basic uint64 operations - times in nanoseconds - smaller is better
------------------------------------------------------------------
Host                 OS int64  int64  int64  int64  int64
                         bit    add    mul    div    mod
--------- ------------- ------ ------ ------ ------ ------
gcc      Linux 2.6.33-  0.360        0.1100   15.9   15.1
icc      Linux 2.6.33-  0.360        0.1100   15.9   15.1

Basic float operations - times in nanoseconds - smaller is better
-----------------------------------------------------------------
Host                 OS  float  float  float  float
                         add    mul    div    bogo
--------- ------------- ------ ------ ------ ------
gcc      Linux 2.6.33- 1.0800 1.4400 5.3500 5.0300
icc      Linux 2.6.33- 1.0800 1.4400 5.3500 5.0300

Basic double operations - times in nanoseconds - smaller is better
------------------------------------------------------------------
Host                 OS  double double double double
                         add    mul    div    bogo
--------- ------------- ------  ------ ------ ------
gcc      Linux 2.6.33- 1.0800 1.8000 8.2300 7.9000
icc      Linux 2.6.33- 1.0800 1.8000 8.2300 7.9000

Context switching - times in microseconds - smaller is better
-------------------------------------------------------------------------
Host                 OS  2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
                         ctxsw  ctxsw  ctxsw ctxsw  ctxsw   ctxsw   ctxsw
--------- ------------- ------ ------ ------ ------ ------ ------- -------
gcc      Linux 2.6.33- 0.4300 0.5600 0.5900 0.8900 1.3400 1.02000 1.38000
icc      Linux 2.6.33- 0.4700 0.5100 0.4900 0.9000 1.3100 1.08000 1.35000

*Local* Communication latencies in microseconds - smaller is better
---------------------------------------------------------------------
Host                 OS 2p/0K  Pipe AF     UDP  RPC/   TCP  RPC/ TCP
                        ctxsw       UNIX         UDP         TCP conn
--------- ------------- ----- ----- ---- ----- ----- ----- ----- ----
gcc      Linux 2.6.33- 0.430 1.664 2.50 5.564       6.393        18.
icc      Linux 2.6.33- 0.470 1.596 3.20 5.497       6.826        20.

*Remote* Communication latencies in microseconds - smaller is better
---------------------------------------------------------------------
Host                 OS   UDP  RPC/  TCP   RPC/ TCP
                               UDP         TCP  conn
--------- ------------- ----- ----- ----- ----- ----
gcc      Linux 2.6.33-
icc      Linux 2.6.33-

File & VM system latencies in microseconds - smaller is better
-------------------------------------------------------------------------------
Host                 OS   0K File      10K File     Mmap    Prot   Page   100fd
                        Create Delete Create Delete Latency Fault  Fault  selct
--------- ------------- ------ ------ ------ ------ ------- ----- ------- -----
gcc      Linux 2.6.33-  120.9  149.9  128.2  145.3   294.0 0.294 0.01440 1.047
icc      Linux 2.6.33-  116.4  170.3  132.9  165.6   296.0 0.286 0.01450 0.927

*Local* Communication bandwidths in MB/s - bigger is better
-----------------------------------------------------------------------------
Host                OS  Pipe AF    TCP  File   Mmap  Bcopy  Bcopy  Mem   Mem
                             UNIX      reread reread (libc) (hand) read write
--------- ------------- ---- ---- ---- ------ ------ ------ ------ ---- -----
gcc      Linux 2.6.33- 5093 5321 4415 4584.5 7715.2 4111.3 3850.1 5840 8076.
icc      Linux 2.6.33- 5009 5325 4300 4577.8 7707.1 4172.8 3830.4 5827 8129.

Memory latencies in nanoseconds - smaller is better
    (WARNING - may not be correct, check graphs)
------------------------------------------------------------------------------
Host                 OS   Mhz   L1 $   L2 $    Main mem    Rand mem    Guesses
--------- -------------   ---   ----   ----    --------    --------    -------
gcc      Linux 2.6.33-  2784 1.4370 3.5920        21.6        89.1
icc      Linux 2.6.33-  2784 1.4370 3.5930        21.6        87.4