История изменений

«В процессорах Intel есть специальные счетчики производительности, которые могут считать события на кольцевой шине (например, UNC_H_RING_AD_USED — использование кольца данных). Их можно прочитать утилитами perf или likwid-perfctr»

# likwid-perfctr -a
    Group name  Description
--------------------------------------------------------------------------------
   UOPS_RETIRE  UOPs retirement
    UOPS_ISSUE  UOPs issueing
     UOPS_EXEC  UOPs execution
          UOPS  UOPs execution info
           TMA  Top down cycle allocation
     TLB_INSTR  L1 Instruction TLB miss rate/ratio
      TLB_DATA  L2 data TLB miss rate/ratio
          SBOX  Ring Transfer bandwidth
      RECOVERY  Recovery duration
           QPI  QPI Link Layer data
          NUMA  Local and remote memory accesses
           MEM  Main memory bandwidth in MBytes/s
       L3CACHE  L3 cache miss rate/ratio
            L3  L3 cache bandwidth in MBytes/s
       L2CACHE  L2 cache miss rate/ratio
            L2  L2 cache bandwidth in MBytes/s
        ICACHE  Instruction cache miss rate/ratio
            HA  Main memory bandwidth in MBytes/s seen from Home agent
     FLOPS_AVX  Packed AVX MFLOP/s
   FALSE_SHARE  False sharing
        ENERGY  Power and Energy consumption
        DIVIDE  Divide unit information
          DATA  Load to store ratio
  CYCLE_STALLS  Cycle Activities (Stalls)
CYCLE_ACTIVITY  Cycle Activities
         CLOCK  Power and Energy consumption
          CBOX  CBOX related data and metrics
        CACHES  Cache bandwidth in MBytes/s
        BRANCH  Branch prediction miss rate/ratio

«Каждое ядро имеет уникальный номер на аппаратном уровне, он называется APIC ID (Advanced Programmable Interrupt Controller ID).»

«В процессорах Intel есть специальные счетчики производительности, которые могут считать события на кольцевой шине (например, UNC_H_RING_AD_USED — использование кольца данных). Их можно прочитать утилитами perf или likwid-perfctr»

# likwid-perfctr -a
    Group name  Description
--------------------------------------------------------------------------------
   UOPS_RETIRE  UOPs retirement
    UOPS_ISSUE  UOPs issueing
     UOPS_EXEC  UOPs execution
          UOPS  UOPs execution info
           TMA  Top down cycle allocation
     TLB_INSTR  L1 Instruction TLB miss rate/ratio
      TLB_DATA  L2 data TLB miss rate/ratio
          SBOX  Ring Transfer bandwidth
      RECOVERY  Recovery duration
           QPI  QPI Link Layer data
          NUMA  Local and remote memory accesses
           MEM  Main memory bandwidth in MBytes/s
       L3CACHE  L3 cache miss rate/ratio
            L3  L3 cache bandwidth in MBytes/s
       L2CACHE  L2 cache miss rate/ratio
            L2  L2 cache bandwidth in MBytes/s
        ICACHE  Instruction cache miss rate/ratio
            HA  Main memory bandwidth in MBytes/s seen from Home agent
     FLOPS_AVX  Packed AVX MFLOP/s
   FALSE_SHARE  False sharing
        ENERGY  Power and Energy consumption
        DIVIDE  Divide unit information
          DATA  Load to store ratio
  CYCLE_STALLS  Cycle Activities (Stalls)
CYCLE_ACTIVITY  Cycle Activities
         CLOCK  Power and Energy consumption
          CBOX  CBOX related data and metrics
        CACHES  Cache bandwidth in MBytes/s
        BRANCH  Branch prediction miss rate/ratio

«В процессорах Intel есть специальные счетчики производительности, которые могут считать события на кольцевой шине (например, UNC_H_RING_AD_USED — использование кольца данных). Их можно прочитать утилитами perf или likwid-perfctr»