The reason was that the invocation of the cpuid_tool through the makefile
was using LD_LIBRARY_PATH=. to force link&use of the latest-build libcpuid
library. This doesn't seem to work, so using LD_PRELOAD to explicitly load
libcpuid.so into cpuid_tool.
The committed approach of course doesn't work on Mac OS X, where
make test-old should be used.
msrdriver.c was forgotten from Makefile.am, under the assumption that
autotools are used under *nix only. This is not true, as one could use
MinGW or cygwin.
The reason for the apparent slowness was that the "cpuid_tool" that
was being called from run_tests.py is actually a shell script - a
wrapper, installed by libtool. The script was heavy enough to cause
substantial overhead. Just bypassing it (by using the real
cpuid_tool binary) reduced "make test" running time from 4.1 to 0.7s
on my machine.
It is a bit hacky, though, so "make test-old" is retained, which
uses the old invocation.
On Windows, we don't have the API that Linux provides, which can be used
to query MSRs of particular CPU cores. However, the same behaviour can be
emulated. Say that the driver handle object also stores 'dedicated thread
index'. When you call 'cpu_msr_driver_open()', this index is set to -1,
so further API functions do not force which core should be executing
RDMSR code. I.e. "I don't care on which core I run".
However, if this is non-negative number, the subsequent
functions like cpu_rdmsr() are forced to pass through this core by
using temporary affinity mask.
This function is similar to cpu_msr_driver_open(), but with core number as parameter
For Linux, core number was always 0. To be able to get all core temp/voltage, we need to set the core number.
They have L3 cache, and the detection code incorrectly assumed this is a Xeon
Irwindale variant due to an old and no longer valid classification check.
Correctly handle the XEON_IRWIN subcode and add an entry in the matchtable
to fix Sandy Bridge-E Xeon.
- move the INLINE_ASM_SUPPORTED guards outside the body of exec_cpuid, as
suggested by Genoil;
- copy the asm code of busy_sse_loop to masm-x64.asm. Some fixup was
required, because the microsoft calling convention doesn't expect
xmm6 & xmm7 to be clobbered in functions.
Confirmed that --clock-ic from cpuid_tool works with the resulting library.