One question which I see written about often on the Web, is how to find out certain stats about our GPU, under Linux. Under Windows, we had GUI-based programs such as ‘GPU-Z’, etc., but under Linux, the information can be just a bit harder to find.
I think that one tool which helps, is to have ‘OpenCL’ installed, as well as the command-line utility ‘clinfo’, which exists as one out of several packages, and as an actual, resulting command-name.
If we’re serious about programming our GPU, then having a GUI won’t help us much. We’d need to get dirty with code in that case, and then to have text-based solutions is suitable. But, if we’re just spectators in this sport, then two stats we may nevertheless want to know are:
- How many GPU-Core-Groups do we have – since GPU-Cores are organized as Groups, and
- How many actual Shader-Cores do we have in each Group?
Interestingly, the grouping of shader-cores, also represents how many vector-processors such GPU-computing tools as OpenCL see. And so, on the computer which I name ‘Klystron’, which is running Debian / Jessie, when typing in these commands as user, I get the following results:
dirk@Klystron:~$ clinfo | grep units
Max compute units: 4
Max compute units: 6
dirk@Klystron:~$ clinfo | grep multiple
Kernel Preferred work group size multiple: 1
Kernel Preferred work group size multiple: 64
dirk@Klystron:~$
This needs some explaining. On ‘Klystron’, I have the proprietary, AMD packages for OpenCL installed, since that computer has both an AMD CPU and a Radeon GPU. And this means that the OpenCL version will be able to carry out computing on both. And so I have the stats for both.
In this case, the second entries reveal that I have 6×64 cores on the GPU.
But if I try the same experiment on the recently-installed computer ‘Plato’, which is running Debian / Stretch, I only get partial results, with the error-message ‘Invalid Source':
dirk@Plato:~$ clinfo | grep units
=== CL_PROGRAM_BUILD_LOG ===
invalid source Max compute units 7
dirk@Plato:~$ clinfo | grep multiple
=== CL_PROGRAM_BUILD_LOG ===
invalid source Preferred work group size multiple invalid source
dirk@Plato:~$
It can tell me how many Groups I have – In the above box, some readers may have to scroll sideways to see this, but not how many Cores, in each group…
And the first-order explanation for why would be, that the ‘clinfo’ command is a mini-OpenCL program, which is written in C for the GPU, but which merely fetches the stats. Apparently, on ‘Plato’, some of the source-code written in ‘clinfo’ is not recognized by the OpenCL framework.
A closer look at the problem explains it better. On ‘Klystron’, my OpenCL installation is the proprietary AMD version of the packages – including ‘amd-clinfo
‘, while on ‘Plato’, I only have the Mesa-Drivers, even for OpenCL, installed.
For this type of thing, of course it’s always better to be using the proprietary drivers. But in order to install the NVIDIA OpenCL Drivers on ‘Plato’, I’d also need to have several G.P. Graphics Drivers installed, including the proprietary as well as the generic ones, as I do on ‘Klystron’, and this means I’d need to install the ‘…alternatives…’ packages on ‘Plato’ as well, that presently allow me to switch back and forth between drivers, on ‘Klystron’.
For the moment, I think that would threaten what I have going with ‘Plato’ too much – which has been, a very stable experience. But this also means that with the Mesa Drivers alone, I cannot rely on doing any OpenCL computing, since even the demands of the generic ‘clinfo
‘ package go beyond, what those OpenCL drivers can compile.
For now:
dirk@Plato:~$ clinfo
Number of platforms 1
Platform Name Clover
Platform Vendor Mesa
Platform Version OpenCL 1.1 Mesa 13.0.6
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_icd
Platform Extensions function suffix MESA
Platform Name Clover
Number of devices 1
Device Name NVC4
Device Vendor NVIDIA
Device Vendor ID 0x10de
Device Version OpenCL 1.1 Mesa 13.0.6
Driver Version 13.0.6
Device OpenCL C Version OpenCL C 1.1
Device Type GPU
Device Profile FULL_PROFILE
Max compute units 7
Max clock frequency 512MHz
Max work item dimensions 3
Max work item sizes 1024x1024x64
Max work group size 1024
=== CL_PROGRAM_BUILD_LOG ===
invalid source Preferred work group size multiple invalid source
Preferred / native vector sizes
char 16 / 16
short 8 / 8
int 4 / 4
long 2 / 2
half 0 / 0 (n/a)
float 4 / 4
double 2 / 2 (cl_khr_fp64)
Half-precision Floating-point support (n/a)
Single-precision Floating-point support (core)
Denormals No
Infinity and NANs Yes
Round to nearest Yes
Round to zero No
Round to infinity No
IEEE754-2008 fused multiply-add No
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Double-precision Floating-point support (cl_khr_fp64)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Address bits 64, Little-Endian
Global memory size 1099511627776 (1024GiB)
Error Correction support No
Max memory allocation 1099511627776 (1024GiB)
Unified memory for Host and Device Yes
Minimum alignment for any data type 128 bytes
Alignment of base address 1024 bits (128 bytes)
Global Memory cache type None
Image support No
Local memory type Local
Local memory size 49152 (48KiB)
Max constant buffer size 65536 (64KiB)
Max number of constant args 15
Max size of kernel argument 4096 (4KiB)
Queue properties
Out-of-order execution No
Profiling Yes
Profiling timer resolution 0ns
Execution capabilities
Run OpenCL kernels Yes
Run native kernels No
Device Available Yes
Compiler Available Yes
Device Extensions cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_fp64
NULL platform behavior
clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) Clover
clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) Success [MESA]
clCreateContext(NULL, ...) [default] Success [MESA]
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) Success (1)
Platform Name Clover
Device Name NVC4
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (1)
Platform Name Clover
Device Name NVC4
ICD loader properties
ICD loader Name OpenCL ICD Loader
ICD loader Vendor OCL Icd free software
ICD loader Version 2.2.11
ICD loader Profile OpenCL 2.1
dirk@Plato:~$
Dirk
5 thoughts on “Finding Out, How Many GPU Cores we have, Under Linux”