Configuration Files¶
Format of counter configuration file (argument passed to --counterfile
)¶
To specify a set of performance counters to enable when profiling from the
command line, pass the name of a configuration file to the --counterfile
option. The format of this configuration file is one counter name per line.
Counter names are case-sensitive. You can generate a set of counter files
containing every available counter using the following command line:
rcprof --list --outputfile counters.txt
An example of the contents of this file is given below:
Wavefronts
VALUInsts
SALUInsts
VFetchInsts
SFetchInsts
VWriteInsts
LDSInsts
GDSInsts
VALUUtilization
VALUBusy
SALUBusy
FetchSize
WriteSize
L1CacheHit
L1CacheHit
MemUnitBusy
MemUnitStalled
WriteUnitStalled
LDSBankConflict
Format of kernel list configuration file (argument passed to --kernellistfile
)¶
To specify a set of kernels to profile when collecting performance counters
from the command line, pass the name of a configuration file to the
--kernellistfile
option. The format of this configuration file is one
kernel name per line. Kernel names are case-sensitive. When specified, any
kernels dispatched by the application that are not contained in the kernel list
configuration file will not be profiled.
An example of the contents of this file is given below:
MatrixMultiplyKernel
binarySearch
binomial_options
Format of API rules configuration file (argument passed to --apirulesfile
)¶
To specify a set of rules to use when generating the summary pages from a trace
file when using the command line, pass the name of a configuration file to the
--apirulesfile
option. The format of this file is one rule per line in the
NAME=VALUE
format. Note that the “VALUE” can be either “True” or “False”.
An example of the contents of this file is given below:
APITrace.APIRules.RefTracker=True
APITrace.APIRules.BlockingWrite=False
APITrace.APIRules.BadWorkGroupSize=True
APITrace.APIRules.RetCodeAnalyzer=True
APITrace.APIRules.DataTransferAnalyzer=True
APITrace.APIRules.SyncAnalyzer=True
APITrace.APIRules.DeprecatedFunctionAnalyzer=True
Format of API filter configuration file (argument passed to --apifilterfile
)¶
To ignore a set of APIs when collecting an API trace using the command line,
pass the name of a configuration file to the --apifilterfile
option. The
format of this file is one API name per line.
An example of the contents of this file for an OpenCL is given below:
clGetPlatformIDs
clGetPlatformInfo
clGetDeviceIDs
clGetDeviceInfo
clGetContextInfo
clGetCommandQueueInfo
clGetSupportedImageFormats
clGetMemObjectInfo
clGetImageInfo
clGetSamplerInfo
clGetProgramInfo
clGetProgramBuildInfo
clGetKernelInfo
clGetKernelWorkGroupInfo
clGetEventInfo
clGetEventProfilingInfo
Format of environment variable file (argument passed to --envvarfile
)¶
To specify a set of environment variables to be defined for the application
being profiled, pass the name of a configuration file to the --envvarfile
option. The format of this file is one environment variable per line in the
NAME=VALUE
format.
An example of the contents of this file is given below:
APPLICATION_DATA_DIR=c:\path\to\app\data
DEBUG_FLAG=True
LOG_FILE=c:\temp\logfile.log
Format of occupancy display configuration file (argument passed to --occupancydisplay
)¶
A Kernel Occupancy HTML display file can be generated in one of two ways. Both
involve passing a file to the --occupancydisplay
switch.
The first way to generate the HTML file is to pass a previously-generated
.occupancy file to --occupancydisplay
. This must be used in conjunction with
the --occupancyindex
switch to specify which occupancy data from the specified
.occupancy file should be used to generate the display file. The argument
passed to --occupancyindex
is a zero-based index.
The second way is a legacy path which involves passing a file manually
generated from an .occupancy file. The format of this configuration file is one
parameter per line in the NAME=VALUE
format. The “VALUES” are taken from a
generated .occupancy file for a particular kernel.
An example of the contents of this file is given below:
ThreadID=3364
CallIndex=101
KernelName=reduce
DeviceName=Capeverde
ComputeUnits=10
MaxWavesPerComputeUnit=40
MaxWorkGroupPerComputeUnit=16
MaxVGPRs=256
MaxSGPRs=512
MaxLDS=32768
UsedVGPRs=11
UsedSGPRs=20
UsedLDS=4096
WavefrontSize=64
WorkGroupSize=256
WavesPerWorkGroup=4
MaxWorkGroupSize=256
MaxWavesPerWorkGroup=4
GlobalWorkSize=256
MaxGlobalWorkSize=16777216
WavesLimitedByVGPR=40
WavesLimitedBySGPR=40
WavesLimitedByLDS=32
WavesLimitedByWorkgroup=40
Occupancy=80
DeviceGfxIpVer=6
SimdsPerCU=4
This second method is currently used by the CodeXL UI. It is much easier to use the first method when manually generating Occupancy Display files using the profiler command line.