Issue
I'm using perf
for profiling on Ubuntu 20.04 (though I can use any other free tool). It allows to pass a delay in CLI, so that event collection starts after a certain time since program launch. However, this time varies a lot (by 20 seconds out of 1000) and there are tail computations which I am not interested in either.
So it would be great to call some API from my program to start perf
event collection for the fragment of code I'm interested in, and then stop collection after the code finishes.
It's not really an option to run the code in a loop because there is a ~30 seconds initialization phase and 10 seconds measurement phase and I'm only interested in the latter.
Solution
There is an inter-process communication mechanism to achieve this between the program being profiled (or a controlling process) and the perf process: Use the --control
option in the format --control=fifo:ctl-fifo[,ack-fifo]
or --control=fd:ctl-fd[,ack-fd]
as discussed in the perf-stat(1) manpage. This option specifies either a pair of pathnames of FIFO files (named pipes) or a pair of file descriptors. The first file is used for issuing commands to enable or disable all events in any perf process that is listening to the same file. The second file, which is optional, is used to check with perf when it has actually executed the command.
There is an example in the manpage that shows how to use this option to control a perf process from a bash script, which you can easily translate to C/C++:
ctl_dir=/tmp/
ctl_fifo=${ctl_dir}perf_ctl.fifo
test -p ${ctl_fifo} && unlink ${ctl_fifo}
mkfifo ${ctl_fifo}
exec ${ctl_fd}<>${ctl_fifo} # open for read+write as specified FD
This first checks the file /tmp/perf_ctl.fifo
, if exists, is a named pipe and only then it deletes it. It's not a problem if the file doesn't exist, but if it exists and it's not a named pipe, the file should not be deleted and mkfifo
should fail instead. The mkfifo
creates a named pipe with the pathname /tmp/perf_ctl.fifo
. The next command then opens the file with read/write permissions and assigns the file descriptor to ctl_fd
. The equivalent syscalls are fstat
, unlink
, mkfifo
, and open
. Note that the named pipe will be written to by the shell script (controlling process) or the process being profiled and will be read from the perf process. The same commands are repeated for the second named pipe, ctl_fd_ack
, which will be used to receive acknowledgements from perf.
perf stat -D -1 -e cpu-cycles -a -I 1000 \
--control fd:${ctl_fd},${ctl_fd_ack} \
-- sleep 30 &
perf_pid=$!
This forks the current process and runs the perf stat
program in the child process, which inherits the same file descriptors. The -D -1
option tells perf to start with all events disabled. You probably need to change the perf options as follows:
perf stat -D -1 -e <your event list> --control fd:${ctl_fd},${ctl_fd_ack} -p pid
In this case, the program to be profiled is the the same as the controlling process, so tell perf to profile your already running program using -p
. The equivalent syscalls are fork
followed by execv
in the child process.
sleep 5 && echo 'enable' >&${ctl_fd} && read -u ${ctl_fd_ack} e1 && echo "enabled(${e1})"
sleep 10 && echo 'disable' >&${ctl_fd} && read -u ${ctl_fd_ack} d1 && echo "disabled(${d1})"
The example script sleeps for about 5 seconds, writes 'enable' to the ctl_fd
pipe, and then checks the response from perf to ensure that the events have been enabled before proceeding to disable the events after about 10 seconds. The equivalent syscalls are write
and read
.
The rest of the script deletes the file descriptors and the pipe files.
Putting it all together now, your program should look like this:
/* PART 1
Initialization code.
*/
/* PART 2
Create named pipes and fds.
Fork perf with disabled events.
perf is running now but nothing is being measured.
You can redirect perf output to a file if you wish.
*/
/* PART 3
Enable events.
*/
/* PART 4
The code you want to profile goes here.
*/
/* PART 5
Disable events.
perf is still running but nothing is being measured.
*/
/* PART 6
Cleanup.
Let this process terminate, which would cause the perf process to terminate as well.
Alternatively, use `kill(pid, SIGINT)` to gracefully kill perf.
perf stat outputs the results when it terminates.
*/
Answered By - Hadi Brais Answer Checked By - Cary Denson (WPSolving Admin)