Member flamegraphs?

I was recently trying to profile a simple program in c++. I then found out about flamegraphs. I’ll write down here for my future reference the steps I followed to generate one.

First, install FlameGraph by cloning the GitHub repository:

git clone https://github.com/brendangregg/FlameGraph.git

Run perf to profile your program. The basic syntax for profiling a program with perf is as follows:

perf record ./program [arguments]

This will start the profiling process and run your program. When your program finishes, perf will generate a profile file containing information about the performance of your program.

Analyze the profile file generated by perf. The basic syntax for analyzing the profile file is as follows:

perf report

This will generate a report that summarizes the performance of your program. You can use various options to customize the report, such as --sort, which sorts the output by a particular metric, or --stdio, which generates a report that can be piped to another command.

You can also use various other options with perf to customize the profiling process. For example, you can use -e to specify the events to monitor, such as CPU cycles or cache misses, or --call-graph to generate a call graph that shows which functions are consuming the most resources. You can use perf help or perf help <command> to get more information about the available options.

Use perf script to generate a script file from the perf.data file generated by perf record:

perf script > perf.script

This command will generate a script file named perf.script in the current directory.

Use the stackcollapse-perf.pl script provided by FlameGraph to convert the perf script file to a format that can be used by FlameGraph:

./FlameGraph/stackcollapse-perf.pl perf.script > perf.folded

This command will generate a folded stack file named perf.folded in the current directory.

Use the flamegraph.pl script provided by FlameGraph to generate the flame graph from the folded stack file:

./FlameGraph/flamegraph.pl perf.folded > perf.svg

This command will generate a flame graph named perf.svg in the current directory.

The resulting flame graph will show the call stack of your application, with the width of each block proportional to the amount of time spent in each function. You can open the flame graph in a web browser or an image viewer to explore the performance characteristics of your application.

IF NO STACK COUNTS found

The error message “ERROR: No stack counts found” typically occurs when FlameGraph cannot find any stack samples in the input file. This can happen if there are no samples recorded in the perf.data file, or if the perf command was not configured to record call stacks.

If there are no samples recorded in the perf.data file, FlameGraph will not be able to generate a flame graph. To confirm that there are samples recorded in the perf.data file, you can use the perf report command to generate a text report:

perf report -i perf.data

By default, perf records only a limited amount of information about each sample. If there are no samples recorded, you need to adjust the perf command options to collect more samples. To collect call stacks, you need to specify the --call-graph dwarf option when running perf record:

perf record --call-graph dwarf ./program

This will instruct perf to collect call stacks using the Dwarf stack unwinder. If you are using a different unwinder, you may need to specify a different option.

Note that FlameGraph expects the input file to be in a specific format, with one line per stack sample. If the input file is not in this format, FlameGraph will not be able to generate a flame graph. To confirm that the input file is in the correct format, you can use the perf script command to generate a script file from the perf.data file, and then inspect the contents of the script file:

perf script -i perf.data > perf.script
less perf.script

Each line of the script file should represent a stack sample, with one or more stack frames separated by semicolons.

Summary

To recap, the sequence of commands that worked for me is:

perf record --call-graph dwarf ./program.out
perf script > perf.script
/opt/FlameGraph/stackcollapse-perf.pl perf.script > perf.folded
/opt/FlameGraph/flamegraph.pl perf.folded > perf.svg
open perf.svg
Written on October 22, 2023