Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[InstrProf] Add frontend temporal profiling flag #122385

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 32 additions & 0 deletions clang/docs/UsersManual.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3035,6 +3035,38 @@ indexed format, regardeless whether it is produced by frontend or the IR pass.
overhead. ``prefer-atomic`` will be transformed to ``atomic`` when supported
by the target, or ``single`` otherwise.

.. option:: -fprofile-generate-temporal
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mild suggestion: -fgenerate-temporal-profile or -ftemporal-profile-generate seems less awkward to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I somewhat agree, but I think it's more important to keep consistent with the other flags. What do others think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair. Maybe -fprofile-temporal-generate? Putting temporal closer to profile seems like an improvement. But I'm fine with leaving things as-is if others feel strongly about it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this option is used together with the primary option -fprofile-generate, the naming consistency is not important. I suggest making it even simpler: -ftemporal-profile


Enables the temporal profiling extension for IRPGO to improve startup time by
reducing ``.text`` section page faults. To do this, we instrument function
timestamps to measure when each function is called for the first time and use
this data to generate a function order to improve startup.

The profile is generated as normal.

.. code-block:: console

$ clang++ -O2 -fprofile-generate -fprofile-generate-temporal code.cc -o code
$ ./code
$ llvm-profdata merge -o code.profdata yyy/zzz

Using the resulting profile, we can either generate a function order to pass
to the linker.

.. code-block:: console

$ llvm-profdata order code.profdata -o code.orderfile
$ clang++ -O2 -Wl,--symbol-ordering-file=code.orderfile code.cc -o code

Or it can be passed to the linker directly.

.. code-block:: console

$ clang++ -O2 -Wl,--irpgo-profile=code.profdata,--bp-startup-sort=function code.cc -o code
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-fuse-ld=lld for ELF. (Other linkers will unlikely get the feature in the short time)

You probably want to mention the Mach-O way as well


For more information, please read the RFC:
https://discourse.llvm.org/t/rfc-temporal-profiling-extension-for-irpgo/68068

Fine Tuning Profile Collection
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Expand Down
5 changes: 4 additions & 1 deletion clang/include/clang/Driver/Options.td
Original file line number Diff line number Diff line change
Expand Up @@ -1796,6 +1796,9 @@ def fprofile_generate_cold_function_coverage : Flag<["-"], "fprofile-generate-co
def fprofile_generate_cold_function_coverage_EQ : Joined<["-"], "fprofile-generate-cold-function-coverage=">,
Group<f_Group>, Visibility<[ClangOption, CLOption]>, MetaVarName<"<directory>">,
HelpText<"Generate instrumented code to collect coverage info for cold functions into <directory>/default.profraw (overridden by LLVM_PROFILE_FILE env var)">;
def fprofile_generate_temporal : Flag<["-"], "fprofile-generate-temporal">,
Group<f_Group>, Visibility<[ClangOption, CLOption]>,
HelpText<"Generate instrumented code to collect temporal information">;
def fprofile_instr_generate : Flag<["-"], "fprofile-instr-generate">,
Group<f_Group>, Visibility<[ClangOption, CLOption]>,
HelpText<"Generate instrumented code to collect execution counts into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">;
Expand Down Expand Up @@ -1891,7 +1894,7 @@ defm pseudo_probe_for_profiling : BoolFOption<"pseudo-probe-for-profiling",
" pseudo probes for sample profiling">>;
def forder_file_instrumentation : Flag<["-"], "forder-file-instrumentation">,
Group<f_Group>, Visibility<[ClangOption, CC1Option, CLOption]>,
HelpText<"Generate instrumented code to collect order file into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var). Deprecated, please use temporal profiling.">;
HelpText<"Generate instrumented code to collect order file into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var). Deprecated, please use -fprofile-generate-temporal">;
def fprofile_list_EQ : Joined<["-"], "fprofile-list=">,
Group<f_Group>, Visibility<[ClangOption, CC1Option, CLOption]>,
HelpText<"Filename defining the list of functions/files to instrument. "
Expand Down
10 changes: 9 additions & 1 deletion clang/lib/Driver/ToolChains/Clang.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -662,6 +662,14 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, Compilation &C,
CmdArgs.push_back("--pgo-function-entry-coverage");
}

if (auto *A = Args.getLastArg(options::OPT_fprofile_generate_temporal)) {
if (!PGOGenerateArg && !CSPGOGenerateArg)
D.Diag(clang::diag::err_drv_argument_only_allowed_with)
<< A->getSpelling() << "-fprofile-generate or -fcs-profile-generate";
CmdArgs.push_back("-mllvm");
CmdArgs.push_back("--pgo-temporal-instrumentation");
}

Arg *PGOGenArg = nullptr;
if (PGOGenerateArg) {
assert(!CSPGOGenerateArg);
Expand Down Expand Up @@ -8050,7 +8058,7 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
Args.getLastArg(options::OPT_forder_file_instrumentation)) {
D.Diag(diag::warn_drv_deprecated_arg)
<< A->getAsString(Args) << /*hasReplacement=*/true
<< "-mllvm -pgo-temporal-instrumentation";
<< "-fprofile-generate-temporal";
CmdArgs.push_back("-forder-file-instrumentation");
// Enable order file instrumentation when ThinLTO is not on. When ThinLTO is
// on, we need to pass these flags as linker flags and that will be handled
Expand Down
2 changes: 1 addition & 1 deletion clang/test/Driver/clang_f_opts.c
Original file line number Diff line number Diff line change
Expand Up @@ -424,7 +424,7 @@
// CHECK-WARNING-DAG: optimization flag '-fno-devirtualize-speculatively' is not supported
// CHECK-WARNING-DAG: the flag '-fslp-vectorize-aggressive' has been deprecated and will be ignored
// CHECK-WARNING-DAG: the flag '-fno-slp-vectorize-aggressive' has been deprecated and will be ignored
// CHECK-WARNING-DAG: argument '-forder-file-instrumentation' is deprecated, use '-mllvm -pgo-temporal-instrumentation' instead
// CHECK-WARNING-DAG: argument '-forder-file-instrumentation' is deprecated, use '-fprofile-generate-temporal' instead

// Test that we mute the warning on these
// RUN: %clang -### -finline-limit=1000 -Wno-invalid-command-line-argument \
Expand Down
7 changes: 7 additions & 0 deletions clang/test/Driver/fprofile-generate-temporal.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
// RUN: %clang -### -c -fprofile-generate -fprofile-generate-temporal %s 2>&1 | FileCheck %s
// RUN: %clang -### -c -fcs-profile-generate -fprofile-generate-temporal %s 2>&1 | FileCheck %s
// RUN: not %clang -### -c -fprofile-generate-temporal %s 2>&1 | FileCheck %s --check-prefix=ERR

// CHECK: "-mllvm" "--pgo-temporal-instrumentation"

// ERR: '-fprofile-generate-temporal' only allowed with '-fprofile-generate or -fcs-profile-generate'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Include error: for driver error messages

Loading