[m-rev.] for review: include program name and date/time in deep prof data files' names

Julien Fischer jfischer at opturion.com
Sat Oct 5 15:06:02 AEST 2024


On Fri, 4 Oct 2024 at 21:37, Zoltan Somogyi <zoltan.somogyi at runbox.com> wrote:
>
> better names than Deep.data for profiling data.

I assume the rest of that was accidently not included.

> runtime/mercury_deep_profiling.c:
>     Replace the "Deep" in "Deep.data" and "Deep.procrep" with
>
>     - the name of the executable program, and
>     - the date and time of the profiling run,
>
>     yielding names such as
>
>     mercury_compile_2024_10_4_13_6_22.data, and
>     mercury_compile_2024_10_4_13_6_22.procrep.
>
> doc/user_guide.texi:
>     Document the new filenames.
>
> NEWS.md:
>     Announce the change.
>
> diff --git a/NEWS.md b/NEWS.md
> index 9111c9e3f..cda47c312 100644
> --- a/NEWS.md
> +++ b/NEWS.md
> @@ -1397,6 +1397,15 @@ Changes to the Mercury debugger
>  * The `stack` command now identifies cliques by drawing boxes next to them
>    on the left hand side of the screen.
>
> +Changes to the Mercury deep profiler
> +-------------------------------
> +
> +* Programs that have been compiled in a deep profiling grade will now
> +  put profiling data in files whose names contain both the name of the program
> +  being profiled, and the date and time of the profiling run. So instead
> +  of Deep.data and Deep.procrep, the output will go into files whose names
> +  have the form <prog>_<datetime>.data <prog>_<datetime>.procrep.

Add the word "and" between the two file name templates.

...

> diff --git a/doc/user_guide.texi b/doc/user_guide.texi
> index b7ab0e2a1..3d4ac0628 100644
> --- a/doc/user_guide.texi
> +++ b/doc/user_guide.texi
> @@ -5986,7 +5986,9 @@ Executables compiled with @samp{--memory-profiling}
>  will use two of those files (@file{Prof.Decl} and @file{Prof.CallPair})
>  and two others: @file{Prof.MemoryWords} and @file{Prof.MemoryCells}.
>  Executables compiled with @samp{--deep-profiling}
> -save profiling data in a single file, @file{Deep.data}.
> +save profiling data in two files whose names will have form
> + at file{@var{programname}_ at var{date}_ at var{time}.data} and
> + at file{@var{programname}_ at var{date}_ at var{time}.procrep}.

You should add that on Windows the ".exe" extension will be omitted from
programname.

(Creating deep profiles on Windows should work, for at least some of the
profiling metrics; the machinery for viewing deep profiles won't work, but it
should be possible to view those profiles on another machine.)

>  Executables compiled with the @samp{--threadscope} option write profiling data
>  to a file whose name is that of the program being profiled with the extension
>  @samp{.eventlog}.
> @@ -6335,7 +6337,9 @@ of the program may be significantly higher than in non-memory profiling grades.
>
>  The user interface of the deep profiler is a browser.
>  To display the information contained in a deep profiling data file
> -(which will be called @file{Deep.data} unless you renamed it),
> +(whose name will have the form
> + at file{@var{programname}_ at var{date}_ at var{time}.data}
> +unless you renamed it),
>  start up your browser and give it a URL of the form
>  @file{http://server.domain.name/cgi-bin/mdprof_cgi?/full/path/name/Deep.data}.
>  The @file{server.domain.name} part should be the name of a machine
> @@ -6344,11 +6348,13 @@ it should have a web server running on it,
>  and it should have the @samp{mdprof_cgi} program installed in
>  the web server's CGI program directory.
>  (On many Linux systems, this directory is @file{/usr/lib/cgi-bin}.)
> -The @file{/full/path/name/Deep.data} part
> +The @file{/full/path/name/@var{programname}_ at var{date}_ at var{time}.data} part
>  should be the full path name of the deep profiling data file
>  whose data you wish to explore.
>  The name of this file must not have percent signs in it,
>  and it must end in the suffix @file{.data}.
> +(The deep profiler will replace this suffix with @file{.procrep}
> +to get access to the other file generated by the profiling run.)
>
>  When you start up @samp{mdprof} using the command above,
>  you will see a list of the usual places

...

> diff --git a/runtime/mercury_deep_profiling.c b/runtime/mercury_deep_profiling.c
> index 0f60a94c9..798d0ef6f 100644
> --- a/runtime/mercury_deep_profiling.c
> +++ b/runtime/mercury_deep_profiling.c
> @@ -249,7 +250,9 @@ MR_deep_profile_update_method_history(void)
>
>  // Functions for writing out the data at the end of the execution.
>
> -static  void    MR_deep_data_output_error(const char *msg, const char *file);
> +static  void    MR_deep_data_output_error(const char *msg,
> +                    const char *bad_filename, const char *data_filename,
> +                    const char *procrep_filename);
>  static  void    MR_write_out_profiling_tree_check_unwritten(FILE *check_fp);
>
>  static  void    MR_write_out_deep_id_string(FILE *fp);
> @@ -366,6 +369,8 @@ static  FILE        *debug_fp;
>  #define MR_MDPROF_DATA_FILENAME     "Deep.data"
>  #define MR_MDPROF_PROCREP_FILENAME  "Deep.procrep"
>
> +#define MR_FILENAME_BUF_LEN     1024
> +
>  void
>  MR_write_out_profiling_tree(void)
>  {
> @@ -377,22 +382,80 @@ MR_write_out_profiling_tree(void)
>      unsigned                num_call_seqs;
>      int64_t                 table_sizes_offset;
>      char                    errbuf[MR_STRERROR_BUF_SIZE];
> +    char                    prog_name[MR_FILENAME_BUF_LEN];
> +    char                    date_name[MR_FILENAME_BUF_LEN];
> +    static char             data_filename[MR_FILENAME_BUF_LEN];
> +    static char             procrep_filename[MR_FILENAME_BUF_LEN];
> +    time_t                  seconds_since_epoch;
> +    struct tm               *tm;
>
>  #ifdef MR_DEEP_PROFILING_STATISTICS
>      int                     i;
>  #endif
>
> -    deep_fp = fopen(MR_MDPROF_DATA_FILENAME, "wb+");
> +    if (MR_progname != NULL) {

By design, MR_progname will never be NULL after the runtime has finished
processing argv (see MR_process_args() in runtime/mercury_wrapper.c).
You should check the value of the variable MR_progname_is_known.

> +        // Put the profiling output into the current directory.
> +        const char  *p;
> +        const char  *last_slash_ptr;
> +
> +        // A multi-byte UTF-8 character cannot contain any ASCII bytes,
> +        // and that includes both '0' and '/'.
> +        last_slash_ptr = NULL;
> +        for (p = MR_progname; *p != '\0'; p++) {
> +            if (*p == '/') {
> +                last_slash_ptr = p;
> +            }
> +        }

Use MR_get_progame_basename from runtime/mercury_runtime_util.h for this,
since that handles Windows correctly.

> +        if (last_slash_ptr != NULL) {
> +            strncpy(prog_name, last_slash_ptr + 1, MR_FILENAME_BUF_LEN);
> +        } else {
> +            strncpy(prog_name, MR_progname, MR_FILENAME_BUF_LEN);
> +        }
> +    } else {
> +        strncpy(prog_name, "Deep", MR_FILENAME_BUF_LEN);
> +    }
> +
> +    strncpy(data_filename, prog_name, MR_FILENAME_BUF_LEN);
> +    strncpy(procrep_filename, prog_name, MR_FILENAME_BUF_LEN);
> +
> +    tm = NULL;
> +    seconds_since_epoch = time(NULL);
> +    if (seconds_since_epoch > 0) {
> +        tm = localtime(&seconds_since_epoch);
> +        if (tm != NULL) {
> +            // XXX Is there a way to format this data+time that
> +            // - more readily identifies each component,
> +            // - but does not use characters that may not occur
> +            //   in filenames, such as '/', and
> +            // - also does not use characters that would cause problems
> +            //   in shell commands?
> +            // For example, would A-B-C_D:E:F be better?

Given that ":" is a drive specifier on Windows, please don't use that.

> +            //
> +            // And should the separation char between the program name and
> +            // the date and time be something other than an underscore?
> +            snprintf(date_name, MR_FILENAME_BUF_LEN, "_%d_%d_%d_%d_%d_%d",
> +                tm->tm_year + 1900, tm->tm_mon + 1, tm->tm_mday,
> +                tm->tm_hour, tm->tm_min, tm->tm_sec);
> +            strncat(data_filename, date_name, MR_FILENAME_BUF_LEN);
> +            strncat(procrep_filename, date_name, MR_FILENAME_BUF_LEN);

I would be inclined to borrow the 'T' separator used in the ISO 8061 formats
and use that to separate the date and time components. The timestamp could
either be:

   YYYMMDD'T'HHMMSS (e.g. 20241005T145900)

which is compact,

or you could use '-' to separate the date and time components.

   YYYY-MM-DD'T'HH-MM-SS (e.g. 2024-10-05T14-59-00)

which is a bit more readable.

(Given that ':' has a special meaning in some file systems I would avoid
it entirely.)

Regardless of which, the month, day, hour, minutes and second fields
should have a minimum width
of two and be zero padded, e.g. 02 for February, rather than just 2.

> +    }
> +
> +    strncat(data_filename, ".data", MR_FILENAME_BUF_LEN);
> +    strncat(procrep_filename, ".procrep", MR_FILENAME_BUF_LEN);
> +
> +    deep_fp = fopen(data_filename, "wb+");

Put an XXX WINDOWS against this; given that data_filename can now potentially
contain non-ASCII characters, using fopen() is not the appropriate thing to do.
(I will take a look at anything marked XXX WINDOWS, since I fixed a bunch of
similar problems back in January.)

>      if (deep_fp == NULL) {
>          MR_fatal_error("cannot open `%s' for writing: %s",
> -            MR_MDPROF_DATA_FILENAME,
> +            data_filename,
>              MR_strerror(errno, errbuf, sizeof(errbuf)));
>      }
>
> -    procrep_fp = fopen(MR_MDPROF_PROCREP_FILENAME, "wb+");
> +    procrep_fp = fopen(procrep_filename, "wb+");
>      if (procrep_fp == NULL) {
>          MR_fatal_error("cannot open `%s' for writing: %s",
> -            MR_MDPROF_PROCREP_FILENAME,
> +            procrep_filename,
>              MR_strerror(errno, errbuf, sizeof(errbuf)));
>      }
>
> @@ -414,7 +477,7 @@ MR_write_out_profiling_tree(void)
>      table_sizes_offset = MR_ftell(deep_fp);
>      if (table_sizes_offset == -1) {
>          MR_deep_data_output_error("ftell failed for ",
> -            MR_MDPROF_DATA_FILENAME);
> +            data_filename, data_filename, procrep_filename);
>      }
>      MR_write_fixed_size_int(deep_fp, 0);
>      MR_write_fixed_size_int(deep_fp, 0);
> @@ -468,7 +531,7 @@ MR_write_out_profiling_tree(void)
>
>      if (MR_fseek(deep_fp, table_sizes_offset, SEEK_SET) != 0) {
>          MR_deep_data_output_error("cannot seek to header of",
> -            MR_MDPROF_DATA_FILENAME);
> +            data_filename, data_filename, procrep_filename);
>      }
>
>      MR_write_fixed_size_int(deep_fp, MR_call_site_dynamic_table->last_id);
> @@ -477,13 +540,14 @@ MR_write_out_profiling_tree(void)
>      MR_write_fixed_size_int(deep_fp, MR_proc_layout_table->last_id);
>
>      if (fclose(deep_fp) != 0) {
> -        MR_deep_data_output_error("cannot close", MR_MDPROF_DATA_FILENAME);
> +        MR_deep_data_output_error("cannot close",
> +            data_filename, data_filename, procrep_filename);
>      }
>
>      putc(MR_no_more_modules, procrep_fp);
>      if (fclose(procrep_fp) != 0) {
>          MR_deep_data_output_error("cannot close",
> -            MR_MDPROF_PROCREP_FILENAME);
> +            procrep_filename, data_filename, procrep_filename);
>      }
>
>  #ifdef MR_DEEP_PROFILING_STATISTICS
> @@ -656,7 +720,8 @@ MR_write_out_profiling_tree(void)
>  }
>
>  static void
> -MR_deep_data_output_error(const char *op, const char *filename)
> +MR_deep_data_output_error(const char *op, const char *filename,
> +    const char *data_filename, const char *procrep_filename)
>  {
>      char    errbuf[MR_STRERROR_BUF_SIZE];
>
> @@ -667,15 +732,15 @@ MR_deep_data_output_error(const char *op, const char *filename)
>      // misunderstandings about that, and may also cure a disk-full condition,
>      // if the close failure was caused by that.
>
> -    if (remove(MR_MDPROF_DATA_FILENAME) != 0) {
> +    if (remove(data_filename) != 0) {

Add an XXX WINDOWS: this probably wants to be _wremove() on Windows.

>          MR_warning("cannot remove %s: %s",
> -            MR_MDPROF_DATA_FILENAME,
> +            data_filename,
>              MR_strerror(errno, errbuf, sizeof(errbuf)));
>      }
>
> -    if (remove(MR_MDPROF_PROCREP_FILENAME) != 0) {
> +    if (remove(procrep_filename) != 0) {

Ditto.

The rest looks fine.

Julien.


More information about the reviews mailing list