[m-dev.] Mercury to HTML

Fergus Henderson fjh at cs.mu.OZ.AU
Thu Apr 4 22:51:26 AEST 2002


----- Forwarded message from Florian Haber <fhaber at rationalizer.com> -----
[...]

3.
just to relax i do some _real_ programming every now and then, and so i can send
you this little perl script (attached).
it turns a system of mercury modules into rather nicely cross-linked and indexed
html pages, and i found it quit useful. it doesn't analyze the source code to
deeply, but i have put some effort into mercury specific things, typeclasses
e.g. (an overloaded identifier will lead you to the typeclass declaration, where
you will find a list of all of it's instantiations...). i usually start with a
handwritten file 'intro.txt', supposed to be the title page of my documentation
and mentioning all the module names (well ordered), then run 'mercurydoc
intro.txt' in the directory where my mercury source files are. if you don't want
to write a title page, 'ls | mercurydoc' will do. you'll get more info with
'mercurydoc -h'.
i didn't know where to place this little contribution, so i just send it to you.
if you find it useful, please pass it on or tell me where to send it.

so much for today
sincerely
 - florian

-- 
Florian Haber
fhaber at rationalizer.com
The Rationalizer Intelligent Software AG
Rudower Chaussee 29, 12489 Berlin, Germany
Tel: +49 30 678060-16 
Fax: +49 30 678060-29

This e-mail contains confidential information which is the property
of THE RATIONALIZER. If you are not the proper recipient of the
message,
or if you have doubts about being the proper recipient, please delete
the message immediately. In this case, we would be delighted if you
could inform us under mail at rationalizer.com.



#! /usr/bin/perl

#-------------------------------------------------------------------------------
# Copyright (C) 2002 The Rationalizer.
# Author: Florian Haber
# This file may only be copied under the terms of the GNU General
# Public License - see http://www.gnu.org/licenses/gpl.txt
#-------------------------------------------------------------------------------


#   P A R T   I :   P R O C E S S   C O M M A N D   L I N E

#--- process options
while($ARGV[0]=~/^-+(.)(.*)/)
{    $options{$1} = $2||'on';
     shift;
}
$dir = $options{'d'}||'html';

#--- table of module files from the 'intro file' ---
if (! $ARGV[0]){ print STDERR "getting module names from stdin.. (ctrl-d for help)\n" };
undef $/;
$introdoc=<> unless($options{'h'}||$options{'?'});
while($introdoc =~ /(\w+)\.m/g)
{    push @modules,$1;
} 

#--- gernerate help screen ---
if(@modules==())
{   open  HELP, "|less >&2";
    print HELP <<"[HELP SCREEN]";
________________________________________________________________________________
DESCRIPTION
    This little script reads your mercury sources and turns them into html
    format with hyperlinked cross references and indices.
    It does not parse your code too deeply, but the results are still useful.

SYNTAX
    mercurydoc [OPTIONS] INTROFILE

OPTIONS
    -i       process interface parts only 
    -d<name> write html files into this directory (default './html')
    -a       analyze module structure for unnecessary imports and exports
    -?       display this message

INPUT
    The INTROFILE should be a html or plaintext description of your system and
    contain all the module names (including the '.m') in a meaningful order.
    If you don't want to write one, pipe an ls instead: "ls|mercurydoc" will do.

OUTPUT
    Frames are the module index, the identifier index and the source code.
    The system view (start page) contains your INTROFILE and an index of all
    exported identifiers, the module views give an index of local identifieres.
    All imported identifiers are linked with their definitions, the definitions
    with the implementations (the latter must start in column 1!).
    Overloaded identifiers are linked with their typeclass specification.
    For every typeclass, backreferences are inserted to their instantiations.
    Also backreferences to all importing modules are inserted.

    Module structure analysis detects unnecessarily imported modules and unused
    exported identifiers and writes them to standard output.
________________________________________________________________________________
[HELP SCREEN]
    close HELP;
    exit;
}


#   P A R T   I I :   G A T H E R   S O U R C E   I N F O R M A T I O N

#--- go though all modules building global tables ---
for my $modname (@modules)
{   print STDERR '+';
    $_ = `cat $modname.m`;
    if($options{'i'}) {s/:-\s*implementation.*//s;}
    
    #--- table of imported modules: imported module -> [this module,..] ---
    while(/import_module(.*?)\./sg)
    {   for(split /[,\s]+/, $1)
        {  push @{$globtab_exportedto{$_}}, $modname;
        }
    }

    #--- delete implementation part ---
    s/:-\s*implementation.*//s;
    study;
    
    #--- table of exported names: this module -> list of exported ids ---
    #--- table of all exported names:list of exported id / name of this module ---
    while(/:-\s*((pred|func|type|inst|typeclass)\s+(\w+))/g)
    {  push @{$globtab_exports{$modname}}, $3;
       push @globtab_allexports, "$3/$modname";       
    }
    while(/:-\s*typeclass.*?\bwhere\s*\[(.*?)\]/sg)
    {  my $tmp = $1;
       while ($tmp =~ /(pred|func)\s+(\w+)/g)
       {  push @{$globtab_exports{$modname}}, $2;
          push @globtab_allexports, "$2/$modname";
       }
    }

    #--- table of overloaded names: classname -> link to thismodule#typename ---
    while(/:-\s*instance\s+(\w+)\(.*?(\w+)/g )
    {  push @{$globtab_overload{$1}}, "<a hReF='mod_$modname.html#inst_$1_$2'>$2</a>";
    }
}


#   P A R T   I I I :   O U T P U T   T O C   F I L E S

mkdir $dir,0755;
print STDERR "\n";

#--- generate index file ---
$html= <<"[HTML]";
<html><frameset cols='200,*'>
              <frameset rows='33%,*'>
                      <frame name='tocmod' src='toc_modules.html'>
                      <frame name='tocids' src='toc_expids.html'>
              </frameset>
              <frame name='code' src='intro.html'>
      </frameset></html>
[HTML]
qx|echo "$html" >$dir/index.html|;

#--- generate toc of all modules ---
$tocm = join "", map "<a href='mod_$_.html' target='code'>$_</a><br>", @modules;
$html = <<"[HTML]";
<html><body bgcolor='white' style='font: small sans-serif'>
      <a href='intro.html' target='code'>SYSTEM VIEW</a><br>
      $tocm</body></html>
[HTML]
qx|echo "$html" >$dir/toc_modules.html|;

#--- generate toc of all exported identifiers ---
$toca = join "", map {m|(.*)/(.*)|; $_="<a href='mod_$2.html' target='code'>$1</a><br>";}
             (sort @globtab_allexports);
$html = "<html><body bgcolor='white' style='font: small sans-serif'>EXPORTS:<BR>$toca</body></html>";
open  TOCEXP, ">$dir/toc_expids.html";
print TOCEXP $html;
close TOCEXP;

#--- insert links in 'structure doc' and generate start page ---
$introdoc =~ s|(\w+)\.m|<a href='mod_$1.html'>$1.m</a>|g;
unless($introdoc =~ /<html>/)
{   $introdoc = <<"[HTML]";
<html><script>open('toc_expids.html','tocids');</script>
      <body bgcolor='white'><pre wrap>$introdoc</pre></body></html>
[HTML]
}
qx|echo "$introdoc" >$dir/intro.html|;


#   P A R T   I V :   G E N E R A T E   M O D U L E   F I L E S

#--- go though all modules again ---
for my $modname (@modules)
{   print STDERR '*';
    my $thismod = `cat $modname.m`;
    if($options{'i'}) {$thismod =~ s/:-\s*implementation.*//s;}
    my @loctab_decls;
    my @loctab_imports;
    my @loctab_used;

    #--- table of all the imported module names ---
    while($thismod =~ /import_module(.*?)\./sg)
    {   push @loctab_imports, split /[,\s]+/, $1;
    }

    #--- insert links to imported modules ---
    sub impsubst
    {   $_ = shift;
        for my $m (@modules)
        {   s|$m|<a hReF='mod_$m.html'>$m</a>|;
	}
        return $_;
    }
    $thismod =~ s/import_module.*?\./impsubst($&)/seg;

    #--- insert backreference to the importing modules ---
    $imps = (join ", ", map "<a hReF='mod_$_.html'>$_</a>", @{$globtab_exportedto{$modname}})
            || "nowhere imported";
    $thismod =~ s/:-\s*module.*/$&\n% imported by: $imps./;

    #--- insert anchors/links for the contents of typeclass declarations ---
    #--- table of local declarations: identifiers ---
    while($thismod =~ /:-\s*typeclass.*?\bwhere\s*\[(.*?)\]/sg)
    {  my $tmp = $1;
       while ($tmp =~ /(pred|func)\s+(\w+)/g)
       {  push @loctab_decls, $2;
       }
    }
    sub tclsubst
    {   $_ = shift;
        s%((\[|,)[\s\n]*(pred|func)\s+)(\w+)%$1<a nAmE='decl_$4' hReF='#decl_$4'>$4</a>%g;
        return $_;
    }
    $thismod =~ s/:-\s*typeclass.*?\bwhere\s*\[.*?\]/tclsubst($&)/seg;
 
    #--- insert backreferences for overloadings ---
    $thismod =~ s|:-\s*typeclass\s+(\w+).*|
                  "$&\n% instantiated by: ".(join ", ", @{$globtab_overload{$1}} )|eg;

    #--- insert anchors for instantiations ---
    $thismod =~ s%:-\s*instance\s+(\w+)\(.*?(\w+)\)[\s\n]*?where%<a nAmE='inst_$1_$2'>$&</a>%g;

    #--- insert anchors/links in declarations of the module and their implementations ---
    #--- table of local declarations: identifiers ---
    $tmp = $thismod;
    while($tmp =~ /(:-\s*(pred|func|type|inst|typeclass)\s+)(\w+)/g)
    {   $id = $3;
        $thismod =~ s%^$id\b%<a nAmE='impl_$id' hReF='#decl_$id'>$id</a>%m;
        push @loctab_decls, $id;
    }
    $thismod =~ s%(:-\s*(pred|func|type|inst|typeclass)\s+)(\w+)%$1<a nAmE='decl_$3' hReF='#impl_$3'>$3</a>%g;

    #--- insert links at references of all imported identifiers ---
    #--- tabulate usage of imported modules ---
    if($options{'a'})
    {   for my $impmod (@loctab_imports)
	{   for (@{$globtab_exports{$impmod}})
	    {   if($thismod =~ s%([^>])\b$_\b%$1<a hReF='mod_$impmod.html#decl_$_'>$_</a>%g)
		{   push @loctab_used,$impmod;
		    $analysis_usage{"$impmod-$_"}=1;
		}
	    }
	}
	for $m (@modules)
	{   if ( (grep {$m eq $_} @loctab_imports) && ! (grep {$m eq $_} @loctab_used) )
	    {   push @analysis_imports, "   $modname <- $m\n";
	    }
	}
    }
    else
    {   for my $impmod (@loctab_imports)
	{   for (@{$globtab_exports{$impmod}})
	    {   $thismod =~ s%([^>])\b$_\b%$1<a hReF='mod_$impmod.html#decl_$_'>$_</a>%g;
	    }
	}
    }

    #--- nice formatting and generate module html file ---
    $thismod =~ s%(\n.*)?(:-\s*(module|interface|implementation).*)\n%</pre><h2>$2</h2><pre wrap>%g;
    open  MODHTML, ">$dir/mod_$modname.html";
    print MODHTML <<"[HTML]";
<html><head><script>open('toc_ids_$modname.html','tocids');</script></head>
      <body bgcolor='white'><pre>$thismod</pre></body></html>";
[HTML]
    close MODHTML;

    #--- generate toc of identifiers for this module ---
    $toct = join "", map {$_="<a href='mod_$modname.html#decl_$_' target='code'>$_</a><br>";}
                 (sort @loctab_decls);
    $html = "<html><body bgcolor='white' style='font: small sans-serif'>$toct</body></html>";
    qx|echo "$html" >$dir/toc_ids_$modname.html|;
}

print STDERR "\n";


#   P A R T   V :   G E N E R A T E   A N A L Y S I S

#--- generate analysis output ---
if($options{'a'})
{   print "MODULE STRUCTURE ANALYSIS\n";
    print "module .. unneccessarily imports module ..\n";
    print @analysis_imports;
    print "module .. unneccessarily exports identifier ..\n";
    for $m (@modules)
    {   for $id (@{$globtab_exports{$m}})
	{   unless($analysis_usage{"$m-$id"}) { print "   $m -> $id\n" };
	}
    }
}


----- End forwarded message -----

-- 
Fergus Henderson <fjh at cs.mu.oz.au>  |  "I have always known that the pursuit
The University of Melbourne         |  of excellence is a lethal habit"
WWW: <http://www.cs.mu.oz.au/~fjh>  |     -- the last words of T. S. Garp.
--------------------------------------------------------------------------
mercury-developers mailing list
Post messages to:       mercury-developers at cs.mu.oz.au
Administrative Queries: owner-mercury-developers at cs.mu.oz.au
Subscriptions:          mercury-developers-request at cs.mu.oz.au
--------------------------------------------------------------------------



More information about the developers mailing list