summaryrefslogtreecommitdiffstats
path: root/src/perl_checker.html.pl
blob: c985d7321be69982749e386b49485001c82dea64 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
$s = <<'EOF';
<head>
  <title>perl_checker</title>
  <style> body { max-width: 900; } </style>
</head>


<h1>Quick Start</h1>

To use perl_checker, simply use "perl_checker a_file.pl"
<p>
To use under emacs, simply add the following line to your .emacs, 
then when you visit a perl file, you can use Ctrl-Return to run perl_checker
on this file

<pre>
  (global-set-key [(control return)] (lambda () (interactive) (save-some-buffers 1) (compile (concat "perl_checker --restrict-to-files " (buffer-file-name (current-buffer))))))
</pre>

<p>
To use with vim, use something like:
<pre>
  perl_checker --restrict-to-files scanner.pm > errors.err ; vim -c ':copen 4' -c ':so /usr/share/vim/ftplugin/perl_checker.vim' -q
</pre>
where /usr/share/vim/ftplugin/perl_checker.vim is

<pre>
" Error formats
setlocal efm=
  \%EFile\ \"%f\"\\,\ line\ %l\\,\ characters\ %c-%*\\d:,
  \%EFile\ \"%f\"\\,\ line\ %l\\,\ character\ %c:%m,
  \%+EReference\ to\ unbound\ regexp\ name\ %m,
  \%Eocamlyacc:\ e\ -\ line\ %l\ of\ \"%f\"\\,\ %m,
  \%Wocamlyacc:\ w\ -\ %m,
  \%-Zmake%.%#,
  \%C%m
</pre>


<h1>Goals of perl_checker</h1>

<ul>
<li> for beginners in perl:
  based on what the programmer is writing,
 <ul>
  <li> suggest better or more standard ways to do the same
  <li> detect wrong code
  <br>
  =&gt; a kind of automatic teacher
 </ul>

<li> for senior programmers:
  detect typos, unused variables, check number
  of parameters, global analysis to check method calls...

<li> enforce the same perl style by enforcing a subset of perl of features.
     In perl <a href="http://c2.com/cgi/wiki?ThereIsMoreThanOneWayToDoIt">There is more than one way to do it</a>. 
     In perl_checker's subset of Perl, there is not too many ways to do it.
     This is especially useful for big projects.
     (NB: the subset is chosen to keep a good expressivity)
</ul>

<h1>Compared to <a href="http://www.perl.com/pub/a/2005/06/09/ppi.html">PPI</a> and <a href="http://perlcritic.tigris.org/">Perl-Critic</a></h1>

<ul>
<li>perl_checker use its own OCaml-written parser.
  This parser only handle a subset of perl, 
    whereas one of PPI's goal is to be able to parse non finished perl documents.
  <p>perl_checker is a checker: it is not a big deal to die horribly on a weird perl expression, it tells the programmer what to write instead.
   The issue is that perl_checker includes inter-modules analysis, and it implies being able to parse non-perl_checker compliant modules.
   A solution for this is perl_checker <i>fake</i> modules. No perfect solution though.

<li>PPI doesn't handle operator priorities: <tt>1 + 2 &lt;&lt; 3</tt> is parsed as
    <ul><li>PPI: a list [ Number(<tt>1</tt>), Operator(<tt>+</tt>), Number(<tt>2</tt>), Operator(<tt>&lt;&lt;</tt>), Number(<tt>3</tt>) ]
        <li>perl_checker: a tree Operator(<tt>&lt;&lt;</tt>, [ Operator(<tt>+</tt>, [ Number(<tt>1</tt>), Number(<tt>2</tt>) ]), Number(<tt>3</tt>) ])
    </ul>
    This limits perlcritic checks to a syntax level.

<li>perl_checker is <b>much</b> faster (more than 100 times) (ML pattern matching rulez)

<li>perl_checker checks a lot more things than perlcritic: undeclared variables, unknown functions, unknown methods...

<li>and of course perl_checker checks are different from the Conways's <a href="http://www.oreilly.com/catalog/perlbp/">Perl Best Practices</a>
</ul>

<h1>Get it</h1>

<a href="http://svnweb.mageia.org/packages/cauldron/perl_checker/current/SOURCES/">tarball</a>
<br>
<a href="http://svnweb.mageia.org/soft/perl_checker/">SVN source</a>
<br>
<a href="http://svnweb.mageia.org/packages/cauldron/perl-MDK-Common/current/SOURCES/">MDK::Common tarball</a>

<h1>Implemented features</h1>

<dl>

 <dt>detect some Perl traps
 <dd>some Perl expressions are stupid, and one gets a warning when running
 them with <tt>perl -w</tt>. The drawback of <tt>perl -w</tt> is the lack of
 code coverage, it only detects expressions which are evaluated.

 TESTS=various_errors.t

 </dd>

 <dt>context checks
 <dd>Perl has types associated with variables names, the so-called "context".
 Some expressions mixing contexts are stupid, perl_checker detects them.

 TESTS=context.t

 </dd>

 <dt>suggest simpler expressions
 <dd>when there is a simpler way to write an expression, suggest it. It can
 also help detecting errors.

 TESTS=suggest_better.t

 </dd>

 <dt>function call check
 <dd>detection of unknown functions or mismatching prototypes (warning: since
  perl is a dynamic language, some spurious warnings may occur when a function
  is defined using stashes).

 TESTS=prototype.t

 </dd>

 <dt>method call check
 <dd>detection of unknown methods or mismatching prototypes. perl_checker
 doesn't have any idea what the object type is, it simply checks if a method
 with that name and that number of parameters exists.

 TESTS=method.t

 </dd>

 <dt>return value check
 <dd>dropping the result of a functionnally <i>pure</i> function is stupid.
 using the result of a function returning void is stupid too.
 <br>(nb: perl_checker enforces <tt>&&</tt> and <tt>||</tt> are used as boolean operators 
      whereas <tt>and</tt> and <tt>or</tt> are used for control flow)

 TESTS=return_value.t

 </dd>

 <dt>white space normalization
 <dd>enforce a similar coding style. In many languages you can find a coding
 style document (eg: <a href="http://www.gnu.org/prep/standards/standards.html#Writing-C">the GNU one</a>).

 TESTS=force_layout.t

 </dd>

 <dt>disallow <i>complex</i> expressions
 <dd>perl_checker try to ban some weird-not-used-a-lot features.

 TESTS=syntax_restrictions.t

 </dd>

</dl>

<h1>Todo</h1>

Functionalities that would be nice:
<ul>
 <li> add flow analysis
 <li> maybe a "soft typing" type analysis
 <li> detect places where imperative code can be replaced with
   functional code (already done for some <b>simple</b> loops)
 <li> check the number of returned values when checking prototype compliance
</ul>
EOF

my $_rationale = <<'EOF';
<h1>Rationale</h1>

Perl is a big language, there is <a
href="http://c2.com/cgi/wiki?ThereIsMoreThanOneWayToDoIt">ThereIsMoreThanOneWayToDoIt</a>.
It has advantages but also some drawbacks for team project:
<ul>
 <li> it is hard to learn every special rules. Automatically enforced syntax
 coding rules help learning incrementally
EOF

use lib ('test', '..');
use read_t;
sub get_example {
    my ($file) = @_;
    my @tests = read_t::read_t("test/$file");
    $file =~ s|test/||;
    qq(<p><a name="$file"><table border=1 cellpadding=3>\n) .
      join('', map { 
	  my $lines = join("<br>", map { "<tt>" . html_quote($_) . "</tt>" } @{$_->{lines}});
	  my $logs = join("<br>", map { html_quote($_) } @{$_->{logs}});
	  $logs ? "  <tr><td>\n" . $lines . "</td><td>" . $logs . "</td></tr>\n" : '';
      } @tests) .
      "</table></a>\n";
}

sub anchor_to_examples {
    my ($s) = @_;
    $s =~ s!TESTS=(\S+)!(<a href="#$1">examples</a>)!g;
    $s;
}
sub fill_in_examples {
    my ($s) = @_;
    $s =~ s!TESTS=(\S+)!get_example($1)!ge;
    $s;
}

$s =~ s!<h1>Implemented features</h1>(.*)<h1>!
        "<h1>Implemented features</h1>" . anchor_to_examples($1) .
        "<h1>Examples</h1>" . fill_in_examples($1) .
        "<h1>"!se;

print $s;

sub html_quote {
    local $_ = $_[0];
    s/</&lt;/g;
    s/>/&gt;/g;
    s/^(\s*)/"&nbsp;" x length($1)/e;
    $_;
}