perl-MDK-Common tutorial v0.1

Guillaume Cottenceau (maintainer: Pixel)

Introduction

This document aims at helping people interested in learning more on perl-MDK-Common, a Perl library which is intensively used in Mandriva in-house software development.

The library adds some convenient "basic" functions to Perl, allows easier functional-style programming, and also provides some better system-related operations. It can be seen as an extension to the standard Perl library, adding missing helpful functions. It is divided as follows:

MDK::Common::File: some useful list/hash functions
MDK::Common::Func: functions suited to functional-style programming
MDK::Common::Math: some math functions
MDK::Common::String: functions to perform various formatting on strings
MDK::Common::System: system-related useful functions
MDK::Common::Various: other useful functions

Thanks to perl-MDK-Common's own documentation, an easy way to directly access information about the provided functions is to use perldoc. For example, perldoc MDK::Common::Func will list the functions of the Func sub-module. Use perldoc MDK::Common to view information on all the available functions.

Additionally, perl-MDK-Common provides a binary called perl_checker, which is a Perl compiler aiming at enforcing the use of a subset of Perl, so that all Mandriva Perl programs roughly follow the same code style. It will also help the programmer to remove unneeded parentheses and conditionals.

Prerequisites

Of course, a first look at the Perl language will be necessary for the reader. The following can be a good Perl Tutorial (although there are many others on the web): http://www.comp.leeds.ac.uk/Perl/.

Programming with perl-MDK-Common also emphasizes the following quality properties on your code:

no code duplication: at the grassroots, this library aims at helping you with simple operations you have to deal with so many times in usual programs; this includes reading the contents of a file, finding the maximum numeric element of an array, etc; in order to be efficient with perl-MDK-Common, you need to always keep in mind to not duplicate code to perform a single action

functional style programming: this is not a so common technique among programmers, and maybe it's even worse with Perl programmers; but functional-style programs are often clearer, more expressive, more reusable, and more maintainable, than traditional programs

strict code style: Perl is known to be a language with which "there is more than one way to do it"; actually, nearly every Perl program uses a different code-style; that's nice for the programmer's freedom, and that's awful for code maintainance; perl_checker will ask you to follow a specific code style

We can't discuss Perl programming without referring to two excellent books from O'Reilly. The first one is called "The Perl Cookbook", and covers many daily problems a Perl programmer will face, in a recipe-like fashion. All Perl programmers should own this book :). The second one can be a good resource for more skillful programmers, and is called "Advanced Perl Programming"; it covers interesting advanced features of Perl.

Structure of this document

This document will first try to emphasize the most useful functions of the perl-MDK-Common library, e.g. the most commonly used and simple. Then, some functions whose use is not trivial will be explained. As a last part, an introduction to the code-style to please perl_checker will be shown.

Most useful functions

Note: many functions' name, extending capabilities of existing functions, or being their functional counterpart, are suffixed with the underscore character (_); for example, chomp_ is the semantical equivalent of chomp, but returns the chomp'ed results instead of modifying its argument.

cat_(FILENAME): returns the file contents; in scalar context it returns a single string, in array context it returns the lines. If the file doesn't exist, it returns undef.
Perl IO operations are verbose and the API is cluttered. There are many situations in which you want to read the contents of a file, put it in a scalar or loop around the files. cat_ allows to do that easily:
```
  printf "Mandriva release:\n%s\n", cat_('/etc/mandriva-release');

  foreach (cat_('/proc/mounts')) {
      my ($dev, $where, $type) = split;
      print "$dev is mounted on $where (type $type)\n";
  }
  
```
output(FILENAME, LIST): creates a file and outputs the list (if the file exists, it is clobbered)
Counterpart of cat_:
```
  output('/tmp/resolv.conf', 
         "search $domain\n", 
         map { "nameserver $_\n" } @name_servers);
  
```
member(SCALAR, LIST): is the value in the list?
Returns true if the value is stringwise equivalent to an element of the list:
```
  if (!member($driver, @modules)) {
      print "Sorry, the driver is not available in our modules.\n"
  }
  
```
difference2(ARRAY REF, ARRAY REF): returns the first list without the element of the second list
Performs a set-wise substraction, e.g. removes in first list the elements that are members of the second one:
```
  my @types = difference2(\@available_types, \@bad_types);
  print "Please select a type from: @types\n";
  
```
uniq(LIST): returns the list with no duplicates
Removes duplicates from the list, keeping the order of the list, and the first element when duplicates.
```
  my @types = uniq map { (split)[2] } cat_('/proc/mounts');
  print "Filesystem types in use: @types\n"
  
```
min(LIST): returns the minimum number from a list
max(LIST): returns the maximum number from a list
chomp_(STRING): non-mutable version of chomp: do not modify the argument, returns the chomp'ed value.
Very useful for simple functional expressions.

Note: also works on lists: chomp_($a, $b) is equivalent to chomp($a); chomp($b); ($a,$b).
```
  my $pid = chomp_(cat_('/var/run/cardmgr.pid'));
  
```

Other interesting functions

The following describes functions whose use is not trivial.

if_(BOOL, LIST)
Returns LIST if the BOOL condition is true, else an empty list.

Note: it's equivalent as doing BOOL ? LIST : (), except that since it's a function, LIST is evaluated even if BOOL is false. It's useful because it's shorter and more readable than the ternary ?:.

A first convenient use is when you want to loop on a list and conditionally on another:
```
  foreach (@types, if_($arch eq 'ppc', @ppc_types)) {
      # ...
  }
```
It's also useful to select elements from a list and modify them on-the-fly, e.g. performing the equivalent of a grep then a map. It works because Perl automatically concatenates lists.
```
  my @md_devices = map { if_(/^(md\d+)/, $1) } cat_('/proc/mdstat');

      # equivalent (but much more elegant!) to:

  my @md_devices = map { /^(md\d+)/ } grep { /^md\d+/ } cat_('/proc/mdstat');
  
```
substInFile { CODE } FILENAME: executes the code for each line of the file; you can know the end of the file is reached using eof
Typically used to change a parameter in a file:
```
  substInFile { s/^FORWARD_IPV4.*\n//; $_ .= "FORWARD_IPV4=true\n" if eof } '/etc/sysconfig/network';
  
```
each_index { CODE } LIST: iterate on a list to execute some code needing the index of the list element (available in $::i)
Useful when you need to perform an action on each element of a list, but you also need the index of each element:
```
  each_index { printf "%s mountpoint: $_", $::i == 2 ? 'third' : 'other' } cat_('/proc/mounts');
  
```

perl_checker

Let's examine now the code-style perl_checker wants you to adopt. Let's consider the following naive code example:

  1: sub calc {
  2:     my ($x,$y) = @_;
  3:     $_ = $y;
  4:     ($x==0 && $y==0) and return -1;
  5:     my @tab = (1, 2, 3);
  6: 			
  7:     /sysconfig\/i18n/ and return 1;
  8: }

The following problems are reported:

```
line 2, character 12-12
you should have a space here
```
Good: my ($x, $y) = @_;
Why: you should put a space after the comma when specifying a list.
```
line 3, character 5-7
undeclared variable $_
```
Good: local $_ = $y;
Why: you should always localize $_ when you set it, because it's a global variable.
```
line 4, character 8-8
you should have a space here
```
Good: ($x == 0 && $y == 0) and return -1;
Why: you should put spaces before and after operators.
```
line 4, character 5-21
unneeded parentheses
```
Good: $x == 0 && $y == 0 and return -1;
Why: because of operators precedence, the parentheses are unneeded (if unsure about precedence, see perlop(1))
```
line 5, character 8-12
unused variable @tab
```
Why: Assigning to unused variables is (typically) useless. If you really need to assign to an unused variable, prefix its name with `_' and perl_checker will stop boring you (for example, @_tab).
```
line 7, character 20-21
change the delimit character / to get rid of this escape
```
Good: m|sysconfig/i18n|
Why: / is not the only regexp delimiter! if you want to specify a slash in your regexp, use another delimiter so that your regexp will be more readable.

Finally, the correct code looks like:

  sub calc {
      my ($x, $y) = @_;
      local $_ = $y;
      $x == 0 && $y == 0 and return -1;
      my @_tab = (1, 2, 3);
  			
      m|sysconfig/i18n| and return 1;
  }

Under Emacs, you might want to add the following to your .emacs and then be able to validate your code with C-Enter:

  (defmacro ilam (&rest body) `(lambda () (interactive) ,@body))
  (add-hook 'cperl-mode-hook 
	    '(lambda ()
	       (local-set-key [(control return)] 
			      (ilam (save-some-buffers 1) (compile (concat "perl_checker --restrict-to-files " (buffer-file-name (current-buffer))))))
	       ))

Last update: Wed Apr 30 18:05:40 2003