Running a Perl program:

From the command line, assuming your program is named test.pl, run the program via the command:

perl -w test.pl

The (-w) flag tells the interpreter to print warning messages. This is very useful during debugging of programs, but could be omitted once you are sure the program is correct.

Command interpretation:

The first line of your program should probably be something like:

#!/usr/bin/perl -w

The /usr/bin/perl is the pathname to the Perl interpreter on your machine. (The -w is to turn on warnings (which would obviate the need for the (-v) flag in the command-line.) This line allows the program to be executed from a Unix or Linux command line as if it were an executable. I.e: just

test.pl

Rather than

perl -w test.pl

Output:

The print operator sends its arguments as a concatenated string to standard output. E.g.:

print "foo", 5, 1.5;

prints foo51.5 to the console.

The binding operator, =~

Apply the operation on the right to the string on the left hand side. Here is an example using the substitution operator, assuming that $RNA holds a string that contains some instances of the substring "T":

$RNA =~ s/T/U/g

In this case T and U are actually regular expression patterns. (We'll talk about regular expressions later!)

Other operators often employed with the binding op are:

Here is an example use of the substitution operator. Unfortunately, it is buggy, as we shall see. The idea is to calculate the reverse compliment of a DNA strand: A<-->T, G<-->C. (The reversal is necessary because DNA strands have an orientation, from the 5' end to the 3' end. See chapter 1 of the text.) The example code also shows a use of the transliteration operator that solves the bug.

Reading data from a file

File handles are identifiers not preceded by a $, @, or %. We use the open command to bind a file handle to a file with a particular pathname. We use the "< >" operator to access the contents of a handle's file. Here's a very simple example. Note that each time we use the < > operator, we access the next line of the file, as is demonstrated in this example.

We can load all of the contents of the file with one assigment operation by making use of an array: @protein = <FILEHANDLE>. You should be careful with this operation, however, if the file is large. You can easily exceed the memory capacity of your machine!

Arrays

push (@arr, $bob);

$bob = pop @arr;

$bob = reverse @arr;

splice (@arr, 2, 0, 'X'); Inserts 'X' into array at position 2 (i.e., after the 2nd element in the array). The 0 indicates that we aren't replacing anything. So if @arr held (a,b,c), then after the above splice operation it would hold (a,b,X,c).

The assignment operator can work with arrays (and lists) to simultaneously assign many variables to the elements of an array:

($a, $b, $c) = @arr;

What, then, is the difference between $a = @arr and ($a) = @arr ? Scalar and list contexts.

$glommed = join ('', @array); Gloms the contents of an array into a string.

The explode operator goes the other way:

@arr = explode $string;

Conditionals:

Boolean operators: The usual numeric ones (<, >=, ==, etc.). For stringwise comparisons use eq, ne, lt, gt, le, ge and cmp instead of ==, !=, <, >, <=, >=.

For cascading if-else statements, Perl provides an unusual "elsif" keyword. Here's an example program demonstrating this type of statement. Perl also provides an unless statement, which is equivalent to a negated if statement. I.e., the statement is executed if the condition is false, rather than true. So

if (bob != 0) { ... };

is equivalent to

unless (bob == 0) { ... };

Perl's strong string manipulation and pattern matching capabilities can also be used in the context of conditionals.

if ($bob =~ /$patternToMatch/) { Executed if the string in $bob matches the regular expression in $patternToMatch. }

Keyboard input:

Keyboard input is via the same mechanism that we access the contents of files. There is a predefined file handle, STDIN, that is bound to the input from the console. So to access the next line of input from the console, use

$input = <STDIN>;

Note that newline at the end of the line will be included in $input at this point. It is common to remove that via:

chomp $input;

which deletes the last character from $input.

Loops:

Perl has all the standard C/Java iteration constructs, using the same syntax: while, do-while, for. In addition, Perl provides a do-until construct, which is like a do-while loop, but iterates only for so long as the test condition is false. Thus, do-while is analogous to do-until in the same way that if is analogous to the unless statement.

Here's an example program demonstrating a do-while loop and keyboard input.

Here is a program for counting nucleotides (the A, C, T and G of DNA). It uses the foreach looping statement, borrowed from various Unix shellscripting languages. Notice the use of the split function, which is basically synonymous with explode.

The foreach loop there could be replaced with another version that makes use of the implicit $_ variable:

foreach (@DNA) { 
  if ( /A/ ) { 
    ++$count_of_A;
  } elsif ( /C/ ) ... 

Here's another version of the program that uses the substr function in a normal for loop.

Here's yet another version, this one using a nifty side-effect of string matching operators to do the counting. (See the while loops.) In a scalar context, with the 'g' suffix, the matching operator will return true for as many times as there are matches within the string. This version also places its output in a file. The '>' prefix to the filename indicates that the file is to be used for output. The while loops here are actually a bit slow. We can do better using the transliteration operator, tr:

$a = ($dna =~ tr/Aa//);

because in a scalar context, the tr operator will return the number of transliterations, and because the replacement character set is empty, the string is not actually altered (if the replacement string is empty, the search string is used in its place, resulting in characters being replaced with themselves.)