TweetFollow Us on Twitter

Still More Perl

Volume Number: 19 (2003)
Issue Number: 1
Column Tag: Section 7

Still More Perl

Munging Mail and Media...

by Rich Morin

Perl's "whipitupitude" is legendary. This column looks at a couple of small scripts I've recently been "whipping up", showing how Perl can work in and around more formal OSX tools. One script, fmmf, Finds Monster Mail Files; I use it to keep track of mailing list (and other) mail files which may be getting out of hand. The other script, cfwc.d, is a daemon (background process) which helps me operate an experimental webcam.

Finding MOnster Mail Files

I'm on quite a few mailing lists and I don't always get to the associated mailboxes regularly to keep them under control. I'm also trying to track the efficacy of my spam filtering system (based on SpamAssassin and Eudora), which drops suspected spam into one of several mailboxes, depending on its numeric spam rating, etc. I have written a short script which helps me keep on top of these issues.

The mainline code, below, is quite simple. Using finddepth, from the File::Find module (available on the CPAN; cpan.perl.org), it performs a depth-first examination of my email folder. The callback function, wanted, is invoked for each node (e.g., file, directory) in the tree. Using the lists produced by this traversal, the remaining code prints out the results for spam and miscellaneous email, sorting each list in a case-insensitive manner.

#!/usr/bin/env perl
#
# fmmf - find monster mail files
#
# Written by Rich Morin, CFCL, 2002.11
use File::Find;
$monster = 2000000;
{
  $eu = '/Users/rdm/Mail/Eudora Folder';
  finddepth(\&wanted, "$eu/Mail Folder");
  for $line (sort {lc($a) cmp lc($b)} (@spam)) {
    print $line;
  }
  print "\n";
  for $line (sort {lc($a) cmp lc($b)} (@misc)) {
    print $line;
  }
}

The tricky parts of this script, such as they are, lie in the "wanted" callback function. As it traverses the tree, finddepth changes the "current directory" and sets $_ to the relative name of the node. This makes it easy to skip over items that aren't files and Eudora's "table of contents" (*.toc) files.

sub wanted {
  return unless (-f $_);
  return if ($_ =~ m|\.toc$|);

For the next part, however, we need the "full path name" of the node. Getting this from a handy helper method, we can strip off the first part of the path and test the remainder in assorted ways. Perl's regular expressions are very useful for this sort of name handling.

  $path = $File::Find::name;
  $path =~ s|^.*/Eudora Folder/Mail Folder/||;
  return if ($path =~ m|_Inactive/ Save/|);

After picking up the size of the file (in bytes), the script opens each mailbox in the "spam" area and counts the number of "From: headers (i.e., messages). Eudora uses carriage returns (rather than the conventional BSD newlines) for line termination, but setting Perl's $/ (input record separator) variable handles that quite easily. The strings containing the formatted output are pushed into a list, for use by the mainline code.

  $size = -s $_;
  if ($path =~ m|!Spam|) {
    open(MBOX, $_) or die "can't open mailbox($_)";
    $/ = "\r";
    $fcnt = 0;
    while (defined($line = <MBOX>)) {
      $fcnt++ if ($line =~ m|^From:|) ;
    } 
    close(MBOX);
    push(@spam, sprintf("%-35s  %9d  %4d\n",
      $path, $size, $fcnt));
    return;
  }

The code for miscellaneous mailboxes is comparatively simple. After ensuring that the mailbox is large enough to qualify as a "monster", it formats and saves the output lines. Perl's "x" operator comes in handy for creating a "quick and dirty" histogram.

  return if ($size < $monster);
  $isiz = int($size/$monster);
  push(@misc, sprintf("%-35s  %9d  %s\n",
    $path, $size, '*' x $isiz));
}

This sort of "personalized" script is quite common in BSD circles. Clearly, it isn't suitable for use by others, as is, but it is short and simple enough that it can easily be customized to meet the needs of different users. Here is some sample output, from my own system:

!Spam/?? Junk (Eudora)                9041     5
!Spam/?? Junk (SA 1)                 39192     6
!Spam/?? Junk (SA 2)                 11467     2
!Spam/?? Junk (SA 3)                420538    60
_Lists/DocBook                     3231686  *
_Lists/FreeBSD/FreeBSD-Ports       6431902  ***
_Lists/FreeBSD/FreeBSD-Questions   2666962  *
...

A WebCam Daemon

I recently started playing with an iBOT, a FireWire-based camera made by Orange Micro

(www.orangemicro.com). My initial goal was to create a simple "security camera" app that would display a set of recent images on a web page.

After downloading the OSX driver for the iBOT, I started looking around for image capture software. One package, EvoCam (www.evological.com), captures images, based on elapsed time and/or software-based motion detection. It can also upload the image files (via FTP) to a web server and/or save numbered copies on the local disk.

Unfortunately, this wasn't exactly what I wanted. The FTP upload feature simply refreshed the same file; turning this into a time history would be tricky. The numbered image files would do, however, if I could get them over to the web server. All told, it was a good start on what I wanted. All I needed to do was create a little plumbing...

The first part of the plumbing had to do with getting the files from my desktop Mac onto the (FreeBSD-based) local web server. FreeBSD provides NFS, but getting OSX to mount the provided volumes can be quite a trial. Fortunately, Marcel Bresink's NFS Manager (www.bresink.de/osx/NFSManager.html) eases the pain considerably.

Once I got the files sifting into a directory on the web server, I merely had to rename them (for convenience) and build up a web page to display a selected subset. The following script, while still a "work in progress", accomplishes these tasks quite handily.

#!/usr/bin/env perl
#
# cfwc.d - Canta Forda WebCam Daemon
#
# Written by Rich Morin, CFCL, 2002.11
$imgs = '/.../iBOT';   # adjust to taste...
$html = '/.../cfwc';   # adjust to taste...
{
  for (;;) {

As mentioned above, EvoCam generates a unique name (e.g., 123456789.jpg) for each image file. In writing these to the NFS-mounted FreeBSD machine, OSX also generates a companion file (e.g., ._123456789.jpg) for the resource fork. The code below creates a new name for the image file, based on the file's modification time, and discards the companion file.

    # Clean out incoming directory.
    opendir(IN, "$imgs/incoming")
      or die "can't open $imgs/incoming";
    @in = grep(!/^\./, readdir(IN));
    chomp(@in);
    closedir(IN);
    for $in (sort(@in)) {
      @stat = stat("$imgs/incoming/$in");
      $mtime = $stat[9];
      ($sec, $min, $hour, $mday, $mon, $year,
       $wday, $yday, $isdst) = localtime($mtime);
      $out = sprintf("%d.%02d%02d.%02d%02d%02d.jpg",
        $year+1900, $mon+1, $mday, $hour, $min, $sec);
      rename("$imgs/incoming/$in",
             "$imgs/i.queue/$out");
      unlink("$imgs/incoming/._$in"); 
    }

Perl's approach to reading directories is rather messy, but it isn't all that difficult. The code below gets a list of filenames, discarding any that don't match the desired format, and sorts them. Because the names were crafted with this in mind, the list is now in chronological order.

    # Get list of images to display.
    opendir(IN, "$imgs/i.queue")
      or die "can't open $imgs/i.queue";
    @in  = sort(grep(/^\d{4}\.\d{4}\.\d{6}\.jpg$/,
                     readdir(IN)));
    chomp(@in);
    closedir(IN);

Using Perl's "slice" syntax, we grab the last (i.e., most recent) nine file names.

    @show = @in[-9 .. -1];

Now we start generating a web page. The META tag tells the user's browser to refresh the page every 15 seconds. I am rather compulsive about formatting the HTML; the web browser doesn't care, but it sure makes debugging less painful for humans!

    # Make up a new web page.
    open(OUT, ">$html/index.temp")
      or die "can't open index.temp";
    print OUT <<EOT;
<HTML>
  <HEAD>
    <META HTTP-EQUIV="Refresh" content="15">
    <TITLE>Canta Forda WebCam</TITLE>
  </HEAD>
  <BODY>
    <TABLE>
EOT

The code below generates a 3x3 table of images, each followed by a centered label. I could have used the file names (e.g., 2002.1129.2039.jpg) as labels, but that would have been a bit ugly. Why not parse the names and reformat the values into a more readable format?

Note the multi-line regular expression that is used to break up the file name. When REs get long and complex, breaking them up in this manner can make them much easier to follow.

    $cnt = 0;
    for ($i=0; $i<9; $i+=3) {
      print OUT "      <TR>\n";
      for ($j=0; $j<3; $j++) {
        print OUT "        <TD>\n";
        $k = $i + $j;
        $tmp1 = $show[$k];
        $tmp1 =~
          m|^(\d{4})\.            # (YYYY).
             (\d\d)(\d\d)\.       # (MM)(DD).
             (\d\d)(\d\d)(\d\d)\. # (HH)(MM)(SS).
             jpg$|x;              # jpg
        $tmp2 = sprintf("%s/%s/%s at %s:%s:%s",
                        $1, $2, $3, $4, $5, $6);
        print OUT "          <CENTER>\n";
        print OUT "            ",
                  "<IMG SRC=\"iq/$tmp1\"><BR>\n";
        print OUT "            $tmp2\n";
        print OUT "          </CENTER>\n";
        $cnt++;
        print OUT "        </TD>\n";
      }
      print OUT "      </TR>\n";
    }

Finally, we push out the last of the HTML, close the file and (Oh, yes!) move it into place for Apache to find. Then, after a second's repose, we go back up and do the whole exercise again.

    print OUT <<EOT;
    </TABLE>
  </BODY>
<HTML>
EOT
    close(OUT);
    rename("$html/index.temp",
           "$html/index.html");
    sleep(1);
  } 
}

Lessons Learned

As we all know, the Mac and BSD universes aren't a perfect fit. Perl is a very good "glue language", however, allowing us to deal smoothly with issues such as line termination, extra (e.g., resource fork) files, etc.

Similarly, there are a wealth of useful apps which can perform small tasks, fill in gaps between operating systems, and generally make our lives easier. If a $20 shareware package can save me hours of frustration, the purchase decision is a no-brainer.

Unfortunately, some issues are still difficult to resolve. For instance, although it's easy to scan a Eudora mail file for header lines, editing Eudora mailboxes would be far trickier. Aside from file locking problems, there is the small issue of the (binary, undocumented) format of the TOC files. In short, choose your challenges carefully...


Rich Morin has been using computers since 1970, Unix since 1983, and Mac-based Unix since 1986 (when he helped Apple create A/UX 1.0). When he isn't writing this column, Rich runs Prime Time Freeware (www.ptf.com), a publisher of books and CD-ROMs for the Free and Open Source software community. Feel free to write to Rich at rdm@ptf.com.

 

Community Search:
MacTech Search:

Software Updates via MacUpdate

WhatsApp 0.2.8000 - Desktop client for W...
WhatsApp is the desktop client for WhatsApp Messenger, a cross-platform mobile messaging app which allows you to exchange messages without having to pay for SMS. WhatsApp Messenger is available for... Read more
TunnelBear 3.5.1 - Subscription-based pr...
TunnelBear is a subscription-based virtual private network (VPN) service and companion app, enabling you to browse the internet privately and securely. Features Browse privately - Secure your data... Read more
Typinator 7.4 - Speedy and reliable text...
Typinator turbo-charges your typing productivity. Type a little. Typinator does the rest. We've all faced projects that require repetitive typing tasks. With Typinator, you can store commonly used... Read more
Monosnap 3.4.9 - Versatile screenshot ut...
Monosnap lets you capture screenshots, share files, and record video and .gifs! Features Capture Capture full screen, just part of the screen, or a selected window Make your crop area pixel... Read more
Fantastical 2.4.5 - Create calendar even...
Fantastical 2 is the Mac calendar you'll actually enjoy using. Creating an event with Fantastical is quick, easy, and fun: Open Fantastical with a single click or keystroke Type in your event... Read more
TunnelBear 3.5.1 - Subscription-based pr...
TunnelBear is a subscription-based virtual private network (VPN) service and companion app, enabling you to browse the internet privately and securely. Features Browse privately - Secure your data... Read more
Typinator 7.4 - Speedy and reliable text...
Typinator turbo-charges your typing productivity. Type a little. Typinator does the rest. We've all faced projects that require repetitive typing tasks. With Typinator, you can store commonly used... Read more
Fantastical 2.4.5 - Create calendar even...
Fantastical 2 is the Mac calendar you'll actually enjoy using. Creating an event with Fantastical is quick, easy, and fun: Open Fantastical with a single click or keystroke Type in your event... Read more
Monosnap 3.4.9 - Versatile screenshot ut...
Monosnap lets you capture screenshots, share files, and record video and .gifs! Features Capture Capture full screen, just part of the screen, or a selected window Make your crop area pixel... Read more
Skim 1.4.32 - PDF reader and note-taker...
Skim is a PDF reader and note-taker for OS X. It is designed to help you read and annotate scientific papers in PDF, but is also great for viewing any PDF file. Skim includes many features and has a... Read more

Latest Forum Discussions

See All

Everything about Hero Academy 2 - The co...
It's fair to say we've spent a good deal of time on Hero Academy 2. So much so, that we think we're probably in a really good place to give you some advice about how to get the most out of the game. And in this guide, that's exactly what you're... | Read more »
Everything about Hero Academy 2: Part 3...
In the third part of our Hero Academy 2 guide we're going to take a look at the different modes you can play in the game. We'll explain what you need to do in each of them, and tell you why it's important that you do. [Read more] | Read more »
Everything about Hero Academy 2: Part 2...
In this second part of our guide to Hero Academy 2, we're going to have a look at the different card types that you're going to be using in the game. We'll split them up into different sections too, to make sure you're getting the most information... | Read more »
Everything about Hero Academy 2: Part 1...
So you've started playing Hero Academy 2, and you're feeling a little bit lost. Don't worry, we've got your back. So we've come up with a series of guides that are going to help you get to grips with everything that's going on in the game. [Read... | Read more »
What mobile gaming can learn from the Ni...
While Nintendo might not have had things all its own way since it began developing for mobile, one thing it has got right is the release of the Switch. After the disappointment of the WiiU, which I still can't really explain, the Switch felt a... | Read more »
Programmer of Sonic The Hedgehog launche...
Japanese programmer Yuji Naka is best known for leading the team that created the original Sonic The Hedgehog. He’s moved on from the speedy blue hero since then, launching his own company based in Tokyo – Prope Games. Legend of Coin is the... | Read more »
Why doesn't mobile gaming have its...
The Overwatch League is a pretty big deal. It's an attempt to really push eSports into the mainstream, by turning them into, well, regular sports. But slightly less sweaty. It's a lavish affair with teams from all around the world, and more... | Read more »
Give Webzen’s new billiard game PoolTime...
Best known for producing hugely popular MMO titles, South Korean publisher Webzen is now taking aim at a different genre altogether. PoolTime is a realistic eight ball pool simulator, allowing you to compete in real-time matches against players... | Read more »
Let Them Come Guide - How to survive aga...
Let Them Come is all about making it as far as possible against overwhelming odds. Check out some of these tips to help you last a little longer in your unwinnable fight: [Read more] | Read more »
All the best games on sale for iPhone an...
Happy last day of the week. I hope you've been having a good one. I have. I saw ten doggos today. So because I'm in a good mood, I thought I'd round up all of the best games that are currently on sale on the App Store. [Read more] | Read more »

Price Scanner via MacPrices.net

Apple now offering Certified Refurbished 2017...
Apple has Certified Refurbished 9.7″ WiFi iPads available for $50-$80 off the cost of new models. An Apple one-year warranty is included with each iPad, and shipping is free: – 9″ 32GB WiFi iPad: $... Read more
10″ iPad Pros on sale for $50-$75 off MSRP, n...
B&H Photo has 10″ and #Apple #iPad Pros on sale for up to $75 off MSRP. Shipping is free, and B&H charges sales tax in NY & NJ only. Note that some sale prices are restricted to certain... Read more
Apple refurbished Mac minis available startin...
Apple has restocked Certified Refurbished Mac minis starting at $419. Apple’s one-year warranty is included with each mini, and shipping is free: – 1.4GHz Mac mini: $419 $80 off MSRP – 2.6GHz Mac... Read more
Amazon offers Silver 13″ Apple MacBook Pros f...
Amazon has new Silver 2017 13″ #Apple #MacBook Pros on sale today for up to $150 off MSRP, each including free shipping: – 13″ 2.3GHz/128GB Silver MacBook Pro (MPXR2LL/A): $1199.99 $100 off MSRP – 13... Read more
Sale: 12″ 1.3GHz MacBooks on sale for $1499,...
B&H Photo has Space Gray and Rose Gold 12″ 1.3GHz #Apple MacBooks on sale for $100 off MSRP. Shipping is free, and B&H charges sales tax for NY & NJ residents only: – 12″ 1.3GHz Space... Read more
Apple offers Certified Refurbished 2017 iMacs...
Apple has a full line of Certified Refurbished iMacs available for up to $350 off original MSRP. Apple’s one-year warranty is standard, and shipping is free. The following models are available: – 27... Read more
13″ MacBook Airs on sale for $120-$100 off MS...
B&H Photo has 2017 13″ 128GB MacBook Airs on sale for $120 off MSRP. Shipping is free, and B&H charges sales tax for NY & NJ residents only: – 13″ 1.8GHz/128GB MacBook Air (MQD32LL/A): $... Read more
15″ Touch Bar MacBook Pros on sale for up to...
Adorama has Space Gray 15″ MacBook Pros on sale for $200 off MSRP. Shipping is free, and Adorama charges sales tax in NJ and NY only: – 15″ 2.8GHz MacBook Pro Space Gray (MPTR2LL/A): $2199, $200 off... Read more
21″ 3.4GHz 4K iMac on sale for $1399, $100 of...
Adorama has the 21″ 3.4GHz 4K #Apple #iMac on sale today for $1399. Their price is $100 off MSRP. Shipping is free, and Adorama charges sales tax in NJ and NY only: – 21″ 3.4GHz 4K iMac (MNE02LL/A... Read more
B&H offering 13″ Apple MacBook Pros for u...
B&H Photo has 13″ MacBook Pros on sale for up to $75-$120 off MSRP. Shipping is free, and B&H charges sales tax for NY & NJ residents only: – 13-inch 2.3GHz/128GB Space Gray MacBook Pro (... Read more

Jobs Board

Commerce Engineer, *Apple* Media Products -...
# Commerce Engineer, Apple Media Products Job Number: 113161479 Santa Clara Valley, California, United States Posted: 01-Nov-2017 Weekly Hours: 40.00 **Job Summary** Read more
*Apple* Retail - Multiple Positions - Apple,...
Job Description:SalesSpecialist - Retail Customer Service and SalesTransform Apple Store visitors into loyal Apple customers. When customers enter the store, Read more
Site Reliability Engineer, *Apple* Pay - Ap...
# Site Reliability Engineer, Apple Pay Job Number: 113356036 Santa Clara Valley, California, United States Posted: 12-Jan-2018 Weekly Hours: 40.00 **Job Summary** Read more
UI Tools and Automation Engineer, *Apple* M...
# UI Tools and Automation Engineer, Apple Media Products Job Number: 86351939 Santa Clara Valley, California, United States Posted: 11-Jan-2018 Weekly Hours: 40.00 Read more
*Apple* Retail - Multiple Positions - Apple,...
Job Description: Sales Specialist - Retail Customer Service and Sales Transform Apple Store visitors into loyal Apple customers. When customers enter the store, Read more
All contents are Copyright 1984-2011 by Xplain Corporation. All rights reserved. Theme designed by Icreon.