TweetFollow Us on Twitter

Mac in the Shell: Data Manipulation with PHP

Volume Number: 24 (2008)
Issue Number: 04
Column Tag: Mac in the Shell

Mac in the Shell: Data Manipulation with PHP

PHP provides easy access to MySQL on OS X

by Edward Marczak

Introduction

Last month, we covered some of the basics of scripting with PHP. This month, we'll continue and talk about database access and string manipulation with PHP. Thanks to a pretty full install of PHP as the default in OS X, we don't have to worry about fetching, configuring, compiling and installing PHP - we can just get on with writing our script.

Working Environment

Last month, I mentioned that there are plenty of OS X-based editors that 'understand' PHP. This comes in the form of syntax coloring, appropriate code snippets, code completion and auto indentation. Before we continue, there are two more things that I should mention.

Coda, from Panic Software, is one of those editors - one that I should have mentioned last month. It's actually quite a bit more than a simple text editor: It's a full web development environment. I really mention it here because it has a way to access PHP documentation built right in (along with HTML, Javascript and more). Since PHP is such a popular web development language, as a web development tool, this made sense. At last check, though, access to this documentation was only available while you have an Internet connection.

I, on the other hand, am very much of an offline person. I want/need to accomplish things while on a plane, train, or wherever I may not have a good signal from some source, for whatever reason. Due to that, I've come to rely on the PHP Function Reference widget for Dashboard. Downloadable from here: http://code.google.com/p/phpfr/ (it has been "open sourced" since its introduction, and now lives as a Google Code project). Weighing in at close to 40MB, it's a little larger than your average widget, but that's the price you pay to carry around the entire PHP function set with descriptions. As a Dashboard widget, it lives out of sight until you need it. With a search function built-in, I tend to reach for this a lot. Highly recommended.

The Setup

I do a lot of data manipulation and migration work, and scripting languages are an essential part of the process. Perl, Ruby and Python are all good contenders for that position, but this particular column is about PHP! Our sample project this month will extract data from a MySQL database and give us a CSV file, ready for importing into another system. "But Ed," you say, "you can do that with MySQL alone!" While this is true, we aren't allowed to manipulate the data too drastically in that process.

Let's imagine that the data in our source system is stored with a single column for full name, but our target system wants separate first name and last name. Or that we need to go gather more data (like a zip+4 based on address) per record. There are hundreds of reasons you may need to filter the data as it goes from the database to a CSV file.

Connecting to the Database

One variable type that I left out of our discussion last month is resource. We looked at string, integer and other built-in types, but 'resource' really warrants its own discussion.

A variable of type "resource" holds a reference to some external resource. This may be a file on disk, a curl session to fetch a URL, a database connection or more. Again, it's just a reference, and you'll need to use functions that know how to access those resources via the resource variable that you pass to them.

First thing that our script needs to do is open a connection to the MySQL server. This happens with the mysql_connect command:

$connection = mysql_connect("server.example.com","user","password");

The mysql_connect command passes back a resource if successful, and FALSE if there's an error. Let's get into good habits right away: check the value of $connection for FALSE. Thankfully, we have access to the error that the server passes back to us in the form of two functions: mysql_errno() and mysql_error(). These functions return the error code from the previous MySQL operation. We should follow our connect, or any database operation with something similar to this:

if (! $connection) {
   print "Connection to database failed with ".mysql_errno()." - ".mysql_error()."\n";
   die();
}

Simply, if $connection is not TRUE (zero, or, FALSE), we print out a message containing the MySQL error information, and then we halt operation of the script and exit with die().

Also, looking at our initial connect string, what's wrong, or at least, what should send shivers up your spine? Yes, a password in a script. Unfortunately, that's just life in the big city, and you'll need to take precautions that allow only the right people to have access to the script. While you can put this information in external files and pull it in, or make your script get it from somewhere, the person that can read your file can also (typically) read how and where to get the password. Here be dragons, so, be conscious of this and plan for protection.

Once connected, we need to specify which database we're going to be working with. That's done with the mysql_select_db() function:

$select = mysql_select_db("customers");
...error checking here ...

All subsequent MySQL functions operate on the selected database. If you're working with more than one server, you can optionally pass in the connection resource that you obtained from mysql_connect():

$conn1=mysql_connect("server1","bob","s3kret");
$conn2=mysql_connect("server2","jane","h1dd3n");
$select1=mysql_select_db("sales",$conn1);
$select2=mysql_select_db("aggregates",$conn2);

...with the proper error checking, of course!

Once those preliminaries are complete, we can now query our database. While you can stuff a query into fewer lines, we're going to split it up in a more conventional manner. The mysql_query() function takes a string as an argument. Assigning the query to a variable makes for easier reading, easier variable substitution, and allows one to build dynamic queries. Let's start with a simple example:

$query="select uid,fname,lname from users";
$result=mysql_query($query);

$query is a simple string, and $result is of type resource. As with our earlier queries, if the execution fails, mysql_query() will return a Boolean FALSE, and our errors can be retrieved with mysql_errno() and mysql_error().

$result is simply a resource, and points to the result set. We now have to use it to fetch the actual data, database row by database row. If you purposefully limited your query to a single row, you could now go fetch it with one line. However, more often than not, you're expecting multiple rows - sometimes thousands. A perfect job for the while loop that we introduced last month:

while ($row=mysql_fetch_assoc($result)) { print "Scanning ".$row['uid'].": ".$row['fname']." ".$row['lname']."\n"; }

That's a lot packed into a small space, so, let's break it down. The data fetching happens with this line:

$row=mysql_fetch_assoc($result)

You could manually call that over and over and get a new row from the database each call. Of course, for large result sets, a loop is the only feasible method. The mysql_fetch_assoc() function returns an associative array. Unlike a "plain" array, you can access values with a key that is a string. In our case, each column from the database becomes an entry in the array. We asked for the fields, "uid", "fname" and "lname", which makes $row become:

$row['uid']
$row['fname']
$row['lname']

..filled with the values from the current row retrieved from the database.

Within the body of the loop, there's a single line: a print statement. It's fairly ugly, though, isn't it? It happens to be pretty standard fare in PHP. The concatenation operator (".") 'glues' two strings together. So, now it should be pretty easy to break down, as we're printing some static text (the bits in quotes) and some dynamic text, filled in by the values in the $row array. Sample output would look like this:

Scanning 1052: Mike Bollinger
Scanning 1053: Laura Wilkinson
Scanning 1055: Joanne Moss
...

Now that we have access to our data, let's get it into a format usable by almost all other apps: CSV.

Data Manipulation

If you simply wanted a CSV file of a MySQL database table, there are many simple ways to do so. However, when moving data from one system to another, we often need to massage the original data into a new format. We get that opportunity in our while loop, where we're accessing each row of the table and before writing it out. PHP has many string manipulation functions to help us do anything our hearts (and brains) desire.

The first thing we may want to do is to simply re-assign a cleaned-up version of a string before writing it out. Consider this:

switch ($row['state']) {
   case "NY":
   case "new york":
   case "noo yrk":
      $state="New York";
      break;
   case "CA":
   case "calif.":
   case "cal":
      $state="California";
      break;
   default:
      $state="Unknown";
}

Rather than writing six separate if statements, I chose to use a single switch statement. This is a) cleaner code and b) a nice introduction to the switch statement, which I did not cover last month. Generally, the switch statement will test a single expression for the results you choose to test for. In the preceding example, we're examining the variable "$row['state']". Each case statement within the switch block acts as an "if" statement, and all code through the following break statement is executed. The code in the default block matches when no other case statements match. While the switch statement may contain virtually any expression, the match in each case statement can only be a simple type: string, integer or floating-point. You cannot evaluate arrays or objects here. For these more complex cases, you still must rely on an if statement.

The simple line doing all of the work here is this: $state="...". We're just assigning a literal to our variable. We're even free to re-assign $row['state'] if it suits our needs and style.

One of the coolest ways to tear up a string is the explode() function. As always, an example is best:

$animal="bear cat dog elephant";
$the_animals=explode(" ",$animal);
After this runs, $the_animals will be an array containing:
$the_animals[0]="bear"
$the_animals[1]="cat"
$the_animals[2]="dog"
$the_animals[3]="elephant"

explode() can split based on any delimiter, not just a space. However, this is extremely useful for splitting up a full name. If you know that all entries in the database are in firstname-space-lastname format, then this will do what you want:

list($fname, $lname) = explode(" ",$fullname);

However, the reason you're probably involved is due to a more complex nature of the data. If most names are in two-name format, but some may contain a middle initial, too, then we have a few options. One would be to toss the middle initial, or include it in the first name. In that scenario we need to test how many elements were returned:

$fullname="William S. Fatire";
$names=explode(" ",$fullname);
if (count($names) > 2) {
   $fname=$names[0]." ".$names[1];
   $lname=$names[2];
} else {
   $fname=$names[0];
   $lname=$names[1];
}
print "The names are:\n";
print "Firstname: $fname\n";
print "Lastname: $lname\n";

The count() function is introduced here and simply returns the number of elements in an array.

Naturally, our target database may have a field for middle initial, in which case, we can retain it and assign it appropriately.

Let's now imagine that we want to create a default user id for our new users. We can base this id on the user's real name. PHP includes some nice string slicing functions. If we want to base the user ID on the user's first initial plus last name, that's simple:

$uid=$fname[0].$lname;

PHP can access string elements by character when you supply the zero-based offset in square brackets. So, in this example, we're just grabbing the first character of $fname.

If you had a more complex uid-creation scheme, it'd be easy to handle, too. If you needed a uid that contained the first three letters from the family name, and the first two from the first name, PHP includes a nice sub-string function. Generically, it looks like this:

substr ($string, $start [,$length ])

This will return a string that starts at '$start' in $string, and runs until the end of the string supplied, or optionally, for $length. Back to our first-three, first-two uid scheme:

$uid=substr($lname,0,3).substr($fname,0,2);

Nice, right? Of course, if you were really implementing this, you'd need to look for duplicates and also check for last names that may be shorter than 3 characters in total (like "Li").

All Together Now

I'd like to put all of this together to create a code snippet that takes data from a database, gives space to manipulate the data, and output a CSV file. Some new elements will also be introduced here.

Listing 1: db2csv.php

01 <?php
02 
03 mysql_connect("127.0.0.1", "dbuser", "dbpass") or
04     die("Could not connect: " . mysql_error());
05 mysql_select_db("mydb");
06 
07 $q="select * from user_profiles order by fid";
08 $r=mysql_query($q);
09 while ($row=mysql_fetch_assoc($r)) {
10    $pf[$row['fid']]=$row['title'];
11 }
12 
13 // Create the header
14 $curcsv="\"uid\",";
15 foreach ($pf as $key=>$val) {
16    $curcsv=$curcsv."\"".$val."\",";
17 }
18 $curcsv=substr($curcsv,0,strlen($curcsv)-1);
19 print "$curcsv\n";
20 
21 $q="select * from users where status=1";
22 $r=mysql_query($q);
23 while ($row=mysql_fetch_assoc($r)) {
24 // Build csv for current user
25    $curcsv="\"".$row['uid']."\",";
26    foreach ($pf as $i=>$val) {
27       $q2="select value from user_profiles where fid=$i and uid=".$row['uid'];
28       $r2=mysql_query($q2);
29       $row2=mysql_fetch_assoc($r2);
30       $curcsv=$curcsv."\"".$row2['value']."\",";
31    }
32    $curcsv=substr($curcsv,0,strlen($curcsv)-1);
33    print "$curcsv\n";
34 }
35 
36 ?>

Newly introduced here is the foreach loop. foreach gives the programmer (you) an easy way to iterate over arrays. Generically, foreach looks like this:

foreach ($array as $value)

The loop interates once for each element of the array, and $value is updated accordingly. A variation to this (seen on line 26 in listing 1) also includes the current key - very useful for associative arrays where the key is text or other representation of the index. That variant is simply:

foreach ($array as $key=>$value)

Breaking down listing 1, lines 1-5 should look familiar: the opening PHP tag, and then try to connect to the database. Line 7 defines the first query to the first table, and line 8 executes that query against the selected database. The while loop starting at line 9 is pretty standard fare, but what's going on in the loop body?

On line 10, we're creating an associative array ($pf) using a variable as the key ("index"). In this case, we're using the result of the database fetch $row['fid']. Pretty slick.

Lines 14 through 17 take case of one small but significant part of writing out a CSV file: making the first line a header that describes the remaining columns. In our case, we can do that with the contents of the array we just created. Another foreach loop neatly solves that. In the body of this loop, we're creating a variable that will hold the current line of the CSV file. Each field is wrapped in quotes, and then followed by a comma.

Line 18 removes the trailing comma from $curcsv, and line 19 prints out $curcsv. Now for the bulk of work.

Lines 21 through 33 handle the main load of this program. Line 21 and 22 set up a new query and execute it. Line 23 brings back our now familiar while loop, fetching one database row at a time. Line 25 starts $curcsv anew each iteration, wrapping the proper value ($row['uid'] in this case) in quotes followed by a comma.

Line 26 gets interesting: we use a new foreach loop to obtain more information about this particular uid for each of the values in $pf - by running a new query. We query and fetch again, and line 30 adds each field retrieved to $curcsv (wrapped and comma'd).

Finally, line 32 takes care of the trailing comma on $curcsv (because we blindly add it after each value), and line 33 prints the row.

Running this, or similar program, will simply dump all output to standard out - we're only using a print statement. You thought we were creating a CSV file? Well, we are! Everything in Unix is a file. Remember our shell redirectors? We can run this program as is and have the chance to visually inspect the output, and, when we're ready, simply redirect to a file:

php db2csv.php > users.csv

This is also a nice euphemism for, "I'm going to cover file manipulation next month!"

Conclusion

I'll continue to say: don't discount PHP as a general scripting language. It's fairly rapid development, has broad capabilities, and will typically be found across platforms (including versions for Windows - but you'll need to install it yourself). As mentioned, I'll cover some more aspects of scripting with PHP next month, including file manipulation.

Media of the month: Inside the Machine: An Illustrated Introduction to Microprocessors and Computer Architecture, by Jon Stokes. It's a very readable introduction to microprocessor architecture, and it's even current enough that it covers up through Intel's Core 2 Duo chips.

Until next time, enjoy your new scripting prowess!

References

Official PHP Documentation: http://www.php.net/docs.php


Ed Marczak is the Executive Editor for MacTech Magazine, and has been lucky enough to have ridden the computing and technology wave from early on. From teletype computing to MVS to Netware to modern OS X, his interest has always been piqued. He has also been fortunate enough to come into contact with some of the best minds in the business. Ed spends his non-compute time with his wife and two daughters.

 

Community Search:
MacTech Search:

Software Updates via MacUpdate

Duplicate Annihilator 5.7.5 - Find and d...
Duplicate Annihilator takes on the time-consuming task of comparing the images in your iPhoto library using effective algorithms to make sure that no duplicate escapes. Duplicate Annihilator... Read more
BusyContacts 1.0.2 - Fast, efficient con...
BusyContacts is a contact manager for OS X that makes creating, finding, and managing contacts faster and more efficient. It brings to contact management the same power, flexibility, and sharing... Read more
Capture One Pro 8.2.0.82 - RAW workflow...
Capture One Pro 8 is a professional RAW converter offering you ultimate image quality with accurate colors and incredible detail from more than 300 high-end cameras -- straight out of the box. It... Read more
Backblaze 4.0.0.872 - Online backup serv...
Backblaze is an online backup service designed from the ground-up for the Mac.With unlimited storage available for $5 per month, as well as a free 15-day trial, peace of mind is within reach with... Read more
Little Snitch 3.5.2 - Alerts you about o...
Little Snitch gives you control over your private outgoing data. Track background activity As soon as your computer connects to the Internet, applications often have permission to send any... Read more
Monolingual 1.6.4 - Remove unwanted OS X...
Monolingual is a program for removing unnecesary language resources from OS X, in order to reclaim several hundred megabytes of disk space. If you use your computer in only one (human) language, you... Read more
CleanApp 5.0 - Application deinstaller a...
CleanApp is an application deinstaller and archiver.... Your hard drive gets fuller day by day, but do you know why? CleanApp 5 provides you with insights how to reclaim disk space. There are... Read more
Fantastical 2.0 - Create calendar events...
Fantastical is the Mac calendar you'll actually enjoy using. Creating an event with Fantastical is quick, easy, and fun: Open Fantastical with a single click or keystroke Type in your event details... Read more
Cocktail 8.2 - General maintenance and o...
Cocktail is a general purpose utility for OS X that lets you clean, repair and optimize your Mac. It is a powerful digital toolset that helps hundreds of thousands of Mac users around the world get... Read more
Direct Mail 4.0.4 - Create and send grea...
Direct Mail is an easy-to-use, fully-featured email marketing app purpose-built for OS X. It lets you create and send great looking email campaigns. Start your newsletter by selecting from a gallery... Read more

These are All the Apple Watch Apps and G...
The Apple Watch is less than a month from hitting store shelves, and once you get your hands on it you're probably going to want some apps and games to install. Fear not! We've compiled a list of all the Apple Watch apps and games we've been able to... | Read more »
Appy to Have Known You - Lee Hamlet Look...
Being at 148Apps these past 2 years has been an awesome experience that has taught me a great deal, and working with such a great team has been a privilege. Thank you to Rob Rich, and to both Rob LeFebvre and Jeff Scott before him, for helping me... | Read more »
Hands-On With Allstar Heroes - A Promisi...
Let’s get this out of the way quickly. Allstar Heroes looks a lot like a certain other recent action RPG release, but it turns out that while it’s not yet available here, Allstar Heroes has been around for much longer than that other title. Now that... | Read more »
Macho Man and Steve Austin Join the Rank...
WWE Immortals, by Warner Bros. Interactive Entertainment and WWE, has gotten a superstar update. You'll now have access to Macho Man Randy Savage and Steve Austin. Both characters have two different versions: Macho Man Randy Savage Renegade or Macho... | Read more »
Fearless Fantasy is Fantastic for the iF...
I actually had my first look at Fearless Fantasy last year at E3, but it was on a PC so there wasn't much for me to talk about. But now that I've been able to play with a pre-release version of the iOS build, there's quite a bit for me to talk... | Read more »
MLB Manager 2015 (Games)
MLB Manager 2015 5.0.14 Device: iOS Universal Category: Games Price: $4.99, Version: 5.0.14 (iTunes) Description: Guide your favorite MLB franchise to glory! MLB Manager 2015, officially licensed by MLB.com and based on the award-... | Read more »
Breath of Light (Games)
Breath of Light 1.0.1421 Device: iOS Universal Category: Games Price: $2.99, Version: 1.0.1421 (iTunes) Description: Hold a quiet moment. Breath of Light is a meditative and beautiful puzzle game with a hypnotic soundtrack by... | Read more »
WWE WrestleMania Tags into the App Store
Are You ready to rumble? The official WWE WrestleMania app, by World Wrestling Entertainment, is now available. Now you can get all your WrestleMania info in one place before anyone else. The app offers details on superstar signings, interactive... | Read more »
Bio Inc's New Expansion is Infectin...
Bio Inc., by DryGin Studios, is the real time strategy game where you infect a human body with the worst virus your evil brain can design. Recently, the game was updated to add a whole lot of new features. Now you can play the new “Lethal”... | Read more »
The Monocular Minion is Here! Despicable...
Despicable Me: Minion Rush, by Gameloft, is introducing a new runner to the mix in their latest update. Now you can play as Carl, the prankster minion. Carl has a few new abilities to play with, including running at a higher speed from the start.... | Read more »

Price Scanner via MacPrices.net

13-inch 2.5GHz MacBook Pro (refurbished) avai...
The Apple Store has Apple Certified Refurbished 13″ 2.5GHz MacBook Pros available for $829, or $270 off the cost of new models. Apple’s one-year warranty is standard, and shipping is free: - 13″ 2.... Read more
Save up to $80 on iPad Air 2s, NY tax only, f...
 B&H Photo has iPad Air 2s on sale for $80 off MSRP including free shipping plus NY sales tax only: - 16GB iPad Air 2 WiFi: $469.99 $30 off - 64GB iPad Air 2 WiFi: $549.99 $50 off - 128GB iPad... Read more
iMacs on sale for up to $205 off MSRP
B&H Photo has 21″ and 27″ iMacs on sale for up to $205 off MSRP including free shipping plus NY sales tax only: - 21″ 1.4GHz iMac: $1019 $80 off - 21″ 2.7GHz iMac: $1189 $110 off - 21″ 2.9GHz... Read more
Färbe Technik Offers iPhone Battery Charge LI...
Färbe Technik, which manufactures and markets of mobile accessories for Apple, Blackberry and Samsung mobile devices, is offering tips on how to keep your iPhone charged while in the field: •... Read more
Electronic Recyclers International CEO Urges...
Citing a recent story on CNBC about concerns some security professionals have about the forthcoming Apple Watch, John Shegerian, Chairman and CEO of Electronic Recyclers International (ERI), the... Read more
Save up to $380 with Apple refurbished iMacs
The Apple Store has Apple Certified Refurbished iMacs available for up to $380 off the cost of new models. Apple’s one-year warranty is standard, and shipping is free: - 27″ 3.5GHz 5K iMac – $2119 $... Read more
Mac minis on sale for up to $75 off, starting...
MacMall has Mac minis on sale for up to $75 off MSRP including free shipping. Their prices are the lowest available for these models from any reseller: - 1.4GHz Mac mini: $459.99 $40 off - 2.6GHz Mac... Read more
College Student Deals: Additional $50 off Mac...
Take an additional $50 off all MacBooks and iMacs at Best Buy Online with their College Students Deals Savings, valid through April 11, 2015. Anyone with a valid .EDU email address can take advantage... Read more
Mac Pros on sale for up to $260 off MSRP
B&H Photo has Mac Pros on sale for up to $260 off MSRP. Shipping is free, and B&H charges sales tax in NY only: - 3.7GHz 4-core Mac Pro: $2799, $200 off MSRP - 3.5GHz 6-core Mac Pro: $3719.99... Read more
13-inch 2.5GHz MacBook Pro on sale for $100 o...
B&H Photo has the 13″ 2.5GHz MacBook Pro on sale for $999 including free shipping plus NY sales tax only. Their price is $100 off MSRP. Read more

Jobs Board

DevOps Software Engineer - *Apple* Pay, iOS...
**Job Summary** Imagine what you could do here. At Apple , great ideas have a way of becoming great products, services, and customer experiences very quickly. Bring Read more
*Apple* Retail - Multiple Positions (US) - A...
Sales Specialist - Retail Customer Service and Sales Transform Apple Store visitors into loyal Apple customers. When customers enter the store, you're also the Read more
Sr. Technical Services Consultant, *Apple*...
**Job Summary** Apple Professional Services (APS) has an opening for a senior technical position that contributes to Apple 's efforts for strategic and transactional Read more
Lead *Apple* Solutions Consultant - Retail...
**Job Summary** Job Summary The Lead ASC is an Apple employee who serves as the Apple business manager and influencer in a hyper-business critical Reseller's store Read more
*Apple* Pay - Site Reliability Engineer - Ap...
**Job Summary** Imagine what you could do here. At Apple , great ideas have a way of becoming great products, services, and customer experiences very quickly. Bring Read more
All contents are Copyright 1984-2011 by Xplain Corporation. All rights reserved. Theme designed by Icreon.