1/7/07

PHP on Mac OS X

source: apple.com

PHP 4.3 is an exciting release for web programmers working on Mac OS X. Previous versions ran on Mac OS X, but not without some difficulty. Installation was especially challenging, and most opted for a binary release, like the one distributed by entropy.ch.

But PHP 4.3 offers full support for Mac OS X. Thanks to some very hard work by members of the PHP development team, anyone familiar with UNIX installations should be able to get PHP up and running. In this article I’ll show you the basics of Mac OS X installation, then demonstrate PHP’s flexibility by describing three distinct ways to parse XML files.

Installation Choices

PHP offers many installation options, far more than I could cover in this article. You can install it as a stand-alone binary, for example, or get it to run as an Apache module. Plus there are dozens of APIs that can be added to PHP, and most require additional libraries to be downloaded and installed.

For the purposes of this tutorial, I’ll assume you’ll be installing PHP as an apxs module that works with the Apache server that shipped with Mac OS X. I’ll show you how I installed a few modules that were of particular interest to me, though other modules can be installed using similar methods.

Initial Steps

First, get the most recent source code of PHP. There may be a later version since the time of this publication. Download the file to a convenient directory. (I’ve created a directory called /apps for PHP and other UNIX applications I need to install.) Decompress the downloaded file with the following commands:

shell> gunzip php-4.3.4.tar.gz
shell> tar xf php-4.3.4.tar

You’ll now have a directory called /apps/php-4.3.4. At this point you can do a basic, no-frills PHP installation:

shell> cd php-4.3.4
shell> ./configure --with-apxs
shell> make
shell> sudo make install

As you add more modules to your installation, you’ll also need more flags in the ./configure command. For example, if you followed Apple Internet Developer’s instructions for installing MySQL or PostgreSQL, you could add either of the following flags to ./configure to make PHP aware of your preferred database:

shell> ./configure --with-mysql=/usr/local/mysql \
--with-pgsql=/usr/local

(Note that PHP assumes most libraries will be in /usr/local. In the above command, I had to make PHP aware that MySQL was not installed in the default directory.)

I decided to add three other modules to my installation: the GD modules for image processing, the XPAT module for XML processing, and the DOM XML module for XML parsing through the Document Object Model.

The XPAT module is a piece of cake. All you need to do is add the — with-xml flag. The required libraries are already available.

The GD image processing module is a bit trickier, however. The GD libraries are bundled with PHP 4.3, but GD is dependent on a few other libraries: zlib, libjpeg, libpng, and libtiff. You could go through the process of downloading, configuring, and compiling these yourself, but, if you’re using Fink, there’s really no need. Fink automates the installation of dozens of UNIX packages and is extremely easy to work with.

Follow these instructions and you’ll end up with a new directory called /sw with a great deal of software in it. You can now use Fink to install libraries with a single command. For example, here’s how you install libjpeg, libpng, and libtiff:

shell> sudo /sw/bin/fink install libjpeg
shell> sudo /sw/bin/fink install libtiff
shell> sudo /sw/bin/fink install libpng

(Note that Fink uses a base directory of /sw for all its installs.)

At this time, Fink can’t install zlib on Mac OS X 10.2, which is needed for GD to run. For that you’ll need to compile from the source. But that’s no big deal—start by downloading the source. Then decompress it:

shell> gunzip zlib-1.1.4.tar.gz
shell> tar -xf zlib-1.1.4.tar

After you cd into the directory, you can install zlib with just two commands:

shell> cd zlib-1.1.4
shell> make
shell> sudo make install

Now you can configure PHP with everything you need to run GD:

shell> cd /apps/php-4.3.4
shell> ./configure --with-zlib-dir=/usr/local \
--with-libjpeg=/sw \
--with-libtiff=/sw \
--with-libpng=/sw \
--with-gd \
--with-pgsql=/usr/local \
--with-mysql=/usr/local/mysql \
--with-xml \
--with-apxs

I also want to add the XML DOM API, which requires the GNOME XML library, libxml2. Conveniently enough, Fink can handle that, too:

shell> sudo /sw/bin/fink install libxml2

The Fink install of libxml2 didn’t work with PHP at first, however. I had to change line 30 of /sw/include/libxml2/libxml/encoding.h to the following:

#include 

After this change, the following commands installed PHP with everything I needed:

shell> cd /apps/php-4.3.4
shell> ./configure --with-zlib-dir=/usr/local \
--with-libjpeg=/sw \
--with-libtiff=/sw \
--with-libpng=/sw \
--with-gd \
--with-pgsql=/usr/local \
--with-mysql=/usr/local/mysql \
--with-xml \
--with-dom=/sw \
--with-apxs

shell> make
shell> sudo make install

To finish the installation, you need to add the following line to your /etc/httpd/httpd.conf file:

AddType application/x-httpd-php .php

Then copy your php.ini to the /usr/local/lib directory:

shell> sudo cp /apps/php-4.3.4/php.ini-dist /usr/local/lib/php.ini

Parsing XML Files

To show how PHP can parse XML files on Mac OS X, I decided to work with the XML file that holds information on iTunes songs and playlists. You can find this file in ~/Music/iTunes/iTunes Music Library.xml. The PHP files I created to work with the XML are included in this tar file.

To work with the XML file, you can make a symlink of it to your web tree. I copied the file directly into my webserver’s directory:

shell> mkdir /Library/WebServer/Documents/itunes
shell> cp ~/Music/iTunes/iTunes\ Music\ Library.xml
/Library/WebServer/Documents/itunes/library.xml

Here’s an excerpt of the file:




Major Version
1
Minor Version
1
Application Version
3.0.1
Tracks

115

Track ID
115
Name
Baby Goes To Eleven
Artist
Superdrag
Album
Last Call For Vitriol
Genre
Alternative & Punk
Kind
MPEG audio file
Size
4828597
Total Time
241319
Disc Number
1
Disc Count
1
Track Number
1
Track Count
12
Year
2002
Date Modified
2003-01-08T13:55:19Z
Date Added
2003-01-08T13:53:18Z
Bit Rate
160
Sample Rate
44100
File Type
1297106739
File Creator

1752133483
File Folder Count
4
Library Folder Count
1


Track ID
120
Name
I Can't Wait
Artist
Superdrag
Album
Last Call For Vitriol
Genre
Alternative & Punk
Kind
MPEG audio file
Size
3978043
Total Time
198791
Disc Number
1
Disc Count
1
Track Number
2
Track Count
12
Year
2002
Date Modified
2003-01-08T13:56:20Z
Date Added
2003-01-08T13:55:24Z
Bit Rate
160
Sample Rate
44100
File Type
1297106739
File Creator
1752133483
File Folder Count
4
Library Folder Count
1

iTunes stores similar information for each song in the library between a set of tags. My immediate goal in PHP is to turn the provided XML into a PHP structure. I’ve opted to turn this data into a two-dimensional array that will look like this:

Array
(
[0] => Array
(
[Track ID] => 115
[Name] => Baby Goes To Eleven
[Artist] => Superdrag
[Album] => Last Call For Vitriol
[Genre] => Alternative & Punk
[Kind] => MPEG audio file
[Size] => 4828597
[Total Time] => 241319
[Disc Number] => 1
[Disc Count] => 1
[Track Number] => 1
[Track Count] => 12
[Year] => 2002
[Date Modified] => 2003-01-08T13:55:19Z
[Date Added] => 2003-01-08T13:53:18Z
[Bit Rate] => 160
[Sample Rate] => 44100
[File Type] => 1297106739
[File Creator] => 1752133483
[File Folder Count] => 4
[Library Folder Count] => 1
)

[1] => Array
(
[Track ID] => 120
[Name] => I Can't Wait
[Artist] => Superdrag
[Album] => Last Call For Vitriol
[Genre] => Alternative & Punk
[Kind] => MPEG audio file
[Size] => 3978043
[Total Time] => 198791
[Disc Number] => 1
[Disc Count] => 1
[Track Number] => 2
[Track Count] => 12
[Year] => 2002
[Date Modified] => 2003-01-08T13:56:20Z
[Date Added] => 2003-01-08T13:55:24Z
[Bit Rate] => 160
[Sample Rate] => 44100
[File Type] => 1297106739
[File Creator] => 1752133483
[File Folder Count] => 4
[Library Folder Count] => 1
)

PHP has a powerful set of array functions, so once you have this type of structure, you’ll be able to sort and lay out the iTunes data on a webpage any way you want.

I’m going to present three methods for moving the XML into the array. To make things easy on myself, I created a few functions that will work with the array I create. These functions are in the parse_itunes_functions.php file.

I want to be able to sort the data by any category—song name, artist name, album name, or any array key available. For that, I’ll use PHP’s usort() function, which takes two arguments: the array to be sorted, and the name of a user-defined function that the array will be sent to. Here’s my array sorting code:

function cmp ($a, $b) {
$sort = !empty($_GET["sort_by"]) ? $_GET["sort_by"] : "Artist";
return strcmp($a[$sort], $b[$sort]);
}

usort($songs, "cmp");

The cmp() function looks in the GET values for a sort_by variable. If one exists, the array will be sorted by that value. The default sort_by value is “Artist.”

Once an array is sorted, you’ll need to lay it out on a webpage. The array_to_table() function takes care of that:

function array_to_table($array, $printable)
{
//expects multi-dimensional array, all with the same keys
$first_time=TRUE;
$str = "\n";
$str .= "\n";
foreach($array as $elem_key=>$element){
if($first_time){
$header_items=array_keys($element);
foreach($header_items as $header){
if(in_array($header, $printable)){
$str .= "\n";
}
}
$str .= "\n";
$first_time=FALSE;
}
$str .= "\n";
foreach($element as $k => $v){
if(in_array($k, $printable)){
$str .= "\n";
}
}
$str .= "\n";
}
$str .= "
" . $header
. "
" . $v . "
";
return $str;
}

Note that the second argument, $printable, expects an array of key names that you want included in the table. The following two lines would print out a table with four columns:

$printable= array("Name", "Artist", "Album", "Size");
echo array_to_table($songs, $printable);

Using this code, your webpage will look something like this:

iTunes webpage using php

Clicking on the header items will sort the table by that item.

Preparing to Parse the XML

The format of the iTunes XML file is not the easiest thing to work with. Life would’ve been a little easier if the tags looked something like this:




Funk with XML
Tech band


My Blue Heaven
Lovely Sounds Around


But in the XML format that iTunes uses (see the above excerpt), the tag names themselves are not terribly descriptive. For example, elements appear in several contexts: surrounding every element in the file, each major segment of the file (e.g., library items and playlists), and each song within the library. To get a list of the songs in the library, the script needs to examine everything between the third opening tag and its matching closing tag, which happens to immediately precede the tag.

The tags for individual library items don’t offer much information. They alternate between and a variable type (string, integer, date). PHP isn’t a strictly typed language, so this kind of information isn’t particularly useful here. As the script parses the file, the value within the tag will need to become an array key, and the value of the next element will become the array value.

Parsing with PCRE

PHP has outstanding text handling abilities, and the first method I’ll use for parsing XML uses one of the handiest text handing function sets available in PHP, the Perl Compatible Regular Expression (PCRE) functions. The code for my PCRE example is included in the pcre_parse.php file in the tarball.

PCRE offers nearly all the power that Perl scripters have grown to expect from their regular expressions. They’re faster than the standard PHP regular expression functions and they offer features like non-greedy matching, which you can’t use with the ereg() functions.

Here’s a quick function that moves each library item into an array with the structure I described above:

function parse_via_pcre($contents)
{
//get everything between the and
//tags, which will include each library element.
preg_match("/Tracks<\/key>.*?(.*)Playlists/s",
$contents, $whole_match);

//the text between each tag will be a library item
preg_match_all("/(.*?)<\/dict>/s",
$whole_match[1], $items);
$songs=array();
//$j is a generic counter.
$j=0;
foreach($items[1] as $value){

//this function creates two needed arrays
//$elements[1], which stores the values within
//the tags, and
//$elements[2], which stores the values within
//the tags that follow
preg_match_all("/(.*?)<\/key>.*?<.*?>(.*?)<\/.*?>/s
", $value, $elements);

//for each element assign a key and value to $songs array
for($i=0; $i < key="$elements[1][$i];" value="$elements[2][$i];">

Three reasonably simple regular expressions is all it took to get at the needed information. To use this function, all you need to do is open the file and pass it a string containing the file contents:

$fp=fopen("library.xml", "r");
$contents=fread($fp, filesize("library.xml"));
fclose($fp);
$songs=parse_via_pcre($contents);

Parsing with XPAT

The next approach to parsing the file uses the XPAT functions. XPAT is a SAX (Simple API for XML) parser, an event-based parser that can be told to perform an action when a particular condition is encountered. For instance, when the parser comes across an opening tag, that’s an event. Other events include encountering a closing tag or processing instructions. What happens when these events occur is entirely your business; in PHP, these events will trigger functions that you’ll have to write yourself.

Here’s some code that opens a file, creates a parser, and sets the needed callback functions (You can find this code in the xml_parse.php file):

§
$xml_parser = xml_parser_create();
xml_parser_set_option($xml_parser, XML_OPTION_CASE_FOLDING, 1);
xml_set_element_handler($xml_parser, "start_element", "end_element");
xml_set_character_data_handler($xml_parser, "character_data");

if (!($fp = @fopen($file, "r"))) {
return false;
}

while ($data = fread($fp, 4096)) {
if (!xml_parse($xml_parser, $data, feof($fp))) {
die(sprintf("XML error: %s at line %d\n",
xml_error_string(xml_get_error_code($xml_parser)),
xml_get_current_line_number($xml_parser)));
}
}

xml_parser_free($xml_parser);

This created a new parser with the xml_parser_create() function, and then defined some callback functions to run when events occur. In this example, I’ve indicated that the user-defined functions start_element(), end_element(), and character_data() will run when opening tags, closing tags, and character data are encountered, respectively. You’ll need to use these functions to create your internal PHP data sturcture.

Working with the callback functions can be a bit tricky. It’ll be easiest to begin by looking at the start_element() function:

function start_element($parser, $name, $attribs) {
global $current_element, $number_dicts;
if($name=="DICT"){
$number_dicts++;
}
if ($number_dicts>2){
$current_element=$name;
}
}

From examining the iTunes library file, I know that I only want contents of elements that start after the third opening , where the song listing begins. So I keep a counter which tracks the number of opening s encountered. I only begin tracking element names when the counter indicates it has reached the third one. It’s important to note that the counter and the element name both need to be available as globals. The values need to be available to this and the other callback functions as the parser calls them.

The character_data() function doesn’t have to do much. It sets an element’s contents to a global variable and stays on the lookout for an element value of “Playlist.” When that value turns up, we know we’re done processing the song library:

function character_data($parser, $data) {
global $number_dicts, $current_data, $end_of_songs;
if($data=="Playlists") {
$end_of_songs=TRUE;
}
$current_data=trim($data);
}

Now let’s look at the end_element() function, where most of the heavy lifting is done. As you can see from the brief code below, all it takes is a few lines to construct the array of songs:

function end_element($parser, $name) {
global $songs, $current_element, $current_data;
global $number_dicts, $array_key, $end_of_songs;
if($end_of_songs){
return;
}
if(!empty($current_element)) {
if($current_element=="KEY"){
$array_key=$current_data;
}else{
$songs[$number_dicts][$array_key]=$current_data;
}
}
}

XML processing with XPAT can get pretty complicated if your XML files are more sophisticated or if you need to track additional information. But the previous example should provide a basic introduction to working with XPAT.

XML Processing with DOM XML

The Document Object Model, or DOM, is a W3C standard that maps the structure of an XML document into a series of objects. You can access the document’s structure or manipulate the XML with a series of methods and properties. For example, working with the DOM, XML’s root element is considered a node, and from the root you can get a list of immediate children, add a child element, or remove a specific child. DOM methods also allow you to traverse an XML tree with great specificity, letting you get to the exact area of the document that you’re interested in.

Unlike the EXPAT functions, the DOM does not require you to build an internal data structure to store your XML—the structure of your XML tree is available within the DOM.

In my opinion, the DOM functions are easier to work with, though they do require more resources than the very lean XPAT functions. The function set is also very new; the PHP online manual urges caution when using this API.

It took very little code to manipulate the iTunes playlist with the DOM functions. (You can find this code in the dom_parse.php file.)

$dom=domxml_open_file("library.xml");

//$dicts will contain an array elements
$dicts=$dom->get_elements_by_tagname("dict");

//I'm interested in everything within the second dict tag
//that's where the entire music library is
//$childs will contain an array of .
//The sub-elements of each contain all song data.

$song_nodes = $dicts[1]->get_elements_by_tagname("dict");

// get the first "dict" object (the first object with song data)
$song_node = $song_nodes[0];

// now iterate through the song objects and their data
// using the next_sibling() method to move
$i = 0;

while ($song_node){
$data_node = $song_node->first_child();
while($data_node)
{
if($data_node->node_name() != "#text") {
if($data_node->node_name() == "key"){ // found a key
$array_key=$data_node->get_content();
}else{ // found the value for the current key
$songs[$i][$array_key]=$data_node->get_content();
}
}
// now advance to the next data node
//at this level (within a song)
$data_node = $data_node->next_sibling();
}
$i++;
// advance to the next song node
$song_node = $song_node->next_sibling();
}

Once you’ve run through this code, you can send $songs to the array processing functions shown earlier.

Conclusion

PHP is an easy-to-learn and flexible language. With dozens of extensions to choose from, PHP can provide all the power you need to create web applications on Mac OS X. And now that version 4.3 is here, it’s even easier to get PHP up and running on your system.

1 comment:

Marnina said...

People should read this.