Blog

Notes from Facebook's HipHop for PHP Debut

Posted Feb 2, 2010 by Kris Jordan | Comments ( 2 ) | Filed in: PHP | | | | |

This evening the developers behind HipHop dove into more depth on the technology over a Live UStream cast. In case you missed it, here are my notes from the presentation.

Performance compared to PHP with APC opcode cache turned on:

  • 30% less CPU time serving traffic in API usage (shorter requests pulling data from fewer sources)
  • 50% less CPU time serving traffic in WEB page rendering usage

PHP, being a scripted language, has a certain amount of 'non-magic' code that looks and feels much like static code you might write in C/C++. This code and logic can be compiled down to very fast static function calls, static variable look-ups, static types. This is where HipHop's performance gains come from. The 'magic' / dynamic features of PHP are much harder to optimize and run roughly as fast as they do on the Zend engine. Magic includes:

$$x = $$y;
eval($x . $y);
$$$$foo();
function foo($x) { include $x; }

There are some tricks that can be applied for this style of dynamic code, things like prehashing dynamic variable look-ups, efficient jump tables, etc.
 "If the code has a lot of dynamicness it will execute at roughly the same speed."

Transformation Process:

1. Static analysis
* Collect info on who declares what, dependencies
2. Type inference
* Pick most specific type for every variable possible
** C++ scalars string array, classes, object, and variant
* Type hints in code are analyzed
3. Code Generation

Facebook hand wrote the runtime libraries for string/array/object classes. Similar to Zend engine's. They rewrote Zend engine because it primarily deals with variant types. In Hip-Hop there are special cases for static types made possible by type inference. Code for "what if this variable is an integer", etc. Another reason to rewrtie the runtime is that once the whole runtime in C++ it should be easier to author extensions based on primitive C++/FaceBook types than their Zend engine, dynamic counter-parts.

"We want people to write PHP like they're used to writing PHP."

Supported magical PHP features:

  • dynamic function call including call_user_func()
  • dynamic object properties and methods
  • dynamic variables, including extract()
  • dynamic includes
  • redeclared functions
  • redeclared classes
  • redeclared constance
  • magic methods __toString(), __get(), __set(), __call()

Features not supported:
1. eval, create function, preg_replace /e
2. "if(function_exists('foo'){print 'foo';})
function foo() {}"

HipHop runs as one process with multiple threads. No downtimes during restarts (port takeover). Uses libevent internally.

In the next month the goal is to bring it up to speed with 5.2.12. After that the focus switches to PHP 5.3+. HipHop may potentially support some of the type suggestion and typing features that did not make the 5.3 cut.

Some Q&A
Do you support old PHP extensions?
We've converted extensions because they have their own datastructures. They've converted a lot of them because facebook's codebase uses a lot of extensions. Keep same logic switching out data structures.

There are some PHP extensions that are not thread-safe, how do we handle them?
We've fixed the problems they had and made them thread-safe. Once the source is opened you will get the full list of supported extensions.

How did you know it "worked".
- We counted the number of function calls and compare with PHP, it has to be identical.
- We compared network traffic between machines running HipHop and running PHP and made sure the in and out were practically identical.
- High enough volume that every machine gets roughly the same kind of traffic making this analysis possible.

Have you tried to support other compilers like Intel's?
We want to try intel's with great support for memcopy.

Why did you choose not to go the just-in-time / LLVM route?
So V8 has great optimization over Javascript but it was not available at the time we started the project. We absolutely considered LLVM. Subconciously I was thinking about having a tool that automatically converts code to C++. We may be able to borrow some of the optimization techniques being used by V8.

 


 

This is is really exciting technology. Once the source is released we'll be working out the details on how to compile Recess on HipHop as soon as possible.

New Recess Website & Plans for 2010

Posted Jan 20, 2010 by | Comments ( 8 ) | Filed in: | | | | |

Recess has a redesigned website thanks to Josh Lockhart and Eli van Zoeren of New Media Campaigns. The new site will allow for better organization of documentation and resources. It's an exciting way to kick off Recess' second year.

Recess' plans for 2010 will build on the lessons of 2009 and the developments seen across the PHP community at large.

  1. Interoperability. The next Recess release will adhere to the class loading and namespacing rules the PHP community has come to agreement over. Now that PHP has proper namespace support interoperability does not come at the cost of crazy class naming conventions based on directory structure.
  2. Focus. The next Recess release will be Recess Core for 5.3. By focusing one component at a time, rather than a shotgun approach, we can ensure not only high quality, properly tested code but thorough documentation, as well. Documentation was our weakest spot last year. Work on this began in September and will be completed soon. After Core the focus will quickly shift to modularizing the HTTP request handling system.
  3. Scriptable. Recess' Tools GUI provides an easy-to-use rapid development front-end for kicking off CRUDs, inspecting classes, and reading routes. Where tools, and Recess, will improve is in scriptability. Common development tasks will be more straightforward to script and access through PHP APIs.

These are our new year resolutions. Recess' commitment to RESTful, enjoyable, and best-practice PHP development is unwavering. Here's to a big 2010!

Recess Community Updates

Posted Oct 6, 2009 by Kris | Comments ( 0 ) | Filed in: Community, PHP | | | | |

In the month since the new Recess Forums have launched the number of posts and interesting topics has been flourishing. Lots of great things are bubbling up:

PostgreSQL support is being developed on the database stack lead by Ryan Day. You can pull Ryan's postgres bits from Github here. Ryan has also put up two great, introductory tutorials on his blog. The first is on modifying the generated scaffolding code. The second is on beefing up date input fields. Awesome work.

Christiaan has been working on a Recess project that changes the Recess flow-of-control policy to achieve a more nested MVC approach. Check out his project alive.

Issue tracking has moved from lighthouse to Github to keep our source and bugs closer together.

Some other interesting threads from the forums:

If you haven't joined the forums or checked back in on them in a while, you should head that way.

Dynamic Invocation in PHP: is_callable, call_user_func, and call_user_func_array - Functional PHP 5.3 Part III

Posted Aug 31, 2009 by Kris Jordan | Comments ( 3 ) | Filed in: PHP | | | | |

In PHP you can dynamically invoke functions, lambdas, and methods in a number of ways. In this third part of the series we're going to explore callables in PHP 5.3. Check out Part I of the series if you're unfamiliar with anonymous functions, lambdas and closures in PHP 5.3. Part II followed up with a look at higher order PHP functions by implementing map and reduce.

What is Dynamic Invocation?

We've seen a flavor of dynamic invocation in the previous posts of this series using higher-order methods. The functions map and reduce receive a lambda as an argument and dynamically invoke it at runtime. PHP doesn't know (or care) in advance which lambda it is passed and it won't know until runtime:

$lambda = rand(0,1) ? function() { echo "Heads!"; } :
                      function() { echo "Tails!"; };
$lambda(); // <= Dynamic Invocation!

It's not until we run the script and flip the coin that PHP will know which anonymous function is being invoked. PHP has actually had the ability to do dynamic invocation for quite some time. Let's take a look at a similar example using code that runs pre- PHP 5.3:

function heads() { echo "Heads!"; }
function tails() { echo "Tails!"; }
$function = rand(0,1) ? 'heads' : 'tails';
$function(); // <= Dynamic Invocation!

We start by defining two functions named heads and tails, respectively. We then flip a coin and assign the string "heads" or "tails" to $function. Finally we invoke $function as if it were a lambda and the correct function gets called! Under its breath PHP is saying "looks like a function call, but this is actually a string whose value is 'heads'. Do I know of a function named 'heads'? Yes! Initiate dynamic invocation thrusters in 5...4...3...2..."

What if $function's value was a string with no corresponding method name, or an integer, or a boolean? You would murder our friend PHP. You should know that death by dysfunctional dynamic invocation is a most cruel and unusual death.

$fatality = 'Finish Him!';
$fatality();
// Fatal error: Call to undefined function Finish Him!()

is_callable to PHP's defense

How can you stop yourself from murdering PHP with dysfunctional dynamic invocations? Enter: is_callable. If safety is your concern, is_callable is your defense. Pass any variable you want to is_callable and it will tell you true or false whether or not it can be dynamically invoked.

function heads() { echo "Heads!"; }
$function = 'heads';
is_callable($function); // true

$lambda = function() { echo "Tails!"; }
is_callable($lambda); // true

$kimJungIl = "fireNukesAndSingAboutLoneliness";
is_callable($kimJungIl); // false

With the powers of is_callable we can learn whether making a dynamic call to the contents of a variable makes sense. Notice that is_callable works on both the function name string variable $function, and the anonymous function $lambda. We can call them just the same, too, so they can be used interchangeably:

function heads() { echo "Heads!"; }
$callable = rand(0,1) ? 'heads' : function() { echo "Tails!"; };
if(is_callable($callable)) {
    $callable();
}

This is really sweet because with higher-order functions like map and reduce we can pass either a lambda as demonstrated in part II, or string that is the name of a first-class function, and it will work just the same. Dynamic invocation enables flexibility through indirection and late-binding. Now that we've seen dynamic invocations of functions and anonymous functions, what about methods and static methods?

Dynamically Invoking Instance and Static Methods

While practicing the powers of our ability cast dynamically invocations of methods and lambdas, an elder PHP programmer presents a challenge: dynamically invoke an object's method without our script becoming a statistic, just another PHP fatality. With a mischeivious grin she hints, "Try not with a string, you will. Do with a two element array, you must."

class Object { 
	function method() { echo "Great success!"; }
}
$callableArray = array(new Object, 'method');
if(is_callable($callableArray)) {
    $callableArray();
}

Taking the hint we place an instance of the object as the first element of our array, and the name of the method as the second. We protect ourselves by only commiting to the dynamic invocation if the array is callable. We're safe, right? So, we run the script... and we murder PHP. Fatal error: Function name must be a string The elder chuckles and walks away...

Use call_user_func and call_user_func_array for Robust Dynamic Invocation in PHP

PHP's achilles heel is its inconsitencies. It may never live them down. C'est la vie. It turns out is_callable doesn't mean directly callable, but rather callable using the utility functions call_user_func or call_user_func_array. Let's try that example again:

class Object { 
	function method($greatWho, $greatWhat) { 
		echo "$greatWho are Great $greatWhat!";
	}
}
$callableArray = array(new Object, 'method');
if(is_callable($callableArray)) {
    call_user_func($callableArray, "You", "Success");
    call_user_func_array($callableArray, array("You", "Success"));
}

"You are Great Success!" Says the script. Notice the subtle variation in how each is called. call_user_func takes a list of arguments following the $callable just like any other function call, whereas call_user_func _array takes an array that will be used as arguments. In practice this means you can use call_user_func when you know exactly the arguments you want to send to the callable, and call_user_func_array when you don't know how many arguments or when it varies. Can you think of a time when you wouldn't know the exact arguments to use in a dynamic invocation?

Thought of an example yet? How about if we wanted a generic way to wrap new behavior around a function? A simple example...

function wrapWithEcho($callable) {
    return function() use ($callable) {
        echo "Calling!";
        call_user_func_array($callable, func_get_args()); // !!!
        echo "Done calling!"; 
	};
}

function aDeliciousFunction($subject) { echo "Mmm, $subject!"; }
$aDeliciousFunction = 'aDeliciousFunction';
$aDeliciousFunction('candy'); 
// Mmm, Candy!
$aDeliciousFunction = wrapWithEcho($aDeliciousFunction);
$aDeliciousFunction('candy');
// Calling! Mmm, Candy! Done Calling!

function anotherDeliciousFunction($subject, $verb) {
	echo "I love to $verb $subject!";
}
$anotherDeliciousFunction = 'anotherDeliciousFunction';
$anotherDeliciousFunction('candy','eat'); 
// I love to eat candy!
$anotherDeliciousFunction = wrapWithEcho($anotherDeliciousFunction);
$anotherDeliciousFunction('candy','unwrap'); 
// Calling! I love to unwrap candy! Done calling!

The focal point of this example is near the three bangs (!!!). Our wrapWithEcho function is returning an anonymous function that closes over the callable and does not know, or care, how many arguments that the callable should be passed. In fact, we wrap two different functions that take two different parameter counts and it all just works! Without call_user_func_array, in a situation like this, we'd have to resort to a huge switch statement that counted the number of arguments and then invoked call_user_func with exactly the right count.

There is another way to dynamically invoke callables, using reflection, which will be the subject of a short follow-up post.

Calling all Callables! (Directly)

Let's have a little fun and tie together concepts from each post in the series thus far to create a uniform means for doing direct dynamic invocation. To accomplish this we'll use PHP's closures and anonymous functions (part I) to create a higher-order PHP function (part II) that will return a directly invokable value for all callables  (part III). We're going to totally unify our dynamic invocations so we don't have to use call_user_func in our code, but can instead do this:

function aFunFunction() { echo "Radical."; }
$function = callable('aFunFunction');
$function();

$lambda = callable(function() { echo "Awesome!"; });
$lambda();

class Object { 
    function method() { echo "Great success!"; }
    static function staticMethod() { echo "More great success!"; }
}
$method = callable(array(new Object, 'method'));
$method();

$staticMethod = callable(array('Object','staticMethod'));
$staticMethod();

Now we just need to write this callable method. If you think back to our wrapper example, it's got the hints we need. Our callable function will return an anonymous function that closes over the callable. Ready?

function callable($aCallable) { 
     if(!is_callable($aCallable)) { 
		throw new Exception("Callable only works on is_callable's!");
	}
    // Return our anonymous function
    return function() use ($aCallable) {
        return call_user_func_array($aCallable, func_get_args());
    };
}

Pretty awesome, but I know what you're thinking. You're right, you can do better. Remember what we started out learning in this post? There are two kinds of callables that are directly callable: strings and anonymous functions. (Aside: there are actually three, the third is an object that implements the magical __invoke method, we'll get there later in this series.) Right, strings and anonymous functions can be called directly, it's the callables in the form of an array that need call_user_func. We can make our code a lot faster by simply returning callable non-arrays because they're already directly callable!

function callable($aCallable) {
	if(!is_callable($aCallable)) { 
		throw new Exception("Callable only works on is_callable's!");
	}
    if(!is_array($aCallalbe)) return $aCallable;
    return function() use ($aCallable) {
        return call_user_func_array($aCallable, func_get_args());
    };
}

Awesome. Just awesome. Using a closure we've brought unification to direct dynamic invocation in PHP 5.3 with our shiny new callable function.

Just a simple example of the power and flexibility functional PHP 5.3 brings to the table. Are you enjoying this series? Let me know in the comments. If so, you should subscribe to our RSS feed as parts IV and V are on their way...

You can also follow along on Twitter to get updates as this series continues...

Around the Recess PHP Community

Posted Aug 24, 2009 by Kris | Comments ( 0 ) | Filed in: Community, News | | | | |

Just wanted to call out a couple of news items that have been bubbling up around the Recess developer community over the past week or so.

  • zdk put together a plugin for Smarty Views that is compatible with Recess 0.20. Download zdk's Recess Smarty Plugin here and check out the README for instructions in smarty/README.txt. He's also put together a simple Smarty / Recess demo app. Thanks zdk!
  • Improvements to the database stack were brought into Recess Edge on Github thanks to commits from KevBurnsJr's groupBy and midnightmonster's exists.
  • Preview scripts for the Recess Sandbox, a virtual Ubuntu development environment preconfigured with all the best open-source tools to do first class PHP software engineering are now available on Github. Setup instructions can be found on the Recess PHP Wiki (that just came online this past week and needs some help!). Expect some full-blown articles on Sandbox soon, but if you can't wait to get your hands dirty feel free to play. The Recess Sandbox setup comes loaded with:
    • Server Software
      • Apache 2.2
      • PHP 5.3 w/ a Lot of Extensions
      • MySQL
    • PHP IDE
      • Eclipse PDT 2.1
      • Integrated Debugging
      • Syntax auto-completion with Recess
    • Interactive Debugging
      • XDebug Installed and Configured
      • Ready to hook-up IDE and XDebug
    • Profiling
      • Profile any script with an additional query string
      • Visualize and inspect call graphs with KCacheGrind
    • Unit Testing
      • PHPUnit 3.4 setup
      • Run Test Coverage reports
  • Work on Recess 5.3, a branch of Recess that leverages many of PHP 5.3's new capabilities for an even more enjoyable PHP development experience, has begun in earnest. More news on this will start trickling in as our series on Functional PHP continues. If you're interested in the new functional PHP features be sure to check out the first two installments: anonymous functions, closures, and lambdas in PHP 5.3 and understanding and implementing map and reduce in PHP.

Exciting times in the land of Recess and PHP. What will this week have in store? Stay tuned to via RSS.

Understanding and Implementing the famous map and reduce functions - (Functional PHP 5.3 Part II)

Posted Aug 21, 2009 by Kris | Comments ( 2 ) | Filed in: PHP | | | | |

To demonstrate some uses of PHP 5.3's fun new anonymous functions let's implement the famous functions: map and reduce.

In Part I we deconstructed the differences between anonymous functions, lambdas, and closures in PHP. You may want to get comfy with those terms before continuing here. As a refresher, lambdas and anonymous functions are essentially the same idea: functions that are values just like integers and strings. Closures are what enable lambdas to refer to or use variables defined outside of the lambda, even after those variables have fallen out of scope. Now, onto map reduce.

We should chat about the butterflies, the birds, and the bees...

By now you've heard the cool kids at school raving about the euphoria of recursion and lambdas with higher order functions like map and reduce. We'll get there, we will, but first we need to have a little talk about butterflies, birds, and bees.

There's a moment in every little caterpillar's life when it becomes a butterfly. Metamorphosis is a transformative process. It's a transformation function, caterpillar in, butterfly out. (Caterpillar) -> (Butterfly) What is important to understand is not how metamorphosis works but that it is a function, a special kind of function that transforms a caterpillar into a butterfly.

Can you think of other functions that transform an input into an output? Sure: function square($x) { return x * x; } or strtolower or function filter($dirty) { /* clean up */ return $clean; }. There are a lot of transformer functions! Some, like square may take an input and give you the same type of input back int -> int. Others, like metamorphosis, will take one type of input and give you another caterpillar -> butterfly. For our purposes let's classify all of these types of functions simply transformer functions.

Transformer!

Are there other types of functions too? You bet. This is the point where we talk about the birds and the bees. On second thought, I've never really understood how birds and bees combine to make anything, so lets stick to butterflies. When two butterflies are come together in a certain way they create a new butterfly (yes, I know this analogy also has holes, just play along). Let's refrain from thinking about the details of how they come together, and focus on the beauty of what is happening. Two butterflies combining to make one! (Butterfly, Butterfly) -> (Butterfly). It's another function, a combiner function, that takes two like things and results in another like thing.

Combiner!

So, what does this have to do with map / reduce and functional programming in PHP?

map and reduce are higher-order functions. This means each has a parameter that is a function, but not just any function. Each takes a special type of function. map uses a transformer function and reduce uses a combiner function. map transforms a list of things, like say a bunch of caterpillars, into a list of another things, like butterflies. reduce reduces a list of things like butterflies until there's only one left. Here's a visual of how map reduce works:

The Code! For crying out loud, show us The Code!

Ok, ok. You get it! map transforms a list of things into a list of other things, reduce combines a list of things into one thing. What does this look like in PHP?

PHP has two built-in functions: array_map and array_reduce. (Unfortunately the two functions take the array and the callback parameters in different orders. We won't make that mistake when we implement map/reduce will we?) A callback has a lot of different meanings in PHP which we'll cover in the next post in this series. For now take a callback to mean an "anonymous function".

The canonical map/reduce example is word counting. So let's take on the challenge in PHP: given an array of strings we must count the occurrences of each word in all strings. Here is our input and desired output:

<?php
$lines = array(
             'one two three four',
             'two three four',
             'three four',
             'four',
             );
// Desired Output, array of type word => count
// array ( 'one' => 1, 'two' => 2, 'three' => 3, 'four' => 4, ) 
?>

Let's break down the problem into what we know:

  • Inputs: (String of Words)
  • Output: Array(Word => Count)

We need to get from our input type to our output type. How can we use a transformer function to get us half-way there? And a reducer to bring us home? Think of the caterpillars and the butterflies.

Our caterpillar is a plain old, space delimited, lowercase string. Our transformer function must take a single string and metamorphosize it into our butterfly, the output type: a single array where keys are words and values are counts. (Line of Words) -> (Array: Word => Count) Let's code this anonymous transformer function up (we'll cheat and use PHP's built-in array_count_values to count the words):

<?php
// Transforms (Line of Words) -> (Array: Word => Count)
$lineToWordCounts = 
    function($line) { 
        return array_count_values(explode(' ', $line));
    };
    
// Test on a single line:
var_export($lineToWordCounts('one two three four'));
// Output: array ( 'one' => 1, 'two' => 1, 'three' => 1, 'four' => 1, )

// Test with array_map:
$counts = array_map($lineToWordCounts, $lines);
var_export($counts);
// Output: array ( 0 => array ( 'one' => 1, 'two' => 1, 'three' => 1, 'four' => 1, ), 
//                 1 => array ( 'two' => 1, 'three' => 1, 'four' => 1, ),
//                 2 => array ( 'three' => 1, 'four' => 1, ),
//                 3 => array ( 'four' => 1, ), )
?>

Sweet mother nature! The strings have transformed into beautiful butterflies! Now all we need to do is combine them down into a single array. If we can write a combiner function that takes two word count arrays and combines them into one we can use reduce to combine all of them into one for a total word count. Let's take a stab:

<?php 
// Combiner (Array:Word=>Count,Array:Word=>Count)->(Array:Word=>Count)
$sumWordCounts =
    function($countsL, $countsR) {
        // Get all the words
        $words = array_merge(array_keys($countsL), array_keys($countsR));
        $out = array();
        // Put them in a new (Array: Word => Count)
        foreach($words as $word) {
            // Sum their counts
            $out[$word] = isset($countsL[$word]) ? $countsL[$word] : 0;
            $out[$word] += isset($countsR[$word]) ? $countsR[$word] : 0;
        }
        return $out;
    };
$totals = array_reduce($counts, $sumWordCounts, array());
var_export($totals);
// Output: array ( 'one' => 1, 'two' => 2, 'three' => 3, 'four' => 4, ) 
?>

Just like that we've implemented a multi-line word count using PHP's built-in array_map and array_reduce functions and our very own transformer and combiner lambdas. The map step transforms each line from a string into an array of word counts for that line. The reduce step combines those arrays of word counts into a single total count for all lines.

Now that we understand map's use of a transform function, and reduce's use of combine function, let's implement map and reduce on our own.

So, you're ready to implement map/reduce in PHP?

Map's implementation is so simple, it barely needs an explanation. All we're doing is calling the transform function on every input element and storing each result in an array to be returned. Here it is:

<?php
/**
 * $transformer lambda(caterpillar) -> butterfly
 * $in array of caterpillars
 */
function map($transformer, $in) {
    $out = array();
    foreach($in as $item) {
        $out[] = $transformer($item);
    }
    return $out;
}
?>

Not too bad, eh? Somewhere John Mccarthy's just stooled himself, I know. Hang tight, John, we'll get to recursion! Reduce is a little trickier because of edge cases where arrays with no or only 1 element are provided. Don't get caught in those details, just focus on the snippet where we've got more than 1 element to reduce:

<?php
/**
 * $combiner lambda(butterfly, butterfly) -> butterfly
 * $in array of butterflies
 */
function reduce($combiner, $in, $identity) {
    if(count($in) <= 1) {
        $out = $identity;
    } else if(count($in) > 1) {
        $out = array_shift($in);
        do {
            $next = array_shift($in);
            $out = $combiner($out, $next);
        } while(!empty($in));
    }
    return $out;
}
$totals = reduce($sumWordCounts, map($lineToWordCounts, $lines), array());
var_export($totals);
// Output: array ( 'one' => 1, 'two' => 2, 'three' => 3, 'four' => 4, ) 
?>

With reduce we begin by shifting the first butterfly off the list. We combine it with the next butterfly to make a super butterfly. We then combine the super butterfly with the next butterfly on the list, and so on until we have the ultimate single butterfly. This is a little trippy at first so refer back to the diagram for a visual explanation.

Map/Reduce, Take 2: Recursive Implementations, Please

Remember those crazy trips to lambdaland full of recursion all the cool kids on Reddit and HN are taking? Well, your time has come, too. Our initial implementations of map and reduce were done imperatively with loops. The functional folks believe loops are, well, garbage. (Don't worry, they're not.) Why do you need loops when you have recursive functions? How would map and reduce look if we got rid of the loops?

Before we get there, let's implement a few helper functions to make our recursive implementations beautiful. Here they are:

<?php
// First element of an array
function first($in) { 
    return empty($in) ? null : $in[0];
}

// Everything after the first element of an array
function rest($in) {
    $out = $in;
    if(!empty($out)) { array_shift($out); }
    return $out;
}

// Take an element and an array
//  and fuse them together so that the element
//  is at the front of the array
function construct($first, $rest) {
    array_unshift($rest, $first);
    return $rest;
}
?>

Now that we've got those helpers out of the way, allowing us to work with arrays like they're lists, let's do recursive map and reduce functions in PHP.

<?php
/**
 * $transformer lambda(caterpillar) -> butterfly
 * $in array of caterpillars
 */
function map($transformer, $in) {
    return !empty($in) ?    construct(  $transformer(first($in)), 
                                        map($transformer,rest($in)))
                        :    array();
};

/**
 * $combiner lambda(butterfly, butterfly) -> butterfly
 * $in array of butterflies
 */
function reduce($combiner, $in, $identity) {
    return !empty($in) ?    $combiner(first($in),
                                      reduce($combiner, rest($in), $identity))
                         :  $identity;
};

$totals = reduce($sumWordCounts, map($lineToWordCounts, $lines), array());
var_export($totals);
// Output: array ( 'one' => 1, 'two' => 2, 'three' => 3, 'four' => 4, ) 
?>

If you're more comfortable with for loops than recursion this is going to look trippy. Chew on it a little longer.

Some things that may help you reason about the recursive version of map:

  1. If there is at least one caterpillar, it will be transformed. map then calls itself with any remaining caterpillars.
  2. Once there are no caterpillars left, this is our base case: an empty array is returned.
  3. Once the empty array is returned, the previous map will construct a new array with its butterfly and return the array.
  4. The construction of the butterflies array happens one-by-one until every butterfly is in it, and the top-level map returns.

How did that feel for you? Beautifully functional? No loops, yet, we're iterating through a list!

I'll leave the interpretation of reduce for you to mull over. We've now written higher-order functions by implementing map and reduce, imperatively and functionally, in PHP 5.3 with anonymous functions.

So, That Was Fun. What's Next?

We've deciphered lambdas, anonymous functions, and closures, now we've seen higher order functions with callbacks by implementing map/reduce, so what's next? Follow the RSS feed for the upcoming articles in this series on functional PHP 5.3.

You should follow me on Twitter to get updates as this series continues to evolve.

Functional PHP 5.3 Part I - What are Anonymous Functions and Closures?

Posted Aug 18, 2009 by Kris | Comments ( 3 ) | Filed in: PHP | | | | |

One of the most exciting features of PHP 5.3 is the first-class support for anonymous functions. You may have heard them referred to as closures or lambdas as well. There's a lot of meaning behind these terms so let's straighten it all out.

What is the difference between Anonymous Functions, Lambdas, and Closures?

You'll see the terms "anonymous functions", "lambdas", and "closures" thrown around in reference to the new features of PHP 5.3. Even the the URL php.net/closures redirects to php.net/manual/functions.anonymous.php. The difference between 'lambda' and 'anonymous function'? None, for all intents and purposes they are two words for the same concept which is a descendant of lambda calculus. Languages with anonymous functions consider functions to be first-class value types, just like integers or booleans. Anonymous functions can thus be passed as arguments to another function or even returned by a function. Let's make this concrete in PHP:

<?php
$lambda = function() { 
            echo "I am an anonymous function, 
                  aka a lambda!<br />";
            };
$anonymousFunction = $lambda;
$anonymousFunction(); 
// Output: I am an anonymous function, aka a lambda!

function nCallsTo($n, $function) {
    for($i = 0; $i < $n; $i++) {
        $function();
    }
    return function() { echo "I am also an anonymous function!<br />"; };
}

$anotherAnon = nCallsTo(3, $anonymousFunction);
// Output:
// I am an anonymous function, aka a lambda!
// I am an anonymous function, aka a lambda!
// I am an anonymous function, aka a lambda!

$anotherAnon();
// Output: I am also an anonymous function!
?>

Notice how we did not assign a name to the function, we assigned a function to be the value of a variable. Just like a string or any other primative. We then assign it to another variable. The function is just a value, it has no name, hence the term "anonymous function". We then create a regular function named nCallsTo that takes two arguments, $n being the number of times to make a call to $function an anonymous function.

The existance of higher-order functions opens the door for techniques like map/reducenext post.nCallsTo is a higher-order function on two accounts: 1) it takes a function as an argument, and 2) it returns a function as a value. Higher-order functions open the doors for techniques like map/reduce and deserves a post in itself. The point is lambdas and anonymous functions are the same things: functions that are values.

If anonymous functions are values, what does PHP consider their type to be? Let's find out:

<?php 
$lambda = function() { echo "anonymous function"; };
echo gettype($lambda) . '<br />';
// Output: object
echo get_class($lambda) . '<br />';
// Output: Closure
?>

On the Closure object in PHP 5.3

What is a closure? So far it's a misnomer. We haven't actually spotted a closure even though PHP assigns all anonymous functions the type Closure. Since we haven't actually seen a closure yet, let's take a look at one:

<?php
function letMeSeeAClosure() {
    $aLocalNum = 10;
    return function() use (&$aLocalNum) { return ++$aLocalNum; };
}
$aClosure = letMeSeeAClosure();
echo $aClosure();
// Output: 11
echo $aClosure();
// Output: 12
$anotherClosure = letMeSeeAClosure();
echo $anotherClosure();
// Output: 11
echo $aClosure();
// Output: 13
echo $aLocalNum;
// Notice: Undefined Variable: aLocalNum
?>

Chew on that for a minute. Do you spot the funny business? $aLocalNum is a local variable defined within the scope of the plain-old function letMeSeeAClosure. With the new use syntax the variable $aLocalNum is bound or closed over to create the closure. This allows the returned Closure to retain a reference to $aLocalNum, even after $aLocalNum falls out of lexical scope when the function returns. The notice error occurs when trying to reference $aLocalNum directly from outside of the function's scope.

To recap, the terms, 'lambda' or 'anonymous function' refer to the same concept: functions that are values. Closures refer to a related, but different concept: the lifetime of a variable that is 'closed over', or in PHP 5.3 use'd, by a closure, is bound to the lifetime, or extent, of the closure. Anonymous functions that are constructed with the use keyword are also closures. As mentioned, in PHP 5.3, anonymous functions are typed as Closure and the three words have, so far, been thrown about. The high-order bit to take away is an understanding that PHP 5.3 now includes language features for anonymous functions and closures.

More on Functional PHP 5.3

If you're interested in PHP software development you should subscribe to our feed. Upcoming parts to this series:

Follow this series on RSS.

Screencast: Recess with Multiple Sites

Posted Aug 14, 2009 by Kris | Comments ( 0 ) | Filed in: News | | | | |

Kevin Burns put together a great screencast demonstrating how to work with multiple sites using one copy of Recess. Check it...

Screencast: Multiple Sites with Recess

Kevin Burns is a web developer from Menlo Park, CA. He has been doing web design and development for 6 years and is available for work in the SF Bay Area. http://kevburnsjr.com

Recess .2 Release

Posted Aug 7, 2009 by Kris | Comments ( 3 ) | Filed in: Release | | | | |

It's been a long journey since the release of .12 in April. With the exciting release of PHP 5.3 we wanted to push out bulk of the planned .2 changes as soon as possible, which include some fixes to not break a couple of PHP features deprecated in 5.3. Other features that are in this release:

Get the bits in zip or tarball form while they're hot!

We learned a big lesson in the 4 month development process moving Recess from 0.12 to 0.20: we were ambitious and took a big bite, perhaps a bit more than we could chew. The new controls/validation model still isn't ready for this release. (If you're adventurous you can find some prototyping in 0.20's bits!) Lesson learned: moving forward development be focused on fewwer features at a time and released as they are completed.

With the release of 5.3 now nearly a month past it is clear that namespaces, late static binding, lambdas, and the dramatic performance improvements are going to make 5.3 the PHP version of choice for new PHP projects. Our next focus is on a 5.3 Recess branch that leverages these powerful, fundamental language features to make developing in PHP more enjoyable.

Next week there will be more discussion on what to expect with Recess 5.3, the launch of a Recess wiki (finally!), and some tutorials on new features in PHP 5.3.

 

Introducing the 0.20 Layouts System - Part I

Posted Jun 30, 2009 by Kris Jordan | Comments ( 5 ) | Filed in: Views | | | | |

One of the biggest features going into Recess 0.20 is the new layouts system for views and display logic largely inspired by Joshua Paine's original work. It has been in the making since February and, with the latest changeset on GitHub, the view system is finally there after going through 3 major iterations. For those who have been following along and familiarized yourselves with the existing slots/blocks system I want to take some time to explain how the final system works. It is a simplification of the slots/blocks scheme with some additional new functionality.

In Recess 0.20 and beyond view code will revolve around 2 fundamental concepts: blocks and assertive templates which drive layouts and parts.

Buffering to Blocks

Blocks are objects that contain unrendered output. Think of a Block as a hunk of HTML you can pass around and manipulate before rendering. Block is an abstract class with multiple sub-class implementations. You can create a simple block by instantiating HtmlBlock:

<?
$block = new HtmlBlock(
    "<h1>Hello World</h1>" .
	"<p>Welcome to Recess 0.20</p>"
);
echo $block;
?>

The result of this code would be the contents sent to output. If you saw the previous iteration of blocks and slots you may be thinking, "Why write HTML in strings? Isn't the point of Blocks not to do that?" One of the realizations in the last view iteration was that Blocks were confounding two orthogonal concerns: 1) conveniently buffering output and 2) holding onto the output for later use. We broke out these two responsibilities so now Blocks are concerned with retaining output and the Buffer helper is concerned with buffering output to a block. The Buffer fills Blocks. Let's take a look:

<? Buffer::to($block) ?>
   <h1>Hello World</h1>
   <p>Welcome to Recess 0.20</p>
<? Buffer::end ?>
<?= $block ?>

With more content being buffered and/or complex display logic the benefits of buffering output to blocks becomes much more convenient. You can append to, prepend to, and overwrite the content of the resulting blocks with methods on the block instance or using the Buffer helpers.

Layout, an Assertive Template

Most web applications share HTML between many different views. PHP has a simple way of including common HTML between scripts: the include statement. It is not uncommon to see include('header.php'), include('footer.php'), etc. The downside to using this method of including scripts is that it introduces structural dependencies throughout all of your view code. Recess' inverts this dependency problem with layouts. A view script can extend a single layout.

What is a layout? A layout is a template with defined inputs. A layout uses the inputs, provided by a child template extending the layout, to fill in slots of HTML. Let's take a look at a simple example (simple.layout.php):

<?php/*Input Variable   Type      Default Value
--------------------------------------------------------*/
Layout::input($title,   'string', 'The Resilient Layout');
Layout::input($sidebar, 'Block');
Layout::input($body,    'Block');
?>
<html>
	<head>
		<title><?= $title ?></title>
	</head>
	<body>
		<div id="container">
			<div id="sidebar">
			<? if(!$sidebar->draw()): ?>
				<ul>
				  <li>Default Link A</li>
				  <li>Default Link B</li>
				  <li>Default Link C</li>
				</ul>
			<? endif ?>
			</div>
			<div id="content">
			<? if(!$body->draw()): ?>
				<p>Default content.</p>
			<? endif ?>
			</div>
		</div>
	</body>
<? 
if(isset($die)) {
	die('By accident or affliction, a child template cannot kill me'
		. ', for I will never take $die as an input.');
}
?>
</html>

Notice Blocks have a draw method. If the Block contains output it will be printed and return true, else it will return false. For Recess Edge followers, the if(!$block->draw){} construct is equivalent to the previous Layout::slot('block'); Layout::slotEnd(); mechanism.

Also notice in the layout we define our input variables, their respective types, and any default values. This makes the layout an Assertive Template. Assertive templates declare their inputs. You may be wondering, why does the parent layout not simply inherit the entire context of the child? What justifies the extra typing?

  • Design by Contract - Layouts define an input contract. If your child template does not fulfill the contract Recess will immediately fail and tell you the error is in the child template not providing an input. Compare this to the obscure '$foo is not defined in parent.php at line 305' error message you would otherwise receive: Is the bug in the child script or the parent layout? You would have to understand the logic of both to pinpoint the problem.
  • Fewer Surprises - The only variables that exist in a parent layout are those declared as inputs. If the input requirements aren't met the child template is to blame, if the input requirements are met, then the parent layout is to blame. To reason about the behavior of an assertive template you only need to reason about one script at a time.
  • Self Documenting - What variables does a layout expect? With assertive templates this is easy: just look at the inputs. Without assertive templates this can be a pain: you must completely understand all of the code in another script.

We can extend the simple layout in a view template with the following code:

<?
Layout::extend('simple')
$title = 'A Simple Layout'
$die = true; // Has no effect.
?>

<? Buffer::to($sidebar): ?>
<ul>
 <li class="selected">
  <?=html::anchor(url::action('Controller::method'),'Link 1')?>
 </li>
 <li>
  <?=html::anchor(url::action('Controller::method2'),'Link 2')?>
 &;/li>
 <li>
   <?=html::anchor(url::action('Controller::method3'),'Link 3')?>
 </li>
</ul>
<? Buffer::end() ?>

<h1>Body</h1>
<p>$body is a special Block that is created implicitly 
   if not defined explicitly.</p>
 

By using the Layout helper's extend method we declare that we're inheriting from simple.layout.php. Once the child template has executed the variables declared as inputs to the parent layout will be extracted from the child template and passed to the parent. There are two Block inputs in the layout: $sidebar and $body. The child template uses the Buffer technique described above to fill the $sidebar block. The $body block is filled by a special case: any output in a child template will automatically fill a block named $body, unless the child template defines $body explicitly.

Notice that we assign true to variable $die in the child, and in the parent there is a conditional that will kill the script if $die is set. The parent layout will not die, though, because $die is not an input of the layout. Layouts only acquire the scope their inputs they define! This is, admittedly, a contrived example.

In Part II of this look at Recess 0.20's new view system we will cover Parts, Recess' spin on partial templates, and the powerful & mystical Block that is a Part: PartBlock.