Monday, May 28, 2007

What's new in PHP V5.2

I read this article in the developer works from IBM, nice one from them, posting it here for the others need.

What's new in PHP V5.2, Part 1: Using the new memory manager

Track and monitor PHP memory use like an uber-nerd

Level: Intermediate

Tracy Peterson (, Freelance Writer, Consultant

13 Mar 2007

In Part 1 of this "What's new in PHP V5.2" series, learn how to use the new memory manager introduced in PHP V5.2 and become proficient at memory usage tracking and monitoring. This will enable you to more use memory in PHP V5.2 more efficiently.

PHP V5.2: In the beginning

In November 2006, PHP V5.2 was released with many new features and bug fixes. It obsoletes the 5.1 release and is a recommended upgrade for any PHP V5 users. My favorite lab environment -- Windows®, Apache, MySQL, PHP (WAMP) -- is rolled into a new package for V5.2 already (see Resources). You will find an application there that will set up PHP V5.2, MySQL, and Apache on a Windows® XP or 2003 machine. It's a piece of cake to install, has lots of nice little management goodies, and I recommend it wholeheartedly.

While this is the easiest package for Windows users, you need to add the following when configuring PHP on Linux: --memory-limit-enabled (in addition to any other options appropriate for your server). Under Windows, however, there is a workaround function provided.

There are many improvements that have taken place in PHP V5.2, and one critical area is that of memory management. The exact quote from README.ZEND_MM states: "The goal of the new memory manager (PHP5.2 and later) is reducing memory allocation overhead and speeding up memory management."

Here are some of the key items from the V5.2 release notes:

  • Removed unnecessary --disable-zend-memory-manager configure option
  • Added --enable-malloc-mm configure option, which is enabled by default in debug builds to allow using internal and external memory debuggers
  • Allows tweaking the memory manager with ZEND_MM_MEM_TYPE and ZEND_MM_SEG_SIZE environment variables

To understand the implications of these new features, we need to delve into the fine art of memory management a bit and consider why allocation overhead and speed are a big deal.

Why memory management?

One of the fastest-developing technologies in computing is memory and data storage, which are driven by the constant need for increases in speed and storage size. Early computers used cards as memory before moving to chip technology. Can you imagine working on a computer with only 1 KB of RAM? Many early computer programmers did. These pioneers realized very rapidly that to work within the restraints of the technology, they would have to be diligent to avoid overloading their systems with frivolous commands.

As PHP developers, we live in a much more convenient world to code in than our colleagues who code in C++ or other stricter languages. In our world, we do not have to concern ourselves with the handling of system memory because PHP handles that for us. In the rest of the programming world, however, responsible coders use various functions to ensure that executed commands do not overwrite some other program data -- thus, crippling that running program.

Memory management is usually handled by requests from the coder to allocate and release blocks of memory. Allocated blocks can hold data of any type, and this process blocks off a certain amount of memory for just that data and gives the program a method of addressing this data for when it needs to be accessed for operations. The program is expected to release allocated memory when it has completed any operations, and let the system and other programs use that memory. When a program does not release the memory back to the system, it is called a leak.

Leaks are a normal problem with any running program, and a certain amount is usually acceptable, especially when we know a running program will terminate soon and release all of any memory allocated to it by default.

With programs you run and terminate arbitrarily, like almost all client applications, this is the case. Server applications are expected to run indefinitely without termination or restart, making memory management absolutely vital to server daemon programming. Even a small leak would eventually grow into a system-debilitating problem on a long-running program as memory blocks are used and never released.

Long-term thinking

There are many potential uses for a persistent server daemon written in PHP, as with any language. But when we begin to use PHP for these purposes, we must also consider our memory usage.

Scripts that parse a great deal of data or which may be hiding an infinite loop have a tendency to consume large amounts of memory. Obviously, once the memory is exhausted, the performance of the server decreases, so we must also pay attention to how much memory we're using when we execute our scripts. While we can simply watch the amount of memory used by a script by turning the system monitor on, it will not tell us anything more useful than the status of the entire system memory. Sometimes we need to do a bit more than that to help us troubleshoot or optimize. Sometimes we just need more detail.

One way to get transparency into what our script is doing is to use an internal or external debugger. An internal debugger is one that appears to be the same process executing the script. Debuggers that are a separate process from the perspective of the OS are external. Memory analysis using a debugger is similar in either case, but the memory is accessed in different ways. Internally, a debugger has direct access to the same memory space as the running process, while an external debugger will access the memory via a socket.

There are many methods and available debugging servers (external) and libraries (internal) you can use to aid your development. In order to prepare your PHP installation for debugging, you can use the newly provided --enable-malloc-mm, which is enabled by default in a DEBUG build. This makes the environment variable USE_ZEND_ALLOC available to allow selection of malloc or emalloc memory allocations at runtime. Using malloc-type memory allocations will allow external debuggers to observe memory use while emalloc allocations will use the Zend memory manager abstraction, requiring internal debugging.

Memory management functions in PHP

In addition to making the memory manager more flexible and transparent, PHP V5.2 provides a new parameter for memory_get_usage() and memory_get_peak_usage(), which allow the viewing of the amount of used memory. The new Boolean mentioned in the notes is real_size. By invoking the function memory_get_usage($real); where $real = true, the result will be the real size of memory allocated from the system, including the memory-manager overhead, at the moment of invocation. Without the flag set, the data returned would be only the memory used within the running script, minus the memory-manager overhead.

memory_get_usage() and memory_get_peak_usage() differ in that the latter returns the peak memory usage so far for the running process that invokes it while the first only returns the usage at the moment of execution.

For memory_get_usage(), provides the code snippet in Listing 1.

Listing 1. A memory_get_usage() example

In this simple example, we first echo the results of a straight up invocation of memory_get_usage(), which by the code annotation seems to have had a common result of 36640 bytes on the author's system. We then load up $a with 4,242 copies of "Hello" and run the function again. The output of this simple usage can be seen in Figure 1.

Figure 1. Example output of memory_get_usage()
Example output of memory_get_usage()

There is no example of memory_get_peak_usage() as the two are so similar. The syntax is identical. For the example code in Listing 1, there would be only one result, however, which is the peak memory usage at that moment. Let's take a look in Listing 2.

Listing 2. A memory_get_peak_usage() example

The code in Listing 2 is identical to Figure 1, but memory_get_usage() has been swapped for memory_get_peak_usage(). Nothing much changes in output until we populate $a with the 4242 iterations of "Hello." Our memory jumps to 57960, representing our peak so far. When we check the memory usage peak, we get the highest value so far, so every further invocation will result in 57960 until we do something to use more memory than we did with $a (see Figure 2).

Figure 2. Example output of memory_get_peak_usage()
Example output of memory_get_peak_usage()

Limiting memory usage

One way to make sure we do not overtax the server we are hosting our application on is to limit the amount of memory used by any scripts executed by PHP. This isn't something we should have to do at all, but as PHP is a loosely typed language and is parsed at runtime, we sometimes get scripts that are poorly written unleashed upon our production applications. These scripts might execute a loop, or perhaps open a long list of files, forgetting to close the current file before opening a new one. Whatever the case, a poorly written script can end up chewing up a ton of memory before you know it.

In PHP.INI, you can use the configuration parameter memory_limit to specify the maximum amount of memory any script is able to run on the system. This is not a specific change to V5.2, but any discussion of the memory manager and its uses bears at least a quick look at this feature. It also leads me nicely to the last new features of the memory manager: environment variables.

Tweaking the memory manager

Finally, what would programming be without being able to be a perfectionist and get it exactly right for your purposes? The new environment variables ZEND_MM_MEM_TYPE and ZEND_MM_SEG_SIZE allow you to do just that.

When the memory manager allocates large memory blocks, it does so in predetermined sizes, listed in the variable ZEND_MM_SEG_SIZE. The default size of these memory segments is 256 KB per block, but you can adjust these to suit your particular needs. For instance, if you were aware that the operations in one of your most common scripts was causing a large amount of wasted memory, you could adjust this size to more closely match the needs of the script, reducing the amount of memory allocated but remaining empty. In the right conditions, this kind of careful configuration tweaking can make a huge difference.

Retrieving memory usage on Windows

If you have a pre-built PHP Windows binary without the --enable-memory-limit option on when it was built, you need to go through this section before moving on. For Linux®, build PHP with the --enable-memory-limit option on when you configure your PHP build.

To retrieve memory usage using Windows binaries, create the following function.

Listing 3. Getting memory usage under Windows

Save this in a file called function.php. Now you only have to include this file in scripts you wish to use it in.

No comments:

Post a Comment