Solving Problems With Profiling post

I was presented with a problem that was occurring in the virtual machine I was using for client development work -- the PHP-based acceptance test suite was running extremely slowly. Normally it takes 12-13 minutes to run outside of the virtual machine but it was taking...54 minutes!

Because I am almost never afraid to ask for help, I bugged Marco Pivetta to give me a hand, since he is working on the same client project. I figured if anyone knew of where to START diagnosing what the problem is, it would be Marco.

Marco's suggestion after watching a smaller test suite run both in his local environment and in my VM was that we should run the test suite with a debugger enabled so we can see what is going on terms of resources being consumed. For PHP, this usually means using Xdebug.

What Xdebug allows you to do is:

  • step debugging
  • see better var_dump() information
  • write every function call to disk for later summarizing and reporting
  • profile your code to look for performance bottlenecks
  • generate code coverage when using PHPUnit (not sure if it works with other testing frameworks)

I've used the step debugging feature a lot on unfamiliar code bases but the profiling feature was definitely what we needed.

To ask Xdebug to profile the code we're testing, you need to have the Xdebug extension installed and then tell PHPUnit that you want to use it. The command to do it from your shell looks something like this:

XDEBUG_MODE=profile vendor/bin/phpunit --testsuite=unit

Because our test environment was configured to run these tests using a specific Docker container, I had to access the container directly via docker-compose exec php-fpm and then execute this command inside the container.

This ran the test suite and generated a large number of cachegrind files. These files contain profiling data but you need a specialized tool to read them and get information out of them that makes sense. For Linux users you would likely want to use KCachegrind but luckily for me you can read these files using PhpStorm.

The first step was to figure out which of these cachegrind files to examine. Unfortunately this is more intuition than science: our test suite uses @runInSeparateProcess annotations so all the small ones represent single tests. These are likely not to return any meaningful information. "Just pick the biggest one and let's see what happens."

So, we both opened up cachegrind files of similar sizes and took a look at the data. What exactly where we looking for? In terms of bottlenecks we can place things in either "network" or "CPU" categories. Is the application waiting a lot for external resources (say, a service in a different container) or is it waiting for the CPU to finishing doing something before it can continue.

Sadly, I cannot share the cachegrind output here as I have NDA's surrounding the client work but the approach was:

  • sort the calls by how much time was being spent on executing them
  • figure out if it is network or CPU

For network issues, we were looking for things like time spent connecting to a MySQL database in another container. As we scrolled through the list at my end together we started noticing a few things:

  • network access wasn't the problem
  • we were spending an awful lot of time continually parsing a configuration file written using TOML during bootstrap (ticket filed to fix this)
  • a lot of very simple PHP calls were taking significant amounts of CPU time

The next step was to look at how much memory and CPU power I was giving to the virtual machine. I was giving it half my processing cores and half the available memory. So that should not have been an issue.

Marco did some searching and found some forum posts of folks complaining about how slow some VM's were in the latest version of VMWare Fusion but their situation didn't seem to be the same as ours.

"Hrm, Chris, open up that 'Advanced Options' section in the 'Processors & Memory' configuration section. Aha!"

In that section were two disabled options, both dealing with running containers inside the virtual machine. Given that we are heavily relying on Docker it definitely made sense to enable those.

So I shut down the virtual machine, enabled those two options, and started it up. Much to my surprise, the acceptance test suite now ran in 10 minutes instead of 54 minutes! Huge improvement and is also faster than how long it takes outside of the virtual machine.

Afterwards, Marco was explaining to me how much Docker relies on having direct memory access to things so not forcing those connections to go through a different path in the VM would yield a huge gain. Now I'm happier with the performance of the test suite.

So, in summary:

  • the test suite was much slower than expected
  • a decision was made to run the test suite with Xdebug profiling enabled
  • we made an educated guess as to which profile output file to analyze
  • the profiling output led us to believe that there was a CPU-related bottleneck
  • the virtual machine had adequate memory and processor resources allocated to it
  • the VM was not configured to run containerized applications optimally
  • the VM has stopped and options pertaining to running containers inside the VM were enabled
  • re-running the test suite saw a huge increase in performance and execution time

Without the ability to profile the code to get a better idea of where there might be problems, it would've taken a lot longer to come to an effective solution.

Categories: development, PHP