Had some performance issues with a client, recently. The server is an absolute beast, and should be able to easily handle the single website hosted on it. Alas, pages took forever to load (D7). A simple top showed that load was at about 60, and CPU utilization was at 10% for user and ... 95% for system. Wait what?
There are two things that are odd here. The first is, that even though the normal culprit for high system CPU usage ( the I/O ) was low ( checked using iostat ), most of the CPU went into kernel tasks.
I used strace to look into a process and find out what the reason for that delay is. For the curious, you can use
# strace -c -p [pid]
to get a nice summary list of the syscalls your process does. I applied that to a LiteSpeed process, and found the following (truncated):
[root@server:/root]> strace -c -p 97276 Process 97276 attached - interrupt to quit Process 97276 detached % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- -------------- 93.25 0.395625 0 6338926 38539 lstat 2.72 0.011539 0 225123 getcwd 1.51 0.006415 0 101411 5977 stat 0.61 0.002593 0 25156 getdents 0.48 0.002039 0 25621 21 open 0.27 0.001136 0 25625 close 0.26 0.001091 0 12307 12307 readlink 0.22 0.000913 0 9239 1 read 0.16 0.000698 0 12887 munmap 0.15 0.000634 0 24547 fstat 0.15 0.000628 0 12887 mmap [.....] ------ ----------- ----------- --------- --------- -------------- 100.00 0.424240 6835792 57199 total
That's 6,3 MILLION calls to lstat. But why would this occur? After digging about a bit, I found the culprit to be PHP's open_basedir -- which is utterly unecessary in our case, since we run PHP via suExec in FastCGI mode (thus filesystem permissions are more than enough). After turning open_basedir() off, compare the result in a different process :
[root@server:/root]> strace -c -p 98227 Process 98227 attached - interrupt to quit Process 98227 detached % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- -------------- 89.30 0.019310 32 600 munmap 4.82 0.001043 1 1343 1 writev 3.58 0.000774 774 1 brk 0.84 0.000182 0 5158 5 stat 0.55 0.000118 0 6082 read 0.38 0.000083 0 5656 10 lstat 0.18 0.000038 0 1231 open 0.07 0.000016 0 1236 close 0.07 0.000016 0 533 mmap 0.07 0.000015 0 78 poll [....] ------ ----------- ----------- --------- --------- -------------- 100.00 0.021623 24734 28 total
No calls to lstat. Nor stat. That's quite huge in terms of improvement even for the seconds -- from 0.42 to 0.02! Of course, these are two processes that might have anything being loaded, but the important thing to take away from here is, open_basedir has QUITE the effect in your PHP processes, and should NOT be used for a dedicated server!
If you want to take it a step further, PHP also has a cache for mapping files to their real paths - for a large Drupal installation, I've found that the default value of 16K is barely ever enough. You might squeeze a bit of extra performance by adjusting the two relevant php.ini parameters:
realpath_cache_size = 1M realpath_cache_ttl = 3600
according to some articles this will even further reduce the amount of system work for your server.