Mother of the Groom Gets the Best Man

My Mother of the Groom duties, such as they were, had mainly come to an end as the dance got into full swing, and took Mark back to our suite and set him up. What a day, and thank god it was soon…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




Optimize PrestaShop Performance

This article is specific about PrestaShop, a popular e-commerce platform, but it’s principles can be applied to any web application built on PHP and mysql. It includes working experience gathered in about five years. All the work was done on an outdated version of PrestaShop. Today we are working on an upgrade to the current version. The new version is better in many points, but it is slower. We have to start over the performance optimization cycle of the past few years. This article is part of the preparation for the challenge. It summarizes the process and the tools used.

When running an e-commerce site, user experience is at the heart of the success. And page load time is an important factor. While you can’t control the network connection nor what is happening on the client computer, you have full control over page generation time on your server. There are many things you can do to optimize the client part, like compression of images, choice of Javascript libraries, but this is out of scope here. Let’s focus on the server only, it’s a vast domain already.

Any server environment is complex. To begin with, the hardware has it’s obvious performance impact. On the software side, we run a web server (Apache) and Mysql for the database. There are many configuration options (especially in mysql server), that can have an important impact. We can cache database queries by mysqld itself, or by a dedicated service (we chose memcached for practical reasons). Queries are a crucial factor. Their execution time can vary from nanoseconds to hours, depending on your data, missing indexes or an ill formed joins. Is PHP code well written? Does it have the best big-O properties, for instance an extraneous nested loop somewhere?

So many factors! Tens, even hundreds of configurations, giving an almost infinite possibility of combinations. Thousands of dynamically generated database queries. Hundreds of thousands of lines of code. Where do we start?

In order to improve the performance of your shop, you first need to measure it.

But I found that some important data was missing. Apache gives me the number of hits, but how many are static resources and how many php scripts are executed? Also, page generation time is missing. And how is this time spent? How much time is consumed by the CPU, and how much time the script waits for I/O operations (most of which are mysql queries)? What are the other I/O operations composed of? I also found that the number of queries per page was interesting. The new graphs I came up with are :

These statistics run on our web server continuously for some years now. They give valuable insight over time and allow to assess the impact of any changes in the system, be it of external nature or induced by configuration changes or changes to the code.

Real life baseline for custom performance counters

We went further to collect detailed stats about the individual queries. As this has a negative impact on performance, we did it only for some minutes at a time.

We removed all variable parts like numbers and strings to group similar queries together. For each group, we collected data like number of execution, time spent, and many mysql performance counters that point to potential problems (temporary tables, full table scans, ordering without index,…).

This data shows very precisely, which queries are bottlenecks, and allows to pinpoint the areas that need improvement.

With these tools at hand, we can now apply focused changes to improve performance. The interventions are at many different levels.

This is an obvious place to start. The server must be adapted to the load it has to handle. Before adding more CPU, memory or better disc, determine where the bottleneck lies.

Shared hosting is adapted only to the smallest shops. A dedicated virtual server is a requirement for a reactive e-commerce site. Choose a hosting provider, where you can adapt ressources when needed.

Another choice is whether the database should be on the same machine as the web server. PrestaShop recommends to separate the two. I disagree, considering that a single HTTP request generates in average about 180 mysql queries in PrestaShop 1.6.1, and about 120 in the current version. Thus, The network latency has a big impact on performance : if the network only has 200 ns of latency, a mysql query takes 400 ns longer than on the local machine. This would already account for 50 ms on the response time for each HTTP request.

Therefore, in my opinion it only makes sense to separate web server and database only in high availability or load balancing scenarios, where there are more than one web server instances running.

Mysql statistics are especially interesting, but difficult to interpret.

To be able to reliably improve the performance through configuration, a baseline as explained above gives immediate insights as you make changes.

Using a baseline allows you to notice changes you’d never expect. One day, the web site was slow, CPU usage doubled, and Apache hits also noticeably increased. The reason was an ill behaving bot, that started to crawl our web site at a rate of several hits per second.

In a response to this, we limited access to our web site to 300 hits in 5 minutes from the same IP. If this limit is reached, access is temporarily blocked. There are tools for this, but I preferred to write a shell script that does just this, allowing to exclude static resources and full text search (displaying search results to the user as he types).

This might not be an issue for everyone, but keep in mind that there can be performance degradation for unexpected reasons.

After tuning the mysql server, let’s take a granular approach. Which queries take the most time?

We are interested in the queries without parameters. Each search is different, but if we are able to reduce execution time of search queries, search preview while typing is much more attractive.

But execution time is not the only measure, mysql has many status variables at session level, that allow us to find the queries responsible for full scan joins, table scans, temporary tables, and even worse, temporary tables written to disk (the Advice section of phpmyadmin is of great help to understand the data).

With these requirements in mind, we logged all the queries with their performance data in a production environment. It’s important for us to gather statistics of a real workload and not a simulated one.

A few queries stood out immediately for their bad performance. We knew where to start.

The most obvious and most efficient fix were some missing indexes. Some queries with complex joins had to be refactored. Running the queries in different variations with and without EXPLAIN allows to find better alternatives.

Our findings led us to redesign the search. On one hand we changed the table structure to reduce redundant information, and to allow for more efficient queries. On the other hand we also split the query to retrieve matching products in two. First we apply the search conditions to retrieve only a list of ids, and in a second query we retrieve the data. This proved to be much more efficient than to do all in one query. We managed to improve the search preview as you type as much as going from ~500 ms down to ~150 ms load time. Now search speed is adapted to preview as you type.

Mysql implements its own query cache. It does a good job, but it is deprecated as of mysql 8.0. The rationale behind this is, that using the cache adds overhead to all queries, while only queries that use a certain amount of time and that are executed frequently are faster served from cache than executed directly.

While this argument is valid, using a software package like PrestaShop does not give us full control over the queries, and we cannot ensure that all queries are optimal. A performance analysis proves that PrestaShop can take advantage from mysql’s query cache. The size of the query cache is an important parameter when optimizing the server configuration. It must be chosen large enough.

We went a step further and disabled mysql query cache in favor of a memcached instance on the same server.

We didn’t use the memcached caching feature built in as is in PrestaShop. We noticed that it delayed page load a lot, and introduced many erratic behaviors due to a flawed cache invalidation algorithm. With our own implementation, we achieve currently (with PrestaShop 1.6) a reduction of page load time of 33%. Before optimizing certain queries the benefit was larger.

With PrestaShop 1.7, I expect the benefit to be even smaller. PrestaShop now makes fewer database queries per page request, and they seem to have done a good job in optimizing query execution time in general. On the other hand, unfortunately the user time per request increased noticeably.

If you consider using an external database caching, there are two things more to take into account. First, redis is an alternative to memcached to consider. Second, generally when using a cache mechanism, in the case of a cache miss, there will be three requests instead of one : query the cache, query the database, and finally write the result back into the cache.

Caching database results can be interesting, but its benefit decreases when the performance quality of the queries increases.

We must be careful when optimizing code. There is a danger of falling into micro-optimizations, that in theory should improve, but where the gain is not noticeable.

The best improvements can be reached, when it is possible to reduce the time complexity of an function (see big-O notation). For instance, if a query is made inside a loop, and it possible to take it out of the loop, then we pass from O(n) to O(1). Even if the outer query might be more complex and take more time, there will be an overall improvement.

There should be very few occasions for real improvement. Otherwise this would witness for poor code quality. We found a couple of such bugs PrestaShop 1.6.1.1. We didn’t yet have the time to check if they were fixed meanwhile, but I’m quite confident about it.

There are a lot of opportunities to optimize performance. Nevertheless, performance optimization is complex and can be time consuming. It is important to constantly balance the expected improvements against the investment of time. Keep in mind, that certain action take their time, that cannot be reduced.

To keep control over the improvements, and have a objective way to judge the results, constant performance measures lay the base. They must be accompanied by several specialized tools, some of which are mentioned here.

I hope this article gives you hints on where to start. At the end, not only your e-commerce site will be faster, you will definitely get much insight into the inner working of your system as a whole, from configuration, over database to code.

Thank you for reading. If you enjoyed this article, you might be interested in two follow ups: A look at performance while upgrading PrestaShop (PrestaShop 1.7) and Optimize PrestaShop performance again (PrestaShop 8.0, still in draft).

Add a comment

Related posts:

Sex Doll Brothels

The headline of this article might be seen as kink shaming, sex negative or radical feminist, but I don’t necessarily want to deride the idea of visiting a sex doll brothel, but rather give it some…

Traveling the World in Cheap Shoes

I have inconveniently broken a shoe on three continents. By inconveniently, I mean that I was walking along, intending to get somewhere, when one of my shoes broke. A broken shoe makes it difficult…