Technology
Spryker Performance and Scalability Concepts
Spryker provides two main architectural qualities that just work out of the box, very high performance and scalability. In this post I explain our goals and the main architectural concepts to achieve them. There will be another article during the next days which gives insights into the results of a performance load test.
Why are performance and scalability so important for Spryker?
Everybody knows that poor performance is a true conversion killer, no doubt that a fast website results in great user experience. But there are other reasons to strive for a low servers-side execution time.
In case you ever worked with a slow application in a high traffic environment, you will probably remember the pain. With Spryker we want to make sure that the shop will never become the bottleneck for any marketing activities like TV spots, massive newsletter or SEM campaigns.
A good performance also has a positive effect on the team’s productivity. Developers do not need to waste their time with ‘zero business value’ implementations like extensive server tuning or complicated database replication mechanisms. For good reasons we want to avoid any kind of full page cache.
Another benefit of a fast application is the ability to run the shop on small (and cheap) cloud instances, either for testing, staging or even production purposes. In general the complexity of the hosting environment can be much simpler if you don’t need to put high efforts in optimization and tweaking of hardware.
Spryker’s architectural goals
A typical Spryker shop should have an average server-side execution time of around 50 ms under normal load conditions. Even expensive calls like add-to-cart should execute in less than 150ms. In case you don’t know what these numbers mean, you can compare it to the (unofficial) Magento performance comparison which was published by Dima Soroka. Dima was the lead architect of the Magento project, so you can assume that his numbers are not too bad. He compared the old but still famous Magento 1 with the quite new Magento 2:
“On a product page the Magento 1 server response time was 250 or less ms(milliseconds) in 90% of requests. … On Magento 2 less than 25% of requests were able to achieve the same 250 ms response time. As shown on the graph below, the majority of requests to the server resulted in a response time of 600 ms and more.”
A scalable system is not only fast under low and high load conditions, it slowly increases execution time when the server reaches maximum load. A bad system just dies and does not respond anymore… this typically happens when an SQL database starts to queue the queries or the I/O-wait becomes too high.
We are also looking for low memory consumption to enable a high number of parallel requests without the need to buy high memory servers.
Finally we want to enable very quick horizontal scaling without bottlenecks or cold cache situations.
The real performance in a specific project depends on many things like the concrete hosting setup, the website structure, the amount of products, calls to other services and of course the speed of the project implementation. For high traffic projects we always recommend a dedicated load test with a simulation of real user traffic. Anyhow with Spryker we want to provide a very fast PHP framework so there is no need for our clients to put high efforts into optimizations or exotic server tuning.
Spryker’s optimized software design
To achieve these goals we decided to split the whole shop application into two parts called Yves and Zed. Yves is the customer facing front end. It is a lightweight PHP application based on SensioLabs’ Silex micro framework. All the business logic is located in a more heavyweight application called Zed. Yves gets most of its speed by a simplified software design without the architectural overhead of layers and a reduced bootstrapping process. Another important concept is the way Yves accesses the data. Instead of expensive SQL queries to a relational database with several joins and conditions, Yves reads the data from a blazing fast key/value storage which is Redis by default. It is also connected to a powerful Elasticsearch for quick full text search and facet filters. The relational database which often becomes a scalability bottleneck is not directly accessible from Yves. Instead Zed collects all changes from the database and synchronizes them to Redis and Elasticsearch.
Yves’ performance
Let’s have a deeper look into Yves’ workflows to understand its performance capabilities. From a high level perspective a typical Yves request looks like this:
(1) Resolving the URL
(2) Retrieving all data from Redis and/or Elasticsearch
(3) Adding this data to the pre-compiled Twig templates
As you can see there is not much overhead here. The raw PHP part is fast because of the reduced software design. From a performance perspective the most critical part happens in the second step when Yves retrieves the data from Redis and/or Elasticsearch. All product data, CMS pages and translations are stored in Redis. While in a normalized relational PostgreSQL or MySQL database, the information of each product is spread among several tables for products, stocks, prices, taxes, etc., all of this becomes hydrated into a single set of data which is stored in Redis. Instead of multiple queries with expensive joins, Yves just performs a Redis::get() to retrieve all needed data at once.
A single page requires several sets of data. Redis is extremely fast even with high amounts of data and high number of connections. In case you install Redis on the same server, a single Redis::get() takes just 0.1ms. However, in case you run Spryker in a cloud environment, what 90% of our clients do, you’ll quickly observe network latency. Instead of a fast in-memory lookup to a local Redis, the slow network will dramatically increase the execution time. For this reason, Yves automatically combines data that is used together and performs a single Redis::mget() instead of dozens of Redis::get(). Doing this, or Using this technique, we avoid this conceptual restriction of cloud environments and get the most out of Redis.
In contrast to Redis, Elasticsearch is only used for the catalog. We only execute a single query, but this query is much slower than a Redis::get(). To avoid slow queries we are using an optimized way to fill the index, so that the data is prepared for quick searches. We will explain this in detail in another article.
Finally, I would like to say some words about scalability. To prepare Spryker to scale up in cloud environments, we strictly implemented the twelve-factor methodology. The most important part is that “processes are stateless and share-nothing”. All parts of Yves can be scaled independently from the rest of the application. You can add more or bigger nodes for Yves itself or add more resources to Redis or Elasticsearch. All of this is possible without downtime, you don’t even need to deploy the application. I will give more insights about these topics in additional blog posts.
In the next article I am going to publish the results of our performance load tests. In the meantime feel free to have a look at Spryker! The code is open and the installation takes only some minutes. Please consider that our virtual machine ships with disabled Opcache. In case you want to profile the application, you should activate it.