Our goal in devops at HyperTrack is to reduce the request response lifecycle of API calls and make them super fast. To achieve this goal our architecture offloads some tasks out of the synchronous request lifecycle of the application. We accomplish this with Celery which has become a crucial component in our tech stack at HyperTrack to execute asynchronous tasks.
The need for Celery
Most of our foreground API requests are create-read-update-delete (CRUD) and these are I/O bound on our database. Therefore our application server has an event-based architecture which does not block while waiting for an I/O response. In contrast some of our background tasks like the ETA models or location filtering are CPU intensive and running them on the application server causes the event loop to block and increase the request lifecycle for the end user.
To tackle this challenge we setup a task queue to offload some of these jobs from the request lifecycle and move them to the background. We chose Celery for this purpose. Celery spawns worker processes to consume and execute background tasks. These workers communicate with the application server via a broker like RabbitMQ or Redis. We found these best practices useful while setting up Celery at HyperTrack.
Splitting tasks into queues
By default Celery uses a common queue to route all tasks through the broker. While this is easy to setup it treats all tasks the same which can lead to suboptimal performance. At HyperTrack our background tasks can be segregated.
- The first set of tasks is used to clean and filter location data. These tasks have a lot of math and are therefore CPU intensive. A thread based architecture is best suited for these tasks.
- The second set of tasks is used to define models and fetch data for the ETAs. The performance of these tasks is determined principally by the time spent on database queries and external APIs. These I/O bound tasks perform best with an event based architecture.
- In addition we also have a set of periodic tasks for maintenance that form the third set. This set of tasks is also I/O bound on our database.
These categories of tasks help us define our Celery queues.
With different Celery queues we can use the best worker processes to execute these tasks. CPU bound tasks are routed to a worker with a preforked process pool. I/O bound tasks are executed by a gevent based event driven worker. We use the following commands to spawn these processes.
Benchmarking worker performance
With some stress testing we compared the performance of prefork and gevent workers independently and in combination. Performance is measured as the time taken for a task to complete.
In our tests the combination of prefork and gevent workers is 4 times faster on average than individual workers. When used independently the prefork worker blocks consumption of new tasks while waiting for I/O bound tasks to finish and so the time in the queue piles up. The gevent worker shows higher execution times as CPU bound tasks block other tasks in the execution thread.
Celery helps us take off some load from the application server and which helps us serve your requests fast. If your application has tasks that can move outside the request lifecycle we definitely recommend Celery. We have had our share of misadventures setting up Celery and we will share our learnings in upcoming posts. Stay tuned.
And if your application can use location tracking we definitely recommend checking out our APIs. Request access here.
Subscribe to HyperTrack Blog: Imagine. Build. Repeat.
Get the latest posts delivered right to your inbox