Tuya Smart had outgrown its existing message system which was based on Kafka. Rapidly increasing numbers of users, topics, and messages had led to mounting costs associated with storage, processing, labor, and time.
To improve system performance, Tuya required a more flexible message delivery system that could meet their need for persistence. They also needed a solution that would make it easy to classify messages by user.
After evaluating several different options, Tuya settled on Apache Pulsar because it proved to be the most adept at handling the accumulation of messages and repeated consumption. The addition of Pulsar has made Tuya’s message system much more efficient, resulting in lower operational and maintenance costs.
Introduction to Tuya Smart
Tuya Smart is a global, intelligent platform; an “AI+IoT” developer platform; and, the world’s leading voice AI interaction platform. Intelligently connected with the needs of consumers, production brands, OEMs, and retail chains, Tuya provides customers with a one-stop, artificial intelligence IoT solution.
Tuya offers hardware intervention, cloud services, and application software development to form a closed ecosystem of artificial intelligence plus manufacturing services. This closed ecosystem provides business-side technology and business model upgrade services for IoT smart devices for consumers, thereby meeting customers’ higher demands for hardware products.
Figure 1 shows Tuya’s current ecological model, including Tuya Cloud, Tuya OS, and Tuya APP, which together form a closed ecological loop. The IoT ecosystem on the right illustrates some of Tuya’s application scenarios, such as smart hotels, smart security, smart homes, and so on.
Tuya Smart’s Message Architecture Before Pulsar
Figure 2 illustrates Tuya Smart’s message architecture before the addition of Pulsar. The upper layer includes a suite of devices that are independent of IoT, such as power switches, projectors, and so on. Messages from these devices were being reported to the message system through the MQTT gateway. In addition, other IoT devices, such as sensors, were transmitting messages through the MQTT gateway before reporting them to the message system.
Figure 3 shows each link in Tuya’s previous message system. Message distribution began at the MQTT gateway. Then, after gaining access to middle services, the messages were connected to Kafka, through which the messages were distributed. Users were able to process messages in different ways after receiving them.
Tuya’s previous architectural pattern had been causing the company the following pain points:
The most glaring issue was that the HTTP delivery method was not flexible. If a user wanted to re-consume messages after the service was restarted, additional processing was required to meet the demand for message persistence. Specifically, any messages that users did not receive needed to be saved in the database.
The persistence problem could have been solved under Tuya’s existing Kafka subscription model; however, the company had additional architectural challenges that called for a different solution.
In Kafka’s delivery mode, every user is associated with a unique topic. Therefore, the number of topics increases as the number of users increases. As a result of a sharp rise in the numbers of users, topics, and messages, operation and maintenance had become costly and stressful over time. The cost of labor and time had gradually gone up as well.
Tenants were interacting with each other because messages were classified by category. The interrelationship between the tenants was greatly affected by the message distribution through Kafka.
Why Apache Pulsar
Tuya ultimately chose Pulsar for two main reasons. First, Pulsar has unique features, such as multi-tenancy, which offer the company distinct advantages. And second, Pulsar performed better than its competitors during testing. In this section, we’ll examine which Pulsar features were the most attractive to Tuya and review the results of the performance tests.
The following features played a key role in Tuya’s decision to adopt Pulsar:
Rich Delivery/Subscription Strategy
Pulsar unifies the queue model and the stream model. A copy of the data does not need to be stored at the topic level because a piece of data can be consumed multiple times. Flexibility can be improved significantly by calculating different subscription models in streaming, queuing, and so on.
Ease of Operation and Maintenance (Compared to Kafka); Inclined to Automation
Apache Pulsar is a flexible publish-subscribe message system with a multi-layered and segmented architecture. Its main advantage is in geo-replication. With Pulsar’s cloud-native architecture that separates computing from storage, data is moved away from the broker and into shared storage. The upper layer is a stateless broker that replicates message distribution and services. The lower layer is a persistent storage layer called the bookie cluster.
With its segmented storage architecture, Pulsar allows data to expand independently and recover quickly without being constrained by scaling.
Multi-tenancy is the ability of a single instance of software to serve multiple tenants. A tenant is a group of users that shares a common access with specific privileges to the software instance. Tenants and namespace are two core Pulsar resources that support multi-tenancy as follows:
At the tenant level, Pulsar reserves appropriate storage space, application authorization, and authentication mechanisms for specific tenants.
At the namespace level, Pulsar has a series of configuration policies, including storage quotas, flow control, message expiration policies, and isolation policies between namespaces.
Messages can be classified either by category or by tenant (user). When Tuya started to classify messages by tenant instead of category, the interaction between tenants was resolved automatically.
Although Pulsar is not the only platform that can solve this problem, the desired outcome is difficult to achieve in Kafka because Kafka is a single-tenant system. Pulsar’s multi-tenancy feature serves Tuya’s real scenarios much better.
Excellent Online Community
The Pulsar community is very active and responsive to both technical and documentation issues.
Pulsar Performed Better Than Its Competitors
Tuya certainly did their due diligence in comparing multiple message queues from the perspectives of performance, scaling, operation, and maintenance. A summary of their findings is shown in Table 1.
LeviMQ is a MQTT-protocol-based message queue developed by Tuya. NSQ is a popular open-source message middleware product in Go.
Firstly, Kafka had shortcomings in scaling, especially in scaling down. Secondly, from the perspective of operation and maintenance, Kafka, as mentioned above, is more costly in terms of labor and time. Finally, with regard to ecology, LeviMQ was developed by Tuya, but it is not an open-source solution. Therefore, LeviMQ has inherent ecological limitations to some extent. NSQ is, as well, an excellent message queue—open-sourced, with changes based on the advantages of Kafka. However, the documentation for Pulsar is more complete.
After comparing various aspects of message queue performance, the advantages and disadvantages of each platform were analyzed. The results are shown in Figure 4.
As Figure 4 illustrates, Pulsar is better at scaling and application scenarios. It is also more flexible than Kafka.
And, although the documentation for Pulsar is less complete than Kafka’s, the Pulsar community has been working hard to fill in the gaps, and they are making good progress.
With the addition of Apache Pulsar, Tuya’s message system architecture changed as shown in Figure 5.
The most notable changes in the architecture were as follows:
A Pulsar layer was added between Kafka and message distribution. The most obvious improvement in this new architecture is the resolution of tenant isolation. Tuya now creates a new tenant for each user when Pulsar is deployed.
The software development kit (SDK) that Tuya’s customers use to subscribe to messages now supports Pulsar.
Current and Future Plans for Pulsar
Tuya successfully applied Apache Pulsar to various application layers and the system performed well overall. They have been very satisfied with the results and are now working on implementing their short- and long-term plans.
Currently, the company is in the process of applying a set of rule engines to Pulsar to meet the growing demand for message subscriptions.
For the future, Tuya anticipates more extended business support functions that provide richer usage scenarios. Specifically,
At the technical level, Tuya awaits more O&M (operation and maintenance) APIs, such as the ability to view the broker and bookie associated with a specific topic.
As for documentation, Tuya would like to see more official Pulsar design documents to aid in their understanding.
With the advent of 5G, the IoT industry is facing myriad challenges and opportunities. As a global intelligent platform, Tuya Smart not only links vendors of various sales platforms, but also connects users in countless ways. Driven by the theme “Intelligence for All Things”, Tuya had a pressing need for a message system with high performance and stability.
After comparing a variety of different message systems such as Kafka and LeviMQ, Tuya finally chose Apache Pulsar. With its excellent performance and features such as geo-replication and multi-tenancy isolation, Pulsar solved many of the pain points in Tuya’s previous message system, such as inflexible delivery, increased operation costs due to a rapidly growing number of topics, interaction between tenants, and so on. This implementation has proven that Apache Pulsar has a promising future as an application for the IoT industry.
Yong Zhang is an Apache Pulsar committer. He works as a software engineer at StreamNative.
You can download the hard copy of the success story today