Understanding tradeoffs in designing real-time streaming analytical applications
There is no good or bad design instead, there will be many tradeoffs to make and hopefully, those tradeoffs are good for a particular use case being addressed.
Below is a list of decisions that need to be made and what it means if that design decision is made. It will help anyone new to designing real-time streaming analytical applications and would avoid tons of frustration down the road. The list is agnostic to a particular technology.
Event, data, record, message are used interchangeably to indicate a piece of data that is transferred through a messaging system. Similarly messaging system, event system, messaging middleware, broker all refer to Message oriented middleware (MoM).
- Stream — A stream of records is always immutable sequence of records. Immutable is key to understand, the only way to change an event is reproduce an event which becomes new event. This is fundamental to streaming concept.
- Real-time — Define real-time/near real-time.
Put some number — an event every second is real-time, an event a minute is near real-time or something that is applicable for your use case. This will help manage expectations of the system and also helps when benchmarking the performance of the system. Having no defined definitions becomes subjective and performance tuning would be shooting in the dark.
- Number of event producers/consumers — How many producers will produce events and how many consumers will consume. This defines the scalability requirements of a real-time streaming application. In general there can be millions of events generated by thousands of producers and tens or hundreds of consumers.
- Event Resolution — What is the resolution of events in stream
If a sensor is producing 1 event every second and if there are 100,000 sensors we are talking about 1 event/sec * 86,400 /day * 100,00 will lead to 864 million events per day. If the event data size is 100 bytes this will lead to 82GB per day. It is important have these numbers defined at the start of the project itself, to have an understanding of scalability, computing and storage needs.
- Fault tolerance — The ability to restart a messaging middleware in the event of messaging tool shutting down due to unexpected failures. If the messaging middleware doesn’t persists the unprocessed events to a persistent storage, the probability of data loss is high. It is important to understand implications of data loss in relation to the use case.
- Availability — Is the middleware capable of replicating the partitions/shards in such a way that if one of the messaging broker goes down, an additional instance of broker can be online with the existing replica thus guaranteeing zero data loss as well high availability of messaging system. This requires capabilities like partition/shard replication, a clustered architecture of the broker, having in sync replicas. Also ability that producers write is not complete until the event is fully replicated and persisted to the fault tolerant storage.
- Event offset mark — Who maintains the event offset, is it brokers responsibility or the consumers responsibility. A broker that manages every consumers offset has different challenges compared with a broker that lets consumers manage their offsets.
- Load balancing of partitions/shards — Producers can control which partition/shard event can be sent using record key. Important to understand record key and publishing pattern or else will run into load balancing challenges, might end up hot-spotting the same partition. Round robin can help but if ordering is required this might not work. For instance if we want to send events from a device to the same partition to guarantee ordering at the consumer end, round robin will not help. In cases where record key is used for selection of partition, the chances of hot spotting a partition would be high. For instance taking the use case of devices across the world sending events, If region/continent of the device is the key and devices are active during day time, during the day in Asia the partition related to Asia will be hot-spotted, while the partition in US would be idle and vice-versa. The ideal record key in this case could be device id which will evenly load all partitions and at the same guarantees order of events per device. So the selection of Record key is important and really depends on the use case
- Data retention of the event/message data — is the message broker supports reprocess/replay of data? Is this is required for your case. In a traditional message queue, events gets deleted once its acknowledge by consumer. What if there is a need to replay the event again? A pub/sub system supports replay/reprocessing of the events, the downside of it is that downstream systems need to handle duplicate processing of the events, what is desirable really depends on the use case.
- Can multiple consumers process same event — In a traditional queue a message gets deleted at any time a single consumer process the event. In a pub/sub system multiple consumers can process an event.
- Event/messages throughput vs latency — A high throughput (large number of events) with low latency (in lesser time from source to target) is desirable, but is not possible since throughput and latency are inversely related. See the Little’s law in queuing theory on why they are inversely related. A low latency is lesser delay from source to target and is possible if producer sends one event at a time. A high throughput is when a large number of events are moved from source to target. A balance between throughput and latency is to be achieved based on the needs of the use case and is possible in following ways.
1) Batching of the events from producer — how long to wait for batch
2) Reduce number of API calls from producer to broker by batching many events in a single API call.
- Streaming events limits — Some cloud systems have limits on how much of data and how many of events, producers can send to event system per second, at the same time how much of data can be read. Honor the limits or else throttling will happen for producer and consumer calls. And hence requires retry mechanism on both sides.
- Event Size — Messaging systems have different constraints on the event size, events can be compressed to reduce the size, downside of this approach is that events have to be decompressed on the consumer side as well at the same time, compression and decompression will have some impact on throughput and latency in a large scale system. If the event size is too large, let’s say 100MB another trick is send the data to a storage location and then send just an event giving the location of the actual data file, this way large event data can be streamed.
- Producer Capabilities — Can producers make synchronous or asynchronous calls or both. Can producers have capabilities to retry in case of failures, is it ok if some events are lost and retry is not required
- Partition/Shards Scalability capabilities — Can partitions/shards be increased once the system is operational, at the same can they be reduced or merged or combined. Is Auto scaling on partitions based on the load supported?
- Security — is SSL supported and encryption in transit supported by the messaging system, producers and consumers? How producers authenticate with broker, consider scenarios when producers are edge devices directly sending events to broker from the field.
- Delivery Semantics aka QoS — Important to define till what point in complete pipeline this is required. It is also important to understand use case and required desirability. In come use cases data loss is acceptable but not duplicates and vice-versa.
at most once - maximum once, possibility of data loss but no duplicatesat least once - minimum once, no data loss but possibility of duplicates - best to make downstream systems idempotentexactly once - most desirable not always possible even if broker supports in the context of the use case
- Checkpointing offset mark — is the store used by consumers to store checkpointing offset mark is durable and fault tolerant. Is the store fast enough when storing offset mark? A slow store might impact the speed with which consumers can read and process event data. Is the checkpoint store persisted or is its in-memory store, both have implications on the durability.
Disclaimer: Please note that all the opinions expressed are my personal independent thoughts and not to be attributed to my current or previous employers.