Stream Processing and Event-driven Systems
Go is well suited to many processing environments:
- Batch processing is time-driven. Information systems consume large amounts of input data, execute jobs, and produce output. Job execution is scheduled and controlled by agents or system operators. Throughput, the amount of data processed per unit of time, is the common performance metric.
- Interactive processing is the world of online systems and the web. It is request-driven, with clients making requests and servers responding to those requests. Response time or latency is the common performance metric. System scalability is key, with the goal of responding to all user requests within a short amount time.
- Stream processing is needed for event-driven systems. Data are unbounded, arriving in streams over time, perhaps continuously, perhaps intermittently. Events correspond to state changes in data streams. Performance evaluation equates to how fast systems respond to events.
Streaming data arise in many contexts. Journalists attend to messages arriving from social media and news wires. Automated trading systems respond to market fluctuation. Financial service applications detect fraudulent transactions before they are processed. Cameras and electronic sensors identify unusual or threatening activities, alerting security personnel. Providers of networking and information systems monitor system logs, employing cybersecurity solutions. Responding to events in near real-time is important to operations, logistics, and supply chain management. Self-driving cars must react to road and traffic conditions.
Needs for stream processing and real-time analytics are ubiquitous.
In this Go Time episode, Jon Calhoun and Kris Brandow meet with Daniel Selans and Steve High, who shed light on event-driven systems and where they are most useful. They group explores the complexity of setting up an event-driven system, the need to embrace eventual consistency, useful tools for building event-driven systems, and more. The session was recorded in May 2021:
Rebecca Bilbro, founder and CTO of Relational Labs, distinguishes between batch and stream processing in this introductory presentation for data scientists:
Ensign, currently in beta and written in Go, offers a managed event-driven platform for data analytics. Bilbro describes Ensign as being part of a PostSQL approach to data management, one that combines relational and NoSQL technologies and provides support for event-driven solutions. Ensign is open-source with a GitHub repository.
Many firms are likely to enter the event-driven systems space in the coming years, and many will use Go as the primary programming language.
As companies migrate from monolithic architectures to microservices, they realize the importance of efficient communication among services. NATS is an open-source messaging system written in Go. NATS, which stands for “Neural Autonomic Transport System,” comprises a a protocol that ensures trustworthy message delivery of between information producers and consumers. NATS is central to communications between microservices and the development of cloud-native solutions.
The Go Watermill library provides high-level messaging interfaces for even-driven applications. Watermill’s stated goal is to make communication with messages as easy as using HTTP routers. The Watermill repository includes online training for event-driven applications in Go.
The TinyGo compiler brings a subset of the Go programming language to microprocessors and embedded systems, as needed for the Internet of Things (IoT). It also supports the WebAssembly System Interface (WASI) for the modern web.
Words of caution. Go garbage collection imposes a performance penalty, which can be a problem with applications requiring microsecond real-time responsiveness. There may be a need examine Go garbage collection, setting parameters, opting for manual execution, or turning the system off completely. Go software developers examine benchmarks and software profiles to identify performance bottlenecks and issues.
References for Stream Processing: #
-
Kleppmann, Martin. 2017. Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems. Sebastopol, CA: O’Reilly. [ISBN-13: 978-1-449-37332]
-
Psaltis, Andrew G. 2017. Streaming Data: Understanding the Real-time Pipeline. Manning: Shelter Island, NY. [ISBN-13: 9781617292286]
-
Stack, Michael. 2022. Event-Driven Architecture in Golang: Building Complex Systems with Asynchronicity and Eventual Consistency. Birmingham, UK: Packt. [ISBN-13: 978-1-80323-801-2]
-
Theel, Tobias. 2021. Creative DIY Microcontroller Projects with TinyGo and WebAssembly: A Practical Guide to Building Embedded Applications for Low-Powered Devices, IoT, and Home Automation. Birmingham, UK: Packt. [ISBN-13: 978-1-8000560208]
Back to Building Systems and Services