2 min read

Book Review - Episod 1: Designing Data-Intensive Applications.

Saeid Babaei

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

As a software engineer, I am trying to read the most valuable books from the industry and the community. I regularly read a wide range of books from design to maintaining software. But I confess that sometimes I don’t like reading about data. I know that data is a crucial part of every software but actually, sometimes we forget that designing an application is not just architecting code structure and implementing clean code and design patterns! So sometimes we fail projects even with elegant architecture and design only for don’t pay attention to data management and the structure of the storage of the system.

Thinking about data is not so simple. Some many different requirements and options need carefully understand to make decisions for designing the way of storing and retrieving the data. A relational mindset forces most developers to think of data management as some tables, views, stored procedures, triggers, and so on. As a result, sometimes trying to care about schema correctness, keeping the consistency of relations and foreign keys, the problem of applying up migrations and in some bad scenarios down migrations and many other concerns, is the largest part of an application lifecycle. I don’t want to exaggerate but I can say that this book can be a turning point for every developer. You don’t need to be a data engineer or data expert to read the book. Martin Kleppmann tried to cover a massive range of information from very high-level design decisions to very detailed technical implementations. I can say to will got very useful details that can give you the ability to understand the requirements of reliability, scalability, and maintainability of large-scale applications and how to respond to these requirements according to data management. Data management is not just storing and retrieving records in SQL Server or any other DBMS. Deeply understanding the data models, encoding, distributing, transforming, processing large amounts and reporting, and many other principles, practices, and patterns are at the heart of the process of designing well-structured and robust applications.

The book consists of three parts and each part contains some chapters. The first part is about introducing Foundations for Data Systems and the structure of storing, retrieving, transforming, encoding, data models, and query languages and facilities. In part two we will see the Distributed Data and a very good explanation about replication and partitions, and transactions, that are very detailed and rich. sometimes you may lose yourself in the sea of technical details, but I think this is not a bad thing.

Part three is about Derived Data and processing data in many ways. I think this part is the most important but I don’t know why it has very few covered topics here. I expected more details about the processing and managing data in modern and appropriate ways. I think it was very good if Martin could skip some very technical details in previous chapters and pay more attention to this part. Anyway, it doesn't so matter and I am thankful to Martin for his very valuable book. I suggest all developers should read this book.