OLTP vs OLAP
February 10, 2024
Overview
OLTP (Online Transaction Processing) and OLAP (Online Analytical Processing) are two different architectural designs for
relational databases that serve very different purposes. OLTP is designed for a large number of queries that touch less data.
Each query should take anywhere from single millisecond digits to several seconds (any longer would be pushing the purpose of the architecture).
OLAP is designed for fewer queries that touch more data. Each query could take minutes or even hours to complete. The workload of the former is
meant for an application where low latency is desired, while the latter is meant for analytics that is completed outside immediate
end-user interaction.
Table of Contents
Technical Differences
OLTP | OLAP | |
---|---|---|
Maximum Number of Records per User-Facing Table | Can be optimized for reading millions, or even billions of records. | Can be optimized for reading billions, or even trillions of records. |
Data Model | Entity-Relationship (ER) Model | Star (Denormalized), Snowflake (Combination of Normalized and Denormalized), Galaxy (Combination potentially using either Star or Snowflake) |
User Interaction Model | Synchronous or asynchronous through Web Service, application, or scheduled workflow. | Typically asynchronous through reporting tools. |
Consistency of Data in Replicated Nodes | Typically strong consistency, unless writes are distributed as well. | Strong consistency within a cluster. Weak consistency between the data that is imported into the OLAP database. |
Use Cases | Transactional systems (e.g. e-commerce, banking, or order management). | Data warehousing, business intelligence, and analytical reporting. |
Deeper Dive
OLTP
OLTP databases schemas can be designed in two ways:
- Normalized: These are schemas where each entity belongs to one table and tables are joined by related columns (e.g.
Orders
table andCustomers
table joined by theCustomerId
column). - Denormalized: These are schemas where multiple entities live in the same table (e.g. a single
Orders
table where all data that would be in theCustomers
table is included in theOrders
table with additional columns to support theCustomers
data).
When you create a Denormalized table, you're removing the need to make JOIN
s. The less JOIN
s in a query, the faster it runs because
its Query Plan requires fewer
operations to run.
Scaling
OLTP databases cannot handle as many records as OLAP databases -- it's not what they were designed to do. Features like PARTITION
do help, but
after a certain size, one of two changes must take place:
- Older data must be moved to a less accessible storage medium.
- A complete schema redesign is required.
OLAP
OLAP databases require managing and querying large volumes of data efficiently. These schemas can be custom created, but it's paramount to design them in a way to maximize the time complexity of all operations. A safe way to design OLAP database schemas to assure high performance in a wide array of queries is to adhere to the following schema designs:
Conclusion
The choice between OTLP and OLAP are very distinct and far apart. However, the choices available within each of these two architectures are many and nuanced. This blog post provides high-level guidance to navigate these choices. One option not discussed here is a hybrid: HTAP.
Good luck and happy building!
Other Links
Updated: 2024-02-11