OLTP vs OLAP
OLTP (Online Transaction Processing) and OLAP (Online Analytical Processing) are two different architectural designs for relational databases that serve very different purposes. OLTP is designed for a large number of queries that touch less data. Each query should take anywhere from single millisecond digits to several seconds (any longer would be pushing the purpose of the architecture). OLAP is designed for fewer queries that touch more data. Each query could take minutes or even hours to complete. The workload of the former is meant for an application where low latency is desired, while the latter is meant for analytics that is completed outside immediate
Table of Contents
|Maximum Number of Records per User-Facing Table
|Can be optimized for reading millions, or even billions of records.
|Can be optimized for reading billions, or even trillions of records.
|Entity-Relationship (ER) Model
|Star (Denormalized), Snowflake (Combination of Normalized and Denormalized), Galaxy (Combination potentially using either Star or Snowflake)
|User Interaction Model
|Synchronous or asynchronous through Web Service, application, or scheduled workflow.
|Typically asynchronous through reporting tools.
|Consistency of Data in Replicated Nodes
|Typically strong consistency, unless writes are distributed as well.
|Strong consistency within a cluster. Weak consistency between the data that is imported into the OLAP database.
|Transactional systems (e.g. e-commerce, banking, or order management).
|Data warehousing, business intelligence, and analytical reporting.
OLTP databases schemas can be designed in two ways:
- Normalized: These are schemas where each entity belongs to one table and tables are joined by related columns (e.g.
Customerstable joined by the
- Denormalized: These are schemas where multiple entities live in the same table (e.g. a single
Orderstable where all data that would be in the
Customerstable is included in the
Orderstable with additional columns to support the
When you create a Denormalized table, you’re removing the need to make
JOINs. The less
JOINs in a query, the faster it runs because its Query Plan requires fewer operations to run.
OLTP databases cannot handle as many records as OLAP databases — it’s not what they were designed to do. Features like
PARTITION do help, but after a certain size, one of two changes must take place:
- Older data must be moved to a less accessible storage medium.
- A complete schema redesign is required.
OLAP databases require managing and querying large volumes of data efficiently. These schemas can be custom created, but it’s paramount to design them in a way to maximize the time complexity of all operations. A safe way to design OLAP database schemas to assure high performance in a wide array of queries is to adhere to the following schema designs:
The choice between OTLP and OLAP are very distinct and far apart. However, the choices available within each of these two architectures are many and nuanced. This blog post provides high-level guidance to navigate these choices. One option not discussed here is a hybrid: HTAP.
Good luck and happy building!
Sam Malayek works in Vancouver, using this space to fill in a few gaps. Opinions are his own.