Tag: OLAP
Which Is the Best Technique for Separating Cold Data from Hot Data?
As business expands, the database or the data warehouse for an OLAP application stores more and more data, and their workload increases. This results in slower response times. Scaling up or scaling out the database is not enough to solve the problem because both are not only expensive but can hardly push the performance up once the database capacity reaches the limit.
Separating cold data from hot data is a better solution. It stores and computes the small amount of frequently accessed hot data and the large amount of seldom accessed cold data separately in order to reduce database workload and increase the response times for queries.
Is OLAP Pre-Aggregation for Speeding up Multidimensional Analysis Reliable?
Multidimensional analysis usually involves large data sets. It is inefficient to summarize original detail data ad hoc for performing business analysis. So, pre-aggregation is used to speed up queries by pre-preparing the desired result sets, which will be directly retrieved to guarantee a real-time response for interactive analysis.
About pre-aggregation strategies
Pre-aggregation is a way of achieving very short response times through highly disk-consuming data pre-summarization. In theory, we can pre-summarize all combinations of dimensions for use to handle multidimensional analysis scenarios based on any of the combinations. In practice, it is hardly feasible. A 50-dimension full pre-aggregation takes up storage space up to 1MT (if we take each intermediate CUBE as 1KB, though actually they are much bigger), which is a total of one million 1T HDD!! Even if we reduce 50 dimensions to 20 for the pre-aggregation, the storage space is still expected to remain 470000T, which is equivalent to hundreds of thousands of 1T HDDs.
The Open-source SPL Redefines OLAP Server
OLAP, the abbreviation of Online Analytical Processing, queries and computes data and returns a result in real-time. Its application covers report viewing, data query, multidimensional analysis and all the other data computation tasks in daily business analytics that require returning results on the spot. OLAP Server is the product for meeting these requirements.
Status of OLAP Server
At present, almost all the mainstream OLAP Servers are big data platforms based on RDB or encapsulated as RDBs, which are similar to ROLAP (seldom mentioned these days) in the early days. One of the key features of them is using SQL as the query language.