Category: 5 SPL Subject & Routine

Routine for real-time data updating
Background and method This routine has similar applicable scenarios as “Routine for real-time data appending”, except that the data needs to be updated. This routine is applicable to the following scenarios: the real-time requirement for data maintenance is very high, the cycle period for updating data is short, and data may be updated at any time; the data need to be stored in multiple zone tables (hereinafter referred to as ‘table’ unless otherwise specified) of a multi-zone composite table in layers; only the update mode is supported. Key differences from append routine: Definitions and concepts Key differences from append routine:

Routine for real-time data appending
Routine for real-time data appending (zone table) Background and method This routine is applicable to the following scenarios: the real-time requirement for data maintenance is very high, the cycle period for appending data is short, and data may be appended at any time; the data need to be stored in multiple zone tables (hereinafter referred to as ‘table’ unless otherwise specified) of a multi-zone composite table in layers; only the append mode is supported, and the data appended at a time is relatively small and can be stored in a table sequence. Method: 1. In order to meet the requirement

Routine for regular maintenance of multi-zone composite table
Background and method This routine is applicable to the following scenarios: data maintenance has no real-time requirements and can be performed regularly in a specific period (usually in hours or day); the total data is very large and needs to be split and stored in multiple zone tables (hereinafter referred to as ‘table’ unless otherwise specified); two modes, append and update, are supported, and the data amount for maintenance each time may be large, and may be passed in as a cursor. Methods: Append mode: the incoming data is required to be ordered by the time field. Maintenance steps: i)split

Routine for regular maintenance of single composite table
Background and method This routine is applicable to the following scenarios: data maintenance has no real-time requirements and can be performed regularly in a specific period (usually in hours or day); the total data is not large so that it can be stored in a single composite table; two modes, append and update, are supported, and the data amount for maintenance each time may be large, and may be passed in as a cursor. Method: use a current composite table for query, and merge the received new data with the current composite table to generate a backup composite table. After

Column-wise computing of SPL
In-memory column-wise computing What is columnar storage The table sequence in memory generally adopts the row-based storage. For example, the employee table contains three fields ‘id, name and birthday’, which are stored in memory roughly as follows: Each row (i.e., each record) is stored as an Object array, including three member objects: [Integer,String,Date]. In general, each column (field) contains the same type of data. Under this premise, SPL can store data by column. For example, if the data in the id column are all integers, they can be stored as an int array; if the data in the name column

SPL time key
What is a time key? While relatively stable, the data of dimension table may still change. For example, the city where a certain customer is located changed from New York to Chicago on May 15, 2020. When associating the order table with customer table, the order before this date should be associated with the old customer record (that is, the city should still be New York), while the order on and after this date should be associated with the new customer record (that is, the city should be Chicago). In other words, we need to find the correct customer record

Routine for regular and active update of small amounts of data
Composite table is an important file storage format of SPL, yet the composite table file does not support simultaneous read and write operations and, it also often requires storing the data in order in order to ensure high performance. In practice, however, data is not static and needs to be continuously appended or modified, and the order of newly generated data often differs from that required by composite table. In this case, how to avoid affecting the ongoing query and keep the order of data while maintaining the data of composite table becomes a problem we have to face. This

New association calculation methods of SPL
“Association calculation in SPL – In-memory join” presents the classification of association calculations in SPL and the programming methods for in-memory join. “Association calculation in SPL – external storage join” presents the programming methods for external storage join. This article will continue to present new association calculation methods of SPL, including the fjoin function and composite table cursor association & filtering mechanism for foreign key join, as well as the pjoin and new/news functions for primary key join. When used in appropriate scenarios, these new methods can achieve better performance than those introduced in the previous two articles. However, the

Association calculation in SPL – external storage join
The previous article “Association calculation in SPL – In-memory join” (In-memory join for short) presents the classification of association calculations in SPL and the programming methods for in-memory join. When one or more association tables have a large amount of data and need to be stored in external storage, the in-memory join algorithms cannot be used. For this reason, SPL specifically provides external storage join algorithms. When solving external storage join problems, there are similarities with in-memory join: 1. Clearly distinguish the type of join, and find the (logical) primary key participating in association; 2. Choose different SPL functions to