Category: SPL overview and comparison

Python vs. SPL 12 – Big Data Processing

In data analysis, we usually encounter the data that is too big to fit in memory and has to be processed on hard disk. In this article, we’ll compare the calculation abilities of Python and SPL for such order of magnitude of data. As for bigger order of magnitude like PBs, it requires the distributed system to analyze the data, which is out of the scope of this article.

continue reading →

Python vs. SPL 11 – Many-to-One Association

In Python vs. SPL 10 -- One-to-N Association, we introduce one-to-one and one-to-N association. And this article will compare the computational abilities of Python and SPL in many-to-one association.

continue reading →

Python vs. SPL 10 – One-to-N Association

In data analysis, we usually encounter the scenarios of associating two or more tables, and the association of tables can be divided into the following categories: one-to-one, one-to-many, many-to-one, and many-to-many. The association of one-to-one means that one record of a table corresponds to only one record of another table; the association of one-to-many indicates that one record of a table corresponds to multiple records of another table. This article will compare the computational abilities of Python and SPL in one-to-N association.

continue reading →

Python vs. SPL 9 – Inverse Grouping & Transpose

When performing aggregation on the result of grouping operation, we usually get a set smaller than the original set, which equals to doing aggregation on the data; while inverse grouping is equivalent to the inverse operation of grouping, using a relatively small data table to calculate a bigger data table through certain regulations.

continue reading →

Python vs. SPL 8 – Ordered Grouping

We are naturally interested in order-related operations, and grouping operations may also involve the order. This article will compare the calculation abilities of Python and SPL in ordered grouping.

continue reading →

Python vs. SPL 7 – Alignment Grouping & Enumeration Grouping

Grouping operation usually refers to equivalence grouping with the following features: 1) All the members of the original set are in and only in one unique group; 2) No group is empty set;

continue reading →

Python vs. SPL 6 – Equivalence Grouping

When there are too many things, we usually classify those things into various groups and then perform aggregation operation on them. For example, calculations such as querying the highest score of each class or the average age of employees in each department of the company are called grouping operation which is often followed by the subsequent aggregation operation. The most common grouping is to split members with the same attribute value into one group, which is also known as equivalence grouping. In this article, we’ll compare the calculating abilities of Python and SPL in equivalence grouping and aggregation.

continue reading →

Python vs. SPL 5 – Order-related Operation

We are naturally interested in order-related operations which include calculating the rate of last period or YOY. This article is focused on comparing the order-related operations between Python and SPL.

continue reading →

Python vs. SPL 4 – Selection and Positioning

Selecting subset from a set is a very common operation, for example, selecting members who are more than 40 years old from all the members in the company. In this article, we’ll compare the selection operations between Python and SPL.

continue reading →

Python vs. SPL 3 – Loop Function

The functions that calculate every member in the set and traverse the set to get a new result are generally called the loop function. The native loop functions of “list” in Python are too few, and the “for” statement should be used for those slightly complex loops, so instead of introducing them in this article, we’ll focus on comparing the loop functions in Pandas and SPL.

continue reading →