GoodData XAE: The BI Game-Changer (3rd part)

For previous part (2/3), continue here


Discovering the value you can’t see.

Creating a query language is the most complicated task to be solved in BI. It’s not about saving big data, not about their processing, nor about drawing graphs and making API to have a good cooperation with our clients. You cannot buy a query language nor program it in a month.

If the query language gets too complicated the customer won’t manage to work with it. If the query language gets too stupid the customer won’t manage to work with it the way he needs to. GoodData has a simple language to express any complicated questions about the data. At the same time, it has a device that helps it to apply the language to any complicated BI project (or logical data model). In the case of GoodData it has already been mentioned that MAQL/AQW – in my point of view- is the one that is irreplaceable. Furthermore, guys from Prague and Brno – Tomáš Janoušek, David Kubečka and Tomáš Jirotka have widened the AQE with a set of mathematical proofs (complicated algebra) that allow us quick tests of whether the new functions in AQE apply to any type of logical models. That’s how GoodData makes sure that the translations between (MAQL) metrics and some SQL in the lower databases are correct. AQE then helps a common user to overcome the chasm that separates him from low-level scripting.

UPDATE 17. 11. 2013: MAQL is a query language that is translated with MAQL interpreter (before known as QT – “Query Tree” engine) into a tree of queries using the basis of logical model (LDM). These queries are actually definitions of “star joins” in which “Star Generator” (SJG) creates its own SQL queries in DB backend according to the physical data model (PDM – lays below LDM). The whole thing was created at the beginning by Michal Dovrtěl and Hynek Vychodil. The new implementation of AQE further helped to lay all of this onto a solid mathematical basis of ROLAP algebra (which is similar to relation algebra).

After weeks of persuading and yes, bribes, I managed to beg lightly censored examples of queries that AQE creates out of metrics I wrote for this purpose. I guess this is the first time anyone has actually published this....

For a comparison I used the data model out of the Report Master course in Keboola Academy and made this report from it:

The right Y-Axis on the graph shows me how many contracts I have done in Afghanistan, Albania, Algeria and American Samoa in the last 13 months. On the left Y-Axis I can see with a blue line how much regular income my salespeople have brought me and the green line indicates how much was the median sales in a given month (the inputs are de-facto identical with the table at from Part 1 of the previous blog posts).

The graph then shows me three metrics (as per the legend below the graph):

  • "# IDs” = SELECT COUNT(ID (Orders)) – counts the number of components.

  • “Avg Employee” = SELECT AVG(revenue per employee) – counts the mean of (auxiliary) metrics counting the sum of turnover to salesperson.

  • “Median Employee” = SELECT MEDIAN(revenue per employee) – counts the median of (auxiliary) metrics counting the sum of turnover of salespeople.

and the auxiliary metrics:

  • "revenue per employee” = SELECT SUM(totalPrice (Orders)) BY employeeName (Employees) – counts the values of components (some sales) at the level of salesperson.

For the most part, everything explains itself – maybe except “BY” which states that the money “totalPrice  (Orders)” is counted per salesperson and not chaotically within itself. I dare say that anyone who is willing and tries MAQL even a little bit is going to learn (or for that matter we can teach it to you with Keboola Academy any time☺).

And now the most important thing... see below how AQE translates to the following SQL:

With a little bit of exaggeration we can say that creating my report is actually quite difficult but thanks to AQE, it does not bother me at all.


If these three hypotheses are valid:

  1. If GoodData won’t earn me a bunch of money, I won’t use it.
  2. I will earn a bunch of money, but only by the use of a BI project created to suit MY exact needs.
  3. BI that is created to suit MY exact needs is a complex matter that we can only manage with AQE.

… then the basis of the success of GoodData is AQE.

A footnote: the before mentioned MAQL metrics are simple examples. Sometimes it is necessary to build the metrics so complicated that it’s almost impossible to imagine what must happen to the background data. This is an example of metrics from one project where analytics stands upon the unstructured texts. Metrics counts the conversation topics in current time by moderators:

Lukáš Křečan once blogged (CZ) that people are the greatest competition advantage of GoodData.

Translation: “Our biggest competitive advantage is not a unique technology that no one else has. The main thing is people. ”

People are the base. We cannot do this without them; it’s them who create the one and only atmosphere in which unique things are founded. However, one and the other are replaceable.  The biggest competition advantage of GoodData (as well as the intellectual property) is AQE. If we didn’t have it the user would have to click the reports in closed UI that would take away the essential flexibility. Without AQE, GoodData would classify itself with Tableau, Bime, Birst and others. It would become basically uninteresting and it would have to compete strongly with firms who build their own UI over “Redshifts”.

AQE is an unrepeatable opportunity to get ahead of the competitors who then are only going to lose. No one else is able to implement their own new function into product with arbitrary data in arbitrary dimensions while analytically proving and testing the validity of their implementation.

The line between the false image of, “this cool dashboard is very beneficial for me” and the “real potential that you can dig out of the data” is very thin… it’s name is customizing. It’s an arbitrary model for arbitrary data and arbitrary calculations over it. It can be called an extreme. However, without the ability to count a natural logarithm out of a share of figures of two time-periods over many dimensions, you cannot become a star in the world of analytics. AQE is a game changer on the field of BI and only thanks to it, GoodData redefines rules of the game. Today a general root, tomorrow K-means… ☺  

Howgh!

2 responses
Let me ask you: what's the difference between MDX in Microsoft's Analysis Services and GoodData's MAQL?
@Marc: MDX has the same level of abstraction as SQL. In both cases, you have to exactly express what should be counted and how. It’s mainly about defining dimensions, in which you want to count. MAQL, on the other hand, is a much more abstract and declarative language. MAQL and SQL/MDX are two different species, much like the Prolog and C languages are different species. In SQL/MDX, you have to declare where to get data (source table) and how to connect this data with other data. In MAQL, you simply define (declare) the requested result and AQE does the rest for you. For example, if I write in MAQL “SELECT SUM(Sales) / (SELECT SUM(Sales) BY Year), I get the share of sales to annual sales - and then the most fantastic things: If I use this metric in a report with months, I get the share of monthly sales to annual sales. If I use this metric in report with products and quarters, I get share of quarter sales each quarter to annual sales of specific product. And so on… Any filters and any data dimensions. This “share metric” is the simplest example allowing us to present MAQL’s power. Metrics can be much much difficult. Just for fun - try to replicate this metric with Microsoft MDX, not even including its functionality in different reports with different dimensions and complexity.