For previous part (2/3), continue here
Discovering the value you can’t see.
Creating a query language is the most complicated task to be solved in BI. It’s not about saving big data, not about their processing, nor about drawing graphs and making API to have a good cooperation with our clients. You cannot buy a query language nor program it in a month.
If the query language gets too complicated the customer won’t manage to work with it. If the query language gets too stupid the customer won’t manage to work with it the way he needs to. GoodData has a simple language to express any complicated questions about the data. At the same time, it has a device that helps it to apply the language to any complicated BI project (or logical data model). In the case of GoodData it has already been mentioned that MAQL/AQW – in my point of view- is the one that is irreplaceable. Furthermore, guys from Prague and Brno – Tomáš Janoušek, David Kubečka and Tomáš Jirotka have widened the AQE with a set of mathematical proofs (complicated algebra) that allow us quick tests of whether the new functions in AQE apply to any type of logical models. That’s how GoodData makes sure that the translations between (MAQL) metrics and some SQL in the lower databases are correct. AQE then helps a common user to overcome the chasm that separates him from low-level scripting.
UPDATE 17. 11. 2013: MAQL is a query language that is translated with MAQL interpreter (before known as QT – “Query Tree” engine) into a tree of queries using the basis of logical model (LDM). These queries are actually definitions of “star joins” in which “Star Generator” (SJG) creates its own SQL queries in DB backend according to the physical data model (PDM – lays below LDM). The whole thing was created at the beginning by Michal Dovrtěl and Hynek Vychodil. The new implementation of AQE further helped to lay all of this onto a solid mathematical basis of ROLAP algebra (which is similar to relation algebra).
After weeks of persuading and yes, bribes, I managed to beg lightly censored examples of queries that AQE creates out of metrics I wrote for this purpose. I guess this is the first time anyone has actually published this....
The right Y-Axis on the graph shows me how many contracts I have done in Afghanistan, Albania, Algeria and American Samoa in the last 13 months. On the left Y-Axis I can see with a blue line how much regular income my salespeople have brought me and the green line indicates how much was the median sales in a given month (the inputs are de-facto identical with the table at from Part 1 of the previous blog posts).
The graph then shows me three metrics (as per the legend below the graph):
"# IDs” = SELECT COUNT(ID (Orders)) – counts the number of components.
“Avg Employee” = SELECT AVG(revenue per employee) – counts the mean of (auxiliary) metrics counting the sum of turnover to salesperson.
“Median Employee” = SELECT MEDIAN(revenue per employee) – counts the median of (auxiliary) metrics counting the sum of turnover of salespeople.
and the auxiliary metrics:
"revenue per employee” = SELECT SUM(totalPrice (Orders)) BY employeeName (Employees) – counts the values of components (some sales) at the level of salesperson.
For the most part, everything explains itself – maybe except “BY” which states that the money “totalPrice (Orders)” is counted per salesperson and not chaotically within itself. I dare say that anyone who is willing and tries MAQL even a little bit is going to learn (or for that matter we can teach it to you with Keboola Academy any time☺).
And now the most important thing... see below how AQE translates to the following SQL:
With a little bit of exaggeration we can say that creating my report is actually quite difficult but thanks to AQE, it does not bother me at all.
If these three hypotheses are valid:
- If GoodData won’t earn me a bunch of money, I won’t use it.
- I will earn a bunch of money, but only by the use of a BI project created to suit MY exact needs.
- BI that is created to suit MY exact needs is a complex matter that we can only manage with AQE.
… then the basis of the success of GoodData is AQE.
A footnote: the before mentioned MAQL metrics are simple examples. Sometimes it is necessary to build the metrics so complicated that it’s almost impossible to imagine what must happen to the background data. This is an example of metrics from one project where analytics stands upon the unstructured texts. Metrics counts the conversation topics in current time by moderators:
Lukáš Křečan once blogged (CZ) that people are the greatest competition advantage of GoodData.
Translation: “Our biggest competitive advantage is not a unique technology that no one else has. The main thing is people. ”
People are the base. We cannot do this without them; it’s them who create the one and only atmosphere in which unique things are founded. However, one and the other are replaceable. The biggest competition advantage of GoodData (as well as the intellectual property) is AQE. If we didn’t have it the user would have to click the reports in closed UI that would take away the essential flexibility. Without AQE, GoodData would classify itself with Tableau, Bime, Birst and others. It would become basically uninteresting and it would have to compete strongly with firms who build their own UI over “Redshifts”.
AQE is an unrepeatable opportunity to get ahead of the competitors who then are only going to lose. No one else is able to implement their own new function into product with arbitrary data in arbitrary dimensions while analytically proving and testing the validity of their implementation.
The line between the false image of, “this cool dashboard is very beneficial for me” and the “real potential that you can dig out of the data” is very thin… it’s name is customizing. It’s an arbitrary model for arbitrary data and arbitrary calculations over it. It can be called an extreme. However, without the ability to count a natural logarithm out of a share of figures of two time-periods over many dimensions, you cannot become a star in the world of analytics. AQE is a game changer on the field of BI and only thanks to it, GoodData redefines rules of the game. Today a general root, tomorrow K-means… ☺