During my midnight oil hours and rumbling through out our internal systems, I have come across the ZenDesk tickets that our data analysts are closing for one of hour clients - H1 agency (part of GroupM).
During my midnight oil hours and rumbling through out our internal systems, I have come across the ZenDesk tickets that our data analysts are closing for one of hour clients - H1 agency (part of GroupM).
For previous part (2/3), continue here
Discovering the value you can’t see.
Creating a query language is the most complicated task to be solved in BI. It’s not about saving big data, not about their processing, nor about drawing graphs and making API to have a good cooperation with our clients. You cannot buy a query language nor program it in a month.
If the query language gets too complicated the customer won’t manage to work with it. If the query language gets too stupid the customer won’t manage to work with it the way he needs to. GoodData has a simple language to express any complicated questions about the data. At the same time, it has a device that helps it to apply the language to any complicated BI project (or logical data model). In the case of GoodData it has already been mentioned that MAQL/AQW – in my point of view- is the one that is irreplaceable. Furthermore, guys from Prague and Brno – Tomáš Janoušek, David Kubečka and Tomáš Jirotka have widened the AQE with a set of mathematical proofs (complicated algebra) that allow us quick tests of whether the new functions in AQE apply to any type of logical models. That’s how GoodData makes sure that the translations between (MAQL) metrics and some SQL in the lower databases are correct. AQE then helps a common user to overcome the chasm that separates him from low-level scripting.
UPDATE 17. 11. 2013: MAQL is a query language that is translated with MAQL interpreter (before known as QT – “Query Tree” engine) into a tree of queries using the basis of logical model (LDM). These queries are actually definitions of “star joins” in which “Star Generator” (SJG) creates its own SQL queries in DB backend according to the physical data model (PDM – lays below LDM). The whole thing was created at the beginning by Michal Dovrtěl and Hynek Vychodil. The new implementation of AQE further helped to lay all of this onto a solid mathematical basis of ROLAP algebra (which is similar to relation algebra).
After weeks of persuading and yes, bribes, I managed to beg lightly censored examples of queries that AQE creates out of metrics I wrote for this purpose. I guess this is the first time anyone has actually published this....
The right Y-Axis on the graph shows me how many contracts I have done in Afghanistan, Albania, Algeria and American Samoa in the last 13 months. On the left Y-Axis I can see with a blue line how much regular income my salespeople have brought me and the green line indicates how much was the median sales in a given month (the inputs are de-facto identical with the table at from Part 1 of the previous blog posts).
The graph then shows me three metrics (as per the legend below the graph):
"# IDs” = SELECT COUNT(ID (Orders)) – counts the number of components.
“Avg Employee” = SELECT AVG(revenue per employee) – counts the mean of (auxiliary) metrics counting the sum of turnover to salesperson.
“Median Employee” = SELECT MEDIAN(revenue per employee) – counts the median of (auxiliary) metrics counting the sum of turnover of salespeople.
and the auxiliary metrics:
"revenue per employee” = SELECT SUM(totalPrice (Orders)) BY employeeName (Employees) – counts the values of components (some sales) at the level of salesperson.
For the most part, everything explains itself – maybe except “BY” which states that the money “totalPrice (Orders)” is counted per salesperson and not chaotically within itself. I dare say that anyone who is willing and tries MAQL even a little bit is going to learn (or for that matter we can teach it to you with Keboola Academy any time☺).
And now the most important thing... see below how AQE translates to the following SQL:
With a little bit of exaggeration we can say that creating my report is actually quite difficult but thanks to AQE, it does not bother me at all.
If these three hypotheses are valid:
… then the basis of the success of GoodData is AQE.
A footnote: the before mentioned MAQL metrics are simple examples. Sometimes it is necessary to build the metrics so complicated that it’s almost impossible to imagine what must happen to the background data. This is an example of metrics from one project where analytics stands upon the unstructured texts. Metrics counts the conversation topics in current time by moderators:
Lukáš Křečan once blogged (CZ) that people are the greatest competition advantage of GoodData.
Translation: “Our biggest competitive advantage is not a unique technology that no one else has. The main thing is people. ”
People are the base. We cannot do this without them; it’s them who create the one and only atmosphere in which unique things are founded. However, one and the other are replaceable. The biggest competition advantage of GoodData (as well as the intellectual property) is AQE. If we didn’t have it the user would have to click the reports in closed UI that would take away the essential flexibility. Without AQE, GoodData would classify itself with Tableau, Bime, Birst and others. It would become basically uninteresting and it would have to compete strongly with firms who build their own UI over “Redshifts”.
AQE is an unrepeatable opportunity to get ahead of the competitors who then are only going to lose. No one else is able to implement their own new function into product with arbitrary data in arbitrary dimensions while analytically proving and testing the validity of their implementation.
The line between the false image of, “this cool dashboard is very beneficial for me” and the “real potential that you can dig out of the data” is very thin… it’s name is customizing. It’s an arbitrary model for arbitrary data and arbitrary calculations over it. It can be called an extreme. However, without the ability to count a natural logarithm out of a share of figures of two time-periods over many dimensions, you cannot become a star in the world of analytics. AQE is a game changer on the field of BI and only thanks to it, GoodData redefines rules of the game. Today a general root, tomorrow K-means… ☺
For previous part (1/3), continue here
An honest look at your data
Moving forward with our previous example; uploading all of the data sources we use internally (from one side of the pond to the other) into an LDM makes each piece of information easily accessible in GoodData - that’s 18 datasets and 4 date dimensions.
Over this model, we can now build dashboards in which we watch how effective we are, compare the months with one another, compare people, different kinds of jobs, look at the costs, profits and so on.
Therefore, anything in our dashboard suits our needs exactly. No one dictated us how the program will work...this freedom is crucial for us. Thanks to it we can build anything that we want in GoodData – only our abilities matter in the question of succeeding and making the customer satisfied.
What’s a little bit tricky is that a dashboard like this can be built in anything. For now let’s focus on dashboards from KlipFolio. They are good, however they have one substantial “but” – all the visual components are objects that load information out of rigid, and predefined datasets. Someone designed these datasets exactly for the needs of the dashboard and made it not possible to tweak - take two numbers out of two base tables … and watch their quotient in time. A month-to-date of this quotient can be forgotten immediately… and not even think about the situation in which there are “many to many” linkages. The great advantage of these BI products (they call themselves BI but we know the truth) is that they are attractive and pandering. However, one should not assume in the beginning that he has bought a diamond, when in actuality it cannot do much more than his Excel. (Just ask any woman her thoughts on cubic zirconia and you’ll see the same result).
Why is the world flooded with products that play on a little playground with walls plastered with cool visuals? I don’t know. What I know is that people are sailing on the “Cool, BigData analytics!” wave and they are hungry for anything that looks just a little like a report. Theme analytics can be done in a few days – transformation of transactions and counting of “Customer lifetime value” is easy until everyone starts telling you their individual demands.
“No one in the world except GoodData has the ability to manage analytics projects that are 100% free in their basis (the data model) and to let people do anything they want in these projects without having to be “low-level” data analysts and/or programmers. Bang!”
So how does GoodData manage to do it?
Everyone is used to adding up an “A” column by inputting the formula “=SUM(A:A)”. In GoodData you add up the “A” column by inputting the formula “SELECT SUM(A)”. The language used to write all these formulas in GoodData is called MAQL – Multi-dimensional Analytical Query Language. It sounds terrifying but everyone was able to manage it – even Pavel Hacker has a Report Master diploma out of the Keboola Academy!
If you look back at my data model out of our internal projects you might say that you want the average number of hours out of one side of the data model but you want it filtered with the type of operation, put together according to the projects descriptions and the name of the client and you want to see only the operations that took place this weekend’s afternoons. All the metrics will look like “SELECT AVG(hours_entries) WHERE name(task) = cleaning". The multi-dimensionality of this language is hidden in the fact that you don’t have to deal with questions such as: what dimension is the name of the task in? What relation does it hold toward the number of worked hours? And furthermore – what relation does it hold towards the name of the client? GoodData (or the relations in the logical model that we design for our client) will solve everything for you.
So getting straight to the point, if I design a (denormalized) Excel table in which you find everything comfortably put together, no one who reads this will have trouble counting it. If we give you data divided by dimensions (and dimensions will often be other sources of data – just like outputs from our Czech and Canadian accounting systems) it would be much more complicated to process (most likely you will start adding in SQL like a mad person). Since the world cannot be described in one table (or maybe it does – key value pair... but you cannot work with that very much) the look into a lot of dimensions is substantial. Without it, you are just doing some little home arithmetic ☺.
Do I still have your attention? Now is almost the time to say “wow” because if you like to dig around in data, you are probably over the moon about our described situation by now ☺.
And to the Finale...
Creating a query language is the most complicated task to be solved in BI. GoodData on the other hand, uses a simple, yet effective language to mitigate any of these “complications” and express the questions you have about your data. Part 3 of our series will dive deeper into this language, known as MAQL, and its ability to easily derive insights hidden in your data.
At the beginning of last summer, GoodData launched its new analytic engine AQE (Algebraic Query Engine). Its official product name is GoodDate XAE. However, since I believe that XAE is Chinese for “underfed chicken”, I will stick with AQE ☺. Since the first moment I saw it, I considered it a concept with the biggest added value. When Michael showed me AQE I immediately fell in love.
However, before we can truly reveal AQE and the benefits that can be derived from it, we need to begin with an understanding of it’s position in the market - starting from the foundation on which GoodData’s platform rests. In a three part series we’ll cover AQE’s impact on contextual data, delivering meaningful insights and finally digging for those hidden gems.
First, a bit more comprehensive introduction...
Any system with ambitions to visualize data needs some kind of mathematical device. For instance, if I choose sold items using the names of salespeople as my input and my goal is to find out the median of the salespersons’ turnover, somewhere in the background a total summation of the sold items per month (and per salesperson) must take place. Only after getting that result can we count the requested median. Notice the below graphic - the left table is the crude input with the right table being derived in the course of the process - most of the time we don’t even realize that these inter-outputs keep arising. Within the right table, we can quickly calculate the best salesperson of the month, the average salesperson/median and so on…
And how does this stack up against the competition?
If we don’t have a robust analytic backend, we cannot have the freedom to do whatever we want to. We have to tie our users to some already prepared “vertical analysis“ (churn analysis of the e-shop’s customers, RFM segmentation, cohorts of subscriptions, etc…). Fiddling with the data is possible in many ways. Besides GoodData, you can find tools such as Birst, Domo, Klipfolio, RJMetrics, Jaspersoft, Pentaho and many, many others. They look really cool and I have worked with some of them before! A lonely data analyst can also reach for R, SPSS, RapidMiner, Weka and other tools. However, these are not BI tools.
Most of the aforementioned BI tools do not have a sophisticated mathematical device. Therefore, it will simply allow you to count the data, calculate the frequency of components, find out the maximum, minimum and mean. The promo video of RJMetrics is a great example.
Can I just use a calculator instead?
Systems such as Domo.com or KlipFolio.com solve the problem of an absentee mathematical device in a bit of a bluffy way. They offer their users several mathematical devices – just the same as Excel does. The main difference is that they can be used with separate tables, not with the whole data model. Someone may think that it does not matter but, quite the contrary – this is the pillar of anything connected to data analytics. I will try to explain why...
The border of our sandbox lays with the application of the law of conservation of “business energy”.
“If we don’t manage to earn our customer more money than our services (and GoodData license) cost him, he won’t collaborate with us.“
Say for example if we take the listing of invoices from SAP and draw a graph of growth, our customers will sack us from the offices. We need a little bit more. We need to put each data dimension into context (dimension = thematic data package usually presented by data table). Each dimension does not have to have any strictly defined linkages; the table in our analytics project is called dataset.
But how is it all connected?
The moment we give each dimension it’s linkage (parents, children … siblings?), we get a logical data model. A logical data model describes the “business” linkages and most of the time it is not identical with the technical model in which any kind of system saves it’s data. For example, if Mironet has its own e-shop, the database of the e-shop is optimized for the needs of the e-shop – not financial, sales and/or subscription analytics. The more the environment (of which we analyze the data) is complicated, the less similarities the technical and analytical data models have. A low structural similarity of the source data and the data we need to analytics, divides the other companies from GoodData.
A good example of this is our internal project. I chose the internal project because it contents the logical model we need only for ourselves. Therefore, it is not somehow artificially extended just because we know “the customer will pay for it anyway”.
We upload different kinds of tables into GoodData. These tables are connected through linkages. The linkages define the logical model; the logical model then defines “what can we do with the data”. Our internal project serves to measure our own activity and it connects the data from the Czech accounting system (Pohoda), Canadian accounting system (QuickBooks), the cloud application Paymo.biz and some Google Drive documents. In total, our internal project has 18 datasets and 4 date dimensions.
The first image (below) is a general model, select the arrow in the left corner to see what a more detailed model looks like.
In the detailed view (2 of 2), note that the name of the client is marked with red, the name of our analyst is marked with black and the worked hours are marked with blue. What I want to show here is that each individual piece of information is widely spread throughout the project. Thanks to the linkages, GoodData knows what makes sense altogether.
Using business-driven thinking to force your data to comply to your business model (rather than the other way around) will allow you to report on meaningful and actionable insights. Part 2 of the following series on AQE (...or more formally XAE) will uncover the translation of the Logical Data Model into the GoodData environment.
Today's world is oversaturated with data. Telling stories through data is beginning to be so sexy, that many people are building their career on it. A few semi-experts in the Czech Republic have even changed their colours and started talking about BigData (in a worst-case scenario, they also hold conferences on this topic). However, I'll save this topic for a future blog post, in which I'll ground their Hadoop enthusiasm a bit for you.
People want to know more about the environment in which they operate. It helps them make better decisions, which usually leads to a competitive advantage. Generally, for good decision making we need a combination of three things: proper input parameters (information / data), common sense / experience, and a modicum of luck. However, an idiot will still be stupid and although luck can be occasionally bought in the Czech Republic, there's the threat of being arrested for bribery. Hence why information remains the most influenceable component of success. At my playground, the correct information serves as answers to your most penetrating questions you can think of.
I assume that each of you knows how much money you have in your personal bank account. Most of us will also know how much money we spend per month. Fewer of you will know exactly what it was for. An even smaller group of people will know the structure of all the pleasant cups of coffee, ice cream, wine, lunches and so on (we call it long tail). I would bet that almost no one knows one´s personal annual trend in the cost structure of such long tail. You’ll probably argue that you don't care. If you're a company that wants to succeed, you can't do without such information. As for one's own personal life, the biggest nutcase, as I see it, is Stephen Wolfram, who has been measuring almost everything since 1990. He wrote about almost everything except lint from his bellybutton (unlike Graham Barker :)
Because there’s nothing about the executive summary of your accounting on CRM, Google Analytics or social networks on TV after the evening news, you're forced to build different variants of reports and dashboards yourselves.
I'll try to summarize the tools which I know are available; but in the end I'll tell you that it's all just a toy gun, and whomever wants a proper data gun must reach for GoodData. To be fair, I'll do my best and argue a bit :)
Today, Excel is on every corner. It's a good helper, but quite a lot of people have a strange tendency to make Excel Engineers of themselves, which is the most dangerous expertise you can come across. The Excel Engineer often ends with a contingency table and SUMIF() formula. At the same time, business data processing is interconnected with him and, perhaps unwittingly, he's becoming a brake on progress. The biggest risks of reporting in Excel, in my opinion, are as follows:
It's probably obvious that Excel reporting should end at the level of a sole trader. Nothing reliable can be created with it efficiently. You can be sure that the Excels on ZEE (network drive, of course!) contain errors, are not up-to-date and were made by people who were assigned to it by someone else, so they knew damn all about the nature of the data they involved into the VLOOKUP! Excel Engineers usually don't have it in their genes to do data discovery, and even if they came across something interesting, they probably wouldn't notice. You will know best what the correct information is at that very moment (and Excel isn't really what you should master in 2013 at the level of VBS macros and dirty hacks)!
Today's market is oversaturated with tools that aim to help you visualize some kind of business information. Imagine business information as the number of orders per today, the net margin for the last hour, the average profit per user, etc. In the majority of cases it works in the following way: you calculate this information on your side and send it automatically via an interface to a service that ensures the given metric is presented. Examples of such services are Mixpanel, KissMetrics, StatHat, GeckoBoard and even KlipFolio. The advantage compared to Excel lies especially in the fact that the reports and dashboards can be easily automated and then shared. Information sharing is quite underrated! An example of such information could be the number of data transformations that are executed in minute granularity at our staging layer:
You can build Dashboards from these reports, and for a while you well feel good. The problem occurs when you find out that any extension of such a dashboard requires intervention from your programmers, and the more complex your questions are, the more complicated the intervention. If you operate in B2C and have transactional data, you can be sure that the clinical death of this form of reporting will be, for example, a query on the number of customers with time that spent at least 20% more than the average order for the previous quarter, and all of whom at the same time bought an ABC product this month for the first time. If your programmers, by luck, manage to implement it, they'll shoot their heads off once you add that you want daily numbers of TOP 10 customers from each city who meet the previous rule. If you have just a few more transactions, it will mean remaking your existing DB on your side, which will eventually lead to a 100% collapse. Even if you try to make it survive at all costs, you can be sure that you won't slay the competition thanks to that zero flexibility - you won’t even be able to gently take the analytical helm because the market will pivot around you.
It's possible you don't have similar questions about your business, and it doesn't bother you. The cruel truth is that your competition is asking about it right now, and you will have to somehow respond to it...
Neither Excel nor visualization tools usually have any sophisticated back-end, which applies similarly to services like Domo or Jolicharts. They look super sexy at first glance, but inside is a masked set of visualization tools, sometimes coated with a few statistical features that you mostly won’t use. The common denominator is the absence of a language with which you could step out of the predefined dashboards, and begin to implement similar services so that they were to your benefit.
Their only advantage is that they can be quickly implemented. Unfortunately, that's it, and after a short intoxication period, sobriety sets in. If by chance you are a little bit more demanding, you haven't got a chance for a very happy life.
There are services that allow you to upload data and raise queries. As I see it, nowadays the hottest is Google BigQuery. For us at Keboola, it's a tremendous help with data transformation, denormalization and JOINs of huge tables. It can serve you well if it seems like a good idea to you to write the following:
...to get this...:
It's evident that if you don't make a living as an SQL consultant and don't have any ambition to create your own analytical service, you’d better leave this approach to nerds (like us!) and attend to your own business :)
If you google cloud BI, Google will return names like Birst, GoodData, Indicee, Jaspersoft, Microstrategy, Pentaho, etc. (if you have Zoho Reports among the results, the universe got crazy because that should have remained in Asia :).
From many trends, it's obvious that the Cloud moves the world of today. In the Czech Republic, the most common concern about the concept is a worry about the data and the feeling that my IT can do something better than that of the vendor. If you feel the same concerns, you should know that when any troubles arise in the Cloud, the best people available on this planet are working on it immediately, so that everything will again run like clockwork. Dave Girouard (coincidentally also a board member of GoodData) summed it up nicely in this article.
Except for Microstrategy, which probably discovered the Cloud this morning, the above-mentioned brands are relatively established within the Cloud. However, there are different surprises hiding under the lid. Pentaho requires highly technical knowledge so that you can make the most of it. Jaspersoft is Excel on the web that, in short, failed. Indicee would like to play in the major leagues, but I know at least one large customer from Vancouver who, after trying to implement their solutions for a year, moved to GoodData. When I tried Birsta it was all in Flash, and despite my enormous effort I really didn't understand it :(
As I said in the beginning, everything except GoodData sucks. There are several reasons for this:
All in all, the quality of GoodData shows, among other things, a lot of connections, such as Zendesk.com (the biggest service to support customers in the world). The ability of such flexibility is, from my point of view, absolutely essential for future success. Any one of you can rent high-performance servers, design super-cool UI or program specific statistical functions (or perhaps borrow them from Google BigQuery), but in the foreseeable future no one will come out with a comprehensive concept that makes sense and is applicable to small dashboards (we have a client who uses GoodData to look at some data from Facebook Insights) as well as gigantic projects with a six-digit $ budget for just the first implementation phase.
We will create for your data a model with a clear structure in the Keboola Connection tool. It is thanks to this model that later the whole system will tread quickly, flexibly and accurately. Using the model we will be able to find relationships between the data.
But the model wants to eat – the model wants to be fed data. Which, will come mostly from these four main sources:
Once we have fed the model with data, we will send the processed data into an application called GoodData. After which, you almost immediately gain access to your reports. Rest assured that the first contact will feel a bit like magic.
Once you’ve had your first dose of satisfaction, we guarantee you that you will want more: "I do not want this report and I want that report to take weather into account." Ok. Post your requirements and wait
for two months for a couple of days and then you are looking at your new reports.
Or even better - access our know-how in Keboola Academy to learn how to work the system and then you will be able to modify the reports yourself. After that no one will ever be able to tear you apart from your data.
A boss with GoodData, who is lounging on a beach half a world away, knows more than any boss present at work without it.
Now, if you wish you can sit under a beach umbrella in Honolulu with a tablet and every five minutes you can check just how much money you are making.
You will notice that the people who were served by Olivier never came back to your cafe.
You will see that customers in Vancouver are spending roughly twice as much as customers in Quebec, as you just launched an advertising campaign in there.
You will observe that when it rains your sales of pour over coffee rise sharply – unless the manager forgets to stock up on the filters.
You will clearly see how the purchasing behaviour of your customers changes in time, so you will spot new trends early to take the full advantage.
As you sip your Mai Tai slowly, you’ll then start to write your first email: "Mary, please order extra thin filters for our coffee machines and also tall glasses for Vancouver. It seems like there's a new fad..."
People were getting lost in data - so they created tools to help them. Since 1958, when Hans Peter Luhn coined the term “Business Intelligence” until the end of the 80’s, the whole industry lived by terms such as data warehouses, OLAP cubes etc. In 1989 Howard Dresner defined BI as a “set of concepts and methods to improve business decision making by using fact-based support systems”.
Over the last half century, BI has been progressing until today, when it finds itself in a bit of crisis.
We are overwhelmed by data - no longer the raw data - but rather the categorized, mathematically processed data represented in what we call “reports”.
Imagine that you have a large amount of data. You know that there is a lot of very interesting information in it. So, you take tools that pull all that data into one place, clean it, polish and present back to you - and you start looking at it (that’s what we do with Keboola & GoodData).
Over time, though, one can easily experience the following side effects:
Once you have hundreds of reports, all sorting, tagging or naming conventions stop working. You’ll get to the point when no-one will be able to find what they need. Instead of looking for existing reports, people will start building the same ones again and again. Your sales director knows, that there was a report “Margin estimate for the next 4 weeks based on sales managers’ estimate” somewhere, but it is harder to find it than to build it again (which speaks, in case of GoodData, volumes about the ease of its use).
“I read it and IMHO it’s a bit of BS, because articles like that have been showing up regularly since the 50’s - saying that use of natural language is “almost here”. The best generic tools for interactive communication with a computer (asking the computer for something) is so far SQL, which was supposed to be so simple, that everyone can write a query as easy as a sentence. Time has shown that reality (and therefore also natural language) is so idiotically complex, that any language describing it needs to be also complex and you need to study for 5 years to master it (same as natural language).”
It is important to take a bit of everything. It will remain critical that everyone has access to information they feel they need (to validate hypothesis, support their decision etc.). Apart from that the machines need to help a bit with sifting through the data - so you don’t have to generate hundreds of reports trying to find the golden nugget.
At Keboola we’ve been working on a system that is attempting to solve exactly that since the Summer of 2013. Today it is practically a complete set of functions, that can recognize the meaning of data (time, ID, number, attribute - we call that piece “data profiler”), relationships between data (for example it can figure out how to connect Google Analytics with CRM data) and afterwards run tests to identify “interesting moments”. For example it can discover seasonality in a particular segment of customers and point to it, without the need for an analyst to get the idea to try something like that out. Our system “guesses” where the data relates to a specific customer and if it finds something interesting, it will point it out. Ideally it by itself creates a report in GoodData filtered to the given situation.
As an example, for “on-line transaction” data types we have a set of tests that are looking for those interesting moments. One of these tests (working title “Wrong Order Test”) creates histograms of all combinations of facts (typically monetary values) and attributes (products / locations / months / user types etc. ) Among those it tests whether the counts of ID’s (such as orders) correlate with the values - if some attribute seems outside of “normal” in a particular situation, it’s a reason enough to bring it up with the business user.
This picture shows how for a specific time period and product (or a user group), the system identified that there is unexpected drop of profit for a particular payment method - “interesting behaviour”. Unless you somehow get the idea to test for precisely this situation and report setting, you have practically NO CHANCE to discover this. On top of that, the same anomaly may not present itself a week later, therefore you need continual detection.
Our goal is to periodically test the various data types sitting in Keboola and inform their owners of those interesting facts in the form of an automated dashboard within their GoodData projects. The last thing we need to do is to define how to configure the tests, as the true power lies in the interaction of various tests over the same data. Everything else - the data profiler, tests themselves, supporting R functions, API, infrastructure apod. is ready to go.
This way Keboola will not only help use data to find answers to your business questions, but also phrase new questions based on gems hidden in the data.
(originally published as a guest post on the GoodData blog)
In the world of big data and analytics, what is the definition of a platform? What belongs in the category and what doesn't? If Tableau is on the list, why not Excel? If Excel, why not Numbers or GoogleSheets? (Hey, that one's even cloud based!) The whole thing is somewhat silly to me. It is trying to compare the incomparable.
Over the years of Keboola's existence and focus on business intelligence, we've been closely monitoring the tools available. We are an independent company and while we partner with GoodData, our ultimate focus is to do what's best for our customers. There are many tools out there. Some mediocre, some brilliant in what they do. It never ceases to amaze me how can solutions built on Cognos, Microstrategy or Business Object cost so much while so little value seems to be actually delivered. Similarly how Domo took a simple dashboarding tool and by some serious marketing dollars made it appear almost like a BI product. Conversely looking at a product like Tableau, its visualizations are unparalleled. And yet simply put, nothing comes close to fulfilling our vision of BI Platform as well as GoodData does.
If you disagree, start asking questions - Which BI tools have a robust API that can connect to and push data from any data source? Do they allow you to filter data based on the user that is looking at it? Can you automatically build scripts that generate reports relevant to the current situation of your business? Does a BI tool allow you to analyze hundreds of millions of rows of data in seconds? Does it have a front end interface that anyone who came near a medium-complex spreadsheet and knows how to drag and drop can use? Does the platform allow you to build a product that you deploy to hundreds of customers by a touch of a button? And which tool allows you to do ALL these things? Right.
GoodData is more than a tool, it is a true open platform. For some this comes as news, for us at Keboola, it has always been that way. We have always treated GoodData as a platform.
True platform gives you tools and space at the same time. The tools allow you to do things, and the space to imagine and create new ways of doing. Your imagination, not the tool is the limit. We built, using GoodData itself, a training tool to teach people how to use GoodData called Keboola Academy. We built AI that modifies not only the data in the reports, but the dashboard layout of the dashboard to pinpoint what is important. We completely integrated with GoodData so deployment of dashboards and analytics over our own business data warehouse product is seamless and largely automatic. We built whole data products, deeply embedded into our customers' interfaces, all using an open analytics platform called GoodData.
Keboola is about helping companies make more money using data. Whether it is for internal reporting and analytics, or to create new revenue streams by monetizing data-as-a-product, GoodData gave us the freedom to build amazing things and continuously grow our business (so far 200% or more year over year) and that is why I consider it the only true BI platform on the market today. "BI Platforms" is a category of one.