product

BI Dashboard Crisis

We are overwhelmed by data - no longer the raw data - but rather the categorized, mathematically processed data represented in what we call “reports”.

Petr Šimeček

Written by
Petr Šimeček

May 16, 2014

People were getting lost in data - so they created tools to help them. Since 1958, when Hans Peter Luhn coined the term “Business Intelligence” until the end of the 80’s, the whole industry lived by terms such as data warehouses, OLAP cubes etc. In 1989 Howard Dresner defined BI as a “set of concepts and methods to improve business decision making by using fact-based support systems”.

Over the last half century, BI has been progressing until today, where it finds itself in a bit of crisis.

The Dashboard Crisis

We are overwhelmed by data - no longer the raw data - but rather the categorized, mathematically processed data represented in what we call “reports”.

Imagine that you have a large amount of data. You know that there is a lot of very interesting information in it. So, you take tools that pull all that data into one place, clean it, polish and present back to you - and you start looking at it (that’s what we do with Keboola & GoodData).

Over time, though, one can easily experience the following side effects:

  1. resignation / the “juicer syndrome”. You see (if you use the system passively) the same information in the data day after day. Inside the first few weeks, you drill into the data and look from all angles. As time follows, your focus falls away while you continually ingest more and more data you don’t need to see again (Avast Antivirus now has more than 200M users, they’ll still have more than 200M tomorrow, no one needs to be reminded of that daily). If you bought a new juicer, you probably drank nothing but fresh juice for a week or two, and since then the appliance has been collecting dust somewhere. Something very similar can easily happen in BI.
  2. Drowning in data. If you have a good tool that allows you to drill into your data and you use it, you generate one report after another as you find more and more interesting answers. At one point you’ll have so many reports that you get lost.

Once you have hundreds of reports, all sorting, tagging or naming conventions stop working. You’ll get to the point when no-one will be able to find what they need. Instead of looking for existing reports, people will start building the same ones again and again. Your sales director knows, that there was a report “Margin estimate for the next 4 weeks based on sales managers’ estimate” somewhere, but it is harder to find it than to build it again (which speaks, in case of GoodData, volumes about the ease of its use).

What are the attempts for solutions?

  • Use of natural language - Microsoft is trying in it’s “Power BI” to understand queries asked in a similar matter to how we ask a search engine. In that case, natural language needs to be somehow connected to the semantic model leading to the data. It looks pretty (see the Power BI link), but Odin, my colleague, nailed it when he commented after reading one such article:

“I read it and IMHO it’s a bit of BS, because articles like that have been showing up regularly since the 50’s - saying that use of natural language is “almost here”. The best generic tools for interactive communication with a computer (asking the computer for something) is so far SQL, which was supposed to be so simple, that everyone can write a query as easy as a sentence. Time has shown that reality (and therefore also natural language) is so idiotically complex, that any language describing it needs to be also complex and you need to study for 5 years to master it (same as natural language).”

  • Use of visual interface between the system and a human - you can see that nicely on an example of BIRST. It’s a beautifully executed marketing video, but once the data model (a.k.a. the relationships between information) gets sufficiently complex, the interface stops working - it doesn’t understand what we want from it or controlling it gets so complicated, that its advantages are lost.

What are we doing about it?

It is important to take a bit of everything. It will remain critical that everyone has access to information they feel they need (to validate hypothesis, support their decision etc.). Apart from that, the machines need to help a bit with sifting through the data - so you don’t have to generate hundreds of reports trying to find the golden nugget.

At Keboola we’ve been working on a system that is attempting to solve exactly that since the Summer of 2013. Today it is practically a complete set of functions, that can recognize the meaning of data (time, ID, number, attribute - we call that piece “data profiler”), relationships between data (for example it can figure out how to connect Google Analytics with CRM data) and afterwards run tests to identify “interesting moments”. For example it can discover seasonality in a particular segment of customers and point to it, without the need for an analyst to get the idea to try something like that out. Our system “guesses” where the data relates to a specific customer and if it finds something interesting, it will point it out. Ideally it by itself creates a report in GoodData filtered to the given situation.

As an example, for “online transaction” data types, we have a set of tests that are looking for those interesting moments. One of these tests (working title “Wrong Order Test”) creates histograms of all combinations of facts (typically monetary values) and attributes (products / locations / months / user types etc. ) Among those, it tests whether the counts of ID’s (such as orders) correlate with the values - if some attribute seems outside of “normal” in a particular situation, it’s a reason enough to bring it up with the business user.

If you’re curious to learn more about Keboola and how we can help you perform these tests contact us here!

More product Articles