Keboola: Data Monetization Series Pt. 1

When a company thinks about monetizing data, the things that come to mind are increasing revenue, identifying operational inefficiencies or creating a new revenue stream.  It’s important to keep in mind that these are the results of an effective strategy but can't be the only goal of the project.  In this blog series, we will exam these avenues with a focus on the added value that ultimately leads to monetization.  For this blog, lets look at it from the perspective of creating executive level dashboards at a B2B software company.

Who will be consuming the data and what do they care about?

Before we jump into the data itself, take a step back and understand who the analytics will be surfaced to and what their challenges are.  Make profiles with their top priorities, pain points and the questions they will be asking.  One way to get started is to make a persona priority matrix listing the top three to five challenges for each (ex. below.)

Screen Shot 2016-01-16 at 15642 PMpng

Once the matrix is laid out, you can begin mapping specific questions to each priority.  What answers might help a VP of Sales increase the effectiveness of the sales team and ultimately revenue?

  • What do our highest velocity deals look like (vertical, company size, who’s involved)?

  • What do our largest deals look like?

  • Where do our deals typically get stuck in the sales process?

  • What activities and actions are our best reps performing?

Adding Context With Different Types of Data


Data can be vast and overwhelming, so understanding the different types helps to simplify what kind of numbers we are looking for.  Even with the treasure trove of data most organizations have in-house, there are tons of additional data sets that can be included in a project to add valuable context and create even deeper insights.  It’s important to keep in mind what type of data it is, when and where it was created, what else was going on in the world when this data was created, and so forth.  Using the example of a restaurant, let’s look at some different types of data and how they could impact an analytics project.  

Numerical data is something that is measurable and always expressed in numerical form.   For example, the number of diners attending a particular restaurant over the course of a month or the number of appetizers sold during a dinner service.  This can be segmented into two sub-categories.  

Discrete data represent items that can be counted and is listed as an exact number and take on possible values that can be listed out. The list of possible values may be fixed (also called finite); or it may go from 0, 1, 2, on to infinity (making it countably infinite).  For example:

  • Number of diners that ate at the restaurant on a particular day (you can’t have half a diner.)

  • Amount of beverages sold each week.

  • How many employees were staffed at the restaurant on a day.

Continuous data represent measurements; their possible values cannot be counted and can only be described using intervals on the real number line.  For example, the exact amount of vodka left in the bottle would be continuous data from 0 mL to 750 mL, represented by the interval [0, 750], inclusive.   Other examples:

  • Pounds of steak sold during dinner service

  • The high temperature in the city on a particular day

  • How many ounces of wine was poured in a given week

You should be able to do most mathematical operations on numerical data as well as list in ascending/descending order and display in fractions.

6 Gift Ideas for the Data Geek in Your Life


Its that time of year again and there are so many gift options to choose from.  Be it hover-boards (that may explode,) drones or Star Wars’ own BB-8 remote control droid, there’s been quite a boom in tech gadgets this year.  At Keboola we love all things data, so to get you in the holiday spirit, we wanted to share some cool gift ideas that use data to make your life easier (or at least a bit more interesting.)

Automatic Adapter

Similar to the gadget seen in the Progressive commercials, the Automatic Adapter is basically a fitness app for your vehicle.  It provides a full report on behavior through an app or a web interface regarding where you’ve been, driving behavior and even tag routes for business travel expenses.


Top 3 challenges of big data projects

The Economist Intelligence report Big data evolution: forging new corporate capabilities for the long term published earlier this year provided insight into big data projects from 550 executives across the globe. When asked what their company’s most significant challenges are related to big data initiatives, maintaining data quality, collecting and managing vast amounts of data and ensuring good data governance were 3 of the top 4 (data security and privacy was number 3.) Data availability and extracting value were actually near the bottom. This is a bit surprising as ensuring good data quality and governance is critical to getting the most value from your data project.

Maintaining data quality

Having the right data and accurate data is instrumental in the success of a big data project. Depending on the focus, data doesn’t always have to be 100% accurate to provide business benefit, numbers that are 98% confident is enough to give you insight into your business. That being said, with the sheer volume and sources available for a big data project, this is a big challenge. The first issue is ensuring that the original system of record is accurate (the sales rep updated Salesforce correctly, the person filled out the webform accurately, and so forth) as the data needs to be cleaned before integration. I’ve personally worked through CRM data projects; doing cleanup and de-duping can take a lot of resources. Once this is completed, procedures for regularly auditing the data should be put in place. With the ultimate goal of creating a single source of truth, understanding where the data came from and what happened to it is also a top priority. Tracking and understanding data lineage will help identify issues or anomalies within the project.

Collecting and managing vast amounts of data

Before the results of a big data project can be realized, processes and systems need to be put into place to bring these disparate sources together. With data living in databases, cloud sources, spreadsheets and the like, bringing all the disparate sources together into a database or trying to fuse incompatible sources can be complex. Typically, this process consists of using a data warehouse + ETL tool or custom solution to cobble everything together. Another option is to create a networked database that pulls in all the data directly, this route also requires a lot of resources. One of the challenges with these methods is the amount of expertise, development and resources required. This spans from database administration to expertise in using an ETL tool. It doesn’t end there unfortunately; this is an ongoing process that will require regular attention.

Ensuring good data governance

In a nutshell, data governance is the policies, procedures and standards an organization applies to its data assets. Ensuring good data governance requires an organization to have cross-functional agreement, documentation and execution. This needs to be a collaborative effort between executives, line of business managers and IT. These programs will vary based on their focusbut will all involve creating rules, resolving conflicts and providing ongoing services. Verifications should be put into place that confirm the standards are being met across the organization.


Having a successful big data project requires a combination of planning, people, collaboration, technology and focus to realize maximum business value. At Keboola, we focus on optimizing data quality and integration in our goal to provide organizations with a platform to truly collaborate on their data assets. If you’re interested in learning more you can check out a few of our customer stories.

KBC as a Data Science Brain Interface

The Keboola Data App Store has a fresh new addition. That brings us to total of 16 currently available apps, three of which provided by development partners.

This new one is called “aLook Analytics”, and technically it is a clone of our development project, a “Custom Science” app (not available yet, but soon!). It facilitates connection to a GitHub/Bitbucket repository of a specific data science shop, which you can “hire” via the app and enable them to safely work on your project.

This first instance is connected to Adam Votava’s company aLook Analytics (check them out at

How does it work?

Let’s imagine you want to build something data-science-complex in your project. You get in touch with aLook and agree on what it is you want them to do for you. You exchange some data, the boys there will do some testing on their side, set up the environment and once they’re done, they’ll give you a short configuration script that you will enter into their app in KBC. Any business agreement regarding their work is to be made directly between you and aLook, Keboola stays on the sidelines for this one.
When you run the app, your data gets served to aLook’s prepared model and scripts, saved in aLooks repository get executed on Keboola servers. All the complex stuff happens and the resulting data gets returned into your project. The app can be (like any other) included in your Orchestrations, which means it can run automatically as a part of your regular workflow.

The user of KBC does not have direct access to the script, protecting aLook’s IP (of course, if you agree with them otherwise, we do not put up any barriers).

Very soon we will enable the generic “Custom Science” app mentioned above. That means that any data science pro can connect their GitHub/Bitbucket themselves - that gives you, our user, the freedom to find the best brain in the world for your job.

Why people and not just machines?

No “Machine Learning Drag&Drop” app provides the same quality as a bit of thought by a seasoned data scientist. We’re talking business analytics here! People can put things in context and be creative, while all machines can do is to adjust (sometimes thousands of) parameters and tests the results against a training set. That may be awesome for facial recognition or self-driving car AI, but in any specific business application, a trained brain will beat the machine. Often you don’t even have enough of a test sample so a bit of abstract thinking is critical and irreplaceable.

How we "hacked" Vizable

Tableau unveiled their new Vizable app the first full day of the Tableau User Conference 2015 (A.K.A. TC-15) to much oohs and aahs. Vizable is a tablet app that allows users to take data from an .xls or .csv file and easily interact with it right on their tablet. It is unparalleled in its ease of use and intuitiveness, providing an exciting new way to consume data and drive insights. More information here:

As soon as we saw it, the Keboola team thought, “What an exciting way to use data from Keboola Connection - if only we could send data to it immediately to test it!” The app is built to accept .xls and .csv files that are physically present on the iPad it runs from, so at a glance, it is completely and utterly off-line. We immediately wondered if Keboola Connection - due to its integration with DropBox and Google Drive - could make Vizable the ultimate, on-the-go data visualization app.

(a little bit of frantic testing later)

Yeah! We can easily schedule pushing data into the iPad using our existing integrations. We didn't have to write a single line of code and already during the conference we were able to play with #data15 mentions we’d pulled in through Keboola Connection, with fresh data being automatically pushed into the iPad every 30 minutes.

We eagerly shared our success with the Vizable team and started showing conference attendees and members of the Tableau team just how we’d made it all happen! It was great to receive a string of visits from the whole Vizable crew all the way up to Dave Story, VP of Mobile and Strategic Growth, and Chris Stolte, the Chief Development Officer. What a thrilling way to educate the Tableau folks on all the cool stuff Keboola does with their tool and for their customers.

Get in touch with us if you want to know more!

Agency - get rid of pivot tables !

During my midnight oil hours and rumbling through out our internal systems, I have come across the ZenDesk tickets that our data analysts are closing for one of hour clients - H1 agency (part of GroupM).

At they have created a report in GoodData which is called “non active campaigns”. It contains one metric, 5 attributes of type data, client, etc. and 4 filters (time, client’s agency, etc….) - It sounds super simple, but let’s take a closer look.

What it does, it gives you back a table, which is a wet dream of any and all agencies out there. You can see “anything” across all of the advertising channels. I mean “anything." In this particular case, they’ve created a report of non active campaigns. After some time this is a very good example of an output that is very hard or impossible to achieve in things like Tableau, SAP, Chartio, or RJMetrics. Rock&Roll of the multi-dimensional BI system! You need to live it to believe it and to actually understand it.

Bellow is the data model (non readable on purpose), the yellow ovals are the things on top of which you count and you can see them in the context of green ones:

Karel Semerák from has prepared this report. I bet he has no clue what mega machine he put into works so it would actually produced this. GoodData has based on the physical data model, definition of metrics and report context generated 460 rows of SQL in the datawarehouse which propells the system. 

Just imagine that there is a real person that tries to do this report by the hand (totally ignoring the incomprehensible amount of data), he has to do lots of small tasks (look inside AdWords, find the active clients, count number of their campaigns, compare to CRM data for paid invoices, create temporary pivot table, etc.)  and every little task could be represented by the rectangle inside this picture:

It all comes to almost 90 totally different tasks each taking from a minute to 3 days when done by hand. Try to explain this workflow to a Teradata consultant and you will spend a week just explaining what you want, try it with IBM Cognos expert and…well you get the picture.
And one more thing, with GoodData you do it yourself and don’t have to wait another month for the expert nor pay 5 digits sum for one report.

Well played, GoodData & multi-dimensional BI! 

But for a moment let’s forget about this one report. has already prepared over 400 of such reports. Try to produce that in Excel and you better have a hord of MS Excel devotees who work hard as robots and are precise as robots. Last time I have seen something like this was years back at OMD. It was a mega office and all of the people inside have been producing pivot tables.

Talking about robots. If you are interested in counting the probability with which “data AI robots” will replace your job, take a look here.

Karel Semerák from can stay cool though. He think about the data and the context instead of spending time on tasks where robots will always be better and that is one thing where robots will take some time to improve. Cognitive skills and context.

So next time your P/L starts nocking on your doors, think about giving your people the chance to creatively use their head and leave all the heavy lifting to robots. People aren’t the best at copy&paste or sorting through the AdWords report, but they are great at creative thinking and that is what you need in order to win over your competition.

Why aren’t there more nerds in marketing?

My arrival at Business Intelligence (BI) and eventually consulting for Keboola was not through your standard Statistics or Programming route. After completing a Bachelor’s degree in Business & Communications, I had several stints coordinating corporate marketing efforts in various industries from Automotive to Gaming. I found that no matter what capacity I worked in -  from tracking call-to-actions, to analyzing performance reports from suppliers and conducting market research - I could not hide from data. 

So in my never ending quest for efficiency I also started to ask myself why am I doing this and how meaningful is it? Or simply put, am I wasting my time coming up with tweets no one reads. #HashtagAllTheThings


Traditionally data analysis was never a core a competency of marketing, someone else from purchasing, finance, IT etc. would tell you if your campaign was successful. But with the shift towards digital marketing there’s been an increase in data availability and now more control over how marketers themselves can measure KPIs.

It was this trend that made me first curious about making the switch from Marketing to Analytics. The organizational gap was so apparent to me, but I had no idea what that translated to in terms of a job description. I was stuck between working in marketing where (for obvious reasons) the primary focus is on campaign implementation before measurement vs. a highly technical position (which I didn’t even have the qualifications for) that would stifle my creative side.

Caught in the middle, I came across a job posting at Keboola for a “Data Analyst”. At the time I had no idea what I was applying for, but through my experience in the past year I now see that the job description couldn’t have been clearer. Keboola like me is somewhere in the middle. With a pragmatic approach, we provide real solutions to our clients’ very real business issues.  

What I love about working here is that we help companies integrate both Market Intelligence (MI) and Business Intelligence (BI) for data-driven decision making.

In my role, I provide the (BI) tools to answer the “why” behind my clients’ marketing decisions and then visualize those findings so they can make more informed decisions (MI). What I’ve found through this experience that there are common problems afflicting marketers, for which I feel there is actually a solution already under your nose. My job at Keboola is to translate these observations into something actionable so marketers can be empowered to work with their data and spend their time creating something meaningful … less hashtags in the next tweet perhaps? 

Zig Zagging your way through the E-learning journey

We bank online, buy groceries online, watch movies online, we even date online … so why not become educated online?

The emphasis on eLearning has continued to grow in the recent years. Training has become streamlined to the point where anybody and everybody can learn without actually stepping foot into a  classroom. The experience allows for knowledge to be compressed and consistently delivered to every learner. At the same time, online training gives learners a new-found freedom. They can choose their physical environment (bunny slippers and cup of tea, perhaps?), how they reference materials, and in some cases the pace at which they learn.

Traditional linear learning is based on the idea that you must first build a foundation through a set of carefully predefined segments or lessons delivered in a specific order before tackling more complicated topics. And although some people may respond well to working through an established series of concepts, the linear approach often leaves many of us unenthused-and unmotivated.

Non-linear learning, in comparison, offers learners the freedom to actively construct a personalized educational journey. What does this freedom look like? It’s having the ability to make individualized choices based on background, expertise, and one’s own unique learning style.

Interest and retention towards subject matter is increased when learners are provided the opportunity to select material relative to their own lives, careers and projects. Choosing the sequence of learning materials allows learners to tailor the content to their individual needs, weaknesses and strengths.

Adopting the non-linear philosophy, eLearning programs like Zendesk Insights Advanced Learning offer users the best route to the information that is meaningful to them. There are several support tools available including videos, guides, tasks, and challenge problems, but the combination and timing of these tools makes each user experience unique. Leveraging the Non-Linear learning advantages in an online environment, lends flexibility for time management so users can engage with the tool at their own pace ... (because it doesn’t matter if you’re the turtle or the hare, as long as you pass the finish line ☺).

* We love to hear what our partners think, so for their thoughts on Zendesk Insights Advanced Learning check out

Another year passed by

Yesterday my account in GoodData turned 5 years old!

It is one in the morning and the delivery service calls my phone; surprise! We’ve got champagne for you. It was from my colleagues. I’m alone and sick with the flu in bed, but almost shed a tear.

Every single day for the last 12 months I looked forward to going to work. And the main reason is the people at Keboola. Thank you guys, without you I would probably just sit at the cash register at TESCO and… well, whatever.

This seems like a good moment to look back at the last 12 months. This is not in any particular order - and not a complete list, either:

  • We adjusted our positioning. From “just selling and implementing” GoodData, we moved to data enablement and actually started to talk with everyone who might have a need to analyse data. You have the data, we will help you integrate it and get it “consumption ready”. If anyone wants to use they are more than welcome - we are here for our clients and “the tool” for our clients’ internal analytics guys to use. We also built several non-analytics applications, where the purpose is to deliver quality data into other companies’ products and platforms.
  • Our Keboola Connection ecosystem is growing rapidly. We are adding more and more new ways to push your data for analytics and data discovery. Along with GoodData, we support today Tableau , Chartio and are planning support for Birst, RJMetrics and Anaplan. I would love to have support for SAS soon as well. If you have a tool that can extract the data from DB, CSV from your hard drive or from a URL, you can have data from us today.
  • In total silence we have launched our “Apps Store”. The main part of it is our, still very juvenile app "LuckyGuess" and transformations templates which automate many daily routine type tasks. Our goal is to support any app that really helps our users/analytics by providing some added value. It automatically analyzes data or automates processes with the data. If you can deliver such an app in Docker, we offer the best place to monetise it -- we have the computing power and clients have their data with us already… Our LuckyGuess app is primarily written in R and it does very basic, yet fundamental things like detect relations between tables, lets you know the data types, detects dependencies (regressions) between the columns (“tell me which expenses bring the most customers") or it can detect purchasing patterns and let you know when to go to talk to a particular customer because he is most likely to buy. We are working on other apps, our own as well as partners driven !

  • Marc Raiser is back. A few years ago he received an offer that he couldn’t refuse - working with Fujitsu Mission Critical Systems Ltd. (managing data from large machines and building AI on top of that). At the time we joked that Japan would be just an apprenticeship program for him and that he would be back soon. And voila, he is back working with us again, for now on development of LuckyGuess!
  • The rebuilding of our platform into async mode is nearly finished. It will give us unlimited horizontal scalability.
  • Martin Karásek developed a new design of our UI. No longer bare Bootstrap! While implementing the new design we also redesigned our whole approach to technical implementation of UI, and today everything will be an SPA application, on top of our APIs. Any partner of ours can skin it as he wishes and run it from his own servers if he has a need for that. Sneak peek of the Transformations UI:

  • We organised our first Enterprise Data Hackathon
  • We reorganized the company into two verticals - product and services. Our Keboola Connection team actually doesn’t have any direct clients any more. Everything is done through partners. Currently we have 7 partners. Our definition of a partner is any other entity which has a data business of their own and uses a tech stack from us to support their business. In just the last month we were approached by 4 more companies.
  • We now have a third partner in Keboola. So it’s me, Milan and Pavel Doležal. Pavel spends most of his time with our partners making sure they get all the right tools and support they need for their work and is leading the development of our partner network.
  • Vojta Roček left us and went his own "BI” way. Today he is in a new ecommerce holding Rockaway and he is leading people down the data-driven business development path. Keboola Connection adoption within Rockaway is growing every day.
  • Our extractor-framework - the environment where third parties can write their own extractors - is done and ready to use. Today it takes us ½ a day to connect to a new API.
  • We are finishing the app that can read Apiary Blueprint and by doing so we shall be able to read data from any API that has its documentation in with minimum development.
  • Working on “schemas” - the possibility to use standardized nomenclature for naming and describing the data. Think of it as a "data ontology". It will allow creation of smarter Apps, as they will be able to understand the meaning of the data.
  • Just launched TAGs - it is like a form of dialogue between you and us about the data. It is enough for us if you just tag the column "location” and we will promptly serve you weather data for every address in the column. If you label a column as “currency” then right away you have the up to date currency exchange rates, etc.
  • We are still 25 people and growing without a need to add too many more.
  • Zendesk launched online courses for Zendesk Insights within our own Keboola Academy. We trained hundreds of people how to use GoodData.
  • Our “Team Canada” has moved into new offices.
  • We publish many components as open source. If it makes sense, we want to provide it for you for free. Our JSON2CSV convertor is a first sign of this trend. The dream would be to run the most used extractors for free as well.

So that's where we are, what we've been doing and where we're going, exciting times! 

Now, to take my medicine...