Seznam's Return on BI, Part One

The investment in Business Intelligence returned 10 times in three months, says Michal Buzek, the chief analyst of Seznam

Czech’s biggest web portal, Seznam.cz, has not only built a search engine to rival Google, but has also founded an empire of prospering services. From an email platform to a growing network of contextual advertisements (Sklik), Seznam has excelled at building a portfolio of complementary business ventures.

So how does this giant with thousands of employees, manage and understand the infinite amounts of data at their disposal? One word, GoodData. We sat down with Seznam’s Head Analyst, Michal Buzek, to dig deeper into this trade secret. The following two-part interview will uncover how an investment in their data has payed off by more than 5 million in profit.

How did it all start?

Some time around 2009, a decision was made to implement a Business Intelligence tool and an open competition took place. The former CEO of Seznam, Pavel Zima (now a deputy chairman of the managing board), invited GoodData to bid. At the time, I was part of the team that compared offers and provided recommendations to management. We met with Zdeněk Svoboda (co-founder of Good Data) a few times and he showed us GoodData’s capabilities using a sample of our business data from Sauto.cz. Compared to on-premise licensed BI tools, GoodData was extremely simple and quick; and on top of that, Mr. Svoboda was very smooth and natural in selling it to us. 

Why exactly did you search for Business Intelligence tool?

We needed to escape from Excel – when everybody was bringing their own report to a conference, and the quality of the data was unsteady. What’s more, we had about four different business systems at that time. Long story short, we were looking for an integrated reporting tool that would allow us to get all the data we needed under one unified dashboard. 

And how did you encounter Keboola?

We’d been using GoodData for about two years, but didn’t launch any big actions in that area. From time to time we asked for a modification of the data model, but it wasn’t until our PR department found out that GoodData had the capabilities to interact with social networks that we were introduced to Keboola. We were told that they had developed the best connectors of Facebook and Twitter data for integration with GoodData.  

What was your first impression?

Finally someone who resembles the types of people you can find here at Seznam. No suits.

So it all started with the project of getting the data from social networks?

Yes, but I had actually wanted to try something new in GoodData even before that. I wanted to expand the data models and play with other views on our data to see whether I’ll get someone else excited as well. I also wanted to accelerate additions of new items without the need to consult with GoodData each time.

Meanwhile, Keboola came and showed me some ways to improve the Dashboard in GoodData and also had their own tool, Keboola Connection. I won’t lie – I also read Tomáš Čupr’s (well-known Czech businessman, the founder of the most successful variation of Groupon – Slevomat.cz) post about the way they changed his life.  

So what happened next?

In March 2013 we started building a new project for the sales department. I wanted to give the salesmen a fundamental reason to use GoodData. We’ve been buying market research data for some years by that time – specifically looking at expenditures on big-format advertisement - but we haven’t had a chance yet to maximize its potential.

Recently, we connected this third-party data with our business system. In doing so, the knowledge of our current and potential clientbase shifted about five levels ahead. We gave our salesmen a simple tool to trend what types of advertising is purchased, how often and where from, so that they have a better understanding of the buying behaviour of our clients. This took us not one, but five levels beyond what we had before. 

Was it difficult to learn how to work with Keboola Connection?

No, there wasn’t much extra to learn. The data transformations in Keboola Connection are written in SQL, which already has been used by our team. I personally got the hang of it after a few weeks. My favourite toy is Sandbox, a “training environment” in which I can send input tables and play with questions long enough to get the appropriate result. 

What have you already managed to create?

The sales department of Seznam is quite big and the teams are diverse, so the demands for the statistics are varying. People from Sklik need one kind of report, the team specializing in serving large clients needs another. This is why we are continuously developing the project and we cannot just set things up once to be done with it. That being said, I have yet to see an inquiry that we couldn’t solve with Keboola Connection’s help. 

And what specific projects have you launched?

In GoodData we have taken on several projects, beginning with the social networks and ending with the buying behavior of clients. We divide clients according to their industries, we watch their seasonality according to the attendence of categories on Firmy.cz and we try to approach them proactively based on this gained insight. The salesman picks a category on his dashboard and is then able to see listed clients, their solvency and their spendings outside Seznam. From this, he knows exactly who and when to call. 

How is the sales team responding to Keboola Connection and GoodData?

The sales department has their people 100 % under control thanks to Keboola and GoodData, so their response is of course very positive. When you hear a sales manager with more than eight years of experience saying that he cannot imagine his work without GoodData anymore, it’s certainly something you like to hear. 

Does it pay off financially?

After three months of the project running, I could easily see the results (in dollars) through the business managers’ performance - of which we knew certainly was earned with thanks to information from GoodData. I can’t talk in exact numbers, but the investment into the database and BI consulting was in the hundreds of thousands range, and was payed off by more than 5 million in profit. 

So it does really pay off?

Sure it does. Not only does GoodData help us to generate more money, but also to find the areas where we can keep from losing it. A businessman can only use his time and energy where it’s worthy. Decisions are not driven by gut feeling anymore, they are based on hard data. We see costs drilled down to the tiniest of details. We can find the causes of growth and we are able to see what and how it exactly impacts our profit.

Seznam's Return on BI, Part Two

Why I'm not a Data Scientist

During my tenure at Keboola, and for some time before that, I’ve helped to design successful BI implementations for numerous companies, big and small.

In my role I taught others and helped them to achieve the same. Together, we build solutions that amaze me daily with their ability, value they bring to the users, and potential for the future. We process billions of rows of data, 10s of millions of text entries of all kinds, millions of deals and billions of dollars in business transactions. We perform some serious analytics over all that, helping to draw out business value for our clients every day. We innovate and help to redefine what it means to do BI. Our own company runs on data.

Yet, I would not call myself a Data Scientist.

I rarely code. I suck at stats. I definitely need to freshen up on my math skills. I avoid fancy terms like OLAP cube and Linear Regression. I prefer simple language. With my resume, I wouldn’t fit the bill for 80% of data analyst jobs postings out there. 

I don’t hold a PhD.

For me, Big Data is not a category of its own. It is something too big to handle using the tools at hand. So you get a bigger hammer and move on.

I’m a user, in all senses of the word. I’m addicted to data. I look for it everywhere, behind every question and problem. I love great business ideas and using data to make them fly. I love to work with people who think the same way.

How do I pull it off? Sometimes I wonder. For the most part, I believe it’s about the right tools. Tools that are conductive to this kind of thinking. I mostly use just two of them - Keboola Connection to bring the data together and put it where and how I need it, and GoodData to extract the meanings and answers to business questions.

Petr Olmer, Director of Expert Services at GoodData once tweeted that the most underused tool in BI is the human brain, and the most underrated method is asking questions. I believe it, and would add that the term “Data Scientist” ranks up there with the most over- (and mis-) used.

At Keboola we are trying to change that. Consultants at Keboola are people who understand the business and speak its language. They use their brains and ask a lot of questions.

Both Keboola and GoodData have some brilliant people that you could call serious scientists, data or otherwise. But their talents are being applied to making the tools smarter and more useful for us, the common folks. What they do keeps things simple for us. It allows us to focus on the business objective of the task at hand rather than the “how” of it all. Thanks to them, you don’t need to hire a scientist (or be one) to find the wealth in your data

But you might want to talk to - or become - one of us.

The Beginner’s Guide To Keboola

"The whole thing is a bit complicated…" started Vojta, one of Keboola’s consultants, over an English breakfast in the coffee shop with the best coffee in Prague. He was right. It was complicated. But a few hours (and a pint of coffee) I got pretty good idea what was going on. Here, I will try to relay it to you.

Intro: Companies today often have enough data to get completely lost in it and it is unfathomable to put it into context and extract any useful meaning. Even if they can, there are high costs associated with time and money.

Finding the gold in the data

Keboola does something called data ETL (Extract, Transform, Load). It sounds (just like many other fancy terms from this field) more complicated than it is.

Keboola helps you:

  1. Identify, locate and pull together all data relevant to your business from both your own and third-party sources. Anything from accounting and ERP systems to some related open-data initiatives of the government to comments on your Facebook pages. This is the Extract stage.
  2. They manage the whole load, organize it into a structure in which one can meaningfully work with it. That’s Transform.
  3. Then the data is pushed into the system or application selected for the final consumption - Load.

The toolset that Keboola uses to perform (amongst other things) the ETL tasks, is their own Keboola Connection.

The platform that Keboola uses for the analytics and producing all of those wondrous charts and dashboards is GoodData.

So what is it all good for?

You’ve got data. Lots of it.

To give it meaning, the data needs to be pre-processed, the pieces put in order and with the right context, so that GoodData will give you the results you need. That is what Keboola is for:

  • Helping you to find meaning in your data.
  • Continuously processes your data using Keboola Connection
  • Sets up GoodData so you can find the answers you need. Answers to questions like "how much revenue did we get from customers brought to us by the expensive marketing campaign from last fall?” or “what impact does weather have on our sales people’s performance?” Or whatever else comes to mind.

Keboola can do all of that pretty fast and practically without limitations. But that’s my topic for the next time.

If anything here doesn’t make sense to you, please ask! I’ll reply and explain better in the article.

Amazon's anticipatory shipping

Amazon is just making the next logical step against traditional retail

I’ve now read about a dozen various reactions to Amazon’s patent on their “anticipatory shipping”. (you probably saw, or even read, USA Today’s article recommended on Linkedin). While I don’t dispute the brilliance of the idea, I think it’s worth mentioning that we’re looking at fairly expectable extension of Amazon’s logistical model, with a bit of excellent lateral thinking thrown into the mix.

When you think about it, brick-and-mortar retail chains have been doing the same thing since their inception. “Shipping into a general geographic area” is in their language called “putting goods on shelves” in a particular store. They use exactly the same data analytic techniques to estimate how much to put there to avoid both mark-downs and run-outs. Walmart built much of its success on the ability to know exactly how much to have where and when. Replace the word “store location” (which serves people in particular area) with “general geographic area” (state, county, zip-code) and you are back in Amazon’s world.

With the data it has, Amazon can of course predict orders in a particular area at least with the same accuracy Walmart can determine how many boxes of a particular toothpaste to put on a truck in their DC today so it hits the store just at the right time. It’s not perfect, but it does work very well indeed. Now if you imagine that the “general geographical area” happens to be an area served by a particular UPS depot, then all you need to do is to send the stuff there and then just collect the orders by the time the local delivery vans are being loaded. The better your “prediction”, the fewer items will be left at the depot without addressee that day (which may even be, up to a point, just fine with UPS, given how much business they see from Amazon), and the fewer people will have to wait additional day to get their items. Amazon is effectively distributing their DC closer to the user, using the trucks and planes as their warehouse.

Adrian Gonzales in his post looks at the whole thing from an additional, very interesting angle. The shipments to a particular area can become in a way self-fulfilling prophecies. The one thing traditional retail still holds over the on-line business is the ability to posses the wished item right there and then. While Amazon won’t be shipping to you shelves of items to pick from and send back what you don’t want any time soon, with the package already on the way at the time of your order, they’re coming pretty close. With this (on average) shorter time between order and delivery, with the cost of shipping staying standard unlike with same-day deliveries, Amazon is further strengthening its offering and increasing the reason why people would buy online rather than going to a store. In addition to that, they’re opening doors to impulse purchasing (“we think you probably want this, we have an extra on a truck near you, click yes/no”). Or imagine dutch auction for those not-yet-spoken-for items. Price is dropping until someone says yes and “outbids” the others.

At Keboola, we are working with clients on both sides of this online v. brick-and-mortar struggle. Both principles have place in our future shopping habits, but both of them need to work hard to balance the advantages of the other. Data happens to be the weapon of choice on both sides. While online retailers are trying to eliminate the time-to-value gap of purchases against retail, traditionals are learning more and more about us, individual shoppers, and our patterns. So what will be Bentonville’s answer to Amazon’s challenge? Maybe a shopping cart, waiting for you at the entrance of Walmart, already pre-loaded with the items you are almost certainly planning to buy today. You then just pick up the few unusual pieces and off you go.


Milan Veverka

Behold. The Official Keboola Blog has started

Hi! I am Martin, official Keboola Data Rookie, here to share the first post of our new blog!

My goal is to explain what it is we actually do here at Keboola… which can be pretty hard to sum up in one sentence. Every side of Keboola tells a different story, so each month we’ll give you insight into our business from a different perspective. With every post I hope to tell you something new and exciting, but if you’re still left with burning questions please let me know so I can answer them! 

Here’s a taste of what you can expect to see:

  • Keboola Basics. Or kindergarden for data analysts - as simple as what Keboola does, for whom and how. (Successfully tested on my grandmother)
  • Data for Business. We know you want to see numbers, but we also want to give you a comprehensive overview of how and what Keboola does to help companies earn big money through Big Data. We’ll share these stories through interviews, case studies, and our cultivated best practices.
  • Nerd Zone. For the tech savvy and future innovators, this is the space where analysts can embrace their inner geeks. Look forward to detailed articles full of pure know-how.
  • From Keboola With Love. With offices in two very (culturally) different time zones, get a behind the scenes look at what happens on the other side of the screen.

From one data enthusiast to another, thanks for taking the time to hear what I have to say. Stay tuned for our next post where we sit down with Michal Buzek - head of the analytical department at Seznam.cz (a respectable Czech rival of Google) – and find out the results of his project with Keboola.