Recommendation Engine in 27 lines of (SQL) code

In the data preparation space, very frequently the focus lies in BI as the ultimate destination of data. But we see, more and more often, how data enrichment can loop straight back into the primary systems and processes.

Take recommendation. Just the basic type (“customers who bought this also bought…”). That, in its simplest form, is an outcome of basket analysis.

We recently had a customer who asked for a basic recommendation as a part of proof of concept, whether Keboola Connection is the right fit for them. The dataset we got to work with came from a CRM system, and contained a few thousand anonymized rows (4600-ish, actually) of won opportunities which effectively represented product-customer relations (what customers had purchased which products). So, pretty much ready-to-go data for the Basket Analysis app, which has been waiting for just this opportunity in the Keboola App Store. Sounded like good challenge - how can we turn this into a basic recommendation engine?

Find the Right Music: Analyzing last.fm data sentiment with Keboola + Tableau

                               Find The Right Musicpng

As we covered in our recent NLP blog, there are a lot of cool use cases for text / sentiment analysis.  One recent instance we found really interesting came out of our May presentation at SeaTUG (Seattle Tableau User Group.)  As part of our presentation / demo we decided to find out what some of the local Tableau users could do with trial access to Keboola; below we’ll highlight what Hong Zhu and a group of students from the University of Washington were able to accomplish with Keboola + Tableau for a class final project!

What class was this for and why did you want to do this for a final project?

We are a group of students at the University of Washington’s department of Human Centered Design and Engineering.  For our class project for HCDE 511 – Information Visualization, we made an interactive tool to visualize music data from Last FM.  We chose the topic of music because all 4 of us are music lovers.

Initially, the project was driven by our interest in having an international perspective on the popularity vs. obscurity of artists and tracks.  However, after interviewing a number of target users, we learned that most of them were not interested in rankings in other countries.  In fact, most of them were not interested in the ranking of artists/tracks at all.  Instead, our target users were interested in having more individualized information and robust search functions, in order to quickly find the right music that is tailored to one’s taste, mood, and occasion.  Therefore, we re-focused our efforts on parsing out the implicit attributes, such as genre and sentiment, from the 50 most-used tags of each track.  That was when Keboola and its NLP plug-in came into play and became instrumental in the success of this project.

The value of text (data) and Geneea NLP app

Just last week, a client let out a sigh: “We have all this text data (mostly customer reviews) and we know there is tremendous value in that set but outside from reading it all and manually sorting through it, what can we do with it?”

With text becoming a bigger and bigger chunk of a company’s data intake, we hear those questions more and more often. A few years ago, the “number of followers” was about the only metric people would get from their Twitter accounts. Today, we want (and can) know much more; What are people talking about? How do we escalate their complaints? What about the topics trending across data sources and platforms? Those are just some examples of questions we’re asking of NLP (Natural Language Processing) applications at our disposal.

Besides the more obvious social media stuff, there are many areas where text analytics can play an extremely valuable role. Areas like customer support (think of all the ticket descriptions and comments), surveys (most have open-ended questions and their answers often contain the most valuable insights), e-mail marketing (whether it is analyzing outbound campaigns and using text analytics to better understand what works and what doesn’t, or compiling inbound e-mails) and lead-gen (what do people mention when reaching out to you) to name a few. From time to time we even come across more obscure requests like text descriptions of deals made in the past that need critical information extracted (for example contract expiration dates) or comparisons of bodies of text to determine “likeness” (when comparing things like product or job descriptions).

Keboola and Slalom Consulting Team up to host Seattle’s Tableau User Group

On Wednesday, May 18th, Keboola’s Portland and BC team converged in Seattle to host the city’s monthly Tableau User Group with Slalom Consulting. We worked with SeaTUG’s regular hosts and organizers, Slalom Consulting, to put together a full evening of discussion around how to solve complex Tableau data problems using KBC. With 70+ people in attendance, Seattle’s Alexis Hotel was buzzing with excitement! 

The night began with Slalom’s very own Anthony Gould, consultant, data nerd and SeaTUG host extraordinaire, welcoming the group and getting everyone riled up for the night’s contest - awarding the attendee who’s SeaTUG related tweet got the most retweets! He showed everyone how we used Keboola Connection (KBC) to track that data and prepared them that this would be updated at the end of the night and prizes distributed!

Cleaning Dirty Address Data in KBC

There is an increasing number of use cases and data projects for which geolocation data can add a ton of value - e-commerce and retail, supply chain, sales and marketing, etc.  Unfortunately, one of the most challenging asks of any data project is relating geographical information to various components of the dataset. On a more positive note, however, KBC’s easy integration with Google apps of all kinds allows users to leverage Google Maps to add geo-coding functionality. Since we have so many clients taking advantage of geo-coding capabilites, one of our consultants, Pavel Boiko outlined the process of adding this feature to your KBC environment. Check it out!  

Anatomy of an Award Winning Data Project Part 3: Ideal Problems not Ideal Customers

Hopefully you’ve had a chance to read about our excitement and pride upon learning that two of our customers had won big awards for the work we’d done together. To jog your memory, Computer Science Corporation (CSC)’s marketing team won the ITSMA Diamond Marketing Excellence Award as a result of the data project we built together. CSC used KBC to bridge together 50+ data sources and pushing those insights out to thousands of CSC employees. To catch up on what you missed or to read again, revisit our Part 1 of our Anatomy of an Award Winning Data Project. 

Additionally, the BI team at Firehouse Subs won Hospitality Technology’s Enterprise Innovator Award for its Station Pulse dashboard built with a KBC foundation. The dashboard measures each franchise’s performance based on 10 distinct metrics and pulling data from at least six sources. To catch up on what you missed or to read again, revisit our Part 2 of our Anatomy of an Award Winning Data Project.

We’re taught that most businesses have a “typical” or “ideal” customer. When crafting a marketing strategy or explaining your business to partners, customers and your community, this concept comes up repeatedly. And we don’t really have a ready-made answer. A data-driven business can be in any industry and the flexibility and agility of the Keboola platform is by its very nature data source and use case agnostic.

And so, when these two customers of ours both won prestigious awards highlighting their commitment to data innovation, it got us thinking. These two use cases are pretty different. We worked with completely different departments, different data sources, different end-users, different KPIs, etc. And yet both have been successful, award-winning projects.

We realized that perhaps the question of an ideal customer isn’t really relevant for us. Perhaps we’d been asking the wrong question all along. We can’t define our target customer, but we can define the target problem that our customers need help solving.

Anatomy of an Award Winning Data Project Part 2: Firehouse Subs Station Pulse BI Dashboard


As we reported last week, we are still beaming with pride, like proud parents at a little league game or a dance recital. Not one, but two!, of our customers won big fancy awards for the work we did together. The concept of a data-driven organization has been discussed and proposed as an ideal for a while now, but how we define and identify those organizations is certainly still up for debate. We’re pretty confident that these two customers in question - Computer Sciences Corporation (CSC) and Firehouse Subs - would be prime contenders. These awards highlight their commitment to go further than their industry counterparts to empower employees and franchisees to leverage data in new and exciting ways. 

If you missed last week’s post with CSC’s Chris Marin, check it out here. Today, let’s learn more about Firehouse Subs award winning project. In case you don’t know much about Firehouse Subs, let me bring you up to speed. The sandwich chain started in 1994 and as of March 2016 has more than 960 locations in 44 states, Puerto Rico and Canada. Firehouse Subs is no stranger to winning awards, either. In 2006, KPMG named them “Company of the Year” and they’ve been recognized for their commitment to community service and public safety as well through Firehouse Subs Public Safety Foundation®, created in 2005.  


Now let’s hear from our project champion and our main ally at Firehouse Subs, Director of Reporting and Analytics, Danny Walsh.

Anatomy of an Award Winning Data Project Part 1: CSC and Marketing Analytics

Here at Keboola, we take pride in working closely with partners and customers ensuring that each project is a success. Typically we’re there from the beginning - to understand the problem the client needs to solve; to help them define the scope and timeline of the implementation; to provide the necessary resources to get buy in from the rest of their team; to offer alternative perspectives and options when mapping out the project; and to be their ally and guide throughout every step of the process. With all that work, all that dedication, it turns out we develop quite a soft spot for both our clients and their projects. 

We’ve got skin in the game, so when one of our clients receives an award because of the project we worked on together, we get pretty excited. And when two clients receive an award because of our work together, well, then we’re downright ecstatic and ready to celebrate!

At the end of 2015, two customers were honored for their commitment to data innovation. Firehouse Subs® was awarded the Hospitality Technology Innovation Award and the digital marketing team at Computer Science Corporation (CSC) for the ITSMA Diamond Marketing Excellence Award.

Since new partners and clients often ask us to explain what components and environment cultivate a successful data project, we thought we’d take this exceptional opportunity to ask our customers themselves: Danny Walsh, Director of Reporting and Analytics, Firehouse Subs and Chris Marin, Senior Principal, Digital Marketing Platform & Analytics, CSC.

Over the next couple of weeks, we’ll share each of their stories and explain how we feel these separate use cases in two distinctly different industries are reflective of what we at Keboola view as the ideal conditions for creating a wildly successful - award-winning even - data project.

Empowering the Business User in your BI and Analytics Environment


There’s one trend on Gartner’s radar that hasn’t changed much over the last few years and that’s the increasing move toward a self-service BI model. Gone are the days of your IT or analytics department being report factories. And if those days aren’t gone for you, then it’s time you make some substantive changes to your business intelligence environment. When end-users are forced to rely on another department to deliver the reports they need, the entire concept of being a “data-driven” organization goes right out the window. 

So other than giving your users access to ad hoc reporting capabilities, how do  you empower the user?

Bi-Modal BI: Balancing Self-Service and Governance

                                   

The age old conflict.  IT needs centralization, governance, standards and control; on the other side of the coin?  Business units need the ability to move fast and try new things.  How can we get lines of business access to the data they need to for projects so they can spend their time focused on discovering new insights?  Typically they get stuck in a bottleneck of IT requests or spending 80% of their time doing data integration and preparation.  Neither group seems particularly excited to do it, and I don’t blame them.  For the analyst it increases the complexity of their tasks and seriously raises the technical knowledge requirements.  For IT, it’s a major distraction from their main purpose in life, an extra thing to do.  Self serve BI is trying to destroy the backlogged “report factories,” only to replace them with “data stores,”  which are sadly even less equipped for the job at hand.  Either way, the result is a painfully inefficient process, straining both ends of the value chain in any company that embarks on the data driven journey.

The Bi-Modal BI Answer?

An organization's ability to effectively extract value from data and analytics while maintaining a well governed source of truth is the difference between competitive advantage or sunken costs and missed opportunities.  How can we create an environment that provides the agile data access needed by the business users while still maintaining sound data governance?    Gartner has referred to a  Bi-modal IT strategy.  A big challenge with Bi-modal IT is that it pushes IT management to divide their efforts between ITs traditional focus and a more business focused agile methodology.

The DBA and Analyst Divide

Another major challenge in data access comes from the separation between DBAs and business users.  Although the technical side may have the necessary expertise to implement ETL projects, they often lack the business domain expertise needed to make the correct assumptions around context and how the data is regarded.  With so many projects competing for resources, we shouldn’t have to task a DBA on all of them.  Back to the flip side of the coin, data analysts and scientists want the right data for their tools of choice and they want it fast.  Even though there is growing set of data integration tools that allows individual business units to create and maintain their own data projects, this typically requires a lot of manual data modeling and can lead to siloed data or inconsistent metrics.  

Instead of controlling all of BI, IT can enable the business to develop their analytics without sacrificing control and governance standards.  So how can we get the right data in the hands of people who understand and need it in a timely manner?