New dose of steroids in the Keboola backend

More than two years after we announced support for Amazon Redshift in Keboola Connection, it’s about the friggin’ time to bring something new to the table. Something that will propel us further along. Voila, welcome Snowflake.

About 10 months ago we presented Snowflake at a meetup hosted at the GoodData office for the first time.

Today, we use Snowflake both behind the Storage API (it is now the standard backend for our data storage) and the Transformations Engine (you can utilize the power of Snowflake for your ETL-type processes). Snowflake’s SQL documentation can be found here.

What on Earth is Snowflake?

It’s a new database, built from scratch to run in the cloud. Something different that when a legacy vendor took an old DB and hosts it for you (MSSQL on Azure, Oracle in Rackspace or PostgreSQL in AWS).

Guiding project requirements for analytics

In a recent post, we started scoping our executive level dashboards and reporting project by mapping out who the primary consumers of the data will be, what their top priorities / challenges are, which data we need and what we are trying to measure.  It might seem like we are ready to start evaluating vendors and building it out the project, but we still have a few more requirements to gather.

What data can we exclude?

With our initial focus around sales analytics, the secondary data we would want to include (NetProspex, Marketo and ToutApp) all integrates fairly seamlessly with the Salesforce so it won't require as much effort on the data prep side.  If we pivot over to our marketing function however, things get a bit murkier.  On the low end this could mean a dozen or so data sources.  But what about our social channels, Google Ads, etc, as well as various spreadsheets.  In more and more instances, particularly for a team managing multiple brands or channels, the number of potential data sources can easily shoot into the dozens.

Although knowing what data we should include is important, what data can we exclude? Unlike the data lake philosophy (Forbes: Why Data Lakes Are Evil,) when we are creating operational level reporting, its important focus on creating value, not to overcomplicating our project with additional data sources that don't actually yield additional value.

Who's going to manage it?

Just as critical to the project as what and how; who’s going to be managing it? What skills do we have out our disposal and how many hours can we allocate for the initial setup as well as ongoing maintenance and change requests?  Will this project be managed by IT, our marketing analytics team, or both? Perhaps IT will manage data warehousing and data integration and the analyst will focus on capturing end user requirements and creating the dashboards and reports.  Depending on who's involved, the functionality of the tools and the languages used will vary. As mentioned in a recent CMS Wire post Buy and Build Your Way to a Modern Business Analytics Platform, its important to take an analytical inventory of what skills we have as well as what tools and resources we already have we may be able to take advantage of.

                                                    

When Salesforce Met Keboola: Why Is This So Great?

whenharrymetsallyjpg

How can I get more out of my Salesforce data?

sfdcpngAlong with being the world’s #1 CRM, Salesforce provides an end-to-end platform to connect with your customers including Marketing Cloud to personalize experiences across email, mobile, social, and the web, Service Cloud to support customer success, Community Cloud to connect customers, partners and employees and Wave Analytics designed to unlock the data within.

After going through many Salesforce implementations, I’ve found that although companies store their primary customer’s data there, the opportunity enrich it further by bringing in related data stored in other systems such as invoices in ERP or contracts in dedicated DMS is a big one.  For example, I’ve seen clients run into the issue of having inconsistent data in multiple source systems when a customer changes their billing address.  In a nutshell, Salesforce makes it easy to report on that data stored within but can’t provide a complete picture of the customer unless we broaden our view.  

Recommendation Engine in 27 lines of (SQL) code

In the data preparation space, very frequently the focus lies in BI as the ultimate destination of data. But we see, more and more often, how data enrichment can loop straight back into the primary systems and processes.

Take recommendation. Just the basic type (“customers who bought this also bought…”). That, in its simplest form, is an outcome of basket analysis.

We recently had a customer who asked for a basic recommendation as a part of proof of concept, whether Keboola Connection is the right fit for them. The dataset we got to work with came from a CRM system, and contained a few thousand anonymized rows (4600-ish, actually) of won opportunities which effectively represented product-customer relations (what customers had purchased which products). So, pretty much ready-to-go data for the Basket Analysis app, which has been waiting for just this opportunity in the Keboola App Store. Sounded like good challenge - how can we turn this into a basic recommendation engine?

Find the Right Music: Analyzing last.fm data sentiment with Keboola + Tableau

                               Find The Right Musicpng

As we covered in our recent NLP blog, there are a lot of cool use cases for text / sentiment analysis.  One recent instance we found really interesting came out of our May presentation at SeaTUG (Seattle Tableau User Group.)  As part of our presentation / demo we decided to find out what some of the local Tableau users could do with trial access to Keboola; below we’ll highlight what Hong Zhu and a group of students from the University of Washington were able to accomplish with Keboola + Tableau for a class final project!

What class was this for and why did you want to do this for a final project?

We are a group of students at the University of Washington’s department of Human Centered Design and Engineering.  For our class project for HCDE 511 – Information Visualization, we made an interactive tool to visualize music data from Last FM.  We chose the topic of music because all 4 of us are music lovers.

Initially, the project was driven by our interest in having an international perspective on the popularity vs. obscurity of artists and tracks.  However, after interviewing a number of target users, we learned that most of them were not interested in rankings in other countries.  In fact, most of them were not interested in the ranking of artists/tracks at all.  Instead, our target users were interested in having more individualized information and robust search functions, in order to quickly find the right music that is tailored to one’s taste, mood, and occasion.  Therefore, we re-focused our efforts on parsing out the implicit attributes, such as genre and sentiment, from the 50 most-used tags of each track.  That was when Keboola and its NLP plug-in came into play and became instrumental in the success of this project.

The value of text (data) and Geneea NLP app

Just last week, a client let out a sigh: “We have all this text data (mostly customer reviews) and we know there is tremendous value in that set but outside from reading it all and manually sorting through it, what can we do with it?”

With text becoming a bigger and bigger chunk of a company’s data intake, we hear those questions more and more often. A few years ago, the “number of followers” was about the only metric people would get from their Twitter accounts. Today, we want (and can) know much more; What are people talking about? How do we escalate their complaints? What about the topics trending across data sources and platforms? Those are just some examples of questions we’re asking of NLP (Natural Language Processing) applications at our disposal.

Besides the more obvious social media stuff, there are many areas where text analytics can play an extremely valuable role. Areas like customer support (think of all the ticket descriptions and comments), surveys (most have open-ended questions and their answers often contain the most valuable insights), e-mail marketing (whether it is analyzing outbound campaigns and using text analytics to better understand what works and what doesn’t, or compiling inbound e-mails) and lead-gen (what do people mention when reaching out to you) to name a few. From time to time we even come across more obscure requests like text descriptions of deals made in the past that need critical information extracted (for example contract expiration dates) or comparisons of bodies of text to determine “likeness” (when comparing things like product or job descriptions).

Keboola and Slalom Consulting Team up to host Seattle’s Tableau User Group

On Wednesday, May 18th, Keboola’s Portland and BC team converged in Seattle to host the city’s monthly Tableau User Group with Slalom Consulting. We worked with SeaTUG’s regular hosts and organizers, Slalom Consulting, to put together a full evening of discussion around how to solve complex Tableau data problems using KBC. With 70+ people in attendance, Seattle’s Alexis Hotel was buzzing with excitement! 

The night began with Slalom’s very own Anthony Gould, consultant, data nerd and SeaTUG host extraordinaire, welcoming the group and getting everyone riled up for the night’s contest - awarding the attendee who’s SeaTUG related tweet got the most retweets! He showed everyone how we used Keboola Connection (KBC) to track that data and prepared them that this would be updated at the end of the night and prizes distributed!

Cleaning Dirty Address Data in KBC

There is an increasing number of use cases and data projects for which geolocation data can add a ton of value - e-commerce and retail, supply chain, sales and marketing, etc.  Unfortunately, one of the most challenging asks of any data project is relating geographical information to various components of the dataset. On a more positive note, however, KBC’s easy integration with Google apps of all kinds allows users to leverage Google Maps to add geo-coding functionality. Since we have so many clients taking advantage of geo-coding capabilites, one of our consultants, Pavel Boiko outlined the process of adding this feature to your KBC environment. Check it out!  

Anatomy of an Award Winning Data Project Part 3: Ideal Problems not Ideal Customers

Hopefully you’ve had a chance to read about our excitement and pride upon learning that two of our customers had won big awards for the work we’d done together. To jog your memory, Computer Science Corporation (CSC)’s marketing team won the ITSMA Diamond Marketing Excellence Award as a result of the data project we built together. CSC used KBC to bridge together 50+ data sources and pushing those insights out to thousands of CSC employees. To catch up on what you missed or to read again, revisit our Part 1 of our Anatomy of an Award Winning Data Project. 

Additionally, the BI team at Firehouse Subs won Hospitality Technology’s Enterprise Innovator Award for its Station Pulse dashboard built with a KBC foundation. The dashboard measures each franchise’s performance based on 10 distinct metrics and pulling data from at least six sources. To catch up on what you missed or to read again, revisit our Part 2 of our Anatomy of an Award Winning Data Project.

We’re taught that most businesses have a “typical” or “ideal” customer. When crafting a marketing strategy or explaining your business to partners, customers and your community, this concept comes up repeatedly. And we don’t really have a ready-made answer. A data-driven business can be in any industry and the flexibility and agility of the Keboola platform is by its very nature data source and use case agnostic.

And so, when these two customers of ours both won prestigious awards highlighting their commitment to data innovation, it got us thinking. These two use cases are pretty different. We worked with completely different departments, different data sources, different end-users, different KPIs, etc. And yet both have been successful, award-winning projects.

We realized that perhaps the question of an ideal customer isn’t really relevant for us. Perhaps we’d been asking the wrong question all along. We can’t define our target customer, but we can define the target problem that our customers need help solving.

Anatomy of an Award Winning Data Project Part 2: Firehouse Subs Station Pulse BI Dashboard


As we reported last week, we are still beaming with pride, like proud parents at a little league game or a dance recital. Not one, but two!, of our customers won big fancy awards for the work we did together. The concept of a data-driven organization has been discussed and proposed as an ideal for a while now, but how we define and identify those organizations is certainly still up for debate. We’re pretty confident that these two customers in question - Computer Sciences Corporation (CSC) and Firehouse Subs - would be prime contenders. These awards highlight their commitment to go further than their industry counterparts to empower employees and franchisees to leverage data in new and exciting ways. 

If you missed last week’s post with CSC’s Chris Marin, check it out here. Today, let’s learn more about Firehouse Subs award winning project. In case you don’t know much about Firehouse Subs, let me bring you up to speed. The sandwich chain started in 1994 and as of March 2016 has more than 960 locations in 44 states, Puerto Rico and Canada. Firehouse Subs is no stranger to winning awards, either. In 2006, KPMG named them “Company of the Year” and they’ve been recognized for their commitment to community service and public safety as well through Firehouse Subs Public Safety Foundation®, created in 2005.  


Now let’s hear from our project champion and our main ally at Firehouse Subs, Director of Reporting and Analytics, Danny Walsh.