Please hold, your call is important to us

We’ve recently experienced two fairly large system problems that have affected approximately 35% of our clients.

The first issue took 50 minutes to resolve and the other approximately 10 hours. The root cause in both cases was the way we handled the provisioning of adhoc sandboxes on top of our SnowflakeDB (a few words about "how we started w/ them").

We managed to find a workaround for the first problem, but the second one was out of our hands.  All we could do was fill in a support ticket with Snowflake and wait. Our communication channels were flooded with questions from our clients and there was nothing we could do. Pretty close to what you would call a worst-case scenario.! Fire! Panic in Keboola!

My first thoughts were like: “Sh..t! What if we run the whole system on our own infrastructure, we could do something now. We could try to solve the issue and not have to just wait…”

But, we were forced to just wait and rely on Snowflake. This is the account of what happened since:

New dose of steroids in the Keboola backend

More than two years after we announced support for Amazon Redshift in Keboola Connection, it’s about the friggin’ time to bring something new to the table. Something that will propel us further along. Voila, welcome Snowflake.

About 10 months ago we presented Snowflake at a meetup hosted at the GoodData office for the first time.

Today, we use Snowflake both behind the Storage API (it is now the standard backend for our data storage) and the Transformations Engine (you can utilize the power of Snowflake for your ETL-type processes). Snowflake’s SQL documentation can be found here.

What on Earth is Snowflake?

It’s a new database, built from scratch to run in the cloud. Something different that when a legacy vendor took an old DB and hosts it for you (MSSQL on Azure, Oracle in Rackspace or PostgreSQL in AWS).

Guiding project requirements for analytics

In a recent post, we started scoping our executive level dashboards and reporting project by mapping out who the primary consumers of the data will be, what their top priorities / challenges are, which data we need and what we are trying to measure.  It might seem like we are ready to start evaluating vendors and building it out the project, but we still have a few more requirements to gather.

What data can we exclude?

With our initial focus around sales analytics, the secondary data we would want to include (NetProspex, Marketo and ToutApp) all integrates fairly seamlessly with the Salesforce so it won't require as much effort on the data prep side.  If we pivot over to our marketing function however, things get a bit murkier.  On the low end this could mean a dozen or so data sources.  But what about our social channels, Google Ads, etc, as well as various spreadsheets.  In more and more instances, particularly for a team managing multiple brands or channels, the number of potential data sources can easily shoot into the dozens.

Although knowing what data we should include is important, what data can we exclude? Unlike the data lake philosophy (Forbes: Why Data Lakes Are Evil,) when we are creating operational level reporting, its important focus on creating value, not to overcomplicating our project with additional data sources that don't actually yield additional value.

Who's going to manage it?

Just as critical to the project as what and how; who’s going to be managing it? What skills do we have out our disposal and how many hours can we allocate for the initial setup as well as ongoing maintenance and change requests?  Will this project be managed by IT, our marketing analytics team, or both? Perhaps IT will manage data warehousing and data integration and the analyst will focus on capturing end user requirements and creating the dashboards and reports.  Depending on who's involved, the functionality of the tools and the languages used will vary. As mentioned in a recent CMS Wire post Buy and Build Your Way to a Modern Business Analytics Platform, its important to take an analytical inventory of what skills we have as well as what tools and resources we already have we may be able to take advantage of.

                                                    

When Salesforce Met Keboola: Why Is This So Great?

whenharrymetsallyjpg

How can I get more out of my Salesforce data?

sfdcpngAlong with being the world’s #1 CRM, Salesforce provides an end-to-end platform to connect with your customers including Marketing Cloud to personalize experiences across email, mobile, social, and the web, Service Cloud to support customer success, Community Cloud to connect customers, partners and employees and Wave Analytics designed to unlock the data within.

After going through many Salesforce implementations, I’ve found that although companies store their primary customer’s data there, the opportunity enrich it further by bringing in related data stored in other systems such as invoices in ERP or contracts in dedicated DMS is a big one.  For example, I’ve seen clients run into the issue of having inconsistent data in multiple source systems when a customer changes their billing address.  In a nutshell, Salesforce makes it easy to report on that data stored within but can’t provide a complete picture of the customer unless we broaden our view.  

Recommendation Engine in 27 lines of (SQL) code

In the data preparation space, very frequently the focus lies in BI as the ultimate destination of data. But we see, more and more often, how data enrichment can loop straight back into the primary systems and processes.

Take recommendation. Just the basic type (“customers who bought this also bought…”). That, in its simplest form, is an outcome of basket analysis.

We recently had a customer who asked for a basic recommendation as a part of proof of concept, whether Keboola Connection is the right fit for them. The dataset we got to work with came from a CRM system, and contained a few thousand anonymized rows (4600-ish, actually) of won opportunities which effectively represented product-customer relations (what customers had purchased which products). So, pretty much ready-to-go data for the Basket Analysis app, which has been waiting for just this opportunity in the Keboola App Store. Sounded like good challenge - how can we turn this into a basic recommendation engine?

Find the Right Music: Analyzing last.fm data sentiment with Keboola + Tableau

                               Find The Right Musicpng

As we covered in our recent NLP blog, there are a lot of cool use cases for text / sentiment analysis.  One recent instance we found really interesting came out of our May presentation at SeaTUG (Seattle Tableau User Group.)  As part of our presentation / demo we decided to find out what some of the local Tableau users could do with trial access to Keboola; below we’ll highlight what Hong Zhu and a group of students from the University of Washington were able to accomplish with Keboola + Tableau for a class final project!

What class was this for and why did you want to do this for a final project?

We are a group of students at the University of Washington’s department of Human Centered Design and Engineering.  For our class project for HCDE 511 – Information Visualization, we made an interactive tool to visualize music data from Last FM.  We chose the topic of music because all 4 of us are music lovers.

Initially, the project was driven by our interest in having an international perspective on the popularity vs. obscurity of artists and tracks.  However, after interviewing a number of target users, we learned that most of them were not interested in rankings in other countries.  In fact, most of them were not interested in the ranking of artists/tracks at all.  Instead, our target users were interested in having more individualized information and robust search functions, in order to quickly find the right music that is tailored to one’s taste, mood, and occasion.  Therefore, we re-focused our efforts on parsing out the implicit attributes, such as genre and sentiment, from the 50 most-used tags of each track.  That was when Keboola and its NLP plug-in came into play and became instrumental in the success of this project.

The value of text (data) and Geneea NLP app

Just last week, a client let out a sigh: “We have all this text data (mostly customer reviews) and we know there is tremendous value in that set but outside from reading it all and manually sorting through it, what can we do with it?”

With text becoming a bigger and bigger chunk of a company’s data intake, we hear those questions more and more often. A few years ago, the “number of followers” was about the only metric people would get from their Twitter accounts. Today, we want (and can) know much more; What are people talking about? How do we escalate their complaints? What about the topics trending across data sources and platforms? Those are just some examples of questions we’re asking of NLP (Natural Language Processing) applications at our disposal.

Besides the more obvious social media stuff, there are many areas where text analytics can play an extremely valuable role. Areas like customer support (think of all the ticket descriptions and comments), surveys (most have open-ended questions and their answers often contain the most valuable insights), e-mail marketing (whether it is analyzing outbound campaigns and using text analytics to better understand what works and what doesn’t, or compiling inbound e-mails) and lead-gen (what do people mention when reaching out to you) to name a few. From time to time we even come across more obscure requests like text descriptions of deals made in the past that need critical information extracted (for example contract expiration dates) or comparisons of bodies of text to determine “likeness” (when comparing things like product or job descriptions).

Keboola and Slalom Consulting Team up to host Seattle’s Tableau User Group

On Wednesday, May 18th, Keboola’s Portland and BC team converged in Seattle to host the city’s monthly Tableau User Group with Slalom Consulting. We worked with SeaTUG’s regular hosts and organizers, Slalom Consulting, to put together a full evening of discussion around how to solve complex Tableau data problems using KBC. With 70+ people in attendance, Seattle’s Alexis Hotel was buzzing with excitement! 

The night began with Slalom’s very own Anthony Gould, consultant, data nerd and SeaTUG host extraordinaire, welcoming the group and getting everyone riled up for the night’s contest - awarding the attendee who’s SeaTUG related tweet got the most retweets! He showed everyone how we used Keboola Connection (KBC) to track that data and prepared them that this would be updated at the end of the night and prizes distributed!

Cleaning Dirty Address Data in KBC

There is an increasing number of use cases and data projects for which geolocation data can add a ton of value - e-commerce and retail, supply chain, sales and marketing, etc.  Unfortunately, one of the most challenging asks of any data project is relating geographical information to various components of the dataset. On a more positive note, however, KBC’s easy integration with Google apps of all kinds allows users to leverage Google Maps to add geo-coding functionality. Since we have so many clients taking advantage of geo-coding capabilites, one of our consultants, Pavel Boiko outlined the process of adding this feature to your KBC environment. Check it out!