When Salesforce Met Keboola: Why Is This So Great?


How can I get more out of my Salesforce data?

sfdcpngAlong with being the world’s #1 CRM, Salesforce provides an end-to-end platform to connect with your customers including Marketing Cloud to personalize experiences across email, mobile, social, and the web, Service Cloud to support customer success, Community Cloud to connect customers, partners and employees and Wave Analytics designed to unlock the data within.

After going through many Salesforce implementations, I’ve found that although companies store their primary customer’s data there, the opportunity enrich it further by bringing in related data stored in other systems such as invoices in ERP or contracts in dedicated DMS is a big one.  For example, I’ve seen clients run into the issue of having inconsistent data in multiple source systems when a customer changes their billing address.  In a nutshell, Salesforce makes it easy to report on that data stored within but can’t provide a complete picture of the customer unless we broaden our view.  

Another challenge I’ve noticed is that we can only report on the data residing there which means doing time-over-time or analysis of changes between snapshots is a problem.  Salesforce has a huge API and is able to connect to any other systems’ API, however that can also means a lot of development and lengthy, expensive customizations.  At the same time, Salesforce prefers declarative development where we don’t typically need to write any code and developing these connections are contrary to that idea.  

keboola-logo-name-smallpngEnter Keboola Connection, which allows us to blend Salesforce data along with other sources, clean them and run apps on top of them.  In minutes we can set up things like Churn prediction, logistic flow, segmentation and much more.  Supporting the same idea of close to no-development, the focus is on connecting the right data sources, creating transformations or using the data science apps and then automating these data flows.  This enables us to do cross-object reporting and cross data source analysis in our favorite data visualization tools or even blend data together and feed it back into Salesforce to further enrich our customer data.

Here’s a Couple of Examples:

NGO With Hundreds of Thousands of Donations

NGO, an organization that receives hundreds of thousands of donations that they track through campaigns in Salesforce.  The obvious challenge?  How can they get better insight into these campaigns to increase the size and number of donations?  Although within Salesforce we can use different reports and groupings of data to examine different points of view, this requires a lot of manual effort as well as some imagination.  

The Keboola difference?  Just use the Salesforce Extractor to get data out of Salesforce (click the button, authorize SFDC credentials and select the appropriate fields), do segment analysis, which will automatically analyzes data and groups donors together based on similarities, then upload the results (information about segment each contact it’s part of) with the Salesforce Writer back into the system.  This equals days of saved time and a much more precise grouping of contacts which will directly address the donation insights they’re looking for.  

Other benefits include predicting when the next payment will come based on existing data.  Just imagine that NGO without any guaranteed income would be able to predict their cash flow, that would be awesome!   

New CRM without additional information

The second example is a customer who just implemented Salesforce CRM basics - companies, contacts, addresses number of employees and so on.  This isn’t a bad start, but very simple with no additional information.  It means that salespeople have to use this system to obtain and enter information about customers, but cannot use it for segmentation based on billings or existing contracts.  Connecting CRM with their existing ERP is out of the scope of this project, would be too expensive and take too long to deliver.  Because their existing ERP doesn’t support any web access, opening a page which would show billing data based on a provided customer id is out of question as well.  That said, we need salespeople to be able to take advantage of data from both systems to evaluate who to call first.

In Keboola this problem can be solved in minutes.  By using the connector to their existing ERP, using an identification field to match billings and contracts records to customer records and then creating or update these records in Salesforce.  Piece of cake!  This saved weeks of development and several hours every day for each salesman.  

sfdckbc 1jpg


It is easy to think about many other examples Keboola Connection can play an important role in, not just as a middle-man between Salesforce and other systems, but also to help identify important information across all the records we have.  What use case can you see in your environment?

About author

Martin Humpolec is Salesforce Certified Consultant and author of the Salesforce Writer for Keboola Connection. He blog about Salesforce and other things on his personal blog, you can also follow him on Twitter.

Recommendation Engine in 27 lines of (SQL) code

In the data preparation space, very frequently the focus lies in BI as the ultimate destination of data. But we see, more and more often, how data enrichment can loop straight back into the primary systems and processes.

Take recommendation. Just the basic type (“customers who bought this also bought…”). That, in its simplest form, is an outcome of basket analysis.

We recently had a customer who asked for a basic recommendation as a part of proof of concept, whether Keboola Connection is the right fit for them. The dataset we got to work with came from a CRM system, and contained a few thousand anonymized rows (4600-ish, actually) of won opportunities which effectively represented product-customer relations (what customers had purchased which products). So, pretty much ready-to-go data for the Basket Analysis app, which has been waiting for just this opportunity in the Keboola App Store. Sounded like good challenge - how can we turn this into a basic recommendation engine?

Basket analysis, simply put, looks at groups of items (baskets), and assigns a few important values on any combinations of items that are found in the primary dataset. The two most descriptive values are called "support" - the frequency in which a given combination of items presents itself in the dataset or its segment, and "lift" - the likelihood of items in the combination to appear together. If you want to go deeper, here's a good resource (it's a bit heavy reading, you've been warned). Now, this "lift" value is interesting - we can loosely translate the likelihood of the items being together as the likelihood that the customer may be interested in such combination.

So, for simple recommendation, we take what is in the “basket” right now, and look at additional items with highest “lift” value for the combination and display, or “recommend” them. That’s how you get conditioner with a shampoo or fries with a hot dog. While well understood, the algorithms are not simple nor are they “light”, they take decent amount of computing power, especially when you get beyond a few thousand transactions (or “baskets”).

To make things simpler I built a quick transformation to filter out the won opportunities only (the original set, having been a flattened table from SFDC, also contained tons of opportunities in other stages, obviously irrelevant to the task at hand), and just the needed columns - an ID, the customer identifier and a product they got:

And the resulting table:

That then got fed into KBC’s Basket Analysis app. The settings are simple, just point to the app and assign columns that contain what the app needs:

The output of the app gives several tables (you will learn about them from the app description of course), but the interesting one was ARL__1 (I need to talk to Marc who built this app to maybe refresh the table names a bit) - this one gives the found “rules”. The key columns are “LHS”, “RHS” and “lift”. In layman’s (mine) terms, this means: for a customer with a “basket” containing items listed in the “LHS” (Left Hand Side in case you were wondering), there is the “lift” value for the “RHS” (yes, you guessed it) items to be present as well.

The “LHS” and “RHS” columns are actually arrays, as logically there may be more products involved here. Quite fortunately the content is in alphabetical order (that will be important later on). The data looks like this:

Now, in my simple example, I really care only about 1 item to be recommended. So, I care only about rows from the ARL__1 table that:

a) have only one item in RHS and

b) out of those only the highest “lift” rows (this now is a temporary “recommendation” table, defining for each basket the next best product to recommend).

A few lines of SQL will take care of that, those are the first three queries in this transformation:

The rest of the code here is dealing with the few very simple tasks left:

Query 4 takes the original input table and “collapse” it into baskets in the same format as our “lhs” field in the basket analysis - think “GROUP_CONCAT” in MySQL, here it’s Redshift so the listagg() aggregation comes to aid. It also has the nifty ability to force ordering within the group, which allows us to match the order of items with the alphabetical output of the Basket Analysis app (told you it would be important - and thanks to our friends at Periscope Data for their blog where I came across this trick). And finally

Query 5 use this new field to join the temporary table, which allows us to get the recommended new product for each customer.

And we’re “done”. Here's the output data:

Why the quotation marks? This exercise represents just a very simple approach. Its success depends on nearly ideal initial conditions and overall lack of edge cases. Depending on the input data, there will be number of baskets that just don’t create a rule with enough significance, and therefore no recommendation will be served (note the null values in the table above). While this can be addressed by lowering the support threshold of the Basket Analysis app (in my example I used 1% cut off, which at the end yielded 59% success rate, or ratio of customers for whom we were able to provide solid recommendation), whether or not that solution makes sense depends heavily on the circumstances. Judgment needs to be applied - how many transactions we have? How many products? Etc. etc.

In some situations, we could take the customers without recommendations and run them through secondary process - take a part of their basket that has a high “lift” to a product that is not yet in the basket - obviously the SQL gets a bit more complicated. We can deal with most of edge cases in similar manner. At some point, however, going to a dedicated recommendation app such as the one from Recombee would be much better use of resources. This is no Netflix recommendation system :).

This was tons of fun to build, and totally good enough for the proof of concept project. We’ll write the recommended product back into the CRM system as a custom field, and the sales people will know exactly what to bring up next time! Or, perhaps, we’ll use some of the integrations with mailing systems to send these customers just the right piece of content.

If interested, get in touch to learn more!

Thanks for reading,


Find the Right Music: Analyzing last.fm data sentiment with Keboola + Tableau

                               Find The Right Musicpng

As we covered in our recent NLP blog, there are a lot of cool use cases for text / sentiment analysis.  One recent instance we found really interesting came out of our May presentation at SeaTUG (Seattle Tableau User Group.)  As part of our presentation / demo we decided to find out what some of the local Tableau users could do with trial access to Keboola; below we’ll highlight what Hong Zhu and a group of students from the University of Washington were able to accomplish with Keboola + Tableau for a class final project!

What class was this for and why did you want to do this for a final project?

We are a group of students at the University of Washington’s department of Human Centered Design and Engineering.  For our class project for HCDE 511 – Information Visualization, we made an interactive tool to visualize music data from Last FM.  We chose the topic of music because all 4 of us are music lovers.

Initially, the project was driven by our interest in having an international perspective on the popularity vs. obscurity of artists and tracks.  However, after interviewing a number of target users, we learned that most of them were not interested in rankings in other countries.  In fact, most of them were not interested in the ranking of artists/tracks at all.  Instead, our target users were interested in having more individualized information and robust search functions, in order to quickly find the right music that is tailored to one’s taste, mood, and occasion.  Therefore, we re-focused our efforts on parsing out the implicit attributes, such as genre and sentiment, from the 50 most-used tags of each track.  That was when Keboola and its NLP plug-in came into play and became instrumental in the success of this project.

What specific data set(s) did you analyze and how did you collect it?

We extracted a large amount of data from Last FM’s API.  On the high level, there are two main data sets: top 100 artists in each country and top 100 tracks in each country.  For each of the two data sets, we extracted ranking, country name, total play count of all times, URL linking to the artist/track, and top 50 tags for each artist/track.  The total number of data points was over 2 million.  We eventually narrowed them down to 6 dimensions in our final visualization: Track Name, Artist, Play Count, URL, Top Tag, and Overall Sentiment Score.  We also decided to only visualize the top tracks data set due to time constraint of the school quarter.

Latest cleaned data set with sentiment scores:



So you’ve got the data, now what?

Moving over to our trial access to the Keboola platform, it was a few simple clicks and an authorization to access the data through Dropbox and bring in the spreadsheet.

                                Screen Shot 2016-06-14 at 123259 PMpng

Once you have the data you want to analyze, it’s a matter of clicking into the Keboola app store, selecting the app you want to run (NLP highlighted below) and choosing the table you want to analyze.  


Now what?

Because the outputs from Geneea were 50 separate tables – one for each tag, we needed to combine them into one table and calculate the overall sentiment score for each track.

Due to our limited experience in Python and lack of SQL knowledge, we were only able to join 2 tables at a time using Transformation. In the end, we downloaded all the tables and joined them locally.   (**Typically this would be done with a SQL, Python or R transformation within the platform.)

Ready to visualize

Once the data is ready for analysis, its simple to download it as a .TDE file from Keboola or send it directly to Dropbox or Google Drive for consumption in Tableau desktop.  You can also create a live data connection directly to Tableau Server.  

                                   Screen Shot 2016-06-14 at 122912 PMpng

In this case, Tableau public  was the right choice for data visualization.  I’ve provided a screen shot below or you can check out the live viz here.


                                     Find The Right Musicpng

We’re glad we could lend a hand to the UW students (and thanks for letting us be part of a cool data project!)  As mentioned at the outset of the blog, please check out our previous blog  The value of text (data) and Geneea NLP app if you’d like to learn a bit more about the app or feel free to reach out.



If you want to learn more about the Tableau + Keboola integration, check out our brief YouTube video!

The value of text (data) and Geneea NLP app

Just last week, a client let out a sigh: “We have all this text data (mostly customer reviews) and we know there is tremendous value in that set but outside from reading it all and manually sorting through it, what can we do with it?”

With text becoming a bigger and bigger chunk of a company’s data intake, we hear those questions more and more often. A few years ago, the “number of followers” was about the only metric people would get from their Twitter accounts. Today, we want (and can) know much more; What are people talking about? How do we escalate their complaints? What about the topics trending across data sources and platforms? Those are just some examples of questions we’re asking of NLP (Natural Language Processing) applications at our disposal.

Besides the more obvious social media stuff, there are many areas where text analytics can play an extremely valuable role. Areas like customer support (think of all the ticket descriptions and comments), surveys (most have open-ended questions and their answers often contain the most valuable insights), e-mail marketing (whether it is analyzing outbound campaigns and using text analytics to better understand what works and what doesn’t, or compiling inbound e-mails) and lead-gen (what do people mention when reaching out to you) to name a few. From time to time we even come across more obscure requests like text descriptions of deals made in the past that need critical information extracted (for example contract expiration dates) or comparisons of bodies of text to determine “likeness” (when comparing things like product or job descriptions).

The “common” way to deploy text analytics service today is an API integration. There are quite a few services out there (Alchemy API, Rosette spring to mind) that allow anyone with an account to submit data into their API, and receive back results. While perfectly doable, it means that the customer needs a developer’s/engineer’s capacity and what company has these valued employee sitting around with nothing better to do?

Enter Geneea and their app in the Keboola Connection App Store:

While Geneea also offers API service to exchange data with their Interpretor platform crowned with the Frida dashboard, they early on recognized the potential of the Keboola Connection platform. It is Keboola’s job to remove the complexity (and the need for aforementioned developer time) from accomplishing data tasks. This is why having a strong NLP partner has been one of our key priorities. Check out what they can do with your text here.

Today, we have multiple customers utilizing the app to process text data for the use cases described above. Once your data is managed by Keboola Connection, setting up the text data enrichment with the Geneea app takes less time (actually, about 20% of it) than you just spent reading this blog!

Ready to attack your own text analytics opportunity?

Check out what Geneea wrote about their app on their blog.

Keboola and Slalom Consulting Team up to host Seattle’s Tableau User Group

On Wednesday, May 18th, Keboola’s Portland and BC team converged in Seattle to host the city’s monthly Tableau User Group with Slalom Consulting. We worked with SeaTUG’s regular hosts and organizers, Slalom Consulting, to put together a full evening of discussion around how to solve complex Tableau data problems using KBC. With 70+ people in attendance, Seattle’s Alexis Hotel was buzzing with excitement! 

The night began with Slalom’s very own Anthony Gould, consultant, data nerd and SeaTUG host extraordinaire, welcoming the group and getting everyone riled up for the night’s contest - awarding the attendee who’s SeaTUG related tweet got the most retweets! He showed everyone how we used Keboola Connection (KBC) to track that data and prepared them that this would be updated at the end of the night and prizes distributed!

Anthony passed off the mic to our very own Milan Veverka who got to the heart of the evening’s presentation, explaining how users and attendees can use KBC to solve the complex data problems that would be presented throughout the evening. Throughout the rest of the evening, Milan continued to present topics such as, “When SQL isn’t enough” and you want R or Python to get the results you want, data cleanliness and twitter text enrichment. He shared the stage with Slalom consultant Frank Blau, who presented on a variety of Internet of Things (IoT) topics, including weather data enrichment and working with magnetometer and EKG data.

Throughout the presentation and during the following breakout sessions, the audience was engaged, excited, asking lots of questions and doing a lot of laughing for such a technical presentation! Over the coming weeks, we’ll be releasing some video from the night and sharing more takeaways and results! We loved the experience and look forward to hosting more TUGs around North America!

Cleaning Dirty Address Data in KBC

There is an increasing number of use cases and data projects for which geolocation data can add a ton of value - e-commerce and retail, supply chain, sales and marketing, etc.  Unfortunately, one of the most challenging asks of any data project is relating geographical information to various components of the dataset. On a more positive note, however, KBC’s easy integration with Google apps of all kinds allows users to leverage Google Maps to add geo-coding functionality. Since we have so many clients taking advantage of geo-coding capabilites, one of our consultants, Pavel Boiko outlined the process of adding this feature to your KBC environment. Check it out!  

Anatomy of an Award Winning Data Project Part 3: Ideal Problems not Ideal Customers

Hopefully you’ve had a chance to read about our excitement and pride upon learning that two of our customers had won big awards for the work we’d done together. To jog your memory, Computer Science Corporation (CSC)’s marketing team won the ITSMA Diamond Marketing Excellence Award as a result of the data project we built together. CSC used KBC to bridge together 50+ data sources and pushing those insights out to thousands of CSC employees. To catch up on what you missed or to read again, revisit our Part 1 of our Anatomy of an Award Winning Data Project. 

Additionally, the BI team at Firehouse Subs won Hospitality Technology’s Enterprise Innovator Award for its Station Pulse dashboard built with a KBC foundation. The dashboard measures each franchise’s performance based on 10 distinct metrics and pulling data from at least six sources. To catch up on what you missed or to read again, revisit our Part 2 of our Anatomy of an Award Winning Data Project.

We’re taught that most businesses have a “typical” or “ideal” customer. When crafting a marketing strategy or explaining your business to partners, customers and your community, this concept comes up repeatedly. And we don’t really have a ready-made answer. A data-driven business can be in any industry and the flexibility and agility of the Keboola platform is by its very nature data source and use case agnostic.

And so, when these two customers of ours both won prestigious awards highlighting their commitment to data innovation, it got us thinking. These two use cases are pretty different. We worked with completely different departments, different data sources, different end-users, different KPIs, etc. And yet both have been successful, award-winning projects.

We realized that perhaps the question of an ideal customer isn’t really relevant for us. Perhaps we’d been asking the wrong question all along. We can’t define our target customer, but we can define the target problem that our customers need help solving.

Read more »

Anatomy of an Award Winning Data Project Part 2: Firehouse Subs Station Pulse BI Dashboard

As we reported last week, we are still beaming with pride, like proud parents at a little league game or a dance recital. Not one, but two!, of our customers won big fancy awards for the work we did together. The concept of a data-driven organization has been discussed and proposed as an ideal for a while now, but how we define and identify those organizations is certainly still up for debate. We’re pretty confident that these two customers in question - Computer Sciences Corporation (CSC) and Firehouse Subs - would be prime contenders. These awards highlight their commitment to go further than their industry counterparts to empower employees and franchisees to leverage data in new and exciting ways. 

If you missed last week’s post with CSC’s Chris Marin, check it out here. Today, let’s learn more about Firehouse Subs award winning project. In case you don’t know much about Firehouse Subs, let me bring you up to speed. The sandwich chain started in 1994 and as of March 2016 has more than 960 locations in 44 states, Puerto Rico and Canada. Firehouse Subs is no stranger to winning awards, either. In 2006, KPMG named them “Company of the Year” and they’ve been recognized for their commitment to community service and public safety as well through Firehouse Subs Public Safety Foundation®, created in 2005.  

Now let’s hear from our project champion and our main ally at Firehouse Subs, Director of Reporting and Analytics, Danny Walsh.

Here are the stats:

Firehouse Subs “Station Pulse” Business Intelligence Dashboard
Purpose: Help Firehouse Subs’ franchisees to improve performance by measuring key metrics of their operations against regional and network-wide benchmarks
Delivery method: Dashboard embedded into the Franconnect platform used by the franchisees
Data sources: Daily sales, voice of customer platforms, social media, weather data
Project Lead (champion): Danny Walsh, Director of Reporting and Analytics, Firehouse Subs
Executive Sponsor: Vince Burchianti, CFO, Firehouse Subs
Solution Architect: Marcus Wong, Keboola
Award Details: http://www.marketwired.com/press-release/hospitality-technology-announces-2015-restaurant-breakthrough-award-winners-2062616.htm

Keboola: What was the overall objective for the project?

Danny: We had access to many different data sources, but a multitude of logins made it challenging to create data correlations and paint the overall picture of a restaurant’s performance.

We needed a consolidated method to review data from multiple sources. Our Reporting & Analytics team created a scorecard based on 10 key metrics that measures sales and operational performance for each restaurant. This gives us an overall view of restaurant performance and clearly highlights areas of improvement.

How did you expect users would benefit and why would they “bother”?
Users benefit by seeing the overall picture of restaurant performance. Operational and sales metrics are pulled into one dashboard for user-friendly access.

Did the process lead you to make any changes to the vision along the way? How do you see those in retrospect?
It’s very easy to get wrapped up into a lot of data that can ultimately lead to clutter and noise on the dashboard.  Stick with the most important data that can lead to actionable insights.

What do you see as next steps, what are the learnings so far?
We always are looking to improve upon our dashboard and display the most important and relevant data to our franchisees in a user-friendly environment.

How did Keboola help you realize your vision and meet the project’s objectives?        
Keboola was involved in the development and implementation of the dashboard from the very early stages.  They were conferenced into our internal brainstorming sessions with key leadership team members and gave feedback and suggestions as we mapped out the design of the dashboard.  Also, they were instrumental with the roll-out process when the project was complete by conducting webinar training sessions for staff.

What did your Keboola Architect provide to the process?
Our architect, Marcus Wong, wasn’t afraid to voice his opinion and challenge us when we were developing a mock-up of the dashboard.  It was great to bounce ideas off of one another to create what became a Break-Through Award winner in the category of Enterprise Innovator for Hospitality Technology (HT) Magazine.

Following implementation, what has been the rate of adoption and support for ongoing usage and development?
The dashboard is used from the general manager level all the way to the C-Suite at Firehouse Subs Corporate Headquarters.

If you read last week's post about CSC and their winning data project, you'll see some similarities here. It's clear that the right people make all the difference in ensuring the success of a business intelligence project. Next week we'll dive deeper into the ideal conditions for creating a powerful data analytics experience. Stay tuned!

Anatomy of an Award Winning Data Project Part 1: CSC and Marketing Analytics

Here at Keboola, we take pride in working closely with partners and customers ensuring that each project is a success. Typically we’re there from the beginning - to understand the problem the client needs to solve; to help them define the scope and timeline of the implementation; to provide the necessary resources to get buy in from the rest of their team; to offer alternative perspectives and options when mapping out the project; and to be their ally and guide throughout every step of the process. With all that work, all that dedication, it turns out we develop quite a soft spot for both our clients and their projects. 

We’ve got skin in the game, so when one of our clients receives an award because of the project we worked on together, we get pretty excited. And when two clients receive an award because of our work together, well, then we’re downright ecstatic and ready to celebrate!

At the end of 2015, two customers were honored for their commitment to data innovation. Firehouse Subs® was awarded the Hospitality Technology Innovation Award and the digital marketing team at Computer Science Corporation (CSC) for the ITSMA Diamond Marketing Excellence Award.

Since new partners and clients often ask us to explain what components and environment cultivate a successful data project, we thought we’d take this exceptional opportunity to ask our customers themselves: Danny Walsh, Director of Reporting and Analytics, Firehouse Subs and Chris Marin, Senior Principal, Digital Marketing Platform & Analytics, CSC.

Over the next couple of weeks, we’ll share each of their stories and explain how we feel these separate use cases in two distinctly different industries are reflective of what we at Keboola view as the ideal conditions for creating a wildly successful - award-winning even - data project.

Read more »

Empowering the Business User in your BI and Analytics Environment

There’s one trend on Gartner’s radar that hasn’t changed much over the last few years and that’s the increasing move toward a self-service BI model. Gone are the days of your IT or analytics department being report factories. And if those days aren’t gone for you, then it’s time you make some substantive changes to your business intelligence environment. When end-users are forced to rely on another department to deliver the reports they need, the entire concept of being a “data-driven” organization goes right out the window. 

So other than giving your users access to ad hoc reporting capabilities, how do  you empower the user?

Read more »