"Data Monetization" is a term you might have heard a lot lately. But what does it really mean for you and your business? There is gold in your data, but how can you extract it to gain all its benefits without adding resource burdens on your business? We collected the main approaches successful companies are using to give you inspiration and insight into how you can use data you already have to improve efficiencies, create new revenue streams or increase value and hence your wallet share from your current customer base.
Use data to make better decisions
To kick-off our new series about creating data products, we decided to write a white paper. This sounds simple, but this time it was a little more difficult than we expected.
It all started yesterday morning when I saw multiple tweets mentioning new forecasting library published on my way to work:
Sounds interesting, I thought. I bookmarked the link for “weekend fun with code” and moved on. The minute I stepped in the office, Amazon S3 had the outage (coincidence?) which impacted half of the internet and KBC as well. Ok, what can i do now then?
I opened link to the facebook engineering page and started reading about the forecasting module. They supplied quite simple instructions and it made me tempted to test it out. Wouldn't it be great to use it in some KBC projects?
Since the code needed for forecasting is pretty simple, I mocked up a script of suitable for KBC use before lunch and when amazon (US-east) got back up, I could implement the code as a custom science app.
The algorithm requires two columns, date and the value column. The current script gets the source and result tables’ information from the input and output mapping and the parameters specified by user. Those parameters will define:
Date column name
Value column name
Required prediction length (period)
This is how it looks like in the Keboola:
To see the output in a visual form, I used Jupyter, which has been recently integrated within KBC. Not bad for a day’s work, what do you say?
Just imagine how easy would be for our user to orchestrate the forecasting process:
Extract sales data
Enrich data by forecasted values
Publish them to sales and marketing teams
- The sample data I used sucks. I bet yours will be better!
- Here is the link for Jupyter notebook.
- Feel free to check some other custom science apps I did: https://bitbucket.org/VFisa/
Where Prophet shines (from Facebook page)
- hourly, daily, or weekly observations with at least a few months (preferably a year) of history
- strong multiple “human-scale” seasonalities: day of week and time of year
- important holidays that occur at irregular intervals that are known in advance (e.g. the Super Bowl)
- a reasonable number of missing observations or large outliers
- historical trend changes, for instance due to product launches or logging changes
- trends that are non-linear growth curves, where a trend hits a natural limit or saturates
Martin Fiser (Fisa)
Keboola, Vancouver, Canada
How to trigger orchestration by form submission
Keboola just implemented a product assessment tool dedicated to OEM partners. The form's results will show how submitters fare in the various dimensions of data product readiness, areas on which to focus, and specific next steps to undertake.
We wanted to trigger the orchestration that extracts the responses (have you noticed new Typeform extractor?), processes the data, and updates our GoodData dashboard with answers. There was no option to use "Magic Button" to do so because there is no guarantee the respondent would click on it at the end of the form.
We’re always keeping an eye out for BI and analytics experts to add to our fast growing network of partners and we are thrilled to add a long-standing favorite in the Tableau ecosystem! InterWorks, who holds multiple Tableau Partner Awards, is a full spectrum IT and data consulting firm that leverages their experienced talent and powerful partners to deliver maximum value for their clients. (Original announcement from InterWorks here.) This partnership is focused on enabling consolidated end-to-end data analysis in Tableau.
Whether we’re talking Tableau BI services, data management or infrastructure, InterWorks can deliver everything from quick-strikes (to help get a project going or keep it moving) to longer-term engagements with a focus on enablement and adoption. Their team has a ton of expertise and is also just generally great to work with.
InterWorks will provide professional services to Keboola customers, with the focus on projects using Tableau alongside Keboola Connection, both in North America and in Europe, in collaboration with our respective teams. “We actually first got into Keboola by using it ourselves,” said InterWorks Principal and Data Practice Lead Brian Bickell. “After seeing how easy it was to connect to multiple sources and then integrate that data into Tableau, we knew it had immediate value for our clients.”
What does this mean for Keboola customers?
InterWorks brings world-class Tableau expertise into the Keboola ecosystem. Our clients using Tableau can have a one-stop-shop for professional services, leveraging both platforms to fully utilize their respective strengths. InterWorks will also utilize Keboola Connection as the backbone for their white-gloves offering for a fully managed Tableau crowned BI stack.
Whether working on projects with customers or partners, we both believe that aligning people and philosophy is even more critical than the technology behind it. To that end, we’ve found in InterWorks a kindred spirit, we believe in being ourselves and having fun, while ensuring we deliver the best results for our shared clients. The notion of continuous learning and trying new things was one of the driving factors behind the partnership.
Have a project you want to discuss with InterWorks?
It’s been quite an exciting year for us here at Keboola and the biggest reason for that is our fantastic network of partners and customers -- and of course a huge thanks to our team! In the spirit of the season, we wanted to take a quick stroll down memory lane and give thanks for some of the big things we were able to be a part of and the people that helped us make them happen!
Probably the biggest news from a platform perspective this year came about two years after we first announced support for the “nextt” data warehouse called Amazon Redshift. At the time, it was a huge step in the right direction. We still use Redshift for some of our projects (typically due to data residency or tool choice) but this year we were thrilled to announce a partnership born in the cloud when we officially made the lightning fast and flexible Snowflake the database of choice behind our storage API and the primary option for our transformation engine. Not to get too far into the technical weeds (you can read the full post here,) but it has helped us deliver a ton of value to our clients (better elasticity and scale, huge performance improvement for concurrent data flows, better “raw” performance by our platform, more competitive pricing for our customers and best of all, some great friends!) Since our initial announcement, Snowflake joined us in better supporting our European customers by offering a cloud deployment hosted in the EU (Frankfurt!) We’re very excited to see how this relationship will continue to grow over the next year and beyond!
One of our favorite things to do as a team is participate in field events so we can get out in the data world and learn about the types of projects people work on, challenges they run into, and find out what’s new and exciting. It’s also a great chance for our team to spend some time together as we span the globe - sometimes Slack and Goto Meeting isn’t enough!
SeaTug in May
We had the privilege of teaming up with Slalom Consulting to co-host the Seattle Tableau User Group back in May. Anthony Gould was a gracious host, Frank Blau provided some great perspective on IoT data and of course Keboola’s own Milan Veverka dazzled the crowd with his demonstration focused on NLP and text analysis. Afterwards, we had the chance to grab a few cocktails, chat with some very interesting people and make a lot of new friends. This event spawned quite a few conversations around analytics projects; one of the coolest came from a group of University of Washington students who analyzed the sentiment of popular music using Keboola + Tableau Public (check it out.)
How breaking up with Snowflake.net is like breaking up with the girl you love
“Cheers, how much would it cost to have Peta in Snowflake? Eda”
Pavel’s Feedback on Their Snowflake Testing
This sentence is summing it all up. And our last phone call was not a happy one. It did not work out. Avast will not migrate into Snowflake. We would love it, but it can’t be done at this very moment. But I’m jumping ahead. Let’s go back to the beginning.
The first time we saw Snowflake was probably just before DataHackathon in Hradec Kralove. It surely looked like a wet dream for anyone managing any BI infrastructure. Completely unique features like cloning the whole DWH within minutes, linear scalability while running, “undrop table”,”select…at point in time” etc.
“Long story short — overpromised. 4 billion are too much for it. The query has been parsing that json for hours, saving the data as a copy. I’ve already increased the size twice. I’m wondering if it will ever finish. The goal was to measure how much space it will take while flattened, jsoin is several times larger than avro/parquet…."
Data in Snowflake are roughly the same size as in our Hadoop, yet I would suggest to expect 10% — 20% difference.
Performance: we didn’t find any blockers.
Security/roles/privileges: SNFLK is much more mature than Hadoop platform, yet it cannot be integrated with on-premise LDAP.
Stability: SNFLK is far more stable than Hadoop. We didn’t encounter a single error/warning/outage so far. Working with Snowflake is nearly the opposite to hive/impala where errors and cryptical and misleading error messages are part of the ecosystem culture ;).
Concept of caching in SNFLK cannot be fully tested, but we have proved that it affects performance in a pleasant yet a bit unpredictable way.
Resource governance in SNFLK is a mature feature, beast type of queries are queued behind the active queries while small ones sneak through etc.
Architecture of separated 'computing nodes' can stop inter-team collisions easily. Sounds like marketing bullshit, but yes, not all teams do love each other and are willing to share resources.
SNFLK can consume data from various sources from most of cloud-on/on-premise services (Kafka, RabbitMQ, flat files, ODBC, JSBC, practically any source can be pushed there). Its DWH as a service architecture is unique and compelling (Redshift/Google BigQuery/GreenPlum could possibly reach this state in the near future).
Migration of 500+ TB data? Another story — one of the points that undermine our willingness to adopt Snowflake.
SNFLK provides limited partitioning abilities; it can bring even more performance, once enabled at full scale.
SNFLK would allow platform abuse with all of its 'create database as a copy', 'create warehouse as a copy', 'pay more, perform more'. And costs can grow through the roof. Hadoop is a bit harder to scale which somehow guarantees only reasonable upgrades ;).
- SNFLK can be easily integrated into any scheduler. Its command line client is the best one I’ve seen in last couple of years.
Notes from Eda
If I had to pay the people in charge of Hadoop US wages instead of Czech wages, I would get Snowflake right away. That’s a no brainer #ROI.Unfortunately, we will not go for it right now. Migrating everything is just too expensive for us at the moment and using Snowflake only partially just doesn’t make sense.
We would like to show you how some of our clients redefined their businesses by routinely using data in their daily activities. Despite the fact that each company’s situation is different, we hope to give you some ideas to explore in your own business.
If you work in a service agency, as a customer care manager or in similar type positions, you are all about efficiency. Any idle time spent on non-revenue generating activities means wasted time and manpower, and more importantly, a net loss for your organization.
To ensure optimal operation, you may be asking yourself questions like this:
Is your team correctly prioritizing clients with a higher profit margin?
How are individual team members performing compared to each other?
Are team members doing the work they are best suited for?
Sometimes the simplest graphs show the most relevant information. The graph that you see below (generally known as "bullet chart") has been coined the “earthworm” by our clients. Provided by one of our clients, this particular graph eloquently shows agent performance overall, as well as in comparison to the team average.
As a manager, imagine having one of these for each of your agents. In a mere seconds you can distinguish your top vs. poor performers and take the actions needed to enhance or improve their behavior.
Diving deeper into individual performance, you can then examine why each agent is performing the way they are. After you take a look at the next client example, you will see that this series of earthworms track agent performance in different areas.
Even though we understand that every company and each department within it have very different BI needs, we also believe in sharing inspiration from our clients about how they make relevant business decisions using data in their daily routines. You might find this helpful in shaping your own solution.
When planning a new product launch and deciding where to spend your marketing budget, you probably have questions regarding the impact of your campaign:
How long will it take to turn marketing leads into faithful customers?
Did I target the correct customer group?
Do my potential customers respond to the advertisement as expected?
What is the return of investment for my campaign based on different target groups and products?
Check out similar questions our clients have asked. Combine them with an analytical mindset, and create the reports your company needs to invest in better marketing decisions, and to generate a higher return on investment.
Roman Novacek from Gorilla Mobile says: “When looking at our marketing model, everything seemed to be going according to plan. But when we looked deeper into what we thought were well-performing campaigns, we found out that while some ads and channels were performing extraordinarily well, others were draining the overall average leading to mediocre results.”
McPen is a European chain distributor of stationery goods. They are one of the first small to mid-sized retailers who use a data-driven approach to business and enable equal access to data to all of their employees.
Embarking on their data-driven business journey, McPen realized that to excel in the stationery goods space, they would need to create a competitive advantage with a unique operational management system. In order to identify retail solutions specific to their business, they wanted to combine many previously unconnected data sources, and upgrade and speed up their reporting process.
Where Keboola came in
Assisted by the Ascoria team, our partner, McPen’s CEO Milan Petr configured the new system from scratch and without the help of a single developer. McPen began to pull data from sources like their POS, Frames and other retail sources, allowing everybody in the company to use this compiled and easily accessible data to find solutions to their real retail problems.
Focusing on lean operations and adding new features, Milan created a system that benefitted the entire organization. He knew that to effectively manage shifts in business, he had to involve every part of the organization in making decisions based on data. Leading by example, he developed and studied the system in detail to understand its impact on daily operations. He then provided access and support directly to the people on the floor to empower them to make necessary strategic decisions and improve their daily results.
Surprising benefits and results
Examined data showed that in order to maximize profitability, McPen needed to upsell customers. And while their biggest income comes from customers who spend between 200 and 500 CZK (around 8 to 20 USD), it is the 42% of all McPen customers spending up to 50 CZK (around 2 USD) who have the biggest potential for the upsell.