The age old conflict. IT needs centralization, governance, standards and control; on the other side of the coin? Business units need the ability to move fast and try new things. How can we get lines of business access to the data they need to for projects so they can spend their time focused on discovering new insights? Typically they get stuck in a bottleneck of IT requests or spending 80% of their time doing data integration and preparation. Neither group seems particularly excited to do it, and I don’t blame them. For the analyst it increases the complexity of their tasks and seriously raises the technical knowledge requirements. For IT, it’s a major distraction from their main purpose in life, an extra thing to do. Self serve BI is trying to destroy the backlogged “report factories,” only to replace them with “data stores,” which are sadly even less equipped for the job at hand. Either way, the result is a painfully inefficient process, straining both ends of the value chain in any company that embarks on the data driven journey.
The Bi-Modal BI Answer?
An organization's ability to effectively extract value from data and analytics while maintaining a well governed source of truth is the difference between competitive advantage or sunken costs and missed opportunities. How can we create an environment that provides the agile data access needed by the business users while still maintaining sound data governance? Gartner has referred to a Bi-modal IT strategy. A big challenge with Bi-modal IT is that it pushes IT management to divide their efforts between ITs traditional focus and a more business focused agile methodology.
The DBA and Analyst Divide
Another major challenge in data access comes from the separation between DBAs and business users. Although the technical side may have the necessary expertise to implement ETL projects, they often lack the business domain expertise needed to make the correct assumptions around context and how the data is regarded. With so many projects competing for resources, we shouldn’t have to task a DBA on all of them. Back to the flip side of the coin, data analysts and scientists want the right data for their tools of choice and they want it fast. Even though there is growing set of data integration tools that allows individual business units to create and maintain their own data projects, this typically requires a lot of manual data modeling and can lead to siloed data or inconsistent metrics.
Instead of controlling all of BI, IT can enable the business to develop their analytics without sacrificing control and governance standards. So how can we get the right data in the hands of people who understand and need it in a timely manner?
Focus on Data, Not Configuration
Thanks to the cloud, we can free ourselves of desktop ETL tools and data silos by creating an environment where data problems don’t exist. By bringing together all of the underlying architecture needed for data projects there’s no longer the need to cobble together technologies for storage, data warehousing, integration, preparation and transformation; we’ve taken care of it. So what? Nothing to install, free up DBAs and developer resources.
Data For All
Data curation includes all the functions needed to provide accessible, understandable, annotated quality data to the business user regardless of their tools of choice. It’s important to use a platform that can bring in legacy and cloud data and connect it to all the desired points of consumption instead of having each team use a spot tool. This also helps avoid data silos. Analysts might want to do data discovery and create visualizations in Tableau, Looker, GoodData or any other number to tools. At the same time, we need to enable data scientists to perform tasks using SQL, R and Python. Can the product team also bring in additional data sources to enrich their client facing data product? What if we want to connect our customers CRM data to our product but one set of users have Salesforce and the other set uses SugarCRM? Even with the analytics we provide our clients, might they also want a direct feed of their data so they can do further analysis? The whole goal is to get data to the people that understand it so they can extract more value from it.
What About Governance?
Data governance is the set of processes, policies and technology that organizations use to store data in its native form, process it, prepare and transform it, track it and maintain its lifecycle. One of the benefits of a cloud based platform is the ability for each end user to access the data they need while still maintaining a well governed source of truth. Since all the data is on a single platform, it tracks data lineage with a meticulous audit trail of where the data came from, what happened to it and where it went. This also creates an environment where IT can determine which end users get access to specific data sets.
Architecture & Security
In a typical model, storage, warehousing, ETL and data prep are separate components that need to be brought together. These are all connection points that need to be managed, knobs that need to be turned (especially as different components release new version, etc) and security considerations that need to be accounted for. By bringing all the components under one roof, the ability to manage updates and ensure security at all the connection points becomes much simpler. There is no longer a need to maintain infrastructure, install and update software or continuously ensure compatibility. And don’t worry, all of this is accomplished through easy to use interface and some SQL.
Another performance challenge we recently solved is the competition for storage and compute resources with multiple teams running recurring and one-off data loads and executing analysis at the same time. Thanks to our multi-tenant architecture and a recent technology partnership that integrated the Snowflake cloud data warehouse into Keboola Connection (KBC,) we are able to provision storage and compute on the fly. This ensures the necessary resources to support all users and workloads simultaneously while maintaining the flexibility to change as projects do.
The goal is to get business units, power users, the product team, partners, clients and any other relevant users the data they need with the speed and flexibility to shape and compare data the way they need to, while still maintaining governance and control. By combining the necessary elements into a single platform it allows companies to focus on doing something with data, not configuration.
You might also like our blog on 3 Critical Steps to Evangelize the New in Business Intelligence and Analytics. Please contact us if you'd like to learn more about Keboola's data curation platform!