Data science

Data Warehouse Costs Soar, ROI Still Not Realized

Enterprises are pouring money into data management software – to the tune of $73 billion in 2020 – but are seeing very little return on their data investments.  According to a new study out from Dremio, the SQL Lakehouse company, and produced by Wakefield Research, only 22% of the data leaders surveyed have fully realized ROI in the past two years, with most data leaders (56%) having no consistent way of measuring it. 

In addition to the survey findings, the report also introduces the Data Value Scorecard,  a measure of companies’ efforts to unlock the value or ROI of data. Through a series of metrics that assesses their data policies, the Scorecard provides a quick, aggregated view of how efficiently companies and their employees use and manage data.

“Data leaders are frequently concerned about the out-of-control costs of data warehouses, particularly as their workloads grow –  a big surprise to them since the data warehouse cost of entry can be low to start. Moreover, the lack of predictability around future costs leads to very poor financial governance for data leaders and CFOs alike,” said Billy Bosworth, CEO, Dremio. “While companies understand they have an expensive sunk cost for data warehouses, they are keen on ways to drive newer analytics architectures directly from open data lakes for better financial governance and highly performant queries against vast amounts of data.” 

The report found that in order to run analytics, enterprises are making multiple copies of their data – 12 copies on average. A staggering 60% report that their company has over 10 copies of their data floating around. 

To add insult to injury, 82% say that their end users have used inconsistent versions of the same dataset at the same time due to their extract transform and load (ETL) processes, undermining the data integrity & trust, and slowing down the decision-making process. Notably:

Among all data leaders who use a data warehouse, 94% report concerns over it.79% are concerned about scaling their architecture; only 20% report having no concerns about the cost of scaling.

84% of data leaders say it is normal for data analysts at their company to work with a partial data set – this means many data analysts may be working with insufficient data.Just 16% expect fresh data in minutes or hours; 51% expect it in terms of weeksMore than 3 in 4 data leaders (76%) say they are locked into certain vendors due to their closed systems. This means just a quarter of companies have true freedom to explore new and innovative data tools and analytics.

Data Value Scorecard

Respondents answered questions pertaining to their data processes and architectures, including common pain points for inefficient set-ups, underestimating timelines, using inconsistent or partial versions of datasets and being locked into certain vendors. Respondents were scored on a pass-fail basis.

Despite spending years investing in ways to collect data, few have figured out how to efficiently and effectively use data for business purposes. On average, companies with data leaders score 26% – meaning they passed 2.6 metrics.

28% passed, answering that it is “very easy” for end users to access data and develop insights20% passed, saying data project timelines “rarely or never” underestimate ETL times20% passed, reporting their company has “little to no” restrictions on data access for governance

About the Dremio Data Value Scorecard

The Dremio Data Value Scorecard is a survey scorecard developed by Dremio and executed by Wakefield Research. Wakefield Research conducted a quantitative research study in June 2021 among 500 Data & Analytics Leaders at enterprises in the U.S., UK, Germany, Denmark, Sweden, Norway, Australia, Hong Kong, and Singapore. Enterprises were polled regarding their data value — that is, how efficiently they are able to use data within their organization — and their ability to use data for business decisions.

Sign up for the free insideBIGDATA newsletter.

Join us on Twitter: @InsideBigData1 – https://twitter.com/InsideBigData1

Back to top button