There is a lot of talk about AI in the enterprise. But you can’t train models without a tremendous amount of data. The good news is that corporations and other organizations generate a ton of it. The bad news is that most of it is so-called “Dark Data.”
Gartner defines “Dark Data” as: “… the information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes (for example, analytics, business relationships and direct monetizing). Similar to dark matter in physics, dark data often comprises most organizations’ universe of information assets. Thus, organizations often retain dark data for compliance purposes only. Storing and securing data typically incurs more expense (and sometimes greater risk) than value.”
You can find it in the so-called “zombie” systems: legacy systems developed and maintained by internal IT resources. You can find it in Excel spreadsheets and other Office files scattered around the organization. Dark data isn’t part of the corporate “data lake”; it exists as loosely connected (if at all) “data cesspools” siloed throughout the organization. Typically, there is no standard structure to the data. In some cases, people don’t even know of its existence, or at least not broadly.
Organizations generate data faster than staff can figure out what to do with it.
In a recent Splunk survey of business people and IT managers, “The only country in this survey where a high percentage of respondents said they’re working hard to meet the challenges of both dark data and AI is China.”
In addition to the risk of holding data (the risk of being hacked, the risk of violating laws like HIPAA, and the complications associated with storing data with third parties in the cloud), organizations are failing to capture the benefit this data can deliver. And they know it.
According to Splunk’s survey of more than 1,300 senior individuals, more than half of respondents say that “data-driven” is nothing more than a slogan for their organization. Yet, everyone recognizes the importance of AI for future competitiveness. ““Today’s dark data could one day be an accelerant for even greater AI performance. Thus, the advent of AI and the value of dark data go hand in hand.”
At EdgeworthBox, we have embraced the Open Contracting data standard.
“The Open Contracting Data Standard (OCDS) enables disclosure of data and documents at all stages of the contracting process by defining a common data model. It was created to support organizations to increase contracting transparency, and allow deeper analysis of contracting data by a wide range of users.”
In practice, many organizations will want a restricted transparency. For example, a corporation may want to share proprietary information only with its internal users for analytic purposes. EdgeworthBox permits private databases for use as a repository for organization-specific data (RFPs, contracts, etc.). By opening the data to anyone in the organization given our business model of permitting an unlimited number of seat licenses for any organization using the service, we enable the transformation of the RFP process into a truly strategic one. We sit on top of the existing RFx system, providing a lens into its inner workings for people across the company or agency, casting light onto the previously dark data. Let’s talk about how to improve the RFP process.