Assessing Your Data Provenance Score

By: Georg Greve, Andrea Worrlein

Provenance describes the history of the state, custody, or location of something. At first, it was a term used mostly for the arts to answer questions such as, “Is this really a Picasso?” Provenance, though, also applies to the digital world, where it is part of the bedrock on which our entire digital existence rests. So, here are three aspects left largely unconsidered about provenance, and how to best assess a company’s own digital provenance.

Try proving you're NOT a dog

Just ten years ago, our world was still largely “analog-first.” Our correspondence, contracts, invoices, important documents and certificates all had analog originals. That is rapidly changing now. Most of our certificates, obligations and assets are increasingly “digital-first.” This process started years ago, but accelerated dramatically with the COVID-19 crisis, and most of us have not yet fully considered the implications.

The analog world is much more secure and resistant to fraud. Falsifying physical documents also leaves physical evidence, often requires special equipment, and sometimes even physical access to a certain location. By comparison, the digital world is ridiculously easy to falsify: any computer that is fewer than ten years old and has access to the Internet is enough to manipulate information anywhere in the world.

This fundamental difference reaches all the way down to the information that makes up who we are, who we know or what we have done. Good luck trying to prove you did not sign that contract without digital provenance. In fact, good luck trying to prove you are not a dog.

Not your keys, not your provenance

All your digital provenance is based on a technology that is quite literally built on sand—bits and pieces that are indistinguishable from one another and have no physical properties that can discern their age. Proving anything online always relies on cryptography, a peer-reviewed, human selection of universal mathematical laws.

Thanks to cryptography, we can identify data sets with near-absolute certainty and detect any changes to these data sets, no matter how minute. Cryptography also allows us to prove ownership of digital assets by virtue of signed data. Anyone holding the matching key to a signature is considered the owner. Possession may be nine-tenths of the law, but control of the keys is one hundred percent of digital ownership.

To complicate matters further, computers are agnostic to time. Data sets and keys may contain time stamps. But those only provide the time that the user set—either directly, or by changing the system clock as desired. This way, data can claim to be from the past or the future, zero Delorians required.

There are companies that will provide the service of holding an organization’s keys, just as there are organizations that will provide time-stamping services. But that also means your entire provenance is beholden to these companies. And very often their terms of service make your provenance their property. You are merely allowed to make use of your own digital provenance within their terms of service. Violate their terms of service, or find the company caught in some regulatory dispute in a country far away, and you may lose it altogether.

Digital provenance lives in gated communities

When you use Gmail, only Google really knows what is real. The same is true for any other cloud service—and if you self-host, only you know the truth behind your digital provenance, but you might find it near impossible to prove that to anyone else.

Any data shared from any such platform always comes with an implicit disclaimer of “I hereby warrant that platform provider X says this is so.” There is no good way for a third party to verify the veracity of anything without cooperation from the platform. But what if the company you are using does not consider it worth their time to help you prove your digital provenance? What if they flat out refuse because it would cost them time and money?


Latest Updates

Subscribe to our YouTube Channel