A data collector’s perspective on the pros & cons of open records

Data is ever-growing. The current rate is exponential. In fact, 90% of the world's existing data was created in the last two years alone.

As recent as 1990, if you needed to know what year Colorado became a state, you'd probably have to open and read an encyclopedia to obtain the result. Today, a simple Google (or Siri or Alexa) search will produce the same answer in 0.76 seconds. No paper cuts and no lugging around 20-pound book sets.

Every second, over 40,000 searches are sent through Google from all over the world. People are constantly requesting information, interaction, and connection, all of which is provided by searching for and accessing data online. Data is spread far and wide throughout the web, in a multitude of forms. But it isn't always so easy to find exactly what you are looking for – and sometimes Google isn't always the answer.

The Colorado legislature is debating proposed changes to the Colorado Open Record Act (CORA) and the role government should play in releasing documents and data in usable digital formats.

As a water data collection specialist, I have spent years behind a keyboard, researching public data and how it's accessed throughout the web. In my experience, there are three myths when it comes to finding and using public or government data: 1) it's easy to find, 2) free, and 3) available in a useable format.

About three years ago, our company, Ponderosa Advisors, began developing a tool called Water Sage which helps individuals better understand water use and water rights. In Colorado alone, Water Sage integrates water rights, well, and land data for nearly 2.5 million parcels; more than 160,000 water rights; and data for nearly 425,000 water wells and structures. It is the most comprehensive land parcel and water database in the state and we couldn't have built it without access to quality public data.

In many instances, public data is provided in new and easy to use formats, like shapefiles, KML (Google Earth) or csv (comma-separated values), making it easier to use or incorporate into value-add software programs like Water Sage. However, too often important public data sets with real world applications are either completely inaccessible, or in formats that require manual processing. For example, some counties still maintain land parcel data in paper format, requiring individual trips to county courthouses, as well as the expense of converting the data into a digital format. The amount of work required to use data maintained in this manner makes it inaccessible for most. The point is that while data may be public, it definitely is not always easy to find, free and easy to process.

Recent witness testimony on Colorado Senate Bill 40, which would amend our current CORA laws, helped put this challenge into perspective and shed some light on the unintended consequences of completely unfettered data, especially when it comes to our water. In particular, when Doug Kemper, the executive director of Colorado Water Congress, voiced his concerns about protecting our water infrastructure, it naturally got my attention. We're in the water data business and agree costs, privacy, and sensitive information are all legitimate concerns and should be a part of the debate, but like many things, we have to weigh the pros and cons and it seems like everyone has an opinion to share.

In the end, meaningful analytics from public sources is only as good as the data you put into it. The better (and more complete) the data, the better the results will be. Colorado is actually a leader amongst other states in publishing public data, but it has an opportunity to be a national model.

Accessible and easy to use data can revolutionize the way Colorado engages with and solves its most critical issues. There will always be risks associated with creating easier access to information, we just have to decide if the benefits outweigh the risks.