Government records stored on paper

Elon Musk was supposedly surprised that when processing government employee retirement records, sometimes they have to go back into the paper archives to look up information on an employee’s earliest years with the government. There are some records that just aren’t digitized or at least fully digitized, especially old ones from decades ago.

In an ideal world, every government record would be digitized and available in a fully indexed, open and future-oriented non-vendor database format that is protected by secure encryption and the block chain to ensure the record is consistent with the day it was inputted. Data needs to be not only easily accessed by authorized employees and systems, but also able to be read generations into the future, kept consistent and secure. The problem is that many of those technologies did not exist 30 or 40 years ago, and many cases not even a decade ago.

The truth is we aren’t that far removed from when data storage was costly, computers were difficult to use and required advanced training. Even the era before computers wasn’t that long ago. Most government employees don’t retire until at least 30 years, which would be 1995. Before the Internet was mainstream, when a 2 GB hard drive on a personal PC was a big thing. Many of the most dedicated government employees have decades more experience, some have 40 or even 50 years of experience.

It’s not to say there wasn’t servers back in 1985 or even 1975. The thing is the technology was primitive compared to what was available today. You packed bits back in the day, you had to consider every attribute you captured with every ever big data set, and store it in the more efficient way possible.

Is there a lot of resistance in government to capturing and storing as much data as possible on a variety of topics? Absolutely, for one people have legitimate privacy concerns. Data storage is cheap but not free, especially for datasets with millions if not billions of sets of rows, and hundreds of attributes. I know as Director of Data Services for the Assembly, one of the things I spend a lot of time thinking about is how much data we really need in our database. It’s not always apparent what data will be used immediately, and what will take up space and go stale before it’s ever used. A byte doesn’t take up much space but a half billion of those bytes works out to be 5 GB.

Paper is often seen as secure. It’s much harder to alter a paper document, at least accidentally. It’s easy to physically control access to paper documents. It’s also seen as reliable, as long as it’s filed in the proper file cabinet and there is no destruction to the archives in form of fire, flood or other disaster. Electronic storage, with off-site backups might be more be a better option but such technology used to be costly, and not always available.

As the head of the data director, with many older clerks who grew up in an era of clunky, slow computers with limited and expensive storage, I do see some of the resistance to getting away from paper documentation of reports run and work completed. It seems like in their minds that paper is more convenient and secure, less likely to get accidentally deleted. But paper is just used for documentation of completed tasks, as a backup, not as a primary storage of data. I am trying to get away from such things, to reduce the amount of paper in the garbage cans going to landfill at the end of day, but it’s tough to transition and institution, and you have to pick your battles where you can.

The government is old, it has a lot of long-time employees, but the truth is things are electronic where it counts. Infrequently accessed, legacy records of limited use, like long ago time sheets when data storage was costly, and paper was the norm are fine to be on paper in secure storage, and pull manually as needed. That said, people should be doing things that computers do poorly like manually fixing records that aren’t systematically broken, while computers make big changes to datasets that have systematic problems that can easily be corrected using SQL or an AWK script. Things should continue to be digitized and automated whenever possible, but there will still always be a need for some legacy paper records and manual input and correcting.

Leave a Reply

Your email address will not be published. Required fields are marked *