Earlier this year, attackers entered Equifax’s system through a web-application vulnerability. As a result, the identity information of up to 143 million people– more than 40% of the U.S. population— were put at risk, exposing names, addresses, birth dates, and Social Security numbers (SSN), details that could help identity thieves take out loans, apply for a credit card, and anything else that a digital identity might be used for.
How could this have happened?
The reason behind the latest data breach is almost too easy to believe. The hackers took advantage of a vulnerability in Apache Struts platform, the enterprise platform that Equifax uses. According to René Gielen, the Vice President of Apache Struts, the vulnerability was disclosed to Equifax back in March (which was at least two months before the data breach happened, and four months before Equifax claimed that the breach was first discovered), along with a recommendation to update the software in order to prevent against data exposure. Gielen further stated, “[t]he fact that Equifax was subsequently attacked in May means that Equifax did not follow that advice. Had they done so this breach would not have occurred.” After exploiting the vulnerability to gain a foothold, the attackers may have found scores of unprotected data immediately or may have worked over a period of time —between mid-May and the end of July—to gain more and more access to Equifax’s systems.
At this point, it should be important to note that Equifax is by no means the only company to have suffered a large-scale data breach. At the end of 2016, Yahoo had announced two data breaches, which collectively exposed the information of an upwards of one billion Yahoo accounts, making it the largest data breach in history. There was also the data breach of the infamous AshleyMadison. com, a commercial website billed as enabling extramarital affairs, where hackers obtained user data for some 32 million users and posting their information online. Examples of similar attacks are endless.
What caused this data security problem?
It is no exaggeration to say that the data security problem has become too big to control. In the wake of Equifax breach, there are warnings for consumers to beware of their SSN and other personal information, to be put on fraud alerts and the like. Top politicians are calling for regulatory agencies to investigate Equifax, and there is a scramble over support for stronger consumer protection legislation. While the call for political action is certainly commendable, and Equifax’s own internal risk controls are certainly questionable, a larger piece of the puzzle is being overlooked.
Ultimately, the root of the problem– and to a large extent, the biggest threat to cybersecurity– lies in the way that sensitive data is stored today.
In the case of Equifax, millions of people’s identities are stored all onto a central location. This is why a hacker can gain all hundreds of millions of identities by hacking past the Apache Struts platform, Equifax’s one central gatekeeper.
Is there a solution to this problem?
We are all familiar with the concept of “the cloud”—where data is stored on a central network of servers. In using a central network storage database like cloud service, a large amount of trust is put into these third parties like Dropbox, iCloud, Google Drive, and the like to secure sensitive and private data. Furthermore, the data stored on cloud services are typically unencrypted. On the other hand, on a blockchain, files are encrypted, broken up into blocks, and then distributed across dozens of networks in dozens of geographical locations.
Blockchain’s innovation is that it moves data storage away from centralized servers and databases, focusing more on a decentralized, peer-to-peer network. It would eliminate the security problems facing the way that data is currently centralized. This is because Blockchain technology is a form of “distributed ledger technology”. The term “distributed ledger” refers to the concept that each authorized user shares the same “ledger” or set of accounts, as defined by the blockchain structure. Furthermore, the distributed nature of a blockchain networks means that every network participant (also known as a “node”) has a ‘master ledger copy’ of the data, allowing for the valid rebuilding of any lost data.
In my previous post, I described how a blockchain is essentially computers that transfer blocks of records, transactions, and data in an unchangeable chronological chain, largely due to the hash function that blockchain is built upon. (A hash function is a mathematical formula that changes data or transactions into a long string of characters that are unreadable by humans.) From there, each additional block within the blockchain refers to the previous block’s hash as part of its own data. Due to the dependability of the blocks on a blockchain as well as the hash function, it is extremely difficult to change the data on any blockchain.
Thus, even if a hacker can expend the computational power and time into hacking into one identity on the blockchain block, the hacker would have to then use the same effort to hack into each additional identity, because there is no longer a single point of entry on a blockchain.
How can blockchain make data security a reality?
The solution to cybersecurity threat for companies like Equifax that handle extremely sensitive data lies in the structure of a permissioned blockchain. There are two kinds of blockchains setups: i) permissioned and ii) permissionless blockchains. As the name suggests, the difference between a permissioned and a permissionless blockchain is in the accessibility, or the ‘openness’ of the blockchain ledger. As described by Bitcoin Magazine, permissioned blockchains are believed to offer advantages of digital currencies powered by public blockchains – fast and cheap transactions permanently recorded in a shared ledger – without the troublesome openness of the Bitcoin network where anyone can be a node on the network anonymously. On a bitcoin network, which is permissionless, anyone with an internet connection can use it with no need to register or provide identification. Permissioned blockchains, on the other hand, allows for the insertion of points of control into the blockchain structure.
To understand how having points of control in a permissioned blockchain can work to mitigate cybersecurity threats, an understanding of the consensus protocol that is behind blockchain technology should be flushed out. In essence, valid data may only be added to a blockchain through the approval of nodes. Depending on how the consensus protocol of a blockchain is set up, that approval may be approved by a simple majority consensus (51% or above), or some higher supermajority threshold. If no consensus between the nodes is reached as to the validity of a transition that is represented by a specific data entry, that transaction will not be permitted onto the blockchain. This means the nodes themselves are the gatekeepers of the network, and also ensures that invalid data or changes cannot be added to the blockchain.
In a permissioned blockchain, initial access to the blockchain could be controlled by the existing nodes, such that for any new party to access the network, the existing nodes would first have to approve the change to the blockchain’s state represented by the new party now having access to it. Blockchains cannot be altered unless the nodes reach a consensus that the proposed change is permitted. Any and all changes to the blockchain’s state would be recorded as new transactions hashed into new blocks on the blockchain and would be subject to a consensus among the nodes as to the validity of the change. Unless the blockchain’s state is changed from its old state (no access for a hacker) to a new state in which the hacker has access, the old state (no access) would remain unchanged, as noted previously. This means that the proposed new transaction in the proposed new block would then be subject to the blockchain’s verification and consensus protocol by and among the other nodes. Because the new block (which contains the new transaction) would be idiosyncratic, and because the hacker would be the only node proposing the change (and not the other nodes required by the consensus protocol), the new block – and thus the hacker’s proposed change to the blockchain’s state – would be rejected by the other nodes.
For companies that must maintain extremely sensitive information pertaining to digital identities, the nodes on the blockchain could include government agencies that can validly identify any users—and more importantly, blacklist any users who are on a terrorist watch list, or who have records of committing identity theft. Based on the way the consensus protocol is structured, this could take 51% (or another majority consensus), which is an insurmountably large amount of computational power. Furthermore, if the consensus protocol required government agencies to approve, could require securing the government agency’s consent – an obvious impossibility for a known hacker or identity thief.
In an interview with the International Business Times, Nick Szabo points out that proper financial controls at large banks are already somewhat decentralized, thanks to a “human blockchain” of accountants, auditors, etc. checking each other’s work. In a way, having the additional points of control in a permissioned blockchain acting as multiple gatekeepers runs parallel with that comparison. Unlike the “human blockchain”, the cost of verifying transactions on a permissioned blockchain is lower because there are fewer trusted validators. The interconnected and multi-level way of verifying the user and access to information is what companies currently do not have set in place both due to cost and to the lack of a streamlined and secure process for verification among multiple actors.
At the end of the day, blockchain technology does not render cybersecurity threats obsolete, but it does make it immensely difficult. As an infrastructure, blockchain technology has the potential to enhance the privacy, security, and freedom of data and can vastly improve data repository in a way that is truly revolutionary.
All opinions published on this blog are my own and do not reflect the opinions of any institutions that I am affiliated with in any capacity.
Please contact firstname.lastname@example.org for any inquiries or reprint/use permissions.