What is IPFS (Inter-Planetary File System)?

What is IPFS (Inter-Planetary File System)?

One of the key technologies could be the Inter Planetary File System (IPFS). It is a peer-to-peer (p2p) file sharing system that aims to revolutionize the way information is distributed around the world. IPFS consists of several innovations in communication protocols and distributed systems that have been combined to create a file system unlike any other. Thus, to understand the breadth and depth of what IPFS is trying to accomplish, it is important to understand the technological advances that make it possible.


Communication protocols and distributed systems

For two people to exchange information, they need common sets of rules, which are known as communication protocols. Previously, computers could not communicate with each other and existed as isolated computing devices until the early 1980s, when the first communication protocols were invented.

Communication protocols usually exist in packages (called protocol suites) of several layers, each responsible for specific functions. In addition to communication protocols, it is important to understand the relationship between the underlying computers and their underlying structure. This is known as the system architecture. There are several types, but only two are important to us: client-server and peer-to-peer networks.

The Internet is dominated by client-server relationships, which are based on a set of Internet protocols. Of these, the Hypertext Transfer Protocol (HTTP) is the foundation for communication. This protocol has solved many scalability and security problems, but control of the data still belongs to whoever controls the server, and that can be either an official or an intruder. But the client-server model and HTTP have served the Internet quite reliably for most of its history, although not designed to transfer large amounts of data, which is a problem today.


InterPlanetary File System (IPFS)

IPFS attempts to address the shortcomings of the client-server model and HTTP networking with a new open-source p2p file sharing system. This system is a synthesis of several new and existing innovations. Hundreds of developers around the world have contributed to IPFS and here are the main components.


Distributed hash tables

A hash table is a data structure that stores information in the form of key/value pairs. In distributed hash tables (DHT), data is distributed across a network of computers and efficiently coordinated to ensure efficient access and retrieval between nodes. The main advantages of DHT are decentralization, fault tolerance, and scalability. Nodes do not require central coordination, the system works even when nodes fail, and DHT can scale to accommodate millions of nodes. Together, these features result in a system that is generally more resilient than client-server structures.


Exchange Unit

The popular file-sharing system Bittorrent is able to successfully coordinate data transfers between millions of nodes, relying on an innovative data exchange protocol, but it is limited to the torrent ecosystem. IPFS implements a generalized version of this protocol, called BitSwap, which works as a marketplace for any type of data.


Merkle DAG

It is a mixture of a Merkle tree and an oriented acyclic graph (DAG). Merkle trees ensure that blocks of data exchanged in p2p networks are correct, undamaged, and unchanged. This verification is done by organizing blocks of data using cryptographic hash functions. This is simply a function that takes input data and computes a unique alphanumeric string (hash) corresponding to that input value. It's easy to check that the input will result in a given hash, but incredibly difficult to guess the input from the hash.

The individual data blocks are called "end nodes", which are hashed to form "non-end nodes". These non-end nodes can then be combined and hashed until all data blocks are represented by a single root hash.

Simply put, a DAG is a way of modeling topological sequences of information that have no cycles. A simple example of a DAG is a family tree. Merkle's DAG is basically a data structure in which hashes are used to reference data blocks and objects in a DAG. This creates several useful features: all content in IPFS can be uniquely identified because each data block has a unique hash. In addition, the data is resistant to unauthorized modification.


IPFS version control systems

Another powerful feature of the Merkle DAG framework is that it allows you to create a distributed version control system (VCS). The most popular example of this is Github, which allows developers to easily, collaboratively, and simultaneously work on projects. Files on Github are stored and managed using Merkle DAG. This allows users to independently duplicate and edit multiple versions of a file, save those versions, and later merge changes with the original file.

IPFS uses a similar model for data objects: if objects corresponding to the original data and any new versions are available, the entire file history can be retrieved. Given that blocks of data are stored locally throughout the network and can be cached indefinitely, this means that IPFS objects can be stored permanently.

In addition, IPFS does not rely on access to Internet protocols. Data can be distributed across networks built on another network. These features are notable because they are key elements in a censorship-resistant network. This can be a useful tool in promoting free speech to counter the spread of Internet censorship around the world, but we must also be aware of the potential for abuse by bad actors.


Self-Certifying File System (SFS)

The last important component of IPFS that we will look at is the Self-Supervised File System (SFS). This is a distributed file system that does not require special permissions to exchange data. It is "self-certifying" because the data sent to the client is authenticated by the file name (which is signed by the server). As a result, you can securely access remote content with the transparency of local storage.

IPFS builds on this concept to create an Inter-Planetary Namespace (IPNS). It is an SFS that uses public key cryptography to self-certify objects published by network users. We mentioned earlier that all objects in IPFS can be uniquely identified, but this also applies to nodes. Each node in the network has a set of public keys, secret keys, and a node identifier, which is a hash of its public key. Therefore, nodes can use their private keys to "sign" any data objects they publish, and the authenticity of that data can be verified using the sender's public key.


Why it matters

IPFS provides high bandwidth, low latency, data distribution, decentralization and security. It can be used for content delivery to websites, global file storage with automatic version control and backup, secure file sharing and encrypted communication.

It is also used as an additional file system for public blockchains and other p2p applications. Right now, it can take several dollars to store a kilobyte of data in an Ethereum smart contract. This is a major hurdle, and there is a massive growth of new decentralized applications (DApps). IPFS is compatible with smart contracts and blockchain data, so it can add reliable and inexpensive storage capacity to the Ethereum ecosystem. Attempting to make Ethereum blockchain data initially available in IPFS is a separate protocol known as IPLD (interplanetary linked data).


IPFS Problems

Despite the impressive performance of IPFS, some problems have not yet been fully solved. First, content addressing in IPNS is currently not very user-friendly. A typical IPNS link looks like this:

These links can be reduced to simpler names using the Domain Name System (DNS), but this creates an external point of failure for content distribution. Nevertheless, content is still available through the original IPNS address. Some users also report that IPNS can be slow in resolving domain names with delays of up to several seconds.

IPFS also has little incentive for nodes to support long-term backups of data on the network. Nodes can choose to purge cached data to save space, meaning that theoretically files can eventually "disappear" over time if no nodes remain to store the data. At current levels, this is not a significant problem, but in the long run, backing up large amounts of data requires strong economic incentives.



IPFS is a very ambitious endeavor. Using IPFS is very interesting, and understanding the technical magic that makes it possible is even more exciting. If successful, IPFS and its additional protocols could provide a fault-tolerant infrastructure for the next generation of the Internet. A network that, by definition, should be pervasive, secure, and transparent can indeed become so. Read more about cryptocurrency at Finance Guider