Before we move onto understanding the current structure followed for transactions over various organisations, let's first have a quick look at various types of database systems which currently exist. A database is nothing but a structured set of information which can be accessed, updated and modified in an efficient and simple way.
We classify databases into three types — centralised, decentralised and distributed.
In the next video, Jeeven explains them with an industry example. Let’s have a look at it.
In the previous video, you saw three types of database systems: Centralised, Decentralised and Distributed.
Let's try to understand them further with an example. Suppose Phil is an entrepreneur who starts a business of house furniture. He set up various showrooms in the city for selling the furniture. He created a warehouse to store all the furniture there and supply it as per requirements by the showrooms. You can map this scenario to the centralised database system. Similar to the single warehouse storing all the furniture, a central database stores all the data.
As his business expanded, he ventured out to multiple cities by setting up various showrooms. Now, Phil realised that having one warehouse will not suffice his requirements, so he set up warehouses in all the cities. He created one warehouse in each city he ventured in. All his furniture was stored there and distributed from there to the showroom in one city. Here this can be mapped to the decentralised database system. Multiple warehouses hold the merchandise, and all the showrooms fetch the furniture from these warehouses. Similarly, in a decentralised database system all the information is not stored in one place but multiple places or databases.
Further, as the demand increased, Phil set up one warehouse for each showroom taking into account the cost of transportation and nature of the market. Now every showroom has its own warehouse to store and retrieve the furniture. You can map this scenario to the distributed database systems. In a distributed database system, the data is shared across the entire network.
Thus, any database can be classified as a centralised, decentralised or distributed database.
A distributed database can be summed up as a stretched version of a decentralised database. The mode of decision-making determines the main difference between the two systems:
where or how is the decision made
how the information is shared across the participating nodes
In the decentralised database, there is no single point where the overall decision is made. Every node in the system makes its own decision, and the system behaviour is the sum of those responses. Also, a single node may or may not have the complete information about the system as a whole depending on the architecture.
Distributed databases are best described as a system where data processing is shared across all the nodes, but the system decision might still be centralised, based on the complete system knowledge.
Consider an example where you have 100 nodes or participants in a network.
If we consider these participants in a centralised system, there will be 1 central server and 99 nodes all communicating individually with the central server. Total no. of connections in this setup will be 99. Information to the other nodes will be routed through the central server.
If the same example were used for a decentralised system, let’s assume there might be 10 decentralised servers and 90 nodes split between these 10 servers such that each server is communicating to its subnet of 9 nodes and is also connected to the other 9 peer servers. The total number of connections for each server will be 9 + 9 = 18. Here, the direction of the connections is not important meaning that a connection for instance from server 1 to server 2 if counted once, doesn't have to be counted again while connecting Server 2 to server 1. So, the total number of connectivities in the entire system will be 90 (9 sub-nodes * 10 servers) + 45 (10C2 connections among the servers) = 135. Information from one node to any other node will be routed through the network of decentralised servers.
When we consider a distributed system for the above example, every node in the system will be connected to every other node. Thus, the total number of connections for the entire network will be 100C2 = 4950. Thus, a node can directly communicate with any other node in the network without relying on any server. One important thing to note here is that for a distributed database there is no limit to the incoming connections on a node (meaning as many nodes as possible can receive data/information from as many as nodes possible at a time). However, the outgoing connections are limited (meaning a node can only send information to a certain number of nodes at one time).
Now let’s consider what happens when a server fails in each of the above systems. In a centralised system, if the server fails, the entire network is affected, i.e. 100% of the system fails. In a decentralised system, if 1 server collapses, 10% of the network is affected. The remaining 90% of the network continues to operate although at a reduced level. Considering a distributed database with no server, if 1 node fails only 1% of the network fails. The remaining 99% of the system is unaffected and continues to function efficiently.
Listed below are the critical points of differentiation between the three types of systems we encountered so far.
Feature | Centralised | Decentralised | Distributed |
---|---|---|---|
Security | Low; Most vulnerable to data security issues | Moderate; Data can be rebuilt from parallel servers if backed up | Highest; Very difficult to lose data completely |
Response Speed (*Applicable in case the networks having large amounts of data)
| Bottlenecks can cause response speed to reduce significantly | Quick response speed depending on the distribution of data | Fastest response rates |
Overheads and Costs | Low; Redundancy is minimized | Substantial processing overheads to ensure proper coordination among servers | Massive overheads to ensure appropriate coordination among multiple nodes |
Points of Failure / Maintenance | Single point of failure; Easy to maintain | A limited number of points of failure; Maintenance more complex than centralised systems | Multiple points of failure; Difficult to maintain |
Stability | Highly unstable; if the central server fails, entire network collapses | Stability better than centralised systems; the network can continue to operate at a reduced level if any one server fails | The highest level of stability; single node failure doesn’t affect the network |
Scalability | Low scalability | Moderately scalable | Infinitely scalable |
Ease of Setup | Easy to set up | Difficult to set up | Difficult to set up |
Kindly go through this link which provides an explanation on ACID properties, an introduction of CAP theorem followed by BASE properties.
This link gives you a detailed explanation of the various types of NoSQL databases and compares it with a relational database. You can also refer to this link to have an overview of the pros and cons of relational and non-relational databases.
You can watch this video to further your understanding of centralised, decentralised and distributed computing.