SUBJECT: DATA PROCESSING
CLASS : SS 3
TERM: 2ND TERM
TOPIC: Parallel database
A parallel database system seeks to improve performance through parallelization of various operations, such as loading data, building indexes and evaluating queries. Although data may be stored in a distributed fashion, the distribution is governed solely by performance considerations. Parallel databases improve processing and input/output speeds by using multiple CPUs and disks in parallel. Centralized and client–server database systems are not powerful enough to handle such applications. In parallel processing, many operations are performed simultaneously, as opposed to serial processing, in which the computational steps are performed sequentially.
Architectures of parallel Database
Shared memory architecture
Where multiple processors share the main memory (RAM)space but each processor has its own disk (HDD). If many processes run simultaneously, the speed is reduced, the same as a computer when many parallel tasks run and the computer slows down.
Shared disk architecture
Where each node has its own main memory, but all nodes share mass storage, usually a storage area network. In practice, each node usually also has multiple processors.
Shared nothing architecture
Where each node has its own mass storage as well as main memory.
The Benefits of Parallel Database
Parallel database technology can benefit certain kinds of applications by enabling:
*. Higher Performance
*. Higher Availability
*. Greater Flexibility
*. More Users
Higher Performance: With more CPUs available to an application, higher speed-up and scale-up can be attained.
Higher Availability: Nodes are isolated from each other, so a failure at one node does not bring the whole system down.
Greater Flexibility: An OPS environment is extremely flexible. Instances can be allocated or deallocated as necessary.
More Users: Parallel database technology can make it possible to overcome memory limits, enabling a single system to serve thousands of users.
In a distributed database, data is stored in different systems across a network. For Example, in mainframes, personal computers, laptops, cell phones, etc.
Advantages of distributed database:
1) In a distributed database, data can be stored in different systems like personal computers, servers, mainframes, etc.
2) A user doesn’t know where the data is located physically. Database presents the data to the user as if it were located locally.
3) Database can be accessed over different networks.
4) Data can be joined and updated from different tables which are located on different machines.
5) Even if a system fails the integrity of the distributed database is maintained.
6) A distributed database is secure.
Disadvantages of distributed database:
1) Since the data is accessed from a remote system, performance is reduced.
2) Static SQL cannot be used.
3) Network traffic is increased in a distributed database.
4) Database optimization is difficult in a distributed database.
5) Different data formats are used in different systems.
6) Different DBMS products are used in different systems which increases in complexity of the system.
7) Managing system catalog is a difficult task.
8) While recovering a failed system, the DBMS has to make sure that the recovered system is consistent with other systems.
9) Managing distributed deadlock is a difficult task.
Types of Distributed Databases
Distributed databases can be broadly classified into homogeneous and heterogeneous distributed database
Homogeneous Distributed Databases
In a homogeneous distributed database, all the sites use identical DBMS and operating systems.
Heterogeneous Distributed Databases
In a heterogeneous distributed database, different sites have different operating systems, DBMS products and data models.
Some of the common architectural models are −
Client - Server Architecture for DDBMS
Peer - to - Peer Architecture for DDBMS
Middleware (distributed applications) provides services for the various components of a distributed system
© Lesson Notes All Rights Reserved 2023