Interprocess Communication in Distributed System

We are familiar with the inter-process communications of OS:

Named Pipe
Anonymous Pipe
Message Queue
Soket
File
Signal
…etc

This lists can be longer, but all of this can be organized by two categories:

File Type:
- simple file
- named pipe
- network
Memory Type:
- share memory
- message queue
- signal

But in a distributed system, memory and disk can’t be shared (at least can’t be shared without software assistant), the network become the single and more important way to achieve it. Today, we focus detail on how inter-process communication works in distributed system.

Definition

The definition of inter-process communication is message passing between a pair of process, either in the same host or not. In a distributed system, we can’t make sure the communication happen in same host, so we can hide the differences by using network.

The following is some design considerations of inter-process communications :

destinations: how to resolve another process?
reliability: should this approach handle message omission and host crash?
ordering: should the approach make sure the order of message?

Types

When it comes to the specific ways to achieve inter-process communications, we can have following options:

Socket: the abstraction of both UDP/TCP
Higher abstraction:
- Indirect message
- RMI

The higher abstraction way using socket internally and provide easier interfaces to user. This time, we focus only on basic and lower level – socket.

Socket Details

The socket actually has two types:

UDP – which uses datagram, i.e. has message boundary, so sending message will have no buffer, will be sent immediately

TCP – stream based : producer and consumer – no message boundary. So TCP may buffer some message, and we can force by flush – this behavior stems from TCP’s stream attribute.

Java API for Socket

The API for stream communication assumes that when a pair of processes are establishing a connection, one of them plays the client role and the other plays the server role, but thereafter they could be peers.

Java socket related classes simulate the differences between TCP and UDP, represent them using stream and datagram.

Stream in Java network API is simplex, only in one direction, because the input buffer and output buffer is separated and we should notice that the actual underlying TCP connection is duplex.

Data Representation

Irrespective of the form of communication used, the data structures must be flattened (converted to a sequence of bytes) before transmission and rebuilt on arrival.
Bytes is the minimal unit of data transmission and never change in transmission, what changed is:

bytes order: The individual primitive data items transmitted in messages can be data values of many different types, and not all computers store primitive values such as integers in the same order. The representation of floating-point numbers also differs between architectures. There are two variants for the ordering of integers: the so-called big-endian order, in which the most significant byte comes first; and little-endian order, in which it comes last.
how bytes are interpreted: Another issue is the set of codes used to represent characters: for example, the majority of applications on systems such as UNIX use ASCII character coding, taking one byte per character, whereas the Unicode standard allows for the representation of texts in many different languages and takes two bytes per character.

Three external data format

COBAR – binary data, not contain type info, it is assumed that client and server has prior knowledge of order and types info
Java serializable – binary data, contain the type info because it is also used in disk storage
XML – primitive is converted into textual format, which is generaller longer than binary format -> protocol buffer & JSON; self explain

More: Multicast

Some times we need multicast in inter-process communication, which is implemented by IP multicast or some complex protocol which provides:

Fault tolerance by replicated service
Service discovery
Better performance
Propagation of event notification

Ref

Wikipedia: Inter-process communication

Written with StackEdit.

On teh way

Blog Search