Java IO (3): Utility & Opt

In Java 1.7, more IO optimizations came. Besides the dawn of AIO (Asynchronous IO), many other new classes also introduced in later version. Most of other classes is the utility class and optimizations of original IO model. In this blog, we will dive into this (and AIO in next blog).

Utility

New Abstraction: Path

The class Path is a programmatic representation of a path in the file system, which can be seen as a better naming and more functional version of File. We said so because File is not a good abstraction. It actually has many different functionalities:

Operations about path manipulation: getName(), getParent() etc;
Operations about file meta data access (actually the operation over the inode): length(), canExectue(), setWritable() etc;
Operations to read/write content of Directory: listFiles(), createNewFile() etc;
No operations to read/write a plain File;

So we say that File abstraction has a somewhat misleading naming and violate the coherence of class design.

Path, on the other hand, has two main kinds of operations. The first is syntactic operations (which is about operations that involves manipulating paths without accessing the file system. These are logical manipulations done in memory is like String operation). The second is operations about WatchService, which we will cover it later.

File Utility: Files

Files bears some of the responsibility of File: operations about meta data and file content. It provides a set of isSomething() methods that we can use to perform various kinds of meta data checks before we actually manipulate a file or a directory. It also includes many utility function to read/write content of file, like newDirectoryStream(), lines().

Exists?

An interesting problem about Files.exists()is: !Files.exists(...) is not equivalent to Files.notExists(...), i.e. the notExists() method is not a complement of the exists() method. It is because there exist another state of file unknown.

Watch Service

The Watch Service API was introduced in Java 7 (AIO) as a thread-safe service that is capable of watching objects for changes and events. The most common usage is to monitor a directory for changes to its content through actions such as create, delete, and modify. It can be used in applications like IDE and application with config files, so that it can update the program state when file changed.

Scatter and Gather

As we have said in the first blog of this serial, we should avoid accessing the disk & underlying operating system and avoid method calls. In order to make it, Java provides Vectored IO, also known as scatter/gather IO, which can do multiple IO operation in one method call.

Besides the performance gain, Vectored IO can also makes atomicity (multiple read/write without other threads’ interleave) if specific operating system supports.

File Lock

File locks are held on behalf of the entire Java virtual machine. And it seems not so useful because it is advisory lock but not mandatory:

They are not suitable for controlling access to a file by multiple threads within the same virtual machine.” (Java Platform SE 7 official documentation)

Common Opt Example

We have introduced some basic principles to do IO optimizations in the first blog of this serial and in this blog, we dive into more specific examples.

Random Access Buffer

If we have a large file but we have data access locality, we can use buffer to reduce IO operations with the trade of more memory:

if (pos < startpos || pos > endpos) {  
  long blockstart = (pos / bufsize) * bufsize;  
  int n;  
  try {  
    raf.seek(blockstart);  
    n = raf.read(inbuf);  
  } catch (IOException e) {  
    return -1;  
  }  
  startpos = blockstart;  
  endpos = blockstart + n - 1;  
  if (pos < startpos || pos > endpos) {  
    return -1;  
  }  
}  
return inbuf[(int) (pos - startpos)] & 0xffff;

Compression

Whether compression helps or hurts I/O performance depends a lot on our local hardware config; specifically the relative speeds of the processor and disk drives. Compression using Zip technology implies typically a 50% reduction in data size, but at the cost of some time to compress and decompress.

An example of where compression is useful is in writing to very slow media such as floppy disks. A test using a fast processor (300 MHz Pentium) and a slow floppy (the conventional floppy drive found on PCs), showed that compressing a large text file and then writing to the floppy drive results in a speedup of around 50% over simply copying the file directly to the floppy drive.

Ref

Written with StackEdit.

On teh way

Blog Search