Data copy/transfer operation is one of the most common operations done in computer science. To understand and appreciate Zero-copy we first we need to understand what happens during the 'Copy' operation. Lets take an example of reading a file from the hard disk. When the file is read from the hard disk, the kernel reads the data from the disk (causing a context switch) and pushes it across the kernel layer to the application layer (again causing a context switch), and then the application layer pushes it back across to kernel layer (again a context switch) to be written out to the socket (phew...again causing a context switch). In effect, the application serves as an inefficient intermediary that gets the data from the disk file to the socket with four context switches. Mind you, these context switches between the application and kernel are pretty costly affair.
Use of the intermediate kernel buffer (rather than a direct transfer of the data into the user buffer) might seem inefficient at the first glance but this intermediate kernel buffers does improve performance. The intermediate kernel buffer acts as a "lookahead cache" when the application hasn't asked for as much data as the kernel buffer holds. This significantly improves performance when the requested data amount is less than the kernel buffer size. The intermediate buffer on the write side allows the write to complete asynchronously.
Unfortunately, this approach itself can become a performance bottleneck if the size of the data requested is considerably larger than the kernel buffer size. The data gets copied multiple times among the disk, kernel buffer, and user buffer before it is finally delivered to the application.
Zero copy improves performance by eliminating these redundant data copies and reducing the number of context switches between user and application layer. The context swtich between user and the application layer is very expensive. In zero copy approach, the data is transferred directly from kernel buffer to socket buffer and therefore, the context swtich in the whole operation reduces to two. Zero copy is a OS dependent operation but luckily we get such an implementation in java. Zero copy approach is followed by the transferTo() method of FileChannel class in java (java NIO).
Needless to re-iterate, the performance is greatly enhanced. Lets look at some code to have a feel of the performance improvement using Zero-Copy. The first set of classes is SlowClient.java and SlowServer.java which implements the normal IO based copy operation across sockets. The second set of classes is ZeroCopyClient.java and ZeroCopyServer.java which uses the Zero-Copy approach for implementing copy operations.
package com.test.myblog;
import java.io.DataOutputStream;
import java.io.FileInputStream;
import java.io.IOException;
import java.net.Socket;
import java.net.UnknownHostException;
public class SlowClient {
public static void main(String[] args) {
String server = "127.0.0.1";
Socket socket = null;
DataOutputStream output = null;
FileInputStream inputStream = null;
final int ERRORCODE = 1;
// connect to server
try {
socket = new Socket(server, SlowServer.PORT);
System.out.println("Connected with server " + socket.getInetAddress() + ":" + socket.getPort());
}
catch (UnknownHostException e) {
System.out.println(e);
e.printStackTrace();
System.exit(ERRORCODE);
}
catch (IOException e) {
System.out.println(e);
e.printStackTrace();
System.exit(ERRORCODE);
}
try {
String fname = "d:/jdk-6u16-windows-i586.exe"; //here goes the file name
inputStream = new FileInputStream(fname);
output = new DataOutputStream(socket.getOutputStream());
long start = System.currentTimeMillis();
byte[] b = new byte[SlowServer.BYTES];
long read = 0, total = 0;
while ((read = inputStream.read(b)) >= 0) {
total = total + read;
System.out.println(total);
output.write(b);
}
System.out.println("total bytes sent:" + total + " and total time taken:" + (System.currentTimeMillis() - start));
}
catch (IOException e) {
System.out.println(e);
}
try {
output.close();
socket.close();
inputStream.close();
}
catch (IOException e) {
System.out.println(e);
}
}
}
package com.test.myblog;
import java.net.*;
import java.io.*;
public class SlowServer {
public static final int BYTES = 4096;
public static int PORT = 2000;
public static void main(String args[]) {
ServerSocket serverSocket;
DataInputStream input;
try {
serverSocket = new ServerSocket(PORT);
System.out.println("Server waiting for client on port " + serverSocket.getLocalPort());
while (true) {
Socket socket = serverSocket.accept();
System.out.println("A new connection accepted at:" + socket.getInetAddress() + ":" + socket.getPort());
input = new DataInputStream(socket.getInputStream());
//Print the data that is received
try {
byte[] byteArr = new byte[BYTES];
while (true) {
int nread = input.read(byteArr, 0, BYTES);
if (0 == nread)
break;
}
}
catch (IOException e) {
System.out.println(e);
}
// connection closed by client
try {
socket.close();
System.out.println("Connection closed by client: " + socket.getInetAddress());
}
catch (IOException e) {
System.out.println(e);
}
}
}
catch (IOException e) {
System.out.println(e);
}
}
}
package com.test.myblog;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.net.InetSocketAddress;
import java.net.SocketAddress;
import java.nio.channels.FileChannel;
import java.nio.channels.SocketChannel;
public class ZeroCopyClient {
public static void main(String[] args) throws IOException {
ZeroCopyClient client = new ZeroCopyClient();
client.transferData();
}
public void transferData() throws IOException {
String host = "127.0.0.1";
SocketAddress sad = new InetSocketAddress(host, ZeroCopyServer.PORT);
SocketChannel sc = SocketChannel.open();
sc.connect(sad);
sc.configureBlocking(true);
String fname = "d:/jdk-6u16-windows-i586.exe";
long fileSize = new File(fname).length();
FileChannel fc = new FileInputStream(fname).getChannel();
long start = System.currentTimeMillis();
long curnset = 0;
curnset = fc.transferTo(0, fileSize, sc);
System.out.println("total bytes transferred: " + curnset + " and time taken:" + (System.currentTimeMillis() - start));
}
}
package com.test.myblog;
import java.io.IOException;
import java.net.InetSocketAddress;
import java.net.ServerSocket;
import java.nio.ByteBuffer;
import java.nio.channels.ServerSocketChannel;
import java.nio.channels.SocketChannel;
public class ZeroCopyServer {
ServerSocketChannel listener = null;
public static final int PORT = 8009;
public ZeroCopyServer() {
InetSocketAddress inetSocketAddress = new InetSocketAddress(PORT);
try {
listener = ServerSocketChannel.open();
ServerSocket serverSocket = listener.socket();
serverSocket.setReuseAddress(true);
serverSocket.bind(inetSocketAddress);
System.out.println("Listening on port:" + inetSocketAddress.toString());
}
catch (IOException e) {
e.printStackTrace();
}
}
public static void main(String[] args) {
ZeroCopyServer copyServer = new ZeroCopyServer();
copyServer.readData();
}
private void readData() {
ByteBuffer byteBuffer = ByteBuffer.allocate(4096);
try {
while (true) {
SocketChannel socketChannel = listener.accept();
System.out.println("Accepted connection at socket: " + socketChannel);
socketChannel.configureBlocking(true);
int nread = 0;
while (nread != -1) {
try {
nread = socketChannel.read(byteBuffer);
}
catch (IOException e) {
e.printStackTrace();
nread = -1;
}
byteBuffer.rewind();
}
}
}
catch (IOException e) {
e.printStackTrace();
}
}
}
Running the above operations, one can easily note that the performance of application using the Zero-Copy approach increases by the tune of approximately 75-80%.
Key points:
1. This technique is suitable for static content (ofcourse dynamic content would not make sense here).
2. One of the key limitations of FileChannel is that it can only be used to transfer data from File to File or File to Socket objects.
Java NIO has some great features implemented, but its rather unfortunate that it has not received the kind of attention it deserves.
Tuesday, March 16, 2010
Wednesday, January 27, 2010
Java ClassLoader Architecture - Understanding Java Classloaders
The architecture of Java ClassLoader is hierarchical in nature. At the root of the hierarchy is the ‘Bootstrap’ classloader followed by ‘extension’ classloader and eventually the ‘sytem’(application) classloader. This means that ClassLoaders form a hierarchical tree, with the “bootstrap” ClassLoader as the root of the Tree.
Java ClassLoader model is a “delegating parent” model. What this means is that when a ClassLoader is asked to load a class, it first delegates the opportunity to load the requested class to its parent ClassLoader. Only if the parent classloader has not already loaded the Class does the child ClassLoader get the opportunity to load the Class requested.
The Java system provides for loading code from one of three places: one, the core library, second, the Extensions directory (or directories, if you modify the java.ext.dirs property to include multiple subdirectories), and lastly from the directories and .jar/.zip files found along the java.class.path property, which in turn comes from the CLASSPATH environment variable. Each of these three locations is in turn covered by its own ClassLoader instance: the core classes, by the bootstrap ClassLoader, the Extensions directory/directories by the extension ClassLoader, and the CLASSPATH by the system or application ClassLoader.
I hope you guys must have got an overall idea of how the classes are loaded in java.
Now for some recap:
The bootstrap ClassLoader is implemented as part of the VM itself . It is this ClassLoader that brings the core Java classes into the
VM, thereby, allowing the rest of the JVM to load itself. Normally, this means loading code from the “rt.jar” file in the jdk/jre/lib subdirectory, but under the JVM, the boot.class.path system property actually controls it.
The extension ClassLoader, the first child of the boostrap ClassLoader, is implemented in pure Java code. The extension ClassLoader’s primary responsibility is to load code from the JDK’s extension directories. This in turn provides users of Java the ability to simply drop in or add new code extensions (and hence the name “Extension directory”), such as JNDI , without requiring modification to the user’s CLASSPATH environment variable.
The system, or application, ClassLoader is the ClassLoader returned from the static
method ClassLoader.getSystemClassLoader. This is the ClassLoader responsible for
loading code from the CLASSPATH, and by default will be the parent to any user-created
or user-defined ClassLoader in the system.
Java ClassLoader model is a “delegating parent” model. What this means is that when a ClassLoader is asked to load a class, it first delegates the opportunity to load the requested class to its parent ClassLoader. Only if the parent classloader has not already loaded the Class does the child ClassLoader get the opportunity to load the Class requested.
The Java system provides for loading code from one of three places: one, the core library, second, the Extensions directory (or directories, if you modify the java.ext.dirs property to include multiple subdirectories), and lastly from the directories and .jar/.zip files found along the java.class.path property, which in turn comes from the CLASSPATH environment variable. Each of these three locations is in turn covered by its own ClassLoader instance: the core classes, by the bootstrap ClassLoader, the Extensions directory/directories by the extension ClassLoader, and the CLASSPATH by the system or application ClassLoader.
I hope you guys must have got an overall idea of how the classes are loaded in java.
Now for some recap:
The bootstrap ClassLoader is implemented as part of the VM itself . It is this ClassLoader that brings the core Java classes into the
VM, thereby, allowing the rest of the JVM to load itself. Normally, this means loading code from the “rt.jar” file in the jdk/jre/lib subdirectory, but under the JVM, the boot.class.path system property actually controls it.
The extension ClassLoader, the first child of the boostrap ClassLoader, is implemented in pure Java code. The extension ClassLoader’s primary responsibility is to load code from the JDK’s extension directories. This in turn provides users of Java the ability to simply drop in or add new code extensions (and hence the name “Extension directory”), such as JNDI , without requiring modification to the user’s CLASSPATH environment variable.
The system, or application, ClassLoader is the ClassLoader returned from the static
method ClassLoader.getSystemClassLoader. This is the ClassLoader responsible for
loading code from the CLASSPATH, and by default will be the parent to any user-created
or user-defined ClassLoader in the system.
Subscribe to:
Posts (Atom)