Wednesday, October 7, 2009

Java NIO woes : Poll Set based Selector Provider shipped with IBM AIX JVM J9

     While testing our server on AIX, we ran into multiple nio related issues that serverely affected the functionality of our product. Since our server worked fine on other platforms (win 32, win 64, Solaris sparc, HP - UX), we suspected that this would be an issue with the default selector implementation that shipped with the IBM JDK. One point to note that the issues aggravated when we moved from and earlier JDK version to IBM J9 Version 1.6 SR5.
     To be specific, the issues that we encountered were frequent hangs in Selector.select() and SocketChannel.close() operations. The Stack Traces encountered were as follows  
Thread A
FileDispatcher.preClose0(FileDescriptor) line: not available [native method]
SocketDispatcher.preClose(FileDescriptor) line: 53
SocketChannelImpl.implCloseSelectableChannel() line: 693
SocketChannelImpl(AbstractSelectableChannel).implCloseChannel() line: 212
SocketChannelImpl(AbstractInterruptibleChannel).close() line: 97

Thread B : Selecting Thread (number 1 ):
PollsetArrayWrapper.pollsetBulkCtl(int, long, int) line: not available [native method]
PollsetArrayWrapper.poll(long) line: 228
PollsetSelectorImpl.doSelect(long) line: 62
PollsetSelectorImpl(SelectorImpl).lockAndDoSelect(long) line: 69
PollsetSelectorImpl(SelectorImpl).select(long) line: 80

Thread C : Selecting Thread ( number 2) :
PollsetArrayWrapper.pollsetPoll(int, long, int, long) line: not available [native method]
PollsetArrayWrapper.poll(long) line: 232
PollsetSelectorImpl.doSelect(long) line: 62
PollsetSelectorImpl(SelectorImpl).lockAndDoSelect(long) line: 69
PollsetSelectorImpl(SelectorImpl).select(long) line: 80

 Other Server threads waiting to serve/complete serving requests.
What I discovered was that the default selector provider in 1.6 SR5 had been changed to PollSetSelectorImpl which provides a selector based on the pollset interface. This Poll Set selector provider is an enhancement over the poll() based selector provider that was the default selector provider on AIX J9 JDK pre 1.6 SR5.   Pre 1.6 SR5, the default selector provider was PollSelectorProvider and the Selector Implementation was PollSelectorImpl. The implementation of this Selector Provider is based on the POSIX poll(...).
 One point to be noted is that the default SelectorProviders shipped with JDK implemented by Sun are poll() based (and epoll() based Linux Kernel Version 2.6 onwards).   
 In IBM JDK 1.6 SR5, the default selector provider is the pollset provider (PollsetSelectorProvider). The SelectorProvider.openSelector() opens a pollset selector if it detects that the OS supports the pollset interface. This selector implementation is efficient than the default poll() based Selector Provider that comes along with pre 1.6 SR5 versions.
 The optimizations in the Poll Set based Selector Provider are to register only the file descriptors for whom the operations are newly registered to the kernel poll set and the poll cache mechanism implemented at the kernel level.  The performance improvement of the pollset selector over the poll based selector in terms of number of requests served per second is around 13.6 % which does not seem significant. The details of the Poll Set based Selector Provider can be found here.
 In conclusion, I would be going with the poll based SelectorProvider "sun.nio.ch.PollSelectorProvider" for our server on AIX platform till we have a definitive answer for the issues that we are facing with the poll set based Selector Provider : "sun.nio.ch.PollsetSelectorProvider". To do this, we start the server JVM with the property 
-Djava.nio.channels.spi.SelectorProvider=sun.nio.ch.PollSelectorProvider
Do check this space for updates.

4 comments:

  1. Hi,

    You might like to know that this problem has been tracked down to a bug in AIX and has been fixed by APAR IZ73057 - http://www-01.ibm.com/support/docview.wss?uid=isg1IZ73507

    If you aren't able to request/install this AIX update, then using the system property you described is a good workaround, and is what we have recommended to other customers.

    I hope this information helps.

    ReplyDelete
  2. I see my app crashing every now and then. Stack trace for one of the thread is as follows:

    4XESTACKTRACE at sun/nio/ch/PollArrayWrapper.poll0(Native Method)
    4XESTACKTRACE at sun/nio/ch/PollArrayWrapper.poll(PollArrayWrapper.java:144)
    4XESTACKTRACE at sun/nio/ch/PollSelectorImpl.doSelect(PollSelectorImpl.java:134)
    4XESTACKTRACE at sun/nio/ch/SelectorImpl.lockAndDoSelect(SelectorImpl.java:99)
    4XESTACKTRACE at sun/nio/ch/SelectorImpl.select(SelectorImpl.java:110)
    4XESTACKTRACE at sun/nio/ch/SelectorImpl.select(SelectorImpl.java:114)


    Other details:
    AIX 6.1 level 6
    Java 1.5 SR12 FP1

    Could there be any issue with NIO? Would the workaround mentioned in this post help (I will try it anyway)?

    ReplyDelete
  3. The stack trace indicates that the Poll Selector Provider is already being used as the default in your JDK. Hence the workaround mentioned here wont work. Do you have any useful information the core dump of the JVM ?

    ReplyDelete
  4. The core dump indicates that the crash originated from libpthread library.

    ReplyDelete