Thursday, 15 December 2011

NFS and Oracle UTL_FILE

Today we faced a challenge in the shape of slow dump by UTL_FILE of rows to a file located on a NFS mounted storage. The dump runs quite fast on a test environment, but there the storage in use is a FC based storage array.
After initial code review we found that the UTL_FILE.PUT_LINE is followed immediately by a call to UTL_FILE.FFLUSH, so it become apparent, that every single row dump is immediately flushed to the NFS file. An educated guess (or intuition) pointed out that the code sends data in inefficient manner.

The NFS storage was mounted with the following options: rw,bg,vers=2,proto=tcp,sec=sys. Those settings mean the write window size to be 8K for version 2 of NFS. I have found among many articles by Glen Fawcett a tip to the 359515.1 article on the Metalink (anyway this is a blog worth of looking at). In our case the recommended settings were rw,bg,hard,rsize=32768,wsize=32768,vers=3,cio,intr,timeo=600,proto=tcp, so we went with those new settings. The test showed that with those new ones was even worse.

And now all become clear: in case of NFS the write buffer is sent through network as is, so one can write as much as want to, but the data will be divided into buffers of set size and pushed to a network. In our case we flushed 231-bytes worth portion of data through first 8K, then 32K write windows.
After that all went smoothly. As we could not have been able to change the code at once, we managed to decrease the write window size - first 512B, then even 256B. The test showed that 512B window is better - evidently the protocol overhead + our 231 bytes was more then 256B, so finally we stayed at wsize=512 with a goal to do something about this FFLUSH call in the future and return to 32K window.

Long story short do not call FFLUSH after every PUT_LINE (unless You know what You do in order for example to allow for reading by others as quick as You are able to write), or even better do not call it at all (as it is called implicitly after buffer become full).
And second thought: set Your rsize and wsize according to the data amounts You will read or write in a single call to Your NFS storage.