Monday 18 June 2018

High latch free on Exadata

Today I have investigated an issue with poor performance on some processing. On the production environment the thing have been usually accomplished in some 5 minutes, while on one of test databases it took once 5h to complete.
As this was a RAC on Exadata machine, AWR report was not that much helpful, though immediately was clear the main problem was a latch free event. The problem was, which latch was responsible here.
The performance chart in OEM did not provide SQL text (as apparently those calls were one-time SQL calls - candidates for dealing with them by setting cursor_sharing to FORCE).
A good step was to call the following query:
select name, (misses/decode(gets,0,1,gets) )*100 ratio, 
       (immediate_misses/decode(immediate_gets,0,1,immediate_gets))*100 immediate_ratio,
        spin_gets, wait_time
 from v$latch
where wait_time > 0
order by 2
;
While one can not be 100% sure the query indicates the right latch, it at least gives some clues. This time the only distinctive ratio (around 40%) was Result Cache: RC Latch. The other one with around 10% was resource manager latch but there the number of gets and misses in total was fairly small, so it was not that interesting.
I've looked for this latch name and found few articles on the subject: