In today’s article, we will cover the subject of “Node Eviction”, which is the last thing a database administrator would want to see.
Eviction is a mechanism or process technology designed for Oracle Cluster.
This technology helps to separate the nodes in the cluster with or without critical problems from the structure in order not to disrupt the consistency and overall cluster operation.
For example, if any node hangs on a cluster or thinks it cannot be accessed for different reasons, the relevant node will be separated from the cluster structure by Eviction and a fast restart will occur for this problematic node.
This process is recorded by LMON with the error “ORA-29740 evicted by instance number”.
Its Main Causes are as follows and can be updated according to different cases.
Network problems
Memory (ram, swap, etc.) problems
Excessive and long-term load that may occur on the processor
Bug 16876500 & Bug 14385860
Records to be looked at to fix the problem
All instance Alert Log records ( cluster alert.log, asm alert.log )
Ocssd.logs history
LMON, LMSn, LMD0 history
OSWatcher logs
OS logs ( /var/log/messages )
To explain the subject a little more, when a network-related communication error occurs between the nodes or when the Node cannot write the heartbeat information to the CFVRR, which will be subject to the inviction process, the Cluster on the node performs the evacuation processes we mentioned above to prevent possible data corruption.
IMR is responsible for the automation of all these processes.
Instance Membership Recovery (IMR) is responsible for the organization of all cluster members and is also part of the Cluster Group Services structure.
As we mentioned at the beginning, this process results in the restart of the node with the problem detected.