On Exadata cellnodes, the RS-7445 error can be received due to the inability to send heartbeat under overload. This error appears in versions 11.2.2.4.0 and 11.2.3.2.1 of the Oracle Exadata Storage Server software.
As a result of the error, the CELLSRV service is restarted by RS. When Cellnode alert.log is examined, errors will be seen as below.
/opt/oracle/cell11.2.3.1.1_LINUX.X64_120607/log/diag/asm/cell/<cellnode_hostname>/trace/alert.log :
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | State dump signal delivered to Cellsrv<9971> State dump signal delivered to Cellsrv<9971> by RS. Mon Jan 2 16:38:24 2017 State dump interrupted for Cellsrv<9971> by RS. It did not complete in 5 seconds. [RS] Stopped Service CELLSRV Errors in file /opt/oracle/cell11.2.3.1.1_LINUX.X64_120607/log/diag/asm/cell/sba4cel07/trace/rstrc_9957_4.trc (incident=17): RS-7445 [Serv CELLSRV hang detected] [It will be restarted] [] [] [] [] [] [] [] [] [] [] Incident details in: /opt/oracle/cell11.2.3.1.1_LINUX.X64_120607/log/diag/asm/cell/sba4cel07/incident/incdir_17/rstrc_9957_4_i17.trc Sweep [inc][17]: completed [RS] Detected service hang. Increasing heartbeat timeout to 8 seconds. [RS] Started monitoring process /opt/oracle/cell11.2.3.1.1_LINUX.X64_120607/cellsrv/bin/cellrsomt with pid 3109 Mon Jan 02 17:38:25 2017 Successfully setting event parameter - Mon Jan 02 17:38:25 2017 Successfully setting event parameter - CELLSRV process id=3110 CELLSRV cell host name=sba4cel07.saglik.lokal CELLSRV version=11.2.3.1.1,label=OSS_11.2.3.1.1_LINUX.X64_120607,Fri_Jun__8_12:49:44_PDT_2012 OS Hugepage status: Total/free hugepages available=4001/81; hugepage size=2048KB OS Stats: Physical memory: 23955 MB. Num cores: 24 |
To avoid this problem, the heartbeat timeout time between RS and CELLSRV must be increased. By default, the heartbeat value is 6 seconds. You can improve this value by following these steps.
with root;
The following line is added to the $ OSSCONF/cellinit.ora file and services are restarted.
1 | _cellrsdef_heartbeat_timeout=10 |
1 2 3 4 5 6 7 | [root@oradbcel01 ~]# cellcli CellCLI: Release 11.2.3.1.1 - Production on Mon Jan 02 21:21:35 EET 2017 Copyright (c) 2007, 2011, Oracle. All rights reserved. Cell Efficiency Ratio: 488 CellCLI> alter cell RESTART SERVICES RS |