Oracle 11gR2下RAC安装后,启动CRS.错误如下:
[root@rac1 bin]# ./crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
从这个错误提示,可以看到是CRS启动失败了。CRS是关键进程。它不能启动,Clusterware也是启动不了。导致这个问题的原因很多。
Log如下:
[root@rac1 rac1]# tail -50 /u01/app/11.2.0/grid/log/rac1/crsd/crsd.log
ORA-15077: could not locate ASM instance serving a required diskgroup
2010-11-16 17:13:44.286: [OCRASM][3046411024]proprasmo: kgfoCheckMount returned [7]
2010-11-16 17:13:44.286: [OCRASM][3046411024]proprasmo: The ASM instance is down
2010-11-16 17:13:44.287: [OCRRAW][3046411024]proprioo: Failed to open [+CRS]. Returned proprasmo() with [26]. Marking location as UNAVAILABLE.
2010-11-16 17:13:44.287: [OCRRAW][3046411024]proprioo: No OCR/OLR devices are usable
2010-11-16 17:13:44.287: [OCRASM][3046411024]proprasmcl: asmhandle is NULL
2010-11-16 17:13:44.287: [OCRRAW][3046411024]proprinit:Could not open raw device
2010-11-16 17:13:44.287: [OCRASM][3046411024]proprasmcl:asmhandle is NULL
2010-11-16 17:13:44.287: [OCRAPI][3046411024]a_init:16!:Backend init unsuccessful : [26]
2010-11-16 17:13:44.288: [CRSOCR][3046411024] OCR context init failure.Error: PROC-26: Error while accessing the physical storage ASM error [SLOS: cat=7, opn=kgfoAl06, dep=15077, loc=kgfokge
ORA-15077: could not locate ASM instance serving a required diskgroup
] [7]
2010-11-16 17:13:44.288: [CRSD][3046411024][PANIC] CRSD exiting:Could not init OCR, code: 26
2010-11-16 17:13:44.288: [CRSD][3046411024] Done.
这里的提示是ASM没有启动造成的。这里牵涉到的问题较复杂。
这篇文章不打算去具体分析这个问题。Oracle官网上有一篇文章对这个问题进行了非常详细的说明。转到了我的Blog。参考:
How to Troubleshoot Grid Infrastructure Startup Issues [ID 1050908.1]
http://blog.csdn.net/xujinyang/article/details/6834912
In this Document
Goal
Solution
Start up sequence:
Cluster status
Case 1: OHASD.BIN does not start
Case 2: OHASD Agents does not start
Case 3: CSSD.BIN does not start
Case 4: CRSD.BIN does not start
Case 5: GPNPD.BIN does not start
Case 6: Various other daemons does not start
Case 7: CRSD Agents does not start
Network and Naming Resolution Verification
Log File Location, Ownership and Permission
Network Socket File Location, Ownership and Permission
Diagnostic file collection
References
在这里写下我分析问题的思路:
1.根据log,看能否找到问题的原因。如果不能清楚的定位问题。就只能继续分析。
2.根据CRS启动的顺序来分析。
在启动的时候,要先启动ASM实例,这里牵涉到存储问题。
(1)网络是否正常
(2)存储是否正常的映射到相关的位置,我的实验采用的是multipath,将存储映射到/dev/mapper/*目录下。在遇到问题的时候,会去检查这个问题是否有相关的映射。
(3)存储的权限问题。因为映射之后,默认是的root用户。我在/etc/rc.d/rc.local文件里添加了改变权限的脚本。开机启动的时候,就将相关映射文件改成Oracle用户。
3.如果这些都正常,没有问题,可以尝试重启CRS或者重启操作系统。
补充:
在网上还搜索到一个导致CSSD启动失败的原因。这个我关注的是,它讲到了一个知识点。讲到了/tmp/.oracle和/var/tmp/.oracle这两个目录的作用。每次Server重启的时候,会在这两个文件里存放锁的信息。当某次重启后,这两个文件不能被删除,就会导致锁不能更新,从而不能启动。
由此也理解了,在删除Clusterware的时候,为什么需要删除这2个目录了。
在RAC删除的那篇文档里提到了卸载RAC时要删除这2个目录。参考:
RAC卸载说明
http://blog.csdn.net/xujinyang/article/details/6837237
crs.log日志内容:
2007-04-11 14:37:34.020: [ COMMCRS][1693]clsc_connect: (100f78610) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_drdb1_crs))
2007-04-11 14:37:34.020: [ CSSCLNT][1]clsssInitNative: connect failed, rc 9
2007-04-11 14:37:34.021: [ CRSRTI][1] CSS is not ready. Received status 3 from CSS. Waiting for good status ..
2007-04-11 14:37:35.740: [ COMMCRS][1695]clsc_connect: (100f78610) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_drdb1_crs))
2007-04-11 14:37:35.740: [ CSSCLNT][1]clsssInitNative: connect failed, rc 9
When we checked ocssd.log it contained the following
CSSD]2007-04-11 12:53:56.211 [6] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (0//dev/rdsk/c5t8d0s5)
[ CSSD]2007-04-11 12:53:56.211 [10] >TRACE: clssnmvKillBlockThread: spawned for disk 1 (/dev/rdsk/c5t9d0s5) initial sleep interval (1000)ms
[ CSSD]2007-04-11 12:53:56.211 [11] >TRACE: clssnmvKillBlockThread: spawned for disk 0 (/dev/rdsk/c5t8d0s5) initial sleep interval (1000)ms
[ CSSD]2007-04-11 12:53:56.228 [1] >TRACE: clssnmFatalInit: fatal mode enabled
[ CSSD]2007-04-11 12:53:56.269 [13] >TRACE: clssnmconnect: connecting to node 1, flags 0×0001, connector 1
[ CSSD]2007-04-11 12:53:56.274 [13] >TRACE: clssnmClusterListener: Listening on (ADDRESS=(PROTOCOL=tcp)(HOST=drdb1-priv)(PORT=49895))
[ CSSD]2007-04-11 12:53:56.274 [13] >TRACE: clssnmconnect: connecting to node 0, flags 0×0000, connector 1
[ CSSD]2007-04-11 12:53:56.279 [14] >TRACE: clsclisten: Permission denied for (ADDRESS=(PROTOCOL=ipc)(KEY=Oracle_CSS_LclLstnr_crs_1))
[ CSSD]2007-04-11 12:53:56.279 [14] >ERROR: clssgmclientlsnr: listening failed for (ADDRESS=(PROTOCOL=ipc)(KEY=Oracle_CSS_LclLstnr_crs_1)) (3)
[ CSSD]2007-04-11 12:53:56.279 [14] >TRACE: clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=Oracle_CSS_LclLstnr_crs_1))
[ CSSD]2007-04-11 12:53:56.279 [14] >TRACE: clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_drdb1_crs))
[ CSSD]2007-04-11 13:07:36.516 >USER: Oracle Database 10g CSS Release 10.2.0.2.0 Production Copyright 1996, 2004 Oracle. All rights reserved.
[ clsdmt]Fail to listen to (ADDRESS=(PROTOCOL=ipc)(KEY=drdb1DBG_CSSD))
[ CSSD]2007-04-11 13:07:36.516 >USER: CSS daemon log for node drdb1, number 1, in cluster crs
[ clsdmt]Terminating clsdm listening thread
[ CSSD]2007-04-11 13:07:36.536 [1] >TRACE: clssscmain: local-only set to false
[ CSSD]2007-04-11 13:07:36.545 [1] >TRACE: clssnmReadNodeInfo: added node 1 (drdb1) to cluster
[ CSSD]2007-04-11 13:07:36.588 [5] >TRACE: clssnm_skgxnmon: skgxn init failed, rc 1
[ CSSD]2007-04-11 13:07:36.588 [1] >TRACE: clssnm_skgxnonline: Using vacuous skgxn monitor
解决方法:
By checking the above logs we have realised the listener of CSS deamon was unable to start.
the reason why it was unable to start was that each time server reboots it creates a socket at /tmp/.oracle or /var/tmp/.oracle directory .
Alsoif there are previously existing sockets they cannot be reused or deleted automatically from this directory .oracle.
Therefore the solution to above problem was obtained by deleting all the files inside .oracle directoery in /var/tmp or /tmp.
Hence the crs started and cluster came up.
------------------------------------------------------------------------------
分享到:
相关推荐
Oracle cache fushion技术原理讲解,非常有技术含量的文档,Understanding Oracle RAC Internals - The Cache Fusion Edition
Oracle 11g RAC--安装参考手册
Oracle 11g RAC--安装参考手册 pdf.zip
本文详细介绍如何讲oracle rac的数据库数据进行迁移备份,配有图文,适合做做实验在再去实践
oracle RAC crs常用命令,还是比较全的啦
【Oracle-RAC】Linux--Oracle-11g-R2-RAC-安装配置详细过程V3.0(图文并茂)
Oracle RAC集群之Oracle CRS的管理与维护.pdf 学习资料 复习资料 教学资源
oracle-rac-安装部署规范文档 。。。
oracle-rac-11.2.0.4.8升级测试验证报告,测试内容实例,监听,切换,bug补丁修复验证。
Oracle-RAC-11g-R1-On-HPUX.pdf
oracle-rac12c-guide and install guide oracle12c 安装
Linux--Oracle-11g-R2-RAC-安装配置详细过程V3.0(图文并茂).zipLinux--Oracle-11g-R2-RAC-安装配置详细过程V3.0(图文并茂).zipLinux--Oracle-11g-R2-RAC-安装配置详细过程V3.0(图文并茂).zip
oracle 12c RAC白皮书 rac-wp-12c
本文介绍一些oracle11g rac维护常用的命令,是在工作中总结出来的精华!
Oracle 18c for linux7 rac 部署文档
oracle 10g rac的crs命令很多人用的时候总是忘记,我整理了一下,相信对很多dba有用
rhel 6中搭建oracle 11gr2 rac环境。
Windows 2003-Oracle10g-RAC-VMware Server
ORACLE RAC 可能会偶尔碰到CRS 启动的问题,这些问题可以通过查看相关日志,诸如 crsd.log,alertrac.log 等,来修正相关问题,并可以使用crs_register,crs_unregister,crs_profile 来重新注册OCR 信息。 但是有时候...