在2012年第一季度的CPU补丁中,包含了一个关于SCN修正的重要变更,这个补丁提示,在异常情况下,Oracle的SCN可能出现异常增长,使得数据库的一切事务停止,由于SCN不能后退,所以数据库必须重建,才能够重用。
我曾经在以下链接中描述过这个问题:
http://www.eygle.com/archives/2012/03/oracle_scn_bug_exhaused.html
Oracle使用6 Bytes记录SCN,也就是48位,其最大值是:
class="dp-cpp">
- SQL> col scn for 999,999,999,999,999,999
- SQL> select power(2,48) scn from dual;
- SCN
- ------------------------
- 281,474,976,710,656
Oracle在内部控制每秒增减的SCN不超过 16K,按照这样计算,这个数值可以使用大约544年:
- SQL> select power(2,48) / 16 / 1024 / 3600 / 24 / 365 from dual;
- POWER(2,48)/16/1024/3600/24/365
- -------------------------------
- 544.770078
然而在出现异常时,尤其是当使用DB Link跨数据库查询时,SCN会被同步,在以下链接中,我曾经描述过此问题:
http://www.eygle.com/archives/2006/11/db_link_checkpoint_scn.html
一个数据库当前最大的可能SCN被称为"最大合理SCN",该值可以通过如下方式计算:
- col scn for 999,999,999,999,999,999
- select
- (
- (
- (
- (
- (
- (
- to_char(sysdate,'YYYY')-1988
- )*12+
- to_char(sysdate,'mm')-1
- )*31+to_char(sysdate,'dd')-1
- )*24+to_char(sysdate,'hh24')
- )*60+to_char(sysdate,'mi')
- )*60+to_char(sysdate,'ss')
- ) * to_number('ffff','XXXXXXXX')/4 scn
- from dual
- /
这个算法即SCN算法,以1988年1月1日 00点00时00分开始,每秒计算1个点数,最大SCN为16K。
这个内容可以参考如下链接:
http://www.eygle.com/archives/2006/01/how_big_scn_can_be.html
在CPU补丁中,Oracle提供了一个脚本 scnhealthcheck.sql 用于检查数据库当前SCN的剩余情况。
该脚本的算法和以上描述相同,最终将最大合理SCN 减去当前数据库SCN,计算得出一个指标:HeadRoom。也就是SCN尚余的顶部空间,这个顶部空间最后折合成天数:
以下是这个脚本的内容:
- Rem
- Rem $Header: rdbms/admin/scnhealthcheck.sql st_server_tbhukya_bug-13498243/8 2012/01/17 03:37:18 tbhukya Exp $
- Rem
- Rem scnhealthcheck.sql
- Rem
- Rem Copyright (c) 2012, Oracle and/or its affiliates. All rights reserved.
- Rem
- Rem NAME
- Rem scnhealthcheck.sql - Scn Health check
- Rem
- Rem DESCRIPTION
- Rem Checks scn health of a DB
- Rem
- Rem NOTES
- Rem .
- Rem
- Rem MODIFIED (MM/DD/YY)
- Rem tbhukya 01/11/12 - Created
- Rem
- Rem
- define LOWTHRESHOLD=10
- define MIDTHRESHOLD=62
- define VERBOSE=FALSE
- set veri off;
- set feedback off;
- set serverout on
- DECLARE
- verbose boolean:=&&VERBOSE;
- BEGIN
- For C in (
- select
- version,
- date_time,
- dbms_flashback.get_system_change_number current_scn,
- indicator
- from
- (
- select
- version,
- to_char(SYSDATE,'YYYY/MM/DD HH24:MI:SS') DATE_TIME,
- ((((
- ((to_number(to_char(sysdate,'YYYY'))-1988)*12*31*24*60*60) +
- ((to_number(to_char(sysdate,'MM'))-1)*31*24*60*60) +
- (((to_number(to_char(sysdate,'DD'))-1))*24*60*60) +
- (to_number(to_char(sysdate,'HH24'))*60*60) +
- (to_number(to_char(sysdate,'MI'))*60) +
- (to_number(to_char(sysdate,'SS')))
- ) * (16*1024)) - dbms_flashback.get_system_change_number)
- / (16*1024*60*60*24)
- ) indicator
- from v$instance
- )
- ) LOOP
- dbms_output.put_line( '-----------------------------------------------------'
- || '---------' );
- dbms_output.put_line( 'ScnHealthCheck' );
- dbms_output.put_line( '-----------------------------------------------------'
- || '---------' );
- dbms_output.put_line( 'Current Date: '||C.date_time );
- dbms_output.put_line( 'Current SCN: '||C.current_scn );
- if (verbose) then
- dbms_output.put_line( 'SCN Headroom: '||round(C.indicator,2) );
- end if;
- dbms_output.put_line( 'Version: '||C.version );
- dbms_output.put_line( '-----------------------------------------------------'
- || '---------' );
- IF C.version > '10.2.0.5.0' and
- C.version NOT LIKE '9.2%' THEN
- IF C.indicator>&MIDTHRESHOLD THEN
- dbms_output.put_line('Result: A - SCN Headroom is good');
- dbms_output.put_line('Apply the latest recommended patches');
- dbms_output.put_line('based on your maintenance schedule');
- IF (C.version < '11.2.0.2') THEN
- dbms_output.put_line('AND set _external_scn_rejection_threshold_hours='
- || '24 after apply.');
- END IF;
- ELSIF C.indicator<=&LOWTHRESHOLD THEN
- dbms_output.put_line('Result: C - SCN Headroom is low');
- dbms_output.put_line('If you have not already done so apply' );
- dbms_output.put_line('the latest recommended patches right now' );
- IF (C.version < '11.2.0.2') THEN
- dbms_output.put_line('set _external_scn_rejection_threshold_hours=24 '
- || 'after apply');
- END IF;
- dbms_output.put_line('AND contact Oracle support immediately.' );
- ELSE
- dbms_output.put_line('Result: B - SCN Headroom is low');
- dbms_output.put_line('If you have not already done so apply' );
- dbms_output.put_line('the latest recommended patches right now');
- IF (C.version < '11.2.0.2') THEN
- dbms_output.put_line('AND set _external_scn_rejection_threshold_hours='
- ||'24 after apply.');
- END IF;
- END IF;
- ELSE
- IF C.indicator<=&MIDTHRESHOLD THEN
- dbms_output.put_line('Result: C - SCN Headroom is low');
- dbms_output.put_line('If you have not already done so apply' );
- dbms_output.put_line('the latest recommended patches right now' );
- IF (C.version >= '10.1.0.5.0' and
- C.version <= '10.2.0.5.0' and
- C.version NOT LIKE '9.2%') THEN
- dbms_output.put_line(', set _external_scn_rejection_threshold_hours=24'
- || ' after apply');
- END IF;
- dbms_output.put_line('AND contact Oracle support immediately.' );
- ELSE
- dbms_output.put_line('Result: A - SCN Headroom is good');
- dbms_output.put_line('Apply the latest recommended patches');
- dbms_output.put_line('based on your maintenance schedule ');
- IF (C.version >= '10.1.0.5.0' and
- C.version <= '10.2.0.5.0' and
- C.version NOT LIKE '9.2%') THEN
- dbms_output.put_line('AND set _external_scn_rejection_threshold_hours=24'
- || ' after apply.');
- END IF;
- END IF;
- END IF;
- dbms_output.put_line(
- 'For further information review MOS document id 1393363.1');
- dbms_output.put_line( '-----------------------------------------------------'
- || '---------' );
- END LOOP;
- end;
- /
在应用补丁之后,一个新的隐含参数 _external_scn_rejection_threshold_hours 引入,通常设置该参数为 24 小时:
_external_scn_rejection_threshold_hours=24
这个设置降低了SCN Headroom的顶部空间,以前缺省的设置容量至少为31天,降低为 24 小时,可以增大SCN允许增长的合理空间。
但是如果不加控制,SCN仍然可能会超过最大的合理范围,导致数据库问题。
这个问题的影响会极其严重,我们建议用户检验当前数据库的SCN使用情况,以下是检查脚本的输出范例:
- --------------------------------------
- ScnHealthCheck
- --------------------------------------
- Current Date: 2012/01/15 14:17:49
- Current SCN: 13194140054241
- Version: 11.2.0.2.0
- --------------------------------------
- Result: C - SCN Headroom is low
- If you have not already done so apply
- the latest recommended patches right now
- AND contact Oracle support immediately.
- For further information review MOS document id 1393363.
- --------------------------------------
这个问题已经出现在客户环境中,需要引起大家的足够重视。