Oracle | Lynn の世界

上一篇Oracle 10g DataGuard学习（一）中简单介绍了DG通用的技术架构原理及相关概念，本篇将逐步细化DG架构介绍DG中redo传输服务及redo应用机制。该篇内容解释起来可能比较难懂，但会尽力，同时会结合Oracle官方文档加以描述！

2-1 重做传输服务（Redo Transport Services–RTS）

在上一篇的1-2节中简单解释了那么主库产生的重做数据即redo如何传输到备库（无论是物理备库还是逻辑备库），这就是此处要讨论的问题：DG重做传输服务（Redo Transport Services–RTS）协调从主库到备库的重做数据传输过程，同时，主库的LGWR进程将重做数据写入到自己的online redo log中。这样解释可能完全不知所云，因此需要一步一步分析解释。由于主库redo传输到备库的方式有两种：一种是主库通过ARCn进程将在本地产生的归档日志传给备库online redo log或是备库的归档地；另一种是主库通过LGWR进程将主库redo log buffer cache redo或是online redo log传输给备库。备库在接收到主库传来的重做数据后开始应用，通过这种方式保持主备库数据同步。以下为Oracle官方文档中对TRS的说明：

Redo transport services control the automated transfer of redo data from a database destination to one or more destinations. Redo transport services also manage the process of resolving any gaps in the archived redo log files due to a network failure.
Redo transport services can transmit redo data to local and remote destinations. Remote destinations can include any of the following types: physical and logical standby databases, archived redo log repositories, Oracle Change Data Capture staging databases, and Oracle Streams downstream capture databases
Figure 5-1 shows a simple Data Guard configuration with redo transport services archiving redo data to a local destination on the primary database while also transmitting it to archived redo log files or standby redo log files on a remote standby database destination.

上图简单描述了TRS过程，下面将详细分析其内容原理！

2-2 redo传输原理

按照Oracle DG官方文档内容，这一节本应该是Where to Send Redo Data，其主要讲到三方面：①主库redo应传输到哪里使备库可以应用这些redo，其中主要包括redo传输目的地的类型（Destination Types）；②如何传输redo到指定的目的地；③设置闪回恢复区。其中①主要讨论在DG或其他架构数据库环境中RTS所支持的destinations，这其中肯定是支持standby databases的；③主要讲如何在DG环境中使用闪回恢复区。而②是我们要重点解释的部分，在DG环境中，主库生成的redo如何传输到备库，使其应用。因此将这一节改成了redo传输原理，以下为官方文档对redo transport的概要解释：
On the primary database, Oracle Data Guard uses archiver processes (ARCn) or the log writer process (LGWR) to collect transaction redo data and transmit it to standby destinations. Although you cannot use both the archiver and log writer processes to send redo data to the same destination, you can choose to use the log writer process for some destinations, while archiver processes send redo data to other destinations. Data Guard also uses the fetch archive log (FAL) client and server to send archived redo log files to standby destinations following a network outage, for automatic gap resolution, and resynchronization.
从如上描述中，可以清楚的知道，在DG中使用两种方式将主库redo data传输到备库，第一种方式为主库使用ARCn进程将主库archived log传输到备库standby redo log files中，另一种方式为主库使用LGWR进程将主库log buffer cache中redo data或online redo log files中redo data传输到备库standby redo log files中，并且当由于诸如网络问题导致备库没有接受到主库传来的archived log而产生archive gaps时，DG将能自动的检测出备库上所存在的archive gaps，然后使用the fetch archive log (FAL) client and server将备库上存在的archive gaps logfiles从主库自动的transmit到备库以便恢复，而且这个操作是DG自己完成无需DBA人为操作。以下为官方文档中对archive gap解释：
An archive gap can occur on the standby system when it is has not received one or more archived redo log files generated by the primary database. The missing archived redo log files are the gap. If there is a gap, it is automatically detected and resolved by Data Guard by copying the missing sequence of log files to the standby destination. For example, an archive gap can occur when the network becomes unavailable and automatic archiving from the primary database to the standby database temporarily stops. When the network is available again, automatic transmission of the redo data from the primary database to the failed standby database resumes. Data Guard requires no manual intervention by the DBA to detect and resolve such gaps. The following sections describe gap detection and resolution.

2-2-1 ARCn传输redo data

上面说到redo data的传输方式，这里详细解释一下这两种redo data传输的原理。下图为DG使用ARCn进行传输redo data到备库的原理图：

从上图我们可以清楚的看到，主库LGWR进程将log buffer cache中redo data写到online redo log（简称ORL），然后由ARC0进程将ORL中的redo data按顺序写入到本地磁盘进行归档，再由主库ARC1进行将本地归档日志通过网络（Oracle Net）传给远端备库的RFS（Remote File Server）进程接收，备库RFS进程把接收到的主库归档日志中重做条目按照顺序写入到备库的standby redo log files中，备库就可以通过MRP（Medial Recovery Coordinator，Redo Apply方式应用主库redo data时进程）或LSP（Logical Standby Process，SQL Apply方式应用主库redo data时进程）进程将standby redo log files中的redo data应用于备库中，同时备库ARCn进程将standby redo log按顺序归档在本地磁盘中。而在该原理图上，我们还可以看到一点，备库RFS或LSP恢复进程也可以从备库归档日志中读取redo应用于备库，这是为什么呢？按照刚解释的，备库的RFS进程接收到主库传来的redo data先要写入standby redo log files中，随后备库RFS或LSP恢复进程就直接读取standby redo log开始在备库中应用，一般情况下，备库RFS或LSP恢复进程将standby redo log中redo data全部应用于备库中后，备库LGWR进程才会将standby redo log归档，不会出现备库RFS或LSP恢复进程还未将standby redo log中redo data应用于备库前就被归档的情况（根据Oracle数据库体系结构及原理可知）。原因其实很简单，在2-1中已经说到了：当由于诸如网络问题导致备库没有接受到主库传来的archived log而产生archive gaps时，DG将能自动的检测出备库上所存在的archive gaps，然后将备库上存在的archive gaps logfiles从主库自动传输到备库以便恢复。这个过程细化出来，就是图上红圈表出来的一个过程（结合官方文档及自己的理解添加上去），当DG检测到备库存在archive gaps，主库通过ARC3进程将备库缺失的那部分archive redo从主库归档日志中传给备库的RFS进程，RFS进程再将接收到的归档redo先写入到本地，然后备库的RFS或LSP恢复进程就可以读取这部分archive gaps redo应用于备库，从而保证了主备库数据的一致。
有关ARCn传输redo data的几点说明：
①Initialization Parameters That Control ARCn Archival Behavior（初始化参数控制ARCn行为）
在2-2-1小节中，详细讲了ARCn进程传输redo data原理，那么ARCn进程如何去完成这些行为，这就不得不提到初始化参数的作用。其中参数LOG_ARCHIVE_DEST_n和LOG_ARCHIVE_MAX_PROCESSES决定了DG环境中使用ARCn进行传输redo data的行为。
LOG_ARCHIVE_DEST_n参数：主备库都必须存在，该参数最多可以设置10个，并且每个必须指定LOCATION或SERVICE属性来指定redo data的传输位置及传输方式，对于该参数的用法参考Oracle 官方文档Data Guard Concepts and Administration中第14章LOG_ARCHIVE_DEST_n Parameter Attributes，此处不详细说明。
LOG_ARCHIVE_MAX_PROCESSES参数：该参数指定最大的ARCn进程数，默认为4个，如果数据库的负载压力很大，可以通过修改该参数来提高数据库归档能力。在DG环境主库中，如果使用ARCn进程传输redo data到备库，则ARCn进程数至少需要2个。
②DG环境中ARCn进程传输redo data支持最大性能模式的数据保护
既然提到ARCn进程传输redo data支持最大性能模式的数据保护，这里不得不提一下，在DG环境中数据库运行的三种模式：一种为最大保护模式（Maximum Protection Mode），一种为最大可用性模式（Maximum Availability Mode），第三种为最大性能模式（Maximum Performance Mode）。这里不细说这三种模式的具体含义（在后文相关位置会具体解释），ARCn传输redo data所支持的最大性能模式是指，这种最大性能模式相比于前两种保护模式而言，主数据库性能优于前两种，因为该模式下，不要求主备库数据要保持实时同步，所以采用ARCn进程传输主库archivelog至备库应用，因而不能保证在故障发生时数据不会出现零丢失。而最大保护模式和最高可用性模式对于主备库数据实时同步要求高，因此采用LGWR进程传输主库ORL或是log buffer cache中redo到备库应用保持数据同步。

2-2-2 LGWR传输redo data

上面解释了ARCn传输redo data原理过程，下面解释DG中LGWR传输redo data的方式。LGWR进程传输redo data不同于ARCn，因为LGWR不必等主库将ORL全部归档至本地磁盘后再将每一个archived redo log file一次传给备库去应用，而是当主库log buffer cache中有最新生成的redo data或当主库每产生一组新online redo log，LGWR就将这些redo提交给一个或多个LNSn（Log Network Server）进程通过Oracle Net传输给备库的RFS进程接收，然后RFS再将接收到的redo数据写入到standby redo log file中后，备库再开始应用redo。

LGWR进程传输redo data有两种方式，一种是LGWR SYNC，另一种为LGWR ASYNC。这两种方式虽然都是使用LGWR传输redo data，但是二者在redo data传输机制上还是有差别，同时在DG环境中所对应的数据数据保护模式不同，前者支持最大保护模式和最高可用性模式，而后者不支持。这里粘上官方文档中LGWR传输redo data的说明：
You can optionally enable redo transport services to use the LGWR process to transmit redo data to remote destinations.
Using the LGWR process differs from ARCn processing, because instead of waiting for the online redo log to switch at the primary database and then writing the entire archived redo log at the remote destination all at once, the LGWR process selects a standby redo log file at the standby site that reflects the log sequence number (and size) of the current online redo log file of the primary database. Then, as redo is generated at the primary database, it is also transmitted to the remote destination. The transmission to the remote destination will either be synchronous or asynchronous, based on whether the SYNC or the ASYNC attribute is set on the LOG_ARCHIVE_DEST_n parameter. Synchronous LGWR processing is required for the maximum protection and maximum availability modes of data protection in Data Guard configurations.

为了更清楚的说明这个过程，我们结合官方文档中的原理图来分别说明LGWR SYNC和LGWR ASYNC这两个过程。
①LGWR SYNC
官方文档上对于LGWR传输redo data的说明中，提到了LGWR传输redo data是采用SYNC还是ASYNC方式，是由LOG_ARCHIVE_DEST_n参数设置决定的。对于该参数的用法这里不赘述，可以去官方文档查阅了解。以下为官方文档对于LGWR SYNC的说明：
Example Initialization Parameters for LGWR Synchronous Archival

LOG_ARCHIVE_DEST_1='LOCATION=/arch1/chicago'
LOG_ARCHIVE_DEST_2='SERVICE=boston LGWR SYNC NET_TIMEOUT=30'
LOG_ARCHIVE_DEST_STATE_1=ENABLE
LOG_ARCHIVE_DEST_STATE_2=ENABLE

Specifying the SYNC attribute on the LOG_ARCHIVE_DEST_n parameter is optional, because this is the default for LGWR archival processing. The NET_TIMEOUT attribute is recommended, because it controls the amount of time that the LGWR process waits for status from the network server process before terminating the network connection. If there is no reply within NET_TIMEOUT seconds, then the LGWR process returns an error message.
Figure 5-4 shows a Data Guard configuration that uses the LGWR process to synchronously transmit redo data to the standby system at the same time it is writing redo data to the online redo log file on the primary database:
On the primary database, the LGWR process submits the redo data to one or more network server (LNSn) processes, which then initiate the network I/O in parallel to multiple remote destinations. Transactions are not committed on the primary database until the redo data necessary to recover the transaction is received by all LGWR SYNC destinations.
On the standby system, the remote file server (RFS) receives redo data over the network from the LGWR process and writes the redo data to the standby redo log files.
A log switch on the primary database triggers a log switch on the standby database, causing ARCn processes on the standby database to archive the standby redo log files to archived redo log files on the standby database. Then, Redo Apply (MRP process) or SQL Apply (LSP process) applies the redo data to the standby database. If real-time apply is enabled, Data Guard recovers redo data directly from the current standby redo log file as it is being filled up by the RFS process.

从上图及官方文档中的解释中，我们只需要抓住其中一点就可以明白LGWR SYNC的含义：On the primary database, transactions are not committed on the primary database until the redo data necessary to recover the transaction is received by all LGWR SYNC destinations. the remote file server (RFS) receives redo data over the network from the LGWR process and writes the redo data to the standby redo log files.这里已经说得很清楚了，当主库产生新事务redo data由LGWR交送给LNSn通过Oracle Net传输给备库RFS并将redo data写入到standby redo log files后，主库的该事务才能提交。而后备库就可以通过MRP或LSP进程应用redo data到备库，或由LGWR归档至本地磁盘。

②LGWR ASYNC
上面用了较长的篇幅介绍了LGWR SYNC原理过程，接下来我们解释LGWR ASYNC原理过程。我们先看官方文档的原文描述：
Example 5-6 Initialization Parameters for LGWR Asynchronous Archiving

LOG_ARCHIVE_DEST_1='LOCATION=/arch1/chicago'
LOG_ARCHIVE_DEST_2='SERVICE=boston LGWR ASYNC'
LOG_ARCHIVE_DEST_STATE_1=ENABLE
LOG_ARCHIVE_DEST_STATE_2=ENABLE

Figure 5-5 shows the LNSn process collecting redo data from the online redo log files and transmitting it over Oracle Net to the RFS process on the standby database.
When the LGWR and ASYNC attributes are specified, the log writer process writes to the local online redo log file, while the network server (LNSn) processes (one for each destination) asynchronously transmit the redo to remote destinations. The LGWR process continues processing the next request without waiting for the LNS network I/O to complete.
If redo transport services transmit redo data to multiple remote destinations, the LNSn processes (one for each destination) initiate the network I/O to all of the destinations in parallel.
When an online redo log file fills up, a log switch occurs and an archiver process archives the log file locally, as usual.

从上图及官方文档中的秒速中，我们同样只需要抓住其中一点就可以明白LGWR ASYNC的含义：When the LGWR and ASYNC attributes are specified, the log writer process writes to the local online redo log file, while the network server (LNSn) processes (one for each destination) asynchronously transmit the redo to remote destinations. The LGWR process continues processing the next request without waiting for the LNS network I/O to complete.从这段解释中我们就可以看出，LGWR ASYNC传输redo data是从主库ORL读取redo data交送给LSNn后通过Oracle Net传输给备库的RFS进程后写入到standby redo log files，再由备库应用redo data，从而保证主备库数据同步。这种方式下主库不必等备库RFS进程接收到传来的redo data写入到standby redo log files后事务才能提交，从某种程度上来讲，LGWR ASYNC模式下运行的DG环境数据库性能优于LGWR SYNC，但同样该模式下不能保证在主库发生故障后数据不会出现零丢失。

有关LGWR传输redo data的几点说明：
①The transmission to the remote destination will either be synchronous or asynchronous, based on whether the SYNC or the ASYNC attribute is set on the LOG_ARCHIVE_DEST_n parameter.
②The SYNC attribute performs all network I/O synchronously, in conjunction with each write operation to the online redo log file, and waits for the network I/O to complete.
③The ASYNC attribute performs all network I/O asynchronously and control is returned to the executing application or user immediately, without waiting for the network I/O to complete.
④If you configure a destination to use the LGWR process, but for some reason the LGWR process becomes unable to archive to the destination, then redo transport will revert to using the ARCn process to complete archival operations.
⑤If redo transport services transmit redo data to multiple remote destinations, the LNSn processes (one for each destination) initiate the network I/O to all of the destinations in parallel.
⑥If real-time apply is enabled, Data Guard recovers redo data directly from the current standby redo log file as it is being filled up by the RFS process.

这篇主要解释了在DG环境中，主库redo data传输到备库的方式，同时也结合官方文档解释了一些相关的概念，后续还会更新DG环境中redo data应用的原理及过程，还包括如何搭建DG，这其中包括物理和逻辑备库的创建过程，最后还会更新DG的管理维护及简单调优的文章。
PS: 未完待续…

从业这一行没多久，也没有什么经验，之前所接触到的客户数据库系统架构基本都是RAC或是单实例，又或是基于操作系统级别的双机环境，DataGuard架构的数据库系统遇到的相对较少，作为Oracle DB HA技术中一个重要的分支，在实际生产数据库环境中确实没有RAC用到的多。结果昨天一个使用Logical Standby DG的客户，要求去现场调整Primary DB Instance内存参数，这个操作是很简单，之前考认证时也演练过Physical DG，包括Switchover，但是确实没有认真看过官档，更别提DG的优化，于是决定看一看官档，详细了解一下DG的原理及优化方法，记录该过程，算是留个念想！

1-1 DG架构简介

DataGuard专门用来保护Oracle数据，可在提供最高级别的数据保护和可用性的同时，使Oracle数据库保持卓越的性能（对于这一说法，笔者认为也只是是相对而言，在Oracle的官方技术网站上有一个统计数据：远程镜像与DG相比，其生成的网络流量多7倍，网络I/O操作多27倍，而且远程镜像还有一个无法比拟DG的缺陷，远程镜像对Oracle事物一无所知，无法区分重做、撤消、数据块修改，控制文件写入等操作！），作为Oracle内核自带的一项功能，DataGuard和Oracle的其他高可用性技术集成在一起，比如RAC，RMAN，Oracle闪回技术等。在说DataGuard相关专业术语之前之前，先上个DG的结构图：

Figure 1-1/Figure 2-1说明了在DG架构中，主备数据库在地理位置上没有任何限制（只要确保主备数据库节点间网络通信及时间同步正常，即便主库在地球上，备库在月球上也没有问题！），这当然也包括主备数据库在同一台服务器上运行，但是这样做基本没有什么意义，因为如此一来，DG已无法提供最高级别的数据保护和可用性，而且还额外消耗服务器资源。因此DG最主要是用于容灾（包括同城或异地灾备），尽可能降低7*24小时不间断运行数据库系统计划停机或意外宕机的时间！

1-2 DG简单原理及相关概念

在Oracle 10g DG架构中的数据库共分为两种角色，一种称之为主库即生产数据库（Primary DB），另一种称为备用数据库（Standby DB又分为物理备库[Physical DB]和逻辑备库[Logical DB]），主库在同一时间只能有一个，而备库最多则可以有9个。主库主要支撑生产业务系统的正常运行，备库则是当主库出现计划内停机或意外宕机时迅速切换成主库继续支持业务的正常运行，也包将主库上的某些操作例如生成报表或大查询等操作放在备库上执行以降低主库的压力。

DataGuard运行机制遵循一个简单的原则：传输重做数据（主数据库即Primary DB传输Redo），然后应用重做数据（备用数据库即Standby DB应用Redo）。重做数据中包括Oracle数据库恢复数据库事物所需要的所有信息，主数据库将重做数据传输给一个或多个独立备用数据库，DG备用数据库处于连续恢复状态，验证并应用重做数据，从而与主数据库保持同步。如果因为网络中断或备用数据库断电导致备用数据库与主数据库之间的连接临时中断，DG还会自动重新同步备用数据库，这个简单的架构使得当主数据库按计划停机或意外中断后，一个或多个备用数据库立即可供使用，从而恢复正常处理。其简单过程如下图：

重做是DG操作的核心，因为备用数据库就是应用主库所传输的重做数据，从而保证备库与主库的数据同步，但对于重做的概念及基础知识，这里不在赘述，可查阅官方文档及相关Oracle技术丛书和网站！

由于DG提供两种不同的方法在备库上应用重做数据：Redo Apply（物理备用）和SQL Apply（逻辑备用），因此DG中的备库也就分为两种，一种是物理备库即Physical Standby DB，另一种为逻辑备库即Logical Standby DB，这二者的区别除了应用重做数据的方式不同外，还有两个最大的区别在于：①Physical Standby DB在应用主库传来的重做数据时只能启动到mount状态（不能open），当必要时需要在Physical Standby DB上执行某些查询时，此时只能将Physical Standby DB停止重做数据的应用（stop redo apply），然后以open read only方式打开Physical Standby DB执行某些查询；但对于Logical Standby DB而言，由于是采用SQL Apply的方式应用重做数据，因此可以以read/write模式open，很多应用上的大查询或是生成报表等操作就可以在Logical Standby DB上进行，大大减小了Primary DB的压力，除此以外还可以对已从Primary DB中同步到Logical Standby DB中本地表执行insert、update、delete等测试操作操作。②由于Physical Standby DB采用Redo Apply方式应用重做数据，因此Physical Standby DB则是Primary DB的一致copy，无论是磁盘上DB的数据块，还是DB中的schema和indexes都完全一致；而对于Logical Standby DB而言，与Primary DB拥有相同的逻辑信息，其物理结构及数据结构可以不同。

到这你可能会问，既然如此，为什么还要用Physical Standby，Logical Standb不但可以保持和主库的数据一致，而且还可以正常的open供应用使用，为什么还要用只能mount或是以read only方式open的Physical Standby，那是因为Logical Standby由于是采用SQL Apply方式应用redo（即在备库中执行SQL语句）保持与主库数据的同步，而SQL语句必然会有某些数据类型不支持，比如ROWID、UROWID等数据类型，那在这种情况下，这些SQL所不支持的数据库类型数据就无法与主库同步，而Physical Standby由于是基于Redo Apply应用方式，因此不存在这种情况。所以在生产环境中需要结合实际情况进行分析选择，甚至有时会采用mixed模式投入生产。下图为Redo Apply和SQL Apply两种方式的DG示意图：

以下为Oracle官方文档对于Primary DB、Physical Standby DB、Logical Standby DB的解释：

---Primary Database
A Data Guard configuration contains one production database, also referred to as the primary database, that functions in the primary role. This is the database that is accessed by most of your applications
The primary database can be either a single-instance Oracle database or an Oracle Real Application Clusters database.
---Physical standby database
Provides a physically identical copy of the primary database, with on disk database structures that are identical to the primary database on a block-for-block basis. The database schema, including indexes, are the same. A physical standby database is kept synchronized with the primary database, though Redo Apply, which recovers the redo data received from the primary database and applies the redo to the physical standby database.
A physical standby database can be used for business purposes other than disaster recovery on a limited basis.
---Logical standby database
Contains the same logical information as the production database, although the physical organization and structure of the data can be different. The logical standby database is kept synchronized with the primary database though SQL Apply, which transforms the data in the redo received from the primary database into SQL statements and then executing the SQL statements on the standby database.
A logical standby database can be used for other business purposes in addition to disaster recovery requirements. This allows users to access a logical standby database for queries and reporting purposes at any time. Also, using a logical standby database, you can upgrade Oracle Database software and patch sets with almost no downtime. Thus, a logical standby database can be used concurrently for data protection, reporting, and database upgrades.

PS: 未完待续…

Lynn の世界

远方，除了遥远，还有梦想！

分类目录归档：Oracle

Oracle 10g DataGuard学习（二—①）Redo Transport Services

Oracle 10g DataGuard学习（一）架构及概念介绍