Wednesday , March 29 2023

Fast-Start Failover Configuration

In today’s article, we will examine Fast-Start Failover Configuration.

The configuration of Fast-Start Failover consists of the following 8 steps in summary.

A. Identifying the Fast-start Failover Standby Database

B. Setting Data Protection Mode

C. Setting the threshold value at which Fast-start Failover will occur

Ç. Setting additional Fast-start Failover features

D. Setting the additional conditions we want to happen

E. Activating Fast-start Failover

F.Starting Observer

G. Verifying the configuration

Now let’s see how these operations are done.

Commands run from DGMGRL will be run from Primary-1 unless specified separately.

A. Identifying the Fast-start Failover Standby Database

The first step of Fast-start Failover configuration is to determine the Target Standby Database.

1. We query the current status of the Fast-start Failover configuration.

2. We question the Fast-start Failover configuration in detail.

3. Fast-start Failover Target Standby Database is determined.

4. We query the Fast-start Failover configuration again.

There is no change in the configuration. It is necessary to ENABLE Fast-start Failover for it to appear, or we can see it when queried as follows.

B. Setting Data Protection Mode

To enable Fast-start Failover, Data Protection mode must be in either Maximum Availability or Maximum Performance mode.

It is important to choose the right protection mode, as it has a direct effect on data loss in the event of a disaster.

If it is decided to use Maximum Performance mode, the FastStartFailoverLagLimit parameter must be set.

This parameter determines that Fast-start Failover will be activated when the maximum number of seconds of lag occurs.

1. We check the current Data Protection Mode.

2. We question Redo-Transport Mods.

3. If I want Redos to SYNC to Fast-start Failover Standby Database, I follow these steps.

The reason for this error is because I did not use the correct syntax when setting LogXptMode for Primary Database.

The syntax is corrected and queried again.

4. To see how to set the FastStartFailoverLagLimit parameter when Maximum Performance mode is used, I change the Protection Mode again and set the parameter.

The default value of this parameter is 30 seconds.

The reason for this error is that the Protection Mode needs to be changed first.

C. Setting the threshold value at which Fast-start Failover will occur

Fast-start Failover Standby Database and how long the Observer does not receive a response from the Primary Database, determines how Fast-start Failover will be triggered.

This is done with the FastStartFailoverThreshold parameter.

This parameter must be set very accurately to prevent an unnecessary failover.

For example, if our network connection is constantly interrupted for a few seconds, setting this parameter value low will do more harm than good.

Oracle recommends the optimum values to set this parameter based on.

For a system with a single instace Primary with almost no network latency: 10-15 seconds

For a network latency system with a single instace Primary: 30-45 seconds

For a build with RAC Primary: CSS Miss Count + Reconfiguration Time + 24-40 seconds.

The CSS Miss Count is found as follows.

1. In the light of the information above, I set this value to 100 since I use RAC structure and CSS MissCount value is 30 seconds.

Ç. Setting additional Fast-start Failover features

In order to use Fast-start Failover most accurately, I need to set many additional parameters.

a. FastStartFailoverLagLimit

We have given information in Protection mode, but there are some restrictions on the use of this parameter.

A conscious use is essential as it is a parameter that directly affects data loss.

Can be set while in Maximum Performance Mode.

Maximum Availability is invalid in this mode because SYNC is already running.

Real-Time Apply must be enabled on the standby side.

The default value is 30 and the minimum value is 10.

b. FastStartFailoverPmyShutdown

It determines whether the Primary Database will be closed after the Failover that will occur by disconnecting the Primary Database from the Observer and Target Standby Database.

The default value is TRUE, that is, it is turned off.

TRUE: After the Failover that occurs after the FastStartFailoverThreshold value, the Primary Database is shut down with Shutdown Abort.

FALSE: Primary Database is not turned off to see what is causing the failover.

This information is visible from the V$FS_FAILOVER_STATS view.

If I don’t want it shut down.

I set it back to TRUE because I want it to be turned off.

In case of user configuration or an application with SYSDBA authorization calls Failover with the DBMS_DG.INITIATE_FS_FAILOVER function, the Primary Database is shutdown even if this parameter is set to FALSE.

c. FastStartFailoverAutoReinstate

It specifies whether the Primary Database, which will no longer work after failover, will be automatically made as a Physical Standby Database.

We call this process REINSTATE. Its default value is TRUE. There are some prerequisites for this process.

The FastStartFailoverAutoReinstate parameter is TRUE.

Before failover and after the original Primary Database’s restart, the original and new primary databases must have the same Fast-start Failover configuration.

The Observer that will perform the REINSTATE and the New Primary Database must be able to connect to the original Primary database.

If automatic REINSTATE is not requested;

I want automatic REINSTATE again I set the parameter to TRUE.

ç. ObserverConnectIdentifier

It is the parameter that determines how the Observer will be connected to the Primary and Standby Database.

By default, this value takes the “DGConnectIdentifier” parameter. If a different value is not used, there is no need to set it.

We set it as follows.

1. We query the current value of the parameter.

2. We set the parameter to its new value.

3. We check if the new value is set.

4. We said that if the parameter was not set, we would use DGConnectIdentifier.

The DGConnectIdentifier is queried to see its value.

d. ObserverOverride

Even if the Standby Database has communication with the Primary, it indicates whether the Failover operation will be performed in case the Observer’s communication with the Primary is lost. Default is FALSE.

e. ObserverReconnect

It determines how often the Observer will establish a connection with the Primary Database. Default value is 0.

In other words, a connection will be established initially, and then there will be no request to establish a connection again.

The advantage of this is that it does not burden the system with new connections, and the disadvantage is that it is not possible to understand whether the Primary will go or not without new connection requests.

Oracle recommends that this value be set so that neither too often a connection load nor too rare a disconnection of communication is detected too late.

D. Setting the additional conditions we want to happen

When the following conditions occur apart from the parameters, it may request the Failover to be triggered without waiting for the FastStartFailoverThreshold value.

When Datafile goes offline,

When Dictionary Corruption occurs in a critical database object,

If the Control File is damaged due to a problem with the disk,

In case LGWR process cannot write to Online Redo Logs,

If the archive process cannot create an archive due to lack of space in the relevant path, FAST START FAILOVER operation takes place.

When a specific ORA error is received…

1. What these features are can be seen as follows.

2. We can enable a condition that is NO in the default as follows.

3. We can enable specific ORA errors as follows.

E. Activating Fast-start Failover

I can now enable Fast-start Failover after I have made both parameter and all condition’ settings.

There is no Observer at this stage yet. Primary Database and Fast-start Failover Standby Database are communicating.

We will get this warning when we enable Fast-start Failover anyway.

The information falling to the logs during activation is as follows.

The process is controlled by the operating system.

F.Starting Observer

Observer must first be run on a separate server, if available, or run on Fast-start Failover Standby Database.

When the Observer is run, it tells the Broker to manage the monitoring job of the Data Guard Environment via DGMGRL.

There is only 1 Observer in the Data Guard Environment.

It does all the management with a small configuration file containing information and connection definitions of Observer, Primary and Standby Databases.

Observer is a foreground process. Therefore, it keeps the command prompt busy all the time.

Once stopped, something can be done at the command prompt again.

Therefore, it should be at a point where it will work continuously.

For example, on Terminal servers.

1. The Observer is started.

2. We are querying the status of the configuration.

3. The Terminal window where the Observer is started is closed with a cross and the status of the Configuration is questioned.

Although the Observer was not running, it did not give an error warning in the configuration.

The reason is that the ObserverReconnect parameter is set to 15 seconds.

We wait for a while and question again.

G. Verifying the configuration

After all operations, it is checked whether there is an error in the configuration.

There are 3 ways.

1. The broker configuration is queried.

2. The Fast_start Failover configuration is queried.

3. We query from the v$DATABASE view.

The values and meanings of the FS_FAILOVER_STATUS column



Author: Onur ARDAHANLI


Leave a Reply

Your email address will not be published. Required fields are marked *