Restoring Redundancy in Partitioned Regions
Restoring redundancy is a member operation. It affects all partitioned regions defined by the member, regardless of whether the member hosts data for the regions.
Restoring redundancy creates new redundant copies of buckets on members hosting the region and by default reassigns which members host the primary buckets to give better load balancing. It does not move buckets from one member to another. The reassignment of primary hosts can be prevented using the appropriate flags, as described below. See Configure High Availability for a Partitioned Region for further detail on redundancy.
For efficiency, when starting multiple members, trigger the restore redundancy a single time, after you have added all members.
Initiate a restore redundancy operation using one of the following:
gfshcommand. First, starting a
gfshprompt and connect to the cluster. Then type the following command:
Optionally, you can specify regions to include or exclude from restoring redundancy, and prevent the operation from reassigning which members host primary copies. Type
help restore redundancyor see restore redundancy for more information.
ResourceManager manager = cache.getResourceManager(); CompletableFuture<RestoreRedundancyResults> future = manager.createRestoreRedundancyOperation() .includeRegions(regionsToInclude) .excludeRegions(regionsToExclude) .shouldReassignPrimaries(false) .start(); //Get the results RestoreRedundancyResults results = future.get(); //These are some of the details we can get about the run from the API System.out.println("Restore redundancy operation status is " + results.getStatus()); System.out.println("Results for each included region: " + results.getMessage()); System.out.println("Number of regions with no redundant copies: " + results.getZeroRedundancyRegionResults().size(); System.out.println("Results for region " + regionName + ": " + results.getRegionResult(regionName).getMessage();
If you have
start-recovery-delay=-1 configured for your partitioned region, you will need to perform a restore redundancy operation on your region after you restart any members in your cluster in order to recover any lost redundancy.
If you have
start-recovery-delay set to a low number, you may need to wait extra time until the region has recovered redundancy.