Source code of Windows XP (NT5)
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
|
|
The FM cannot init groups/resources until the API is online.
The DM expects FM Phase1 init to create the quorum resource.
What happens if remote node goes down at various points along this path?
Order of events is...
Phase 1: Petition Other Node to Join
NmJoin happens
DmJoin happens - Registers with the gum and syncs database
ApiInitPhase 1 - Read-only access allowed to DM after this point
FmInitPhase1 - Performs synchronized join with GUM layer - Must build the quorum resource for DM layer below - Builds skeletal groups/resources (especially quorum resource) - Gets current state (owner and state) about all groups/resources
DmUpdateJoin - hooks in callbacks for quorum notification and node events.
NmJoinComplete - The node is now fully online
Phase 2: ApiInitPhase2 - API is now fully online
FmInitPhase2 - Complete building all Groups/Resources - Setup all state info retrieved from FmInitPhase1 - FM is now officially online - Pull all groups that should be run on the local node - Signal events for all groups/resources
FORM Phase 1
DmForm - - Registers with gum.
FmGetQuorumResource - gets the quorum resource and arbitrates for it.
DmUpdateFormNewCluter- hooks in callbacks for quorum notifications,node events. This must be done before FmInitPhase1.
FmInitPhase1 - Performs synchronized join with GUM layer - Must build the quorum resource for DM layer below - Builds skeletal groups/resources (especially quorum resource) - Gets current state (owner and state) about all groups/resources
DmRollChanges - quorum resource is brought online - dm layer waits on notification, and when online, changes in the quorum logs are applied.. for logging purposes. - Must be done after FmInitPhase1.
Problem:
- if a node crashes after FmInitPhase1 but before FmInitPhase2 then the state info retrieved in Phase1 is stale
- Resources must be cleaned appropriately no matter where the failure occurs.
- A join must convert to form appropriately.
|