HBASE-30137 In FSFT, generate a new manifest incase the latest manifest file gets corrupted#8382
Draft
gvprathyusha6 wants to merge 5 commits into
Draft
HBASE-30137 In FSFT, generate a new manifest incase the latest manifest file gets corrupted#8382gvprathyusha6 wants to merge 5 commits into
gvprathyusha6 wants to merge 5 commits into
Conversation
Keep dev-support/design-docs/fsft-manifest-repair.md (the canonical design
referenced by StoreFileListRepair and carrying the two-track procedure+CLI
decision). Delete the two superseded drafts:
- fsft-manifest-repair-lld.md: predates the online-procedure decision
("No new RPC, no master integration, no online HBCK plumbing").
- fsft-repair-manifest-copy.md: early offline-only copy.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Replace the in-progress online FSFT manifest "repair" path with a single offline, operator-driven CLI that rebuilds a corrupted FILE store-file-tracker manifest (.filelist) purely from the on-disk store listing. Engine + CLI: - StoreFileListRecover: disk-only reconstruction. The recovered manifest is exactly the set of store files physically present under the family directory (HFiles, references, links), filtered with DefaultStoreFileTracker rules; the Reference body is carried into the manifest entry. Nothing is synthesized from split/merge lineage. - For user-table regions it consults hbase:meta for split/merge parents and reports data-loss risk (parents with unarchived HFiles) without ever injecting parent-derived entries into the manifest. - isAlreadyHealthy() mirrors the runtime load selection (numeric seqId ordering, f1/f2 winner by timestamp) so a no-op cannot mask corruption of a higher-seqId tracker file. - StoreFileListRecoverTool: CLI surface (sftrecover) with safety gates -- requires --region-offline or --dry-run before writing, refuses hbase:meta without --force-meta, refuses non-FILE/MIGRATION trackers. Removals: - Drop the online repair surface entirely: RepairFsftRegionProcedure, the Hbck.repairFsftRegion RPC + HBaseHbck impl, the Master.proto / MasterProcedure.proto RPC + messages + state, and the MasterRpcServices handler. Nothing in the master can fence a RegionServer off the store dir while a manifest is rewritten, so offline-only is the correct boundary. - Restore StoreFileListFilePrettyPrinter to a pure read-only viewer (the repair logic that had been embedded there now lives in the recover tool). Wire `hbase sftrecover` into bin/hbase and bin/hbase.cmd. Add TestStoreFileListRecover (11 tests) and the fsft-manifest-recover design doc. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
testRestoreSnapshotAfterSplitWithCompactionsDisabled (and its helpers) was added by the initial branch commit but is unrelated to the offline FSFT manifest-recover tool that this branch/PR delivers. Restore the file to its upstream/master state so it no longer appears in the PR diff. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This test was an empirical exploration harness added by the initial branch commit: it starts a mini cluster only to LOG whether hbase:meta inherits the FILE tracker, and asserts nothing meaningful about the recover tool (its own comments say "we assert nothing definitive ... the LOG output is the real evidence"). It is not part of the offline FSFT manifest-recover feature, so remove it from the branch/PR. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.