Fix coordinate mismatch in test BC loss after resample_train_points(bc_points=True)#2076
Open
echen5503 wants to merge 1 commit intolululxvi:masterfrom
Open
Fix coordinate mismatch in test BC loss after resample_train_points(bc_points=True)#2076echen5503 wants to merge 1 commit intolululxvi:masterfrom
echen5503 wants to merge 1 commit intolululxvi:masterfrom
Conversation
Contributor
|
Hello. Thank you for the PR. Please confirm whether I understand correctly:
|
Contributor
Author
|
Yes, your understanding is correct. The PR addresses the same goal as your commit, but goes through a minimalist approach that maximizes backwards compatibility and minimizes overhead. |
Contributor
|
Thank you for the explanation. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Inspired by @kyouma issue #2073 and his code, but updated to a complete fix with backwards compatibility and tests.
Bug
When
PDEPointResamplerresampled boundary condition points during training, the test BC losses became meaningless — exhibiting a sudden ~1000× spike immediately after the first resample, while train loss and solution metrics stayed correct.Root cause
PDE.losses()computes the BC residual as:Inside
bc.error(),X[beg:end]supplies the coordinates used to evaluate reference values and boundary normals, whileinputs[beg:end]/outputs[beg:end]are the network's predictions at those same points. They must describe the same physical locations.During training this holds:
inputsis a tensor oftrain_x. The problem is thatData.losses_test()(inherited unchanged byPDE) delegates toself.losses(), which always readsself.train_xforX— even wheninputsandoutputscome fromtest_x.After
resample_train_points(bc_points=True):self.train_x[beg:end]self.test_x[beg:end]train_state.X_test[beg:end]_test()evaluates the network on staletrain_state.X_test(old BC coords) to produceinputs/outputs, thenlosses_test()callsbc.error(self.train_x, ...)whereself.train_x[beg:end]now holds new BC coordinates. For aNeumannBCthis means the normal derivative is taken at old positions but subtracted from the reference value at new positions — a residual that does not represent the true BC error at either set of points.Affected classes
PDE/TimePDElosses_test()inherits fromDataand callslosses()PDEOperatorlosses()usesself.train_x[1]PDEOperatorCartesianProdlosses_test()calls_losses()which hardcodesself.train_x[1]IDElosses_test()override returns zero for all BC lossesFPDE/TimeFPDEFix
Three coordinated changes are required. Any two alone leave a residual mismatch.
Fix 1 —
PDE.losses()+ newPDE.losses_test()(deepxde/data/pde.py)losses()gains an optionalX_bcparameter (defaulting toself.train_x). A newlosses_test()override passesX_bc=self.test_x:Fix 2 —
PDE.resample_train_points()(deepxde/data/pde.py)When
bc_points=True,test_xis now reset and immediately regenerated so it stays consistent with the newtrain_x_bc:Without this, Fix 1 alone computes the BC error at stale (but self-consistent) old test BC coordinates instead of the newly resampled ones.
Fix 3 —
PDEResampler.on_epoch_end()(deepxde/callbacks.py)After resampling, the callback pushes the regenerated
test_xintotrain_state.X_test:train_state.X_testis what_test()feeds into the network. Without this, the network is still evaluated at stale coords even after Fixes 1 and 2 updatedata.test_x.Same fixes for
PDEOperatorandPDEOperatorCartesianProd(deepxde/data/pde_operator.py)PDEOperator.losses()gainsX_bc/aux_varparameters; newlosses_test()passesself.test_x[1]andself.test_aux_vars.PDEOperator.resample_train_points()resets and regeneratestest_xwhenbc_points=True.PDEOperatorCartesianProd._losses()gainsX_bc;losses_train()passesself.train_x[1]andlosses_test()passesself.test_x[1].Docstring added to
BC.error()(deepxde/icbc/boundary_conditions.py)Documents the requirement that
X[beg:end]andinputs[beg:end]must refer to the same physical points.Backwards compatibility
All existing behaviour is fully preserved for usage that does not involve
bc_points=Trueresampling:losses()defaultsX_bc=None→self.train_x, identical to the previous hardcoded value. Subclass overrides without the new parameter continue to work.losses_train()is not overridden; it still delegates tolosses()withX_bc=None. Training losses are unaffected.PDEResamplerwithbc_points=False(the default) skips the newset_data_test()call entirely.resample_train_points(bc_points=False)does not reset or regeneratetest_x.IDEandFPDE/TimeFPDEare untouched.The only observable behaviour change is the intended one: after
resample_train_points(bc_points=True), reported test BC losses now correctly reflect the network's BC residual on the new points.Tests
test_bc_resample.pyadds five regression tests:test_bc_points_equal_before_resample— sanity check thattest_xandtrain_xstart with identical BC slices.test_test_x_refreshed_after_resample— Fix 2:data.test_x[beg:end]equals the newdata.train_x[beg:end]after resampling.test_train_state_X_test_refreshed_by_callback— Fix 3:train_state.X_test[beg:end]is updated after the callback fires.test_losses_test_passes_test_x_to_bc_error— Fix 1:losses_test()passesself.test_x(notself.train_x) asXtobc.error(), verified by intercepting the call with a sentinel coordinate value.test_github_issue_timepde_neumann_bc— end-to-end regression for the reported case:TimePDEon a 2-D+time rectangle with twoNeumannBCs, oneDirichletBC, oneNeumannBC, and anIC, withPDEPointResampler(bc_points=True). Asserts that after one resample cycle,train_x[beg:end] == test_x[beg:end]for every BC and thattrain_state.X_testis in sync.