Skip to content

Fix coordinate mismatch in test BC loss after resample_train_points(bc_points=True)#2076

Open
echen5503 wants to merge 1 commit intolululxvi:masterfrom
echen5503:resample_fix_clean
Open

Fix coordinate mismatch in test BC loss after resample_train_points(bc_points=True)#2076
echen5503 wants to merge 1 commit intolululxvi:masterfrom
echen5503:resample_fix_clean

Conversation

@echen5503
Copy link
Copy Markdown
Contributor

@echen5503 echen5503 commented Mar 30, 2026

Inspired by @kyouma issue #2073 and his code, but updated to a complete fix with backwards compatibility and tests.

Bug

When PDEPointResampler resampled boundary condition points during training, the test BC losses became meaningless — exhibiting a sudden ~1000× spike immediately after the first resample, while train loss and solution metrics stayed correct.

Root cause

PDE.losses() computes the BC residual as:

error = bc.error(self.train_x, inputs, outputs, beg, end)

Inside bc.error(), X[beg:end] supplies the coordinates used to evaluate reference values and boundary normals, while inputs[beg:end] / outputs[beg:end] are the network's predictions at those same points. They must describe the same physical locations.

During training this holds: inputs is a tensor of train_x. The problem is that Data.losses_test() (inherited unchanged by PDE) delegates to self.losses(), which always reads self.train_x for X — even when inputs and outputs come from test_x.

After resample_train_points(bc_points=True):

Before resample After resample
self.train_x[beg:end] old BC coords new BC coords
self.test_x[beg:end] old BC coords old BC coords (never reset)
train_state.X_test[beg:end] old BC coords old BC coords (set once at training start, never updated)

_test() evaluates the network on stale train_state.X_test (old BC coords) to produce inputs/outputs, then losses_test() calls bc.error(self.train_x, ...) where self.train_x[beg:end] now holds new BC coordinates. For a NeumannBC this means the normal derivative is taken at old positions but subtracted from the reference value at new positions — a residual that does not represent the true BC error at either set of points.

Affected classes

Class Affected?
PDE / TimePDE Yeslosses_test() inherits from Data and calls losses()
PDEOperator Yes — same inheritance, losses() uses self.train_x[1]
PDEOperatorCartesianProd Yeslosses_test() calls _losses() which hardcodes self.train_x[1]
IDE No — losses_test() override returns zero for all BC losses
FPDE / TimeFPDE No — same reason as IDE

Fix

Three coordinated changes are required. Any two alone leave a residual mismatch.

Fix 1 — PDE.losses() + new PDE.losses_test() (deepxde/data/pde.py)

losses() gains an optional X_bc parameter (defaulting to self.train_x). A new losses_test() override passes X_bc=self.test_x:

def losses(self, ..., X_bc=None):
    if X_bc is None:
        X_bc = self.train_x
    ...
    error = bc.error(X_bc, inputs, outputs, beg, end)

def losses_test(self, ...):
    return self.losses(..., X_bc=self.test_x)

Fix 2 — PDE.resample_train_points() (deepxde/data/pde.py)

When bc_points=True, test_x is now reset and immediately regenerated so it stays consistent with the new train_x_bc:

if bc_points:
    self.train_x_bc = None
    self.test_x, self.test_y, self.test_aux_vars = None, None, None
...
self.train_next_batch()
if bc_points:
    self.test()

Without this, Fix 1 alone computes the BC error at stale (but self-consistent) old test BC coordinates instead of the newly resampled ones.

Fix 3 — PDEResampler.on_epoch_end() (deepxde/callbacks.py)

After resampling, the callback pushes the regenerated test_x into train_state.X_test:

self.model.data.resample_train_points(self.pde_points, self.bc_points)
if self.bc_points:
    self.model.train_state.set_data_test(*self.model.data.test())

train_state.X_test is what _test() feeds into the network. Without this, the network is still evaluated at stale coords even after Fixes 1 and 2 update data.test_x.

Same fixes for PDEOperator and PDEOperatorCartesianProd (deepxde/data/pde_operator.py)

  • PDEOperator.losses() gains X_bc / aux_var parameters; new losses_test() passes self.test_x[1] and self.test_aux_vars.
  • PDEOperator.resample_train_points() resets and regenerates test_x when bc_points=True.
  • PDEOperatorCartesianProd._losses() gains X_bc; losses_train() passes self.train_x[1] and losses_test() passes self.test_x[1].

Docstring added to BC.error() (deepxde/icbc/boundary_conditions.py)

Documents the requirement that X[beg:end] and inputs[beg:end] must refer to the same physical points.


Backwards compatibility

All existing behaviour is fully preserved for usage that does not involve bc_points=True resampling:

  • losses() defaults X_bc=Noneself.train_x, identical to the previous hardcoded value. Subclass overrides without the new parameter continue to work.
  • losses_train() is not overridden; it still delegates to losses() with X_bc=None. Training losses are unaffected.
  • PDEResampler with bc_points=False (the default) skips the new set_data_test() call entirely.
  • resample_train_points(bc_points=False) does not reset or regenerate test_x.
  • IDE and FPDE / TimeFPDE are untouched.

The only observable behaviour change is the intended one: after resample_train_points(bc_points=True), reported test BC losses now correctly reflect the network's BC residual on the new points.


Tests

test_bc_resample.py adds five regression tests:

  1. test_bc_points_equal_before_resample — sanity check that test_x and train_x start with identical BC slices.
  2. test_test_x_refreshed_after_resample — Fix 2: data.test_x[beg:end] equals the new data.train_x[beg:end] after resampling.
  3. test_train_state_X_test_refreshed_by_callback — Fix 3: train_state.X_test[beg:end] is updated after the callback fires.
  4. test_losses_test_passes_test_x_to_bc_error — Fix 1: losses_test() passes self.test_x (not self.train_x) as X to bc.error(), verified by intercepting the call with a sentinel coordinate value.
  5. test_github_issue_timepde_neumann_bc — end-to-end regression for the reported case: TimePDE on a 2-D+time rectangle with two NeumannBCs, one DirichletBC, one NeumannBC, and an IC, with PDEPointResampler(bc_points=True). Asserts that after one resample cycle, train_x[beg:end] == test_x[beg:end] for every BC and that train_state.X_test is in sync.

@kyouma
Copy link
Copy Markdown
Contributor

kyouma commented Mar 30, 2026

Hello.

Thank you for the PR. Please confirm whether I understand correctly:

  1. these changes achieve the same goal as my commit does, but they do it more carefully;
  2. they also take into consideration passing correct aux_var;
  3. the test BC/BC points sampling process remains the same: copy the current training BC/IC points (the "TODO" in PDE.test_points());
  4. when the training points are resampled, the test BC/IC points are updated according to this procedure.

@echen5503
Copy link
Copy Markdown
Contributor Author

Yes, your understanding is correct. The PR addresses the same goal as your commit, but goes through a minimalist approach that maximizes backwards compatibility and minimizes overhead.

@kyouma
Copy link
Copy Markdown
Contributor

kyouma commented Mar 30, 2026

Thank you for the explanation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants