Commit 7d95dea
Dev/grasp full pipeline (#905)
* feat: migrate GRASP model from PyHealth 1.0 to 2.0 API
Co-Authored-By: lookman-olowo <lookmanolowo@hotmail.com>
Co-Authored-By: christiana-beard <christyanamarie116@gmail.com>
Co-Authored-By: ddhangdd <dfung2@wisc.edu>
* feat: add GRASP mortality prediction notebook and fix cluster_num
Co-Authored-By: Colton Loew <colton.loew@gmail.com>
Co-Authored-By: lookman-olowo <lookmanolowo@hotmail.com>
Co-Authored-By: christiana-beard <christyanamarie116@gmail.com>
Co-Authored-By: ddhangdd <dfung2@wisc.edu>
* Restore code_mapping support in SequenceProcessor for PyHealth 2.0
Adds optional code_mapping parameter to SequenceProcessor that maps
granular medical codes to grouped vocabularies (e.g. ICD9CM→CCSCM)
before building the embedding table. Resolves the functional gap
from the 1.x→2.0 rewrite where code_mapping was removed. Ref #535
Co-Authored-By: lookman-olowo <lookmanolowo@hotmail.com>
* Add RNN baseline and code_mapping comparison notebooks for MIMIC-III
Two identical notebooks for A/B testing code_mapping impact on mortality
prediction. Only difference is the schema override in Step 2. Both use
seed=42 for reproducible splits.
Co-Authored-By: lookman-olowo <lookmanolowo@hotmail.com>
* fix(tasks): extract NDC codes instead of drug names for prescription mapping
event.drug returns drug names (e.g. "Aspirin") which produce zero matches
in CrossMap NDC→ATC; event.ndc returns actual NDC codes enabling 3/3
feature mapping for mortality and readmission tasks.
Co-Authored-By: Colton Loew <colton.loew@gmail.com>
Co-Authored-By: lookman-olowo <lookmanolowo@hotmail.com>
Co-Authored-By: christiana-beard <christyanamarie116@gmail.com>
Co-Authored-By: ddhangdd <dfung2@wisc.edu>
* test(tasks): add tests verifying NDC extraction in drug tasks
Checks that mortality and readmission task processors build vocabulary
from NDC codes (numeric strings) rather than drug names (e.g. "Aspirin"),
confirming the event.drug -> event.ndc fix works correctly.
Co-Authored-By: Colton Loew <colton.loew@gmail.com>
Co-Authored-By: lookman-olowo <lookmanolowo@hotmail.com>
Co-Authored-By: christiana-beard <christyanamarie116@gmail.com>
Co-Authored-By: ddhangdd <dfung2@wisc.edu>
* fix(tasks): fix missed MortalityPredictionMIMIC4 event.drug and update docs
- Fix event.drug -> event.ndc in MortalityPredictionMIMIC4 (line 282)
- Update readmission task docstrings to reflect NDC extraction
Co-Authored-By: Colton Loew <colton.loew@gmail.com>
Co-Authored-By: lookman-olowo <lookmanolowo@hotmail.com>
Co-Authored-By: christiana-beard <christyanamarie116@gmail.com>
Co-Authored-By: ddhangdd <dfung2@wisc.edu>
* fix(tasks): fix DrugRecommendationMIMIC3 to extract NDC codes
DrugRecommendationMIMIC3 used prescriptions/drug (drug names) via Polars
column select; changed to prescriptions/ndc to match MIMIC-4 variant and
enable NDC->ATC code mapping.
Co-Authored-By: Colton Loew <colton.loew@gmail.com>
Co-Authored-By: lookman-olowo <lookmanolowo@hotmail.com>
Co-Authored-By: christiana-beard <christyanamarie116@gmail.com>
Co-Authored-By: ddhangdd <dfung2@wisc.edu>
* fix(models): guard RNNLayer and ConCare against zero-length sequences
RNNLayer: clamp sequence lengths to min 1 so pack_padded_sequence
does not crash on all-zero masks, matching TCNLayer (tcn.py:186).
ConCare: guard covariance divisor with max(n-1, 1) to prevent
ZeroDivisionError when attention produces single-element features.
Both edge cases are triggered when code_mapping collapses vocabularies
and some patients have all codes map to <unk>, producing all-zero
embeddings and all-zero masks.
Co-Authored-By: Colton Loew <colton.loew@gmail.com>
Co-Authored-By: lookman-olowo <lookmanolowo@hotmail.com>
Co-Authored-By: christiana-beard <christyanamarie116@gmail.com>
Co-Authored-By: ddhangdd <dfung2@wisc.edu>
* docs: add docstrings to SequenceProcessor class and fit method
Co-Authored-By: Colton Loew <colton.loew@gmail.com>
Co-Authored-By: lookman-olowo <lookmanolowo@hotmail.com>
Co-Authored-By: christiana-beard <christyanamarie116@gmail.com>
Co-Authored-By: ddhangdd <dfung2@wisc.edu>
* docs: add docstrings, type hints, and fix test dims for GRASP module
Co-Authored-By: Colton Loew <colton.loew@gmail.com>
Co-Authored-By: lookman-olowo <lookmanolowo@hotmail.com>
Co-Authored-By: christiana-beard <christyanamarie116@gmail.com>
Co-Authored-By: ddhangdd <dfung2@wisc.edu>
* feat: add GRASP mortality prediction notebooks for baseline and code_mapping
Baseline notebook runs GRASP with raw ICD-9/NDC codes. Code_mapping
notebook collapses vocab via ICD9CM→CCSCM, ICD9PROC→CCSPROC, NDC→ATC
for trainable embeddings on full MIMIC-III.
Co-Authored-By: Colton Loew <colton.loew@gmail.com>
Co-Authored-By: lookman-olowo <lookmanolowo@hotmail.com>
Co-Authored-By: christiana-beard <christyanamarie116@gmail.com>
Co-Authored-By: ddhangdd <dfung2@wisc.edu>
* fix(models): guard ConCare and GRASP against batch_size=1 crashes
- ConCare FinalAttentionQKV: bare .squeeze() removed batch dim when
batch_size=1, causing IndexError in softmax. Use .squeeze(-1) and
.squeeze(1) to target only the intended dimensions.
- ConCare cov(): division by zero when x.size(1)==1. Guard with max().
- GRASP grasp_encoder: remove stale torch.squeeze(hidden_t, 0) that
collapsed [1, hidden] to [hidden] with batch_size=1. Both RNNLayer
and ConCareLayer already return [batch, hidden].
- GRASP random_init: clamp num_centers to num_points to prevent
ValueError when cluster_num > batch_size.
Co-Authored-By: Colton Loew <colton.loew@gmail.com>
Co-Authored-By: lookman-olowo <lookmanolowo@hotmail.com>
Co-Authored-By: christiana-beard <christyanamarie116@gmail.com>
Co-Authored-By: ddhangdd <dfung2@wisc.edu>
* feat: add GRASP mortality prediction notebooks for baseline and code_mapping
Baseline notebook runs GRASP with raw ICD-9/NDC codes. Code_mapping
notebook collapses vocab via ICD9CM→CCSCM, ICD9PROC→CCSPROC, NDC→ATC
for trainable embeddings on full MIMIC-III.
Co-Authored-By: Colton Loew <colton.loew@gmail.com>
Co-Authored-By: lookman-olowo <lookmanolowo@hotmail.com>
Co-Authored-By: christiana-beard <christyanamarie116@gmail.com>
Co-Authored-By: ddhangdd <dfung2@wisc.edu>
* Add code_mapping as task __init__ argument
Allow tasks to accept a code_mapping dict that upgrades input_schema
entries so SequenceProcessor maps raw codes (e.g. ICD9CM) to grouped
vocabularies (e.g. CCSCM) at fit/process time. This avoids manual
schema manipulation after task construction.
- Add code_mapping parameter to BaseTask.__init__()
- Thread **kwargs + super().__init__() through all task subclasses
with existing __init__ methods (4 readmission tasks, 1 multimodal
mortality task)
- Add 17 tests covering SequenceProcessor mapping and task-level
code_mapping initialization
Co-Authored-By: Colton Loew <colton.loew@gmail.com>
Co-Authored-By: lookman-olowo <lookmanolowo@hotmail.com>
Co-Authored-By: christiana-beard <christyanamarie116@gmail.com>
Co-Authored-By: ddhangdd <dfung2@wisc.edu>
* Update code_mapping notebook to use task init argument
Replace manual task.input_schema override with the new
code_mapping parameter on MortalityPredictionMIMIC3().
Co-Authored-By: Colton Loew <colton.loew@gmail.com>
Co-Authored-By: lookman-olowo <lookmanolowo@hotmail.com>
Co-Authored-By: christiana-beard <christyanamarie116@gmail.com>
Co-Authored-By: ddhangdd <dfung2@wisc.edu>
* feat(examples): add ConCare hyperparameter grid sweep script
Mirrors the GRASP+ConCare mortality notebook pipeline exactly
(same tables, split, seed, metrics) but sweeps 72 configurations
of embedding_dim, hidden_dim, cluster_num, lr, and weight_decay.
Results are logged to sweep_results.csv. Supports --root for
pointing at local MIMIC-III, --code-mapping, --dev, and --monitor.
Co-Authored-By: Colton Loew <colton.loew@gmail.com>
Co-Authored-By: lookman-olowo <lookmanolowo@hotmail.com>
Co-Authored-By: christiana-beard <christyanamarie116@gmail.com>
Co-Authored-By: ddhangdd <dfung2@wisc.edu>
* chore(sweep): increase early stopping patience from 10 to 15 epochs
Smaller ConCare configs (embedding_dim=8/16) may learn slower and
need more epochs before plateauing.
Co-Authored-By: Colton Loew <colton.loew@gmail.com>
Co-Authored-By: lookman-olowo <lookmanolowo@hotmail.com>
Co-Authored-By: christiana-beard <christyanamarie116@gmail.com>
Co-Authored-By: ddhangdd <dfung2@wisc.edu>
* Initial plan
* fix: filter falsy NDCs, guard None tokens in process(), fix NDC regex
Co-Authored-By: Colton Loew <colton.loew@gmail.com>
Co-Authored-By: lookman-olowo <lookmanolowo@hotmail.com>
Co-Authored-By: christiana-beard <christyanamarie116@gmail.com>
Co-authored-by: ddhangdd <43976109+ddhangdd@users.noreply.github.com>
* refactor(sweep): rename and generalize sweep script for all backbones
Rename sweep_concare_grasp.py → sweep_grasp.py. Now supports
--block GRU|ConCare|LSTM with per-backbone default grids, --resume
for crash recovery, --grid JSON override, auto-dated output dirs
(sweep/{BLOCK}_{YYYYMMDD}_{HHMMSS}_{mapping}/), and config.json
saved alongside results for reproducibility.
Co-Authored-By: Colton Loew <colton.loew@gmail.com>
Co-Authored-By: lookman-olowo <lookmanolowo@hotmail.com>
Co-Authored-By: christiana-beard <christyanamarie116@gmail.com>
Co-Authored-By: ddhangdd <dfung2@wisc.edu>
* test(sweep): add unit and integration tests for sweep_grasp utilities
Covers grid building, combo hashing, CSV resume parsing, output
directory naming, and end-to-end single-config runs for GRU and ConCare
on synthetic data (13 tests, all passing).
Co-Authored-By: Colton Loew <loewcx@illinois.edu>
Co-Authored-By: lookman-olowo <lookman-olowo@github.com>
Co-Authored-By: christiana-beard <christiana-beard@github.com>
Co-Authored-By: ddhangdd <ddhangdd@github.com>
* docs(sweep): add tmux copy-paste instructions for each paper run
Co-Authored-By: Colton Loew <loewcx@illinois.edu>
Co-Authored-By: lookman-olowo <lookman-olowo@github.com>
Co-Authored-By: christiana-beard <christiana-beard@github.com>
Co-Authored-By: ddhangdd <ddhangdd@github.com>
* chore(examples): adds cleans examples, removes util script
* Delete tests/core/test_grasp.py
we removed grasp script from examples, dropped test
* Revert "Delete tests/core/test_grasp.py"
This reverts commit 0d95758.
* fix: remove orphaned sweep test, restore grasp tests
* feat(grasp): add static_key support for demographic features with tests
* fix(test): add valid NDC to test prescriptions so readmit test produces both labels
---------
Co-authored-by: lookman-olowo <lookmanolowo@hotmail.com>
Co-authored-by: christiana-beard <christyanamarie116@gmail.com>
Co-authored-by: ddhangdd <dfung2@wisc.edu>
Co-authored-by: Lookman Olowo <42081779+lookman-olowo@users.noreply.github.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: ddhangdd <43976109+ddhangdd@users.noreply.github.com>
Co-authored-by: ddhangdd <desmondfung123@gmail.com>
Co-authored-by: Colton Loew <loewcx@illinois.edu>
Co-authored-by: lookman-olowo <lookman-olowo@github.com>
Co-authored-by: christiana-beard <christiana-beard@github.com>
Co-authored-by: ddhangdd <ddhangdd@github.com>
Co-authored-by: lookman-olowo <lookman-olowo@users.noreply.github.com>1 parent 8c0f157 commit 7d95dea
File tree
17 files changed
+11161
-234
lines changed- examples/mortality_prediction
- pyhealth
- models
- processors
- tasks
- test-resources/core/mimic4demo/hosp
- tests/core
17 files changed
+11161
-234
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
1 | 3 | | |
2 | 4 | | |
3 | 5 | | |
4 | | - | |
| 6 | + | |
5 | 7 | | |
6 | 8 | | |
7 | 9 | | |
8 | 10 | | |
9 | 11 | | |
10 | | - | |
| 12 | + | |
11 | 13 | | |
| 14 | + | |
| 15 | + | |
12 | 16 | | |
13 | | - | |
| 17 | + | |
14 | 18 | | |
15 | 19 | | |
16 | | - | |
17 | | - | |
| 20 | + | |
| 21 | + | |
18 | 22 | | |
19 | 23 | | |
20 | 24 | | |
21 | 25 | | |
22 | | - | |
23 | | - | |
24 | | - | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
25 | 29 | | |
26 | 30 | | |
27 | 31 | | |
28 | 32 | | |
29 | | - | |
30 | | - | |
31 | | - | |
32 | | - | |
33 | | - | |
34 | | - | |
| 33 | + | |
35 | 34 | | |
36 | 35 | | |
37 | 36 | | |
38 | 37 | | |
39 | 38 | | |
40 | 39 | | |
41 | 40 | | |
42 | | - | |
| 41 | + | |
43 | 42 | | |
44 | 43 | | |
45 | 44 | | |
| |||
Lines changed: 3258 additions & 0 deletions
Large diffs are not rendered by default.
0 commit comments