Description of feature
Hi!
**Context:**I trained a model on bulk ATAC-seq data where my lab previously isolated 16 different subclones representing a continuous trajectory from a low to high metastatic cell state (originating from two parental cell lines of KPC). I did base model training on all peaks, and then fine-tuned my model on the subset of the peaks that are DESeq2 differentially accessible peaks in the met-high vs. met-low subclones. I've had to set up some custom optimizer functions for my use case, but I've been pretty impressed by their in silico performance.
I am currently planning an MPRA on the synthetic enhancers generated from this model to further improve its performance via fine-tuning. My MPRA output will be in the form of RNA/DNA read ratios (log2FC) per cell type. Because this cell state shift is very subtle, I anticipate needing an iterative process of fine-tuning off this MPRA data. My ultimate goal is to generate cell-state-specific reporter constructs.
My question is: Does CREsted natively support fine-tuning on this specific data type (The normalized activity scores rather than accessibility peaks)? To help bridge the gap between chromatin accessibility and transcriptional activity, I intend to include a subset of endogenous regulatory sequences that were present in the original ATAC-seq training data to act as anchors. Any guidance on how to best format this for fine-tuning within the CREsted architecture would be awesome!
Thanks for your time and the great tool!
Description of feature
Hi!
**Context:**I trained a model on bulk ATAC-seq data where my lab previously isolated 16 different subclones representing a continuous trajectory from a low to high metastatic cell state (originating from two parental cell lines of KPC). I did base model training on all peaks, and then fine-tuned my model on the subset of the peaks that are DESeq2 differentially accessible peaks in the met-high vs. met-low subclones. I've had to set up some custom optimizer functions for my use case, but I've been pretty impressed by their in silico performance.
I am currently planning an MPRA on the synthetic enhancers generated from this model to further improve its performance via fine-tuning. My MPRA output will be in the form of RNA/DNA read ratios (log2FC) per cell type. Because this cell state shift is very subtle, I anticipate needing an iterative process of fine-tuning off this MPRA data. My ultimate goal is to generate cell-state-specific reporter constructs.
My question is: Does CREsted natively support fine-tuning on this specific data type (The normalized activity scores rather than accessibility peaks)? To help bridge the gap between chromatin accessibility and transcriptional activity, I intend to include a subset of endogenous regulatory sequences that were present in the original ATAC-seq training data to act as anchors. Any guidance on how to best format this for fine-tuning within the CREsted architecture would be awesome!
Thanks for your time and the great tool!