spcoral.model.integrate_model_block#
- class spcoral.model.integrate_model_block(clip_results, is_norm=False, hidden_dim=128, latent_dim=64, device=device(type='cuda', index=0), random_seed=2020, strict_repro=False, learning_rate=0.001, weight_decay=0.0001, loss_weight=None, epochs=300, gradient_clipping=5.0, edge_loss=False)#
Bases:
objectBlock-wise cross-modal spatial omics integration model using graph autoencoders.
This class enables scalable integration of large multi-modal spatial datasets by: 1. Dividing the overlapping tissue region into grid blocks (from
clipping_patch). 2. Preprocessing each block in parallel to build multiple graphs and tensors. 3. Training a single shared CrossModalGAE model across all blocks with memory-efficient per-block updates. 4. Supporting optional edge consistency loss between adjacent blocks for smoother global alignment.- Parameters:
clip_results (dict) – Output dictionary from
clipping_patchcontaining: - ‘x_clip’, ‘y_clip’, ‘x_num’, ‘y_num’, ‘x_retain’, ‘y_retain’ - ‘adata_omics1_clip_dict’, ‘adata_omics2_clip_dict’ (block AnnData objects) - ‘feature_omics1’, ‘feature_omics2’ (input feature dimensions)is_norm (bool, optional (default: False)) – Whether to apply normalization in the model (passed to CrossModalGAE).
hidden_dim (int, optional (default: 128)) – Hidden dimension in the GAE encoder/decoder layers.
latent_dim (int, optional (default: 64)) – Dimension of the final joint latent embedding.
device (torch.device, optional (default: torch.device('cuda:0'))) – Device used for training.
random_seed (int, optional (default: 2020)) – Random seed for reproducibility.
strict_repro (bool, optional (default: False)) – If True, enforces strict deterministic behavior (e.g., CUDA determinism).
learning_rate (float, optional (default: 0.001)) – Optimizer learning rate.
weight_decay (float, optional (default: 0.0001)) – Weight decay for Adam optimizer.
loss_weight (list of float, optional) – Custom weights for loss components. If None, defaults to: - [1, 1, 0.5, 0.5, 1] when
edge_loss=False- [1, 1, 0.5, 0.5, 1, 1] whenedge_loss=True(last weight for overlap loss)epochs (int, optional (default: 300)) – Number of training epochs.
gradient_clipping (float, optional (default: 5.0)) – Maximum gradient norm for clipping.
edge_loss (bool, optional (default: False)) – If True, adds consistency loss between overlapping edges of adjacent blocks.
Notes
Designed to handle very large datasets that cannot fit into GPU memory as a whole.
Requires external helper functions:
process_block,preprogress_adata,adata_to_dgl,build_graph_feature,create_snn_adjacency_matrix.
- __init__(clip_results, is_norm=False, hidden_dim=128, latent_dim=64, device=device(type='cuda', index=0), random_seed=2020, strict_repro=False, learning_rate=0.001, weight_decay=0.0001, loss_weight=None, epochs=300, gradient_clipping=5.0, edge_loss=False)#
Methods
__init__(clip_results[, is_norm, ...])map_results_to_adata([embedding_key, ...])Map block-wise model outputs back to full-resolution integrated AnnData objects.
preprocess(graph_method_single[, ...])Preprocess data blocks in parallel, with optional user-specified number of processes.
train()Train the shared CrossModalGAE model across all preprocessed spatial blocks.
- preprocess(graph_method_single, k_spatial_omics1=None, radius_spatial_omics1=None, k_spatial_omics2=None, radius_spatial_omics2=None, use_obsm='spatial', g_all_auto=True, k_feature_omics1=10, k_feature_omics2=10, k_cross_omics=20, k_all_omics=25, num_processes=None)#
Preprocess data blocks in parallel, with optional user-specified number of processes.
Parameters:#
- graph_method_singlestr
Method for constructing single-omics graph (e.g., ‘knn’).
- k_spatial_omics1int, optional
Number of neighbors for omics1 spatial graph.
- radius_spatial_omics1float, optional
Radius for omics1 spatial graph.
- k_spatial_omics2int, optional
Number of neighbors for omics2 spatial graph.
- radius_spatial_omics2float, optional
Radius for omics2 spatial graph.
- use_obsmstr, optional
Key in adata.obsm for spatial coordinates (default: ‘spatial’).
- k_feature_omics1int, optional
Number of neighbors for omics1 feature graph (default: 10).
- k_feature_omics2int, optional
Number of neighbors for omics2 feature graph (default: 10).
- num_processesint, optional
Number of processes for parallel processing. If None, uses min(cpu_count(), task_count).
- Parameters:
graph_method_single (str) –
k_spatial_omics1 (int | None) –
radius_spatial_omics1 (float | None) –
k_spatial_omics2 (int | None) –
radius_spatial_omics2 (float | None) –
use_obsm (str) –
g_all_auto (bool) –
k_feature_omics1 (int) –
k_feature_omics2 (int) –
k_cross_omics (int) –
k_all_omics (int) –
num_processes (int | None) –
- train()#
Train the shared CrossModalGAE model across all preprocessed spatial blocks.
- map_results_to_adata(embedding_key='emb_spcoral', rec_key='rec_spcoral', cross_key='cross_spcoral')#
Map block-wise model outputs back to full-resolution integrated AnnData objects.
- Parameters:
embedding_key (str, optional (default: 'emb_spcoral')) – Key under which the joint latent embedding will be stored in
.obsm.rec_key (str, optional (default: 'rec_spcoral')) – Key for reconstructed modality-specific features in
.obsm.cross_key (str, optional (default: 'cross_spcoral')) – Key for cross-modality predicted features in
.obsm: - For omics1: predicted omics2 features from omics1 embedding - For omics2: predicted omics1 features from omics2 embedding
- Returns:
Full integrated AnnData for omics1 with added
.obsmlayers.Full integrated AnnData for omics2 with added
.obsmlayers.
- Return type: