spcoral.model.integrate_model_block#

class spcoral.model.integrate_model_block(clip_results, is_norm=False, hidden_dim=128, latent_dim=64, device=device(type='cuda', index=0), random_seed=2020, strict_repro=False, learning_rate=0.001, weight_decay=0.0001, loss_weight=None, epochs=300, gradient_clipping=5.0, edge_loss=False)#

Bases: object

Block-wise cross-modal spatial omics integration model using graph autoencoders.

This class enables scalable integration of large multi-modal spatial datasets by: 1. Dividing the overlapping tissue region into grid blocks (from clipping_patch). 2. Preprocessing each block in parallel to build multiple graphs and tensors. 3. Training a single shared CrossModalGAE model across all blocks with memory-efficient per-block updates. 4. Supporting optional edge consistency loss between adjacent blocks for smoother global alignment.

Parameters:
  • clip_results (dict) – Output dictionary from clipping_patch containing: - ‘x_clip’, ‘y_clip’, ‘x_num’, ‘y_num’, ‘x_retain’, ‘y_retain’ - ‘adata_omics1_clip_dict’, ‘adata_omics2_clip_dict’ (block AnnData objects) - ‘feature_omics1’, ‘feature_omics2’ (input feature dimensions)

  • is_norm (bool, optional (default: False)) – Whether to apply normalization in the model (passed to CrossModalGAE).

  • hidden_dim (int, optional (default: 128)) – Hidden dimension in the GAE encoder/decoder layers.

  • latent_dim (int, optional (default: 64)) – Dimension of the final joint latent embedding.

  • device (torch.device, optional (default: torch.device('cuda:0'))) – Device used for training.

  • random_seed (int, optional (default: 2020)) – Random seed for reproducibility.

  • strict_repro (bool, optional (default: False)) – If True, enforces strict deterministic behavior (e.g., CUDA determinism).

  • learning_rate (float, optional (default: 0.001)) – Optimizer learning rate.

  • weight_decay (float, optional (default: 0.0001)) – Weight decay for Adam optimizer.

  • loss_weight (list of float, optional) – Custom weights for loss components. If None, defaults to: - [1, 1, 0.5, 0.5, 1] when edge_loss=False - [1, 1, 0.5, 0.5, 1, 1] when edge_loss=True (last weight for overlap loss)

  • epochs (int, optional (default: 300)) – Number of training epochs.

  • gradient_clipping (float, optional (default: 5.0)) – Maximum gradient norm for clipping.

  • edge_loss (bool, optional (default: False)) – If True, adds consistency loss between overlapping edges of adjacent blocks.

Notes

  • Designed to handle very large datasets that cannot fit into GPU memory as a whole.

  • Requires external helper functions: process_block, preprogress_adata, adata_to_dgl, build_graph_feature, create_snn_adjacency_matrix.

__init__(clip_results, is_norm=False, hidden_dim=128, latent_dim=64, device=device(type='cuda', index=0), random_seed=2020, strict_repro=False, learning_rate=0.001, weight_decay=0.0001, loss_weight=None, epochs=300, gradient_clipping=5.0, edge_loss=False)#
Parameters:
  • clip_results (Dict) –

  • is_norm (bool) –

  • hidden_dim (int) –

  • latent_dim (int) –

  • device (device) –

  • random_seed (int) –

  • strict_repro (bool) –

  • learning_rate (float) –

  • weight_decay (float) –

  • loss_weight (List[float] | None) –

  • epochs (int) –

  • gradient_clipping (float) –

  • edge_loss (bool) –

Methods

__init__(clip_results[, is_norm, ...])

map_results_to_adata([embedding_key, ...])

Map block-wise model outputs back to full-resolution integrated AnnData objects.

preprocess(graph_method_single[, ...])

Preprocess data blocks in parallel, with optional user-specified number of processes.

train()

Train the shared CrossModalGAE model across all preprocessed spatial blocks.

preprocess(graph_method_single, k_spatial_omics1=None, radius_spatial_omics1=None, k_spatial_omics2=None, radius_spatial_omics2=None, use_obsm='spatial', g_all_auto=True, k_feature_omics1=10, k_feature_omics2=10, k_cross_omics=20, k_all_omics=25, num_processes=None)#

Preprocess data blocks in parallel, with optional user-specified number of processes.

Parameters:#

graph_method_singlestr

Method for constructing single-omics graph (e.g., ‘knn’).

k_spatial_omics1int, optional

Number of neighbors for omics1 spatial graph.

radius_spatial_omics1float, optional

Radius for omics1 spatial graph.

k_spatial_omics2int, optional

Number of neighbors for omics2 spatial graph.

radius_spatial_omics2float, optional

Radius for omics2 spatial graph.

use_obsmstr, optional

Key in adata.obsm for spatial coordinates (default: ‘spatial’).

k_feature_omics1int, optional

Number of neighbors for omics1 feature graph (default: 10).

k_feature_omics2int, optional

Number of neighbors for omics2 feature graph (default: 10).

num_processesint, optional

Number of processes for parallel processing. If None, uses min(cpu_count(), task_count).

Parameters:
  • graph_method_single (str) –

  • k_spatial_omics1 (int | None) –

  • radius_spatial_omics1 (float | None) –

  • k_spatial_omics2 (int | None) –

  • radius_spatial_omics2 (float | None) –

  • use_obsm (str) –

  • g_all_auto (bool) –

  • k_feature_omics1 (int) –

  • k_feature_omics2 (int) –

  • k_cross_omics (int) –

  • k_all_omics (int) –

  • num_processes (int | None) –

train()#

Train the shared CrossModalGAE model across all preprocessed spatial blocks.

Returns:

Training loss history with one entry per epoch. Each entry contains: - Total loss - Reconstruction loss (omics1 + omics2) - Cross-prediction loss - Spatial graph reconstruction loss - Overlap consistency loss (only if edge_loss=True)

Return type:

list of list of float

map_results_to_adata(embedding_key='emb_spcoral', rec_key='rec_spcoral', cross_key='cross_spcoral')#

Map block-wise model outputs back to full-resolution integrated AnnData objects.

Parameters:
  • embedding_key (str, optional (default: 'emb_spcoral')) – Key under which the joint latent embedding will be stored in .obsm.

  • rec_key (str, optional (default: 'rec_spcoral')) – Key for reconstructed modality-specific features in .obsm.

  • cross_key (str, optional (default: 'cross_spcoral')) – Key for cross-modality predicted features in .obsm: - For omics1: predicted omics2 features from omics1 embedding - For omics2: predicted omics1 features from omics2 embedding

Returns:

  • Full integrated AnnData for omics1 with added .obsm layers.

  • Full integrated AnnData for omics2 with added .obsm layers.

Return type:

tuple of (anndata.AnnData, anndata.AnnData)