Time-Aware Contrastive Transformer for Longitudinal Patient Representation Learning

Wait 5 sec.

Learning high-quality longitudinal patient representations from irregular electronic health records (EHRs) is essential for understanding heterogeneity in time-evolving diseases such as cancer. Longitudinal patient representation learning methods often rely on external labels for downstream tasks or do not model the temporal dynamics between medical events explicitly, reducing the clinical applicability of learned disease trajectories. In this work, we propose the Time-Aware-Contrastive-Transformer (TACT), a transformer-based model that integrates explicit temporal modeling with a fully self-supervised contrastive learning framework. We introduce a sampling-based data augmentation workflow that leverages hierarchical taxonomies of diagnoses and medications to enrich representation learning. Evaluated on a large real-world dataset, TACT demonstrates robust performance across patient representation and event embedding metrics and outperforms two time-aware transformer comparison models. Unlike the comparison models, TACT successfully bridges contrastive learning with medical hierarchies, allowing it to track precise disease trajectories and discover clinically actionable patient phenotypes. Consequently, this approach establishes a comprehensive framework for characterizing patient heterogeneity through the identification of potentially clinically meaningful subgroups with distinct progression profiles.