Aim:
To develop a fraud-detection system that learns patterns directly from transaction data without relying on labels. Captures differences between legitimate and fraudulent activity using an unsupervised representation-learning approach. It improve fraud-spotting accuracy by training simple classifiers on these learned representations instead of raw features.
Abstract:
This project presents a fraud-detection framework that teaches itself to understand transaction characteristics by analysing the min multiple overlapping parts. The system discovers hidden patterns by reconstructing and comparing different views of the same data, creating rich internal representations. These learned representations enable even simple classification models to identify fraud more effectively than traditional handcrafted approaches.
Proposed System:
The proposed system extends SubTab with additional multi-view consistency and richer representation fusion. It introduces cross-view latent alignment and full cross-reconstruction to strengthen view agreement. It adds k-NN smoothing on learned embeddings to reduce noise and improve downstream classifier stability. The approach enhances representation robustness by enforcing synchronized learning across views, which helps maintain stable latent distributions even when the input space contains high variability or complex relationships. The encoder architecture uses larger latent dimensions, enabling deeper representation capacity. Downstream evaluation uses Logistic Regression, MLP, and XGBoost on smoothed latent space. Ranking metrics (Precision@K, Recall@K) are included to reflect realistic fraud-risk prioritization. The workflow becomes more flexible, modular, and extensible than the original SubTab implementation.
Advantage:
- Enhanced multi-view alignment — latent consistency loss encourages unified representations across subsets.
- Stronger cross-reconstruction — leveraging all view pairs improves global feature understanding.
- Noise-robust embeddings — k-NN smoothing stabilizes decision boundaries and ranking quality.
- More expressive representation fusion — aggregated embeddings across views give richer downstream signals.






Reviews
There are no reviews yet.