Project Log 4
Tensors, Transfer Learning, and Apple Silicon
In Log 3, I mapped out the problem: processing 3GB .svs Whole Slide Images to predict melanoma organ tropism. The reality? Trying to build the complex tiling logic and the neural network architecture at the exact same time on a local machine is a recipe for a crashed Mac and zero progress.
It was time for a strategic pivot.
The MVP Pivot: SPIDER-Skin
Instead of wrestling gigapixel monsters, I bypassed the tiling phase entirely to unblock the MVP. I wrote a Python script to authenticate with the Hugging Face API and surgically extract a small sample of 50 pre-cut, 1120x1120 tissue patches from the SPIDER-Skin dataset. This gave me the exact building blocks a Convolutional Neural Network (CNN) needs, without melting my CPU.
Building the Brain: PyTorch & ResNet18
With the data secured locally, I built the deep learning pipeline:
- The DataLoader: A custom PyTorch
Datasetclass that ingests the patches, resizes them to the industry-standard 224x224, converts them into mathematical Tensors, and applies ImageNet color normalisation. - Transfer Learning: I imported
ResNet18. Instead of training a model from scratch to recognise basic shapes, I took a model already trained on 14 million images, stripped its final classification layer, and wired in a custom, untrained layer designed to output a binary classification (Primary vs. Metastatic).
Hitting the MPS Core
The most satisfying part of this sprint wasn’t just getting the math to compile. It was wiring the train.py script to detect and utilise Apple Silicon’s Metal Performance Shaders (MPS).
Watching the training loop offload the matrix multiplication to the Mac’s GPU, running a CrossEntropyLoss function with an Adam optimiser, and seeing the loss metric steadily drop was a massive milestone. I closed the loop by writing an inference script that loads the serialised .pth weights, applies a Softmax function to the logits, and spits out a human-readable diagnostic confidence score.
What’s Next?
The MVP architecture is complete. The data flows, the model learns, and the inference engine runs.
But right now, the model is a toy. It was trained on 50 images with dummy labels to test the architecture. The next phase is the real Data Science work. Parsing the actual clinical metadata to map true Primary/Metastatic labels, scaling the training loop up to thousands of images, and moving the compute off my Mac and into a cloud GPU environment.