How to visualize feature maps for SSD model

  Kiến thức lập trình

I’m developing a Single Shot Multibox Detection model with MobileNet as the backbone with PyTorch. I’d like to visualize the feature maps to help debugging and ensure that the feature map is picking up what I need it to see. I have a model loaded which I know works (I tested real predictions), and am trying to use TorchCam to visualize the last convolutional layer of the SSD Backbone.

This is the code for a forward pass

for layer in self.base_net[end_layer_index:]:
            x = layer(x)

        feature_map = x

        for layer in self.extras:
            x = layer(x)
            confidence, location = self.compute_header(header_index, x)
            header_index += 1
            confidences.append(confidence)
            locations.append(location)

       
        confidences = torch.cat(confidences, 1)
        locations = torch.cat(locations, 1)
        

        return confidences, locations, feature_map

I extract the feature map after the input passes through the base network (backbone). I handle it like so in a separate script to try to create a heatmap of the class activation.

    model = create_mobilenetv1_ssd(21, is_test=True)
    #Load weights and set to eval mode
    model.load('models\mobilenet-v1-ssd-mp-0_675.pth')
    model.eval()

    img = read_image(img_path)
    img = to_pil_image(img)

    input_image = transform(img)

    with SmoothGradCAMpp(model, target_layer=model.base_net[-1]) as cam_extractor:
        scores, boxes, feature_map = model(input_image.unsqueeze(0))
        class_idx = scores[0][0].squeeze(0).argmax().item()
        activation_map = cam_extractor(class_idx, feature_map)
        
result = overlay_mask(img, to_pil_image(activation_map[0].squeeze(0), mode='F'), alpha=0.5)

However, the result is empty and there are no highlighted regions. I have tried a few different layers and none of them yielded any visualization.

One issue I thought of is that the confidences output is the confidences in each bounding box, not the confidences of each class like it would be for an image classification network. Since there is no fully connected layer producing a list of confidences per image class, is it possible to use TorchCAM? I am open to suggestions for other methods of achieving the same thing.

How should I go about doing this?

New contributor

Siddarth Calidas is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

LEAVE A COMMENT