Relative Content

Tag Archive for machine-learningcomputer-vision

How can one get similar results on CIFAR10 with the fully-MLP architecture as the original paper?

I found at least three MLP-only architectures (that avoid using the attention mechanism) for computer vision that reported very high results on ImageNet (70%+) and other benchmarks, like CIFAR10 and CIFAR100 (paper 1 code, paper 2 code, paper 3 code) — from “g-mlp”, “MLP-Mixer”, and “do-you-even-need-attention”

How can one get similar results on CIFAR10 with the fully-MLP architecture as the original paper?

MultiScale Vision Transformer tensor mismatch shape issue

There seems to be a tensor mismatch shape issue of the MultiScale Vision Transformer. Does anyone know how to resolve this issue?

Is there method to enhance the performance of vit?

I want to train vision transformer on Cifare10 , I tried to do fine tuning of hyperparameter to enhance the accuracy but actually I still obtain a bad accuracy , so , please there are not any suggestion to enhance my model thank you in advance I tried to load weight from pretrained vit on Imagenet but it doesn’t work :`

What is the best way to recover image from its CLIP features?

Suppose we have an image with size torch.Size([1, 3, 336, 336]) and encode it using CLIP with size torch.Size([1, 577, 1024]), How to recover the origin image with this latent feature map?

Fine tuning LayoutLMv2 for Document question answering using custom data

I want to fine tune LayoutLMv2 for document question answering on custom data. Can somebody help me out on how to prepare the data for this task?

How do I train My Image Recognition Model to work like a Reward Punishment system, where I can tell who the person is, it couldn’t recognize?

I am researching methods to make an attendance system, where the professor clicks few photos (2 to 3)
and upload on app where automatic attendance is given for about 80 students. I am limited training Data, which is the biggest drawback and major issue we need to counter. I made a basic model that trains and marks attendance.

Thiết kế website giá rẻ

Danh mục

Relative Content

Tag Archive for machine-learningcomputer-vision

How can one get similar results on CIFAR10 with the fully-MLP architecture as the original paper?

How can one get similar results on CIFAR10 with the fully-MLP architecture as the original paper?

MultiScale Vision Transformer tensor mismatch shape issue

Is there method to enhance the performance of vit?

What is the best way to recover image from its CLIP features?

Fine tuning LayoutLMv2 for Document question answering using custom data

How do I train My Image Recognition Model to work like a Reward Punishment system, where I can tell who the person is, it couldn’t recognize?