i have a Question about scale-invariant translation and log-space translation concept in R-CNN paper.
As mentioned in the paper:
Our goal is to learn a transformation that maps a proposed box P to a ground-truth box G.
so the regressor predict the four
[dx(P), dy(P), dw(P), dh(P)]
as transformation functions. and then we calculated the predicted ground-truth box Gˆ by applying the below transformations:
Gˆx = Pw * dx(P) + Px (1)
Gˆy = Ph * dy(P) + Py (2)
Gˆw = Pw * exp(dw(P)) (3)
Gˆh = Ph * exp(dh(P)) (4)
Question : i don’t understand equations (1) and (2). why the pw/ph is multiplied in the dx(p)/dy(p)? why we don’t applying px/py in the equations (3) and (4)?
in summarize i don’t understand the Pw and Ph roles in equations (1) and (2).
please guide me.
New contributor