Johan Edstedt @Parskatt, Twitter Profile

@Parskatt yes as used in VGGSfM. It is more elegant (in my humble opinion) because 1. it does not need to build a graph between one keypoint and all keypoints in the other images any more

1 0 2 39 0

@jianyuan_wang I agree that it's more efficient, at least for long tracks. But, 1. It does not seem possible to update the feature maps. 2. Init is from duplicating the query coord, and local corr. Probably this can be fixed though, and I'm all for "latent" things that scale better.

1 0 1 45 0

Jianyuan Wang @jianyuan_wang

2 weeks ago

@Parskatt 1. Probably I get lost here, does it refer to the feature maps of images? It should be okay to change the image feature extractor of a pretrained tracking network, because the model mostly cares about correlation only (have not tested it though) 2. Yes I agree this is a problem.

1 0 0 35 0

Johan Edstedt @Parskatt

2 weeks ago

@jianyuan_wang Regarding 1. I mean that if you have a Transformer directly on the images and exchange messages you update the feature maps of the images themselves. From what I understand from point tracking the underlying feature-map is fixed during the iterations (perhaps my misunderstanding)

1 0 2 50 0

Jianyuan Wang @jianyuan_wang

2 weeks ago

@Parskatt ah I get it now. Yes in tracking people currently do not update the feature maps during the iterations. Instead, they update the feature of tracks for each iteration, an example is below github.com/facebookresear…

0 0 2 37 0