Visual Multi-Object Tracking~(MOT) is a challenging task that involves identifying and associating objects in video frames across various real-world scenes. Transformer-based MOT methods have recently gained popularity for their
simplicity and effectiveness.