Hi @Lili_Marleen,
If the vertical column is reasonably easy to detect you may be able to make your controller focus on yaw and in/out motion, with a reasonably constant sideways motion propelling the vehicle around the column.
For a cylindrical column, if you keep the edges at the same places in the camera frame then the vehicle will be maintaining a constant distance from the column. You can even split that up, so the centre position of the column gets corrected for with yaw rotation, and the spacing/width between the edges can be corrected for with in/out motion.
I’m unsure if there are other obstacles or things on the pool walls, but if the scene is reasonably clear then you could likely downsample the video quite a lot before doing the edge detection, which could help with robustness (at the cost of a bit of fidelity/precision), and would speed up the processing.
Depending on how much the vehicle is moving, it may also be helpful to separate detection and tracking, whereby initial detection operates on the full video feed, and tracking operates on a cropped feed around where the cylinder was last seen (with the crop width determined by the maximum physical movement expected between processed frames).
The IMU is quite noisy/error-prone, so you could use it to determine general directionality, and count rotations around the column by when the vehicle ends up facing the same direction again (if that’s relevant to keep track of), but beyond that it probably won’t have a huge impact on the control side of things.
I believe ArduSub already does full rotation counts, which are communicated via the “tether turns” NAMED_VALUE_FLOAT
MAVLink messages.