Underwater gate detection problems

Hi,

I’m a 4th semester student working on an underwater gate detection system.

Setup:

  • Camera: USB (underwater)

  • Model: YOLO (trained with OBB annotations)

  • Pipeline: detection → custom Python logic for bounding box tracking

Problem:

  1. The model detects the gate but also locks onto background noise (false positives).

    • Gate color: green with a white top bar

    • Underwater conditions introduce strong color distortion and noise

  2. I am generating an additional bounding box in Python for tracking.

    • This leads to “box inside box” behavior

    • Causes ambiguity in selecting the correct target

Behavior:

  • Detection becomes unstable when the gate is partially visible

  • When close to the gate, the model detects incomplete structures

Constraints:

  • I cannot upload images/videos directly here, but I can share external links if needed

Questions:

  1. What are effective ways to reduce false positives in underwater YOLO detection?

  2. Is OBB appropriate for this task, or would standard bounding boxes be more stable?

  3. How should partial detections (close-range or cropped gates) be handled robustly?

  4. Are there any recommended repositories or pipelines for underwater object detection?

  5. Which image preprocessing filters would you suggest for improving detection in underwater conditions?

Goal:
Achieve fast and stable gate detection for real-time navigation.

Thanks in advance.

Hi @Vansh, welcome to the forum :slight_smile:

These questions are primarily about computer vision, rather than being specifically related to the marine environment. While there are community members here with varying degrees of computer vision experience, I would also recommend looking to more dedicated spaces for asking computer vision questions, as they may well yield more targeted and relevant advice.

Trained in which environment? As an example, if you trained in air but are operating in water (or are operating in quite different water from what you trained in) then it is understandable that the changes of appearance have an impact on the model’s detection performance.

This is inherent to the underwater environment. Sometimes you can resolve it by reducing distance and/or controlling lighting, but in the absence of that you’ll need to work on detection algorithms that are robust to colour changes, and/or can compensate for them from an estimate of the distance from the objects in frame.

Any signal comes with noise. To improve the signal to noise ratio your options are to change the environment and/or sensing technology to reduce the noise or increase the strength of your signal, find a relevant signal that is less subject to noise, and/or process the incoming signal in ways that differentiate between signal and noise.

Without a sense of your application requirements and how exactly it’s failing it’s difficult to provide much direct advice. I’m unsure whether you can adjust lighting (e.g. by operating at different times, or adding your own lights), or filter dirty water, for example.

What do you mean by this? Are you running a different detection model in Python, or are you running a high frequency more generic object tracking model that is initialised and corrected by your YOLO detection model?

It’s not clear what you mean by this, or why it’s relevant (is it good? is it causing problems?).

Is ambiguity a problem? Have you done testing to determine which of your signals is more accurate? Is that situation-dependent?

What is important for you to detect? If partial detections are an important part of your application then you need to account for that. Do your processing capacity and hardware constraints allow you to have multiple detectors running in parallel? Could you use a wider camera angle? Are different parts of the gate differentiable? Is it important to know where you are relative to the gate, or just that you’re near it (or not far enough away from it to detect it properly)?

Is that because your forum account is new (which I can adjust the limitations of), or is this a project constraint (i.e. the images are of a space you don’t have authority to share publicly)?

You haven’t shared what kind of training or pre-processing you’re doing, or what control you have over the source data collection (e.g. lighting, camera stability, etc). It’s often the case that RGB is not the best colour-space for object detection, but that depends what you’re trying to detect, and you may have already changed to a different colour space.

That’s highly dependent on the needs of your application. If the orientation of the target is important then an oriented bounding box could maybe be valuable, but if you just need to generally have a sense of where something is then that’s less relevant.

Stability is also dependent on what’s causing detection/tracking failures - in some cases box orientation may make tracking subsequent frames easier, but in others it may be an overly tight constraint to try to match.

Try to detect smaller sections if you can, or use a feature detection framework that doesn’t expect or require the full object to be in frame (e.g. ORB, SIFT, etc).

I think this is very difficult to respond to without knowing lots more about your operating conditions, hardware, and detection requirements.

Which options have you already explored out of ones that come up from an online search for “underwater object detection”, and in what ways have they not been meeting your requirements?

A blur of some kind to reduce sensor noise is almost always a good idea. Beyond that it’s very dependent on your application. As an example, a technique like Sea-Thru could be very helpful for some situations, but could also be highly impractical overkill for others.