ROV-based benthic imaging - thoughts?

Just putting this out there for peer review and input…

A recent outbreak of exotic Caulerpa (C. brachypus and C. parvifolia) in Aotearoa New Zealand has compelled us to think about how we might achieve greater surveillance productivity using ROVs to do transects vs. divers, dropcams, towcams. Note that this is all shallow coastal habitat (0-50m). A minimal viable solution might look like:

  1. Boat arrives at a surveillance target area (GPS location).
  2. ROV deployed.
  3. ROV manually driven along a linear transect capturing high-quality, geotagged hi-res static imagery of the seafloor (not video).
  4. Images reviewed and submitted as part of a surveillance dataset.
  5. Move to next location.

Future enhancements could then include:

  • Autonomous ROV operation along the transect.
  • Move beyond a simple linear transect to scanning of areas defined by a polygon, e.g. a bay or space.
  • AI-based recognition of exotic Caulerpa.
  • Image stitching for photogrammetry (required by other potential applications).

In terms of technical approach we’re thinking:

  1. ROV and nav sensing: BlueROV2 Heavy with WaterLinked A50 for active stabilization, altitude locking and knowledge of relative positioning. A USBL + DVL combo adds cost, so was thinking DVL only. If we capture the GPS position of the ROV at the point of entry into the water, we can then apply the DVL relative position at any time to know the (approx) position of the ROV.

  2. Imaging: Addition of a downward looking camera, 8Mpx + and additional area lighting (BlueRobotics 4x Lumen lights). Lowest cost/complexity would be a fixed camera manual focus and with a fixed FOV and distance to seafloor (altitude). Autofocus might be a plus. Worst case, if the platform is too unstable, we could consider a gimbal camera. USB3 or IP cameras.

  3. Image triggering: With known FOV and simple camera calibration, we would (soft) trigger the camera (and lights) based on travel distance since the last picture taken. The intent would be to include some overlap to ensure 100% coverage along the transect (photogrammetry would require even more overlap).

  4. Geotagging images: Would involve calculating GPS position for any given image from the GPS starting position + DVL relative position at the time that picture was taken.

For our initial testing we have two core criteria to meet: (i) image quality (resolution, clarity) sufficient to spot exotic Caulerpa at an early stage; and (ii) ability to reliably return to a place where it was found (based on the geolocation in the image).

I’d appreciate any input on the above. Some specific questions around implementation:

  • Any advantage in integrating any of this into the Ardusub/MAVLink stack? Or, is it best to just to use IP cameras, connect up to the topside computer and do it all there? (no custom code in Ardusub).

  • Does using the MAVLink Camera Manager do anything for us?

  • The best locational accuracy would mean triggering the cameras from the ROV code based on distance travelled since last trigger (as detected by the DVL). Can you think of a good way to achieve this?

  • Any advice on best approach for shipping images topside? FTP or HTTP server, GStreamer? What have people used for this. At the end of the day we want geolocated JPG images. Once again, any reason we’d consider doing this using MAVlink?

  • For the application interface: Is QGroundControl / Qt the right application to be working with here? Or, are we better off leveraging some other application technology for UX? I’ve seen some amazing aerial drone apps built using QGC and Qt fundamentally is designed for GUI development. Ideally we’d want to be able to create a simple user interface and render thumbnails on a map as images arrive top-side and get processed. In time we’ll want to be able to draw polygons and automatically structure waypoint paths…

The plan is to make this all open source and non-commercial. In the spirit of that, if you think some of this functionality should be in the core product then I’d be keen to discuss how we might help facilitate that.

Hi @PeterM, lots of interesting things here :slight_smile:

Unfortunately I’m quite short on time at the moment, so will have to keep this reasonably brief, but happy to discuss further as relevant and as things progress.

It’s possible to manually set the current global position, in which case (if the DVL is reporting valid velocities) you should be able to run ArduSub autonomously to follow a mission (e.g. scanning along your polygon or whatnot).

If you’re triggering image capture using software on the onboard computer (e.g. using a BlueOS extension) then it should also be able to get the current position from the MAVLink stream and add it to the image metadata, which would save having to do data alignment stuff in post.

This is partially dependent on the vision system, and partially dependent on the environment and vehicle state. The vision system you’ll need to tune with hardware selection/design, but you can improve your chances of getting a good image by either stopping the ROV when taking an image (and potentially monitoring the accelerometer and velocity data to only capture when “still enough”) and/or taking multiple images and (automatically) choosing the sharpest one to keep at each location.

This thread may also have some relevant advice.

I doubt there’ll be much value in modifying ArduSub itself, but it will likely be worth at least monitoring the MAVLink stream for position and motion telemetry data, and potentially for notice of intended image capture points (if they’re specified in the mission).

Not for static image capture at the moment (so probably not), but potentially if you’re wanting a video stream that’s advertised over MAVLink.

I believe ArduPilot missions support events at waypoints, so for software triggering it should be possible to send a MAVLink message that your software (on the onboard computer or the control station computer) can monitor for and run the image capture process as appropriate, or for hardware triggering it could toggle one of the flight controller’s relay outputs.

Do you need the images as they’re taken, or only after the dive? That may influence what makes the most sense. I don’t see how MAVLink would be relevant for image transport, but (again) data from the MAVLink stream will likely be valuable for the geolocation side of things.

QGroundControl has existing Survey functionality, but not all of the functionality is available for underwater vehicles, and I don’t believe it supports sending and displaying the images as they’re taken.

If you actually need to change the functionality and/or interface then I suspect Cockpit will make the most sense for this (since it’s being designed to be much easier to modify than QGC), but that depends a bit on how soon you’re wanting your interface to be “done”. Feature-wise this seems relevant to Blue Boat so I suspect we’ll be happy to help with the development.

We’re planning an initial beta release of Cockpit towards the end of this quarter, at which point there’ll be some documentation available and we’ll more meaningfully be able to discuss available features and the customisation process.

Hi @EliotBR : Thanks so much for your input (and sorry about the slow response in replying).

That’s fantastic. Exactly what I needed to know.

It would be nice to have images viewable top-side while the transect being performed… But since it’s not a live stream delays fine.

Yes, we’d be keen to take a look at Cockpit alpha/beta when it becomes available. Our MVP is pretty basic, so can fuddle along as long as we can run a linear transect and capture images along the way. If we knew Cockpit was the right UX to build on then we could hold off putting dev effort into QGC.

Thanks again for the feedback - and to the whole BlueRobotics team for the great work you do.

Alpha is already available, and there are now docs for it too that describe the current feature set and some of our future intentions :slight_smile:

1 Like

First glance at Cockpit looks awesome @EliotBR. How do you want to receive any feedback / questions? I couldn’t find it on GitHub.

I’ve just added a forum category for it: Cockpit (alpha), which is where any discussion can go for now.

Note that Cockpit is not officially released yet, and will not be officially supported until the 1.0 stable release is made, but in the meantime conceptual questions and suggestions should get responded to, as well as reports of any problems that come up while testing the alpha versions :slight_smile:

The repository is set to private for now, at least until we’ve stabilised some more of the base infrastructure and decided on an appropriate license.

If that drags out I’ll try to get a public issues repository set up so it’s at least possible to see which problems are already known about, and more readily track progress on suggested ideas.

1 Like

There’s some good stuff in this thread. The BR promo video of the WL A50 shows an example of point to point flight. Is it also possible to programme a maximum altitude into this mission? For the type of benthic survey Peter is describing, maintaining a fixed distance from the seabed is important.

Regarding the GPS fix, we find the ROV will shoot off as soon as it’s dropped in the water. I’ve seen some cheap solutions of a small GPS antennae in a bottle on top of the ROV which is supposed to capture a GPS fix of the ROV when it surfaces. I’m considering doing something similar. Is the idea to plug this into the Pixhawk GPS input?

Rather than towing a GPS sensor around (or adding more cost with a USBL/SBL/LBL solution) I’m still wondering what could be achieved with a starting Lat/Long and then dead reconning based on DVL and IMU input. I found this paper which I need to dig into a bit more:

I guess I was hoping that there would be some built-in way of DVL x,y and heading information updating the Lat/Long automatically. If it did update from DVL and IMU then we’d get a rough idea of location which might be a good starting point. Then if you need better accuracy you can always go to USBL/SBL/LBL solutions.

Any thoughts and ideas appreciated!

Hi Peter!
For a consistent survey of a given location, rather then guessing the lat/long of the ROV from it’s initial entry point it may be easier to use a known reference point. If you navigate the ROV manually to achieve a consistent location over a target, at a desired heading, and then activate position hold, followed by an Auto mission with DVL providing position, you should cover nearly the same path every time! You could then make this mission have global coordinates if you surveyed in this start-point via some other means, and even update the auto mission to use global coordinates and define your start position via map-click with the Water Linked DVL extension. Just an idea!

Thanks @tony-white . The problem we’re looking to solve is more about tracking where we’ve been and aligning this to what we see in a video image, vs. the targeting and driving. The planning / piloting at this stage is pretty straight forward, defined by a target starting position (lat, long), a heading and a distance to travel (i.e. a pretty standard transect definition). What we want to do is understand presence/absence and density of a marine pest weed species along that transect.

Think of it as the ROV equivalent of what you would do with a towcamera, where you typically drive a boat in a straight line, capture boat GPS location, offset with distance to towcamera to get an approx towcam position, and record a stream of GPS coordinates along with the video (synchronised with a timebase).

If we can achieve this basic functionality with the ROV (driven manually) then we’d look to build on it with some kind of area scan capability (automated).

Ultimately this is about hardware/software cost saving: Can we achieve this with dead reconning using the DVL vs. adding more cost using GPS sensing?

1 Like

Very cool Peter - and definitely something I’m experimenting with as well. I’ll share results if I have any!

@tony-white : If there’s potential to collaborate on this let me know. Our preference would be for it to get rolled into core product.

1 Like