Research on ROV visual autonomous obstacle avoidance navigation based on deep learning

Why can’t the Python script capture the video stream even though I used the official OpenCV code?
5f906cb55c4938055cb450e1a8ea0cd

#!/usr/bin/env python
"""
BlueRov video capture class
"""

import cv2
import gi
import numpy as np

gi.require_version('Gst', '1.0')
from gi.repository import Gst


class Video():
    """BlueRov video capture class constructor

    Attributes:
        port (int): Video UDP port
        video_codec (string): Source h264 parser
        video_decode (string): Transform YUV (12bits) to BGR (24bits)
        video_pipe (object): GStreamer top-level pipeline
        video_sink (object): Gstreamer sink element
        video_sink_conf (string): Sink configuration
        video_source (string): Udp source ip and port
        latest_frame (np.ndarray): Latest retrieved video frame
    """

    def __init__(self, port=4777):
        """Summary

        Args:
            port (int, optional): UDP port
        """

        Gst.init(None)

        self.port = port
        self.latest_frame = self._new_frame = None

        # [Software component diagram](https://www.ardusub.com/software/components.html)
        # UDP video stream (:5600)
        self.video_source = 'udpsrc port={}'.format(self.port)
        # [Rasp raw image](http://picamera.readthedocs.io/en/release-0.7/recipes2.html#raw-image-capture-yuv-format)
        # Cam -> CSI-2 -> H264 Raw (YUV 4-4-4 (12bits) I420)
        self.video_codec = '! application/x-rtp, payload=96 ! rtph264depay ! h264parse ! avdec_h264'
        # Python don't have nibble, convert YUV nibbles (4-4-4) to OpenCV standard BGR bytes (8-8-8)
        self.video_decode = \
            '! decodebin ! videoconvert ! video/x-raw,format=(string)BGR ! videoconvert'
        # Create a sink to get data
        self.video_sink_conf = \
            '! appsink emit-signals=true sync=false max-buffers=2 drop=true'

        self.video_pipe = None
        self.video_sink = None

        self.run()

    def start_gst(self, config=None):
        """ Start gstreamer pipeline and sink
        Pipeline description list e.g:
            [
                'videotestsrc ! decodebin', \
                '! videoconvert ! video/x-raw,format=(string)BGR ! videoconvert',
                '! appsink'
            ]

        Args:
            config (list, optional): Gstreamer pileline description list
        """

        if not config:
            config = \
                [
                    'videotestsrc ! decodebin',
                    '! videoconvert ! video/x-raw,format=(string)BGR ! videoconvert',
                    '! appsink'
                ]

        command = ' '.join(config)
        self.video_pipe = Gst.parse_launch(command)
        self.video_pipe.set_state(Gst.State.PLAYING)
        self.video_sink = self.video_pipe.get_by_name('appsink0')

    @staticmethod
    def gst_to_opencv(sample):
        """Transform byte array into np array

        Args:
            sample (TYPE): Description

        Returns:
            TYPE: Description
        """
        buf = sample.get_buffer()
        caps_structure = sample.get_caps().get_structure(0)
        array = np.ndarray(
            (
                caps_structure.get_value('height'),
                caps_structure.get_value('width'),
                3
            ),
            buffer=buf.extract_dup(0, buf.get_size()), dtype=np.uint8)
        return array

    def frame(self):
        """ Get Frame

        Returns:
            np.ndarray: latest retrieved image frame
        """
        if self.frame_available:
            self.latest_frame = self._new_frame
            # reset to indicate latest frame has been 'consumed'
            self._new_frame = None
        return self.latest_frame

    def frame_available(self):
        """Check if a new frame is available

        Returns:
            bool: true if a new frame is available
        """
        return self._new_frame is not None

    def run(self):
        """ Get frame to update _new_frame
        """

        self.start_gst(
            [
                self.video_source,
                self.video_codec,
                self.video_decode,
                self.video_sink_conf
            ])

        self.video_sink.connect('new-sample', self.callback)

    def callback(self, sink):
        sample = sink.emit('pull-sample')
        self._new_frame = self.gst_to_opencv(sample)

        return Gst.FlowReturn.OK


if __name__ == '__main__':
    # Create the video object
    # Add port= if is necessary to use a different one
    video = Video(port=4777)

    print('Initialising stream...')
    waited = 0
    while not video.frame_available():
        waited += 1
        print('\r  Frame not available (x{})'.format(waited), end='')
        cv2.waitKey(30)
    print('\nSuccess!\nStarting streaming - press "q" to quit.')

    while True:
        # Wait for the next frame to become available
        if video.frame_available():
            # Only retrieve and display a frame if it's new
            frame = video.frame()
            cv2.imshow('frame', frame)
        # Allow frame to display, and check if user wants to quit
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break

You need to arm the vehicle before it will allow you to control the motors. There are also various failsafes that you’ll need to either account for or disable for continuous control.

This example could be a useful reference, and includes some relevant notes at the bottom.

Are you creating a h264-encoded video stream that’s being served at port 4777? If not, you’ll need to either do that or change which port the code is looking at or which encoding it is trying to parse the stream as.

Hi @akang-211 -
You may have an easier time working with the video stream in OpenCV if you configure the stream to be RTSP, and not a simple UDP stream…


If I burn this docker-image into it, will it overwrite the Raspberry Pi system in my hospital? (My idea is to use the raspberry-oak system officially provided by Oak, and then burn the Blueos docker into it, so that I may be able to use my original code for OAK on the Raspberry Pi)

Hi @akang-211 -
Can you provide more context, like a link to the “official” oak software you’re referring to? If you flash the SD card, it will overwrite any OS - you do this with the img file, the .tar is used for offline update of BlueOS.
Your original code should be executed within a docker container, as part of a BlueOS extension. Have you followed the documentation on this process linked previously?

I recommend you try to

If you want to integrate with Cockpit then your main options would be to either

  1. serve a web-page that talks back to your BlueOS Extension to receive the annotation information, then draws it
    • this could be displayed in Cockpit using an iframe widget, as I mentioned in my initial comment
  2. provide a web-socket from your Extension that talks to a custom Cockpit widget
    • this could be tested using a DIY widget, with code to draw annotations on a HTML canvas

The main components of that could look something like this:

and you could use an existing OAK-D focused Extension like this one or this other one as a basis for several of the relevant steps.


Note also that processing frames on the Raspberry Pi requires a lot of data bandwidth to it, and generally quite a lot of processing capacity on it that may be better directed to other services and tasks, while typically adding latency to the stream, especially if all you’re doing on the Pi is overwriting image data with annotations and then encoding the stream to send it elsewhere.

If you want to install BlueOS as secondary functionality on an existing operating system image then you likely want to use the install script rather than trying to manually install a BlueOS docker image or something. That said, we’d generally recommend installing a BlueOS image and then running other things in parallel via the BlueOS Extension system, so that it doesn’t interfere with the core BlueOS services, and so the setup can be more easily reproduced and shared to other BlueOS devices.

The Raspberry Pi operating system images that get flashed onto an SD card are not the same as Docker images (which require a base operating system to be already installed, as well as Docker), so this question doesn’t make much sense.

Now I can use the algorithm I mentioned before on the Raspberry Pi, but the inference speed is too low. Is there any way to improve the inference speed? The following is a video of me using the Raspberry Pi for inference (the inference speed is 2 frames/second, and the saved video is 30 frames/second)

The OAK series cameras are made for running ML algorithms, and have low-latency access to the raw frame data (before it loses data in the encoding process, and without needing to decode it). If there’s some way you can run your processing on the camera then that would definitely be preferable, and ideally you can completely avoid decoding or processing the frames on the Raspberry Pi at all, and just pass them through to the visualisation endpoint.