I can’t speak for other systems, but for integration with Blue Robotics vehicles (and others using our software stack), that’s likely one of the most straightforward approaches. A Lua script running as part of the autopilot firmware could be an alternative, but that gains extra access to autopilot internals at the cost of access to any of the non-motion-critical peripherals, like cameras or sonars and the like, so may miss important insights from the system as a whole.
I’m not well-versed in how to provide guarantees that data was not falsified in a live stream, but if it’s relevant BlueOS 1.5 (currently in beta) introduces synchronised logging into MCAP files, with Zenoh-based communication. It may be possible to do something like occasionally hash the current MCAP file and inject that hash to the Zenoh stream, as a regular check-in of the data that was included up to that point 
That probably depends on the setup, and the usage conditions. Software development at Blue Robotics involves assuming that internet access during operation may be non-existent, or intermittent / low-bandwidth. There are some operators who intentionally run air-gapped systems for security reasons (in which case manual retrieval would be required as an option), while others are just operating in remote areas, without good reasons to pay for an internet connection while deployed (but may then connect the vehicle to the internet when they get back home, to check for updates and the like).
There are undeniable convenience benefits to systems that can automatically upload critical data to a cloud server somewhere, and live updates of system integrity could be a valuable reference if something goes wrong, so I expect at least some prospective users would be interested in such a feature, but probably not all of them (and it may not be a dealbreaker, in an absence of competition).
My main concern with the viability of this presently is that the data itself cannot necessarily be trusted (from the source), regardless of how verifiable it is that the recorded data has not been modified.
That’s a great question. As a relatively new industry, with substantial growth as equipment becomes more accessible to people from all walks of life, I expect many users aren’t at the stage where that’s relevant yet, and those that are likely don’t have standardised/common ways of doing so. Accordingly, if and when it’s being done at all, it’s likely somewhat haphazard/inconsistent, and may be quite limited in scope.
As is, we’re yet to standardise or provide great tooling for even tracking and managing core data like maintenance and operating schedules, which I’d consider a more pressing concern (including for risk tracking) than additional validity guarantees on what effectively amounts to log data. That is something we’re working on developing, but its current absence may give some insight into the state of the industry more broadly.
Adding to that, underwater positioning is still not a trivially solved problem, and is not tracked on every vehicle - there are methods for it, but they can be quite costly, are often relative rather than global/absolute, and can have limited value in the absence of detailed underwater maps. Where risk tracking and liability are concerned I’d expect that “knowing where the vehicle was, when, and what was around it” are all pretty important, but I’m also unsure how that could best be guaranteed, since, for example, a relative positioning system could always be set up with an incorrect offset or orientation, and everything else is based off of that.
Robotic vehicles are also frequently used for exploration in unmapped areas, or areas where the terrain may have moved or changed since the last map was made (e.g. due to silt buildup/movement, or someone dumping something in a river/lake). A lack of guaranteed environment is not unique to marine robotics, but is probably a relevant consideration when estimating risk, and I’m not sure there are even well-established procedures or protocols for risk estimation outside of specific well-established industries.
Requirements also tend to vary by jurisdiction, so standardisation efforts would likely only prove effective if they’re accompanied by the establishment of best practices that are well-founded, and meet or exceed most of the disparate legal requirements that already exist. One of the largest unknowns in this space at the moment is often what is allowed where, and under what conditions, so a good reference of that could definitely help the industry to progress with more certainty. That said, establishment/hardening of dedicated regulations is not always welcomed, particularly if they prove substantially more restrictive than the common-sense / favourable best guess approaches that are being used in any given place.
A bit of a ramble, but hopefully there are some meaningful insights there 