However, memory mapped files share a lot of the same advantages you list:
1) independence of logical data from physical storage - yes
2) sharing of the dataset between tasks and machines - yes (you do need both endian forms)
3) rapid integrity verification - yes, even faster as once verified, no need to verify again
4) optimization of storage method to match the access patterns - yes that is exactly what has been done
5) maintenance of transactional integrity with related datasets and processes - yes
6) fractional/incremental backup/restore while accessing software is online - yes
7) support for ad-hoc queries without the need to write software - no
ease of integration with new or unrelated software packages - no
9) compliance with accounting and auditing standards - not sure
10) easier gathering of statistics about access patterns - not without custom code
So it depends on if 7 to 10 trump the benefits and if the resources are available to get it working
James
I really don't want to be discouraging to you. You do very creative and innovative work, but you know less than zero about databases, and it is going to hamper you. You write like an autodidact or maybe you went to some really bad school.
1) no mmap()-ed files as used by you don't provide independence of the physical storage layout. It is fixed in your code and in your proposed filesystem usage. Not even simple tasks like adding additional disk volumes with more free space while online.
2) it doesn't seem like your code does any locking, so currently you can't even do sharing on a single host. Shared mmap() over NFS (or MapViewOfFile() over SMB) is only for desperadoes.
3) here you slip from zero knowledge to willful, aggressive ignorance. Marking files read-only and using SquashFS is not storage integrity verification. Even amateur non-programmers, like musicians or video producers are on average more aware of the need to verify integrity.
4) unfortunately you've got stuck in a rut of exclusively testing the initial sync. This is very non-representative access pattern. Even non-programmers in other thread are aware of this issue. This is the reason why professional database systems have separate tools for initial loading (like SQL*Loader for Oracle or BULK INSERT in standard SQL).
5) I don't believe you know what you're talking about. I'm pretty sure that you've never heard of
https://en.wikipedia.org/wiki/Two-phase_commit_protocol and couldn't name any
https://en.wikipedia.org/wiki/Transaction_processing_system to save your life.
6) I couldn't find any trace of support for locking or incremental backup/restore in your code. Personally, you look like somebody who rarely backs up, even less often restores and verifies. Even amateur, but experienced non-programmers seem to be more aware of the live-backup issues.
So not 7/10 or even 6/10. It is 0/10 with big minus for getting item 3) so badly wrong.
Again, I don't want to be discouraging to your programming project, although I don't fully comprehend it. Just please don't write about something you have no understanding.
People like Gregory Maxwell, Mike Hearn or Peter Todd are getting paid to pretend to not understand. Old quote:
It is difficult to get a man to understand something, when his salary depends upon his not understanding it!