So 2 bytes indicate the size of the data?
It is "up to" 2 bytes. It can be 0, 1 or 2.
For example it could be OP_Return OP_2 (which passes
IsPushOnly()).
Basically as impossible as sending bitcoin through 0-conf transaction, that means anyone who make transaction with multiple OP_RETURN or OP_RETURN where the data is more than 80 bytes must contact pool/miner and hope they agree to include your transaction.
Yes.
I just remember there are variation of UTF-8 from 1 to 4 bytes, do you mean 1 byte UTF-8?
I don't know much about UTF-8 but I don't think there is such thing as "1 byte UTF-8".
AFAIK the way this encoding works is that you read one byte at a time and decide based on that byte how many more bytes you need to create the first character.
For example if the first byte is 0xxxxxxx (in binary where 0 is zero and x can be 1 or 0) then that byte is the character itself.
If the first byte is 110xxxxx (in binary) then you have to read another byte and the two bytes represent the character (the second byte also has to use 10xxxxxx format).
Similarly 1110xxxx for 3 bytes and 11110xxx for 4.
That means you can end up reading 4 bytes from the stream to be able to represent 1 character (0xF0, 0x90, 0x8D, 0x88 -> 𐍈
Hwair since the binary is
11110000
10010000
10001101
10001000)
You can check the Wikipedia link
https://en.wikipedia.org/wiki/UTF-8#EncodingOr here is the .net source code in C# where it converts the bytes to string when it can't be mapped 1 byte to ASCII chars:
https://source.dot.net/#System.Private.CoreLib/Utf8Utility.Validation.cs,250