please check this code
//================================ Setup Phase =============================//
//Absorbing salt, password and basil: this is the only place in which the block length is hard-coded to 512 bits
ptrWord = &wholeMatrix[0];
for (i = 0; i < nBlocksInput; i++) {
absorbBlockBlake2Safe(state, ptrWord); //absorbs each block of pad(pwd || salt || basil)
ptrWord += BLOCK_LEN_BLAKE2_SAFE_BYTES; //goes to next block of pad(pwd || salt || basil)
-------------------------------------------------------
BLOCK_LEN_BLAKE2_SAFE_BYTES is 64
the problem is
ptrWord is a 64 bit pointer,but the line ,it add 64,means it add 64*64=4096 bit=512 byte,
so ,the next round ,all ptrWord is zero,
that means Concatenates the basil and padding are not used!!!!!!
you must change BLOCK_LEN_BLAKE2_SAFE_BYTES to BLOCK_LEN_BLAKE2_SAFE_INT64
Surely though, "absorbBlockBlake2Safe" absorbs an entire block of 64 bytes (512 bits). ptrWord is then pushed forward by another 512 bits since +=ing a pointer pushes it forward in multiples of bytes. So on line "ptrWord += BLOCK_LEN_BLAKE2_SAFE_BYTES;", the pointer is pushed forward by 64 bytes (512 bits), and the process is repeated for the next block.