More nonces doesn't mean a better chip. More diff-1 nonces sure, but the actual nonce count must be scaled against the share value.
More nonce output (shift out from register) is a quantity benchmark, which can be controlled by chip
More difficulty archieved from a nonce is a quality benchmark, which can not be controlled by chip.
Can BM1397 filter a nonce in terms of difficulty inside its chip operation?
eg. use last 10 zero as criteria for output the nonce.