Or disable cards one by one until no more tempalarm, then flag faulty one and enable others.
I don't see any ways to implement it properly:
1. GPU temp can reach stop threshold and then become a bit less.
2. Several GPUs can reach threshold at the same time.
3. It will take a lot of time to detect cards for system with many GPUs.
Better way is to create some special mode to detect "-di " values which will correct GPU order. For example, you press "d" button, miner stops mining, and loads every card one by one, check temps and understands its order. In a couple of minutes after checking all GPUs miner shows you what "-di" value you must use. This way it will also fix issue with incorrect GPU indexes in temperature/fan stats that miner displays.
Another option is to check if GPU is still overheated in 30 seconds after disabling mining on it, such behaviour means that we disabled wrong card and entire miner will be closed.
Assuming you can see which number went off limit, u monitor this same number when turning cards off. Disable, wait 5sec, if temp dont drop 5 degrees enable, take next.