Thank you for giving me the chance to clarify this. Our research, as a part of our master's program at Politecnico, has been devoted to the analysis of Bitcoin transaction records. We aimed to discern whether it's possible to identify fraudulent behavior just by scrutinizing these transactions. Our results were encouraging; we look solely at the transaction record, not at the network of interconnected transactions, at least in this phase.
In terms of the algorithm, here's a bit more detail. From a transaction record, we derive roughly 200 attributes or 'features.' Some of these are proprietary, but most can be shared. They include the average of received and sent amounts over different time periods (day, week, month), as well as the minimum and maximum amounts received and sent. We also look at the frequency of transactions. These are mostly simple statistical metrics, though some more complex features are also calculated from the transaction record.
At this point, our artificial intelligence (AI) steps in. It's been trained to take all these factors into account and understand which are the most significant in identifying a fraudster's behavior. As for how the AI makes these decisions, it's a bit of a black box – hence why machine learning algorithms are often referred to as 'unexplainable.' The AI surpasses human ability to evaluate large amounts of statistical data, making it an ideal tool for tasks like this one. Similar technologies are used in advanced anti-fraud systems across the payment industry.
To summarize, our process works as follows:
1. For a given address, we examine the transaction history.
2. We calculate roughly 200 statistical metrics related to that address and pass this data to an algorithm, which has been trained to spot the telltale signs of a fraudulent actor.
The exact way the algorithm weighs each of these metrics, and why, is something that remains opaque to us. We know that during training, the algorithm adjusted the weights of the features, looking for the combination that was most successful in identifying fraudulent behavior from the pool of 10,000 addresses (good and bad) that we used for training it.
At this juncture, I would recommend that you give it a try:
https://carscore.io . Through your own experience, you can gauge whether its predictive capabilities are of value to the community. That's the very reason we embarked on this project. See you soon! /M