Please read and share. amazing article from Marko Grobelnik, credtibti DAO team member....
Artificial intelligence quietly enters into our livesMarko Grobelnik is researcher at the Artificial Intelligence Laboratory at the Jožef Stefan Institute and the Digital Gazette of Slovenia on novelties in the development of artificial intelligence.
"The field of artificial intelligence has advanced rapidly over the last five years, but some problems have not yet been solved," says Marko Grobelnik, researcher at the Institute for Artificial Intelligence at the IJS and the Digital Gazette of Slovenia, which regularly works with a number of academic institutions around the world, Stanford University and University College London (UCL).
Grobelnik, an expert on various aspects of artificial intelligence, for analyzing large amounts of text data, machine learning, network analysis, data visualization and combinatorial optimization, is, among other things, the co-founder of Cycorp Europe and the director and founder of Quintelligence - Intelligent Knowledge Management – that works with some important European and American companies such as British Telecom, Microsoft Research, IBM Watson, New York Times and Bloomberg. In September, the Slovenian government appointed him as a national digital ambassador for a four-year term.
A year ago, the AlphaGo computer program beat the world champion in the game Go, one of the greatest achievements of artificial intelligence. Why has machine learning been progressing so fast in recent years?
Artificial intelligence has more been developed at the background over the past two decades, in the shadow of web and mobile technologies, as a mean of support for the management of massive data that emerged because of rapidly developing Internet-related activities which in the past five years led to a jump in development. This surprising progress has spurred the resurgence of machine learning with a number of methods called deep learning. As a matter of fact; these methods are conceptually not that different from how they were in the 1990s, but at that time computers were significantly less powerful, and there was not as much datga available. Today computers are much more powerful, work parallel and the amount of data is huge.
When these conditions were met, machine learning researchers began to revive techniques, called neural networks, which are basically not a very complicated mechanism, although it may seem that way. In fact, they could have been created by a high school student with the knowledge of programming, at least in a simpler, basic form. But if this mechanism is used on extremely powerful computers with massive data, we can at once solve some important problems that could not be solved previously.
Significant progress is the development of computer vision
That's right. Computers now "see", which is very important for the type of applications. Furthermore, the techniques of deep neural networks have greatly influenced the development of speech recognition, machine translation and processing of various very complex data, such as images, sound and language. These results were quickly used by large companies, like Google, Microsoft and Facebook, who incorporated them into their products. Examples of products that rely on these technologies and are used on a daily basis include Google photos, Google translate and Bing translate, and smart agents like Amazon's echo and Apple's cheese. This technology has also been significantly improved by Google's search engine, and autonomous vehicles are also rapidly developing.
In doing so, I would like to emphasize that large companies, that is, Microsoft, Google and Facebook, acted very kindly and shared virtually all important software packages from the field of deep learning with the public domain, so that everyone can use them. Today, students preparing for a diploma, a master's degree or a doctorate can use the packages of these companies, upload them to their computer, and solve all the problems that ten years ago could not be solved.
What problems remain unresolved and are currently the biggest challenge for artificial intelligence researchers?
The techniques I have mentioned are very powerful, but they are too weak to handle complex structures, such as understanding the text - data is only observed on the surface, they do not really go deep. In lectures, I often illustrate this problem with the famous Wittgenstein phrase "The boundaries of my language are the boundaries of my world." This sentence holds true in artificial intelligence. First, we can ask ourselves what kind of language a computer uses - it depends on how it can be expressed and how much it can tell. The more powerful language we use, the more we can say.
The question of the language in which the computer is thinking and expressing itself is well researched. The question of "the world", as seen by the computer, is much less explored. In today's systems of artificial intelligence, the understanding of the "world" or context is very limited and in this sense the solutions that can be expected are also limited. The real world in which we live is much more complicated and can only be seen by computers in pieces, which are by no means connected. So, if I come back to Wittgenstein's statement, we can say that we are trying to spread the world to the computer so that we can search for solutions in this wider world, better fit things together and solve the more difficult problems.
Do you use such techniques in working for large media companies such as the New York Times and Bloomberg?
Our cooperation with the New York Times began in 2007 when we had a lecture with colleague Blaž Fortuna at a conference in San Jose, California. A group of three contributors to this journal - who listened to our lecture - came to see us and told us that they were using our programs, and then asked if we would like to cooperate with them. They also said that they had to compile a report by the next day, and asked if we could alter some parts of the software that evening. We did this and within a month we had a contract with them.
Soon we began to visit the New York Times regularly, where we obtained access to data and developed software for various services. Later one of the co-workers from the New York Times went to Bloomberg, and so the cooperation continued there too. We started with minor problems, which over the years have grown into much bigger, demanding projects. Some of these activities are related to the media, and partly with Bloomberg's financial and business products.
Was your work related to the media mostly related to advertising?
Most of these projects were related to advertising and user understanding. They were not only interested in how to get as many ads as possible, but above all how to get the existing ads to the target audience as precisely as possible. So, there are ads that are simply scattered across different content - these are the cheapest. The second type is the ads for which advertisers are willing to pay a lot, but only if they are seen by the selected target audience, and these targeted readers are precisely defined, for example, decision makers, rich, employed in a reputable financial institution, must live in selected geographic environments, and the like.
Large media houses have millions of visitors per day, which they really know a lot about, not only from the internal media data of the media houses - they collect a lot of data from their various suppliers of personal data and bring them together with their own. They see their online activity, what they read and when, where they are, who they are like, who they hang out with, what products they are viewing.
Are you also involved in analyzing media content?
For this purpose, at the Jožef Stefan Institute, primarily for research purposes, the event registry system was developed, accessible at
http://eventregistry.org/, which aims to monitor real-time global media. It started as a purely academic project, which is supposed to show whether we can observe, understand and predict global social dynamics based on media content, and we already had developed linguistic technologies at the Institute, which enable simultaneous processing of texts in about one hundred languages. Whether texts are written in Chinese, Slovene, or in a language that we can hardly know to exist, the system will detect this and summarize the key content from these texts, link them and create events. Thus, through this system of the world, we cannot see through a multitude of articles, but as a multitude of events that connect to longer stories.
Event registry collects about half a million articles a day, identifies between 5 and 10 thousand events from them, and then connects these events to stories, allowing us to go very far into the information space of all world events. The system works equally well when we ask what is happening in Chicago, Murska Sobota or in a Chinese village. We can observe various phenomena, such as how the information space is manipulated in a country, it can be seen how important topics compete with each other, how one topic is pushed out, and how some topics are created and planned for some political or commercial reasons.
What do you think of as a national digital messenger about an open letter from the founder of the World Wide Web, Tim Berners-Lee, which he published a month ago and with his views attracted a lot of attention from the international professional public, as he draws attention to three key problems that " so that the world's web develops as a tool that serves all humanity "?
In short: all three problems exist, they are very serious and affect the lives of individuals, groups, countries and the whole world. The problem can be solved in part, by country - but it is not necessary that this is an appropriate solution, because the problems are global, state boundaries and legal systems are local. If solving these problems were to deal with simple cuts, this would pose a threat to the "neutrality and openness of the Internet" that we all want. It is known that some countries abolish access to certain resources on the Internet.
If I indulge every problem separately. First, Berners-Lee warns: "We've lost control of personal information." The problem of losing privacy is probably the most critical. There are too many stakeholders who are not interested in protecting privacy. In fact, there is a whole industry that trades with personal data, and in principle such data can come with some money. Above all, there are critical consequences of a loss of privacy that can have a major impact on the lives of individuals, which sometimes also means loss of career or, in extreme cases, even loss of life.
Secondly, Berners-Lee says: "False information is spreading too easily over the Internet." This problem is critical, but somewhat lighter than the first because it is about establishing a system of trust and credibility of information. Here, perhaps, is the main problem in the media, who, as a result of mutual competition, publish information quickly before verifying them. However, the Internet allows for the rapid spread of data, and when information is online, it is difficult to stop it. Here, it would be an appropriate measure of large Internet companies, such as Google and Facebook, to ensure the credibility of the information they are distributing. Such companies actually have a good reason to do so, because poor information reduces the quality of their services.
And thirdly: "Political advertising on the web should be fair and clear." This problem is probably the easiest to regulate, but it also has the greatest weight for the lives of larger communities, such as countries and the environment where politics is taking place. The problem was particularly widespread during the recent US elections, where the Internet infrastructure was not ready for modern political advertising. As it seems, in similar cases, for example, in the near elections in European countries, matters will be much more controlled.
What is the position on this issue in Slovenia? Are there any suggestions for possible action?
Slovenia is considering these matters, similar to other European countries - there are currently no specific measures, but it is likely that they will be adopted at European level, where Slovenia will add its opinion.