Physicists Talk about Twitter, or How to Trace Information on the Internet

Once published, information on the Internet never goes away, even though some people would probably like it to. How to find the original sender of a piece of information? Why gossip becomes the most important message of the day, and news that are important to the world disappear in the sea of information? And where did all of this begin? Physicists from the Warsaw University of Technology are currently researching this issue.

Internet users ask Google approximately 40 thousand questions in one second, which translates to about 1.3 trillion questions a year. In one minute, 2.5 million posts appear on Facebook; at the same time, 300 thousand tweets are posted on the Internet, and Instagram grows by 220 thousand new photos. 3 billion people use the Internet – the volume of information they produce is simply incredible.

The process of dissemination of these pieces of information and finding the point of origin of a given message are studied by a team of scientists led by Prof. Janusz Hołyst, PhD (Eng.) from the Warsaw University of Technology. The research is conducted under the project titled “RENOIR – Reverse EngiNeering of sOcial Information pRocessing”, financed from EU’s Horizon 2020 “Marie Skłodowska-Curie Actions” programme.

From Physics to Sociology

Why physicists deal with this area seemingly so distant from their main interests? “When thinking about Twitter or Facebook, we have to remember that these are very specific media where we are both consumers and producers of information”, Prof. Hołyst explains. “What’s most important in them is that we can repost information from other users, provided that it interests us. Every day on Twitter, tweets are reposted in tens of millions in their original form (as so-called retweets), and the number of messages containing information from other tweets is difficult to calculate”.

This network of connections is similar to a very specific nervous system. “And neural networks were studied by physicists as early as the 1980s”, adds the researcher. He then goes on to say that similar to a neural network, connections between Twitter users are also constantly changing. According to the analyses conducted by one of the project’s partners at the Stanford University, creation and dissolution of connections between Twitter users is correlated with interest profiles of individual users and their activity.

Who Said That First?

Prof. Hołyst is primarily interested in the issue of finding the source of information in the huge network of connections that is the Internet. “Given pieces of information often find us because many people before us decided that they are, for some reason, interesting”, says the scientist. “However, it’s often very difficult to find their original source (see figure)”.

Schematic representation of one of the problems solved by the RENOIR project. Information in a complex network has been sent from a source that is hidden from us (red circle) and arrived in many nodes at various times t (darker colours correspond to later times). How do we find the source, knowing only the times when pieces of information reach selected recipients (green circles)?

This is the first issue that the project’s participants will tackle. The scientists want to create a method that will enable us to identify the original source of a given piece of information, even in such a dynamic environment.

It’s also important to learn the routes by which such a piece of information arrived to the reader, thanks to how many Twitter or Facebook users we have learned about a given fact.

The last problem is finding the answer to the question: how are these pieces of information disseminated in social media networks? To date, epidemiologic models were used to increase our understanding of social media. They determine our susceptibility (in this case – to a given message), check if we are “infected”, or maybe if we are “resistant” to a given piece of information. This model, however, does not consider the nature of social media.

He Who Has the Information, Has the Profits

RENOIR project results might have a practical use in planning advertising campaigns. Understanding how information about a given product reaches consumers is key for preparing efficient promotional campaigns. It’s important to build a network of influential opinion leaders (influencers). “The idea is for the information to reach not many people, but those recipients who will be interested enough to repost it”, Prof. Janusz Hołyst explains. He who has the information, has the power.

With this knowledge, we can manipulate information to make the message’s profile match interests and expectations of users, and thus drastically increase the efficiency of communication.

This is contrasted with the question of the moral aspect of this research, and of using methods developed in its course. The Internet, unlike any other previous medium, has given its users access to almost infinite amounts of information. Wouldn’t open manipulation of such information cause public outrage?

Prof. Janusz Hołyst presenting the project’s objectives at the Technological University in Singapore.

“Let’s not delude ourselves that the Internet doesn’t currently host organized campaigns that distort information on purpose”, Prof. Hołyst emphasizes. “We experience it, especially during election campaigns. For many years, election committees have been investing enormous amounts of money to secure influential bloggers who flood the Internet with information about candidates. Moreover, the last US presidential campaign has taken this idea one step further. Methods for automatic creation of psychological profiles of individual voters (the so-called micro-targeting) have been used to create hundreds of thousands of different versions of letters tailored to different recipients, inviting them to vote for a given candidate. They employed psychometric and big data approaches, created, among other places, at Stanford University.

The First Results

“We already have interesting effects of our study”, says Professor Hołyst. “Jan Chołoniewski, MSc (Eng.), one of our doctoral students participating in the project has brought his research workshop to the Slovenian Press Agency’s attention so strongly, that they requested he prepare an application to analyse reception of the Agency’s communication in various media. The beta version of this program has been successfully tested by the Agency’s journalists, and there’s even a chance for this tool to be commercialized. Robert Paluch, MSc (Eng.), another one of our doctoral students has prepared during his stay in the US a new algorithm for finding the hidden source of information, which is many times faster than the currently known programs.

There are also unplanned benefits of this project. The Faculty of Physics is preparing a new major for its opening next year – Exploration of Data and Interdisciplinary Modelling, where certain classes will be directly related to the research conducted as part of RENOIR. Employees and doctoral students from the Faculty of Mathematics and Information Science and the Faculty of Electronics and Information Technology have also joined the Project. This way an interdepartmental seminar covering the issues of exploring data in informational media was created, and a joint research platform for this area is being created at the Warsaw University of Technology.

This isn’t the first Internet project coordinated by a scientist from the Faculty of Physics of the Warsaw University of Technology. Before this, Professor Hołyst studied how emotions spread on the Internet.

The RENOIR project, in addition to the Warsaw University of Technology, includes the participation of: Wrocław University of Science and Technology, Stanford University and Rensselaer Polytechnic Institute (USA), Slovenian Press Agency and Jožef Stefan Institute (Slovenia), and Nanyang Technological University (Singapore). The project’s cost amounts to over 1.3 billion euro. Thanks to this, WUT’s employees and doctoral students can spend 98 months in total at scientific internships for partners abroad.

We will have to wait 3 more years for the project’s final results. We wonder, how will the Internet change in this time. More information about the project: www.renoirproject.eu.


