Category: show on landing page

May 10, 2020

[BigDataSur-COVID] The Russian “Sovereign Internet” Facing Covid-19

What does the Covid-19 crisis say about the Russian state’s digital power, and the challenges it poses to public freedoms?

By Olga Bronnikova, Françoise Daucé, Ksenia Ermoshina, Francesca Musiani, Bella Ostromooukhova, Anna Zaytseva

Originally published in French on The Conversation France on April 29, 2020, under a Creative Commons BY-NC license. Translated by Francesca Musiani.

Despite the evolution of the Covid-19 pandemic in Russia, a state of emergency has not been declared in the country: only a state of “high alert” has entered into force in Moscow and in specific regions since early April. “Compulsory holidays” are only partially respected by a population plunged into a growing vagueness that is health-related, legal and economic at once. In this context, Russia is deploying and updating its digital strategy and infrastructure, which have been carefully scrutinized in recent years because of their (increasingly) strong centralizing and authoritarian dimensions. What does the Covid-19 crisis say about the Russian state’s digital power, and the challenges it poses to public freedoms?

The Russian state facing Covid-19: Digital ambitions put to the test

The Russian authorities very early advocated the massive use of digital tools to control the movements of citizens and limit the circulation of the virus. These uses aimed at “securitization” are inspired by foreign examples (China, Korea, Singapore), but they are also part of the “sovereignty” logic of the Russian Internet (Runet), already engaged before the start of the epidemic, and they are consolidating surveillance systems whose existence dates back several years (e.g. video surveillance cameras, aggregation of geolocation data supplied to the authorities by mobile operators).

As of February, Sergei Sobyanin, the mayor of Moscow, proposes the use of facial recognition to monitor people returning from abroad, using the surveillance cameras of the “Safe City” program, in force since 2018. Between February and March, 200 people who broke their quarantine were thus identified, including a man who took out his trash. But as a study by IT and SORM (a blog on Telegram devoted to Runet surveillance and regulation issues, with more than 73,000 subscribers) shows, this device is a catalyst for inequality: these surveillance cameras are mainly installed in the modest districts of Moscow because those who decide their location, who themselves reside in the upscale districts, do not wish their activities to be monitored.

On March 20, 2020, faced with an increase in contaminations, Prime Minister Mikhail Mishustin recommends monitoring citizens who are or have been in contact with infected people, by collecting geolocation data from operators, and transmitting them to local administrations. A patient monitoring application, “Social monitoring”, is made available on April 1 on GooglePlay. It quickly becomes controversial, as its surveillance goes far beyond the movement of patients and offers little protection of personal data; the application is finally withdrawn.

However, digital tracking of citizens has not been abandoned. Since April 13, all trips within Moscow that involve public transportation are carried out, under penalty of fines, with a digital pass, to be generated on an official website. In response to criticism of the “Social Monitoring” application, the Moscow municipality declares that with this new device, personal data will be stored on Russian territory (in accordance with the 2014 law, targeting in particular “giant” United States-based platforms such as Google) and will be deleted when the “high alert” state will be over. The same system works in Tatarstan and the Primorye region; QR-Code passes are also available and recommended but not mandatory in Nizhny-Novgorod, while other Russian regions resort to lighter measures.

Resistance and mobilisations of the free Internet

The use of digital data to strengthen surveillance of the population while coping with the disease is causing concern for defenders of online freedoms. Technologists, engineers and developers discuss government projects and conduct independent investigations to uncover security vulnerabilities, technical issues and other controversial aspects of the technologies deployed by the Russian state.

Several associations and independent media alert Internet users to the growing attacks on the protection of personal data and the development of online surveillance. The NGO Roskomsvoboda published, on March 27, a vademecum on digital rights in a pandemic period, stressing that the use of personal data, especially biometric data, legally requires the consent of individuals. But “the use of facial recognition is in a gray area,” argues lawyer Sarkis Darbinyan. The association is also launching, with other associations in the post-Soviet space, an inventory of restrictions on digital freedoms around the world, while the Agora association is opening a legal aid service linked to the pandemic. Its lawyers are also concerned about the use of facial recognition to enforce quarantine. Activists close to government-opposing personality Alexei Navalny (Society for the Protection of Internet) denounce, even more boldly, the establishment of a “digital gulag”, and call on citizens not to transmit their personal data to the applications that control movements and trace contacts.

At the same time, solidarity initiatives are developing on the Internet, aimed at supporting the poorest citizens and the caregivers. The Makers vs. Covid collective uses 3D printing techniques to provide doctors with the protective gear they need. An online hackathon, “Covidhack”, is developing a bot for Telegram that helps produce a citizen database allowing people with coronavirus to speak anonymously and map their symptoms and requests.

Internet infrastructures are also being weakened by the pandemic, due to the growth in traffic linked to the new digital habits of confinement. Russian networks are frequently down, but the on-site work of technicians and cable operators employed by the three thousand and more Internet service providers (ISPs) that manage these networks comes at the risk of legal threats. OrderKom, a consulting firm for ISPs, offers these workers legal support which includes the preparation of authorizations for movements due to on-site work, and legal defense in the event of a fine.

Faults and paradoxes of digital surveillance

Over the days and the weeks, gaps emerge between the authorities’ security ambitions and the realities of their implementation. Digital surveillance and health-related solutions are delegated to many public and private, federal and regional players, who often make contradictory decisions. The paradoxes and dysfunctions documented by online freedom activists show the limits of the announced “securitization” design. Perhaps the most obvious failure is that of digital passes in Moscow. The Nedoma.mos.ru site, developed to generate them, uses foreign hosting servers; the government was therefore accused of putting its own project of sovereign Runet in jeopardy.

Digital freedom activists, such as Mikhail Klimarev (Society for the Protection of the Internet), point to the ineffectiveness of technological solutions; Covid-19 strategies should focus on civic responsibility, while digital surveillance infantilizes citizens and is likely to be circumvented. This crisis highlights the lack of mutual trust between citizens and the state. Indeed, the information on the epidemic disseminated by the state is viewed with suspicion, oscillating between “they are hiding the true extent of the disaster to us” and “it is a plot to muzzle us even more”. If the authorities take the Covid-19 crisis as an opportunity to re-open their hunt for “fake news”, on their end, Youtubers and independent journalists denounce the incomplete or questionable information disseminated by representatives of power, and their behavior in public (such as that of Vladimir Putin’s spokesperson, who showed up at a press conference with a highly contested virus “blocker” badge). Sometimes, events are borderline ironic, such as the Ministry of Foreign Affairs’ opening of a thread of information for its nationals abroad on the Telegram application… officially banned in Russia.

Thus, part of civil society, without questioning the need for confinement, mobilizes against the threatening initiatives of the Russian Big Brother. It denounces the incompetence of the authorities to manage the implementation of technical devices as well as the institutional power’s violation of its own laws (like the provision on the storage of Russian data on Russian territory), as well as the non-protection of personal data, which exposes them to leaks in the databases black market.

While the wide and ambitious Russian Internet surveillance and sovereignty project is gaining strength during the coronavirus crisis, its implementation is uncertain and often contradictory. The pandemic demonstrates the limits of the Internet infrastructure centralization project, and the government ends up being obliged to relax specific regulatory measures, such as the Yarovaya law (which requires ISPs to keep the history and metadata of users for the purpose of legal interception and fight against terrorism). However, this apparent complexity is not necessarily synonymous with ineffectiveness. It is part of the flexible reconfigurations of digital constraints in Russia, adjusting as best it can to the recently rising challenges, and – legitimately so – raises the concerns of digital freedoms defenders.

About the authors. Olga Bronnikova, Associate Professor at Université Grenoble Alpes. Françoise Daucé, Professor at EHESS, Director of CERCEC. Ksenia Ermoshina, Assistant Research Professor at CNRS, Centre for Internet and Society. Francesca Musiani, Associate Research Professor at CNRS, Deputy Director, Centre for Internet and Society. Bella Ostromooukhova, Associate Professor at Sorbonne Université. Anna Zaytseva, Associate Professor at Université Toulouse 2 Jean Jaurès.

All authors are members of the ANR-funded ResisTIC research project. Thanks to Grégory Rayko for guiding the original article to publication.

May 7, 2020

[BigDataSur] The Challenge of Decolonizing Big Data through Citizen Data Audits [2/3]

A First Attempt at Citizen Data Audits

Author: Katherine Reilly, Simon Fraser University, School of Communication

In the first post in this series, I explained that audits are used to check whether people are carrying out practices according to established standards or criteria. They are meant to ensure effective use of resources. Corporations audit their internal processes to make sure that they comply with corporate policy, while governments audit corporations to make sure that they comply with the law.

There is no reason why citizens or watchdogs can’t carry out audits as well. In fact, data privacy laws include some interesting frameworks that can facilitate this type of work. In particular, the EU’s General Data Privacy Regulation (GDPR) gives you the right to know how corporations are using your personal data, and also the ability to access the personal data that companies hold about you. This right is reproduced in the privacy legislation of many countries around the world from Canada and Chile to Costa Rica and Peru, to name just a few.

With this in mind, several years ago the Citizen Lab at the University of Toronto set up a website called Access My Info which helps people access the personal data that companies hold about them. Access My Info was set up as an experiment, so the site only includes a fixed roster of Canadian telecommunications companies, fitness trackers, and dating apps. It walks users through the process of submitting a personal data request to one of these companies, and then tracks whether the companies respond. The goal of this project was to crowdsource insights from citizens that would help researchers learn what companies know about their clients, how companies manage personal data, and who companies share data with. The results of this work have been used to advocate for changes to digital privacy laws.

Using this model as a starting point, in 2019, my team at SFU, and a team from the Peruvian digital rights advocate HiperDerecho, set up a website called SonMisDatos (Son Mis Datos translates as “It’s My Data”.) Son Mis Datos riffed on the open source platform developed by Access My Info, but made several important modifications. In particular, HiperDerecho’s Director, Miguel Morachimo, made the site database-driven so that it was easier to update the roster of corporate actors or their contact details. Miguel also decided to focus on companies that have a more direct material impact on the daily lives of Peruvians – such as gas stations, grocery stores and pharmacies. These companies have loyalty programs that are involved in collecting personal data about users.

Then we took things one step further. We used SonMisDatos to organize citizen data audits of Peruvian companies. HiperDerecho mobilized a team of people who work on digital rights in Peru, and we brought them together at two workshops. At the first workshop, we taught participants about their rights under Peru’s personal data protection laws, introduced SonMisDatos, and asked everyone to use the site to ask companies for access to their personal data. Companies need time to fulfill those requests, so then we waited for two months. At our second workshop, participants reported back on the results of their data requests, and then I shared a series of techniques for auditing companies on the basis of the personal data people had been able to access.

Our audit techniques explored the quality of the data provided, corporate compliance with data laws, how responsive companies were to data requests, the quality of their informed consent process, and several other factors. My favorite audit technique reflected a special feature of the data protection laws of Peru. In that country, companies are required to register databases of personal information with a state entity. The registry, which is published online, includes lists of companies, the titles of their databases, as well as the categories of data collected by each database. (The government does not collect the contents of the databases, it only registers their existence.)

With this information, our auditors were able to verify whether the data they got back from corporate actors was complete and accurate. In one case, the registry told us that a pharmaceutical company was collecting data about whether clients had children. However, in response to an access request, the company only provided lists of purchases organized by date, skew number, quantity and price. Our auditors were really bothered by this discovery, because it suggested that the company was making inferences about clients without telling them. Participants wondered how the company was using these inferences, and whether it might affect pricing, customer experience, access to coupons, or the like.

In another case, one of our auditors subscribed to DirecTV. To complete this process, he needed to provide his cell phone number plus his national ID number. He later realized that he had accidentally typed in the wrong ID number, because he began receiving cell phone spam addressed to another person. This was exciting, because it allowed us to learn which companies were buying personal data from DirecTV. It also demonstrated that DirecTV was doing a poor job of managing their customer’s privacy and security! However, during the audit we also looked back at DirecTV’s terms of service. We discovered that they were completely up front about their intention to sell personal information to advertisers. Our auditors were sheepish about not reading the terms of the deal, but they also felt it was wrong that they had no option but to accept these terms if they wanted to access the service.

On the basis of this experience, we wrote a guidebook that explains how to use Son Mis Datos, and how to carry about an audit on the basis of the ‘access’ provisions in personal data laws. The guide helps users think through questions like: Is the data complete, precise, unmodified, timely, accessible, machine-readable, non-discriminatory, and free? Has this company respected your data rights? What does the company’s response to your data request suggest about its data use and data management practices?

We learned a tonne from realizing these audits! We know, for instance, that the more specific the request, the more data a company provides. If you ask a company for “all of the personal data you hold about me” you will get less data that if you ask for “all of my personal information, all of my IP data, all of my mousing behaviour data, all of my transaction data, etc.”

Our experiments with citizen data audits also allow us to make claims about how companies define the term “personal data.” Often companies define personal data very narrowly to mean registration information (name, address, phone number, identification number, etc.). This lies in extreme contrast to the academic definition of personal data, which is any information that can lead to the identification of an individual person. In the age of big data, that means pretty much any digital traces you produce while logged in. Observations like these allow us to open up larger discussions about corporate data use practices, which helps to build citizen data literacy.

However, we were disappointed to discover that our citizen data audits worked to validate a data regime that is organized around the expropriation of resources from our communities. In my first blog post I explained that the 5 criteria driving data audits are profitability, risk, consent, security and privacy.

Since our audit originated with the law, with technology, and with corporate practices, we ended up using the audit criteria established by businesses and governments to assess corporate data practices. And this meant that we were checking to see if they were using our personal and community resources according to policies and laws that drive an efficient expropriation of those very same resources!

The concept of privacy was particularly difficult to escape. The idea that personal data must be private has been ingrained into all of us, so much so that the notion of pooled data or community data falls outside the popular imagination.

As a result, we felt that our citizen data audits did other people’s data audit work for them. We became watchdogs in the service of government oversight offices. We became the backers of corporate efficiencies. I’ve got nothing personal against watchdogs — they do important work — but what if the laws and policies aren’t worth protecting?

We have struggled greatly with the question of how to generate a conversation that moves beyond established parameters, and that situates our work in the community. With this in mind, we’ve begun to explore alternative approaches to thinking about and carrying out citizen data audits. That’s the subject of the final post in this series.

About the author: Dr. Katherine Reilly is Associate Professor in the School of Communication at Simon Fraser University in Vancouver, Canada. She is the recipient of a SSHRC Partnership Grant and an International Development Research Centre grant to explore citizen data audit methodologies alongside Derechos Digitales in Chile, Fundacion Karisma in Colombia, Sula Batsu in Costa Rica, TEDIC in Paraguay, HiperDerecho in Peru, and ObservaTIC in Uruguray.

May 6, 2020

[blogpost] Thinking Outside the Black-Box: The Case for ‘Algorithmic Sovereignty’ in Social Media

Urbano Reviglio, Ph.D. candidate of the University of Bologna in collaboration with Claudio Agosti, the brain behind tracking.exposed just pubished a new academic article on Algorithmic Sovereignty in Social Media + Society (SAGE). Find an extended abstract below, and the full paper here.

Everyday algorithms update a profile of “who you are” based on your past preferences, activities, networks and behaviours in order to make a future-oriented prediction and suggest you news (e.g. Facebook and Twitter), videos (e.g. Youtube), movies (e.g. Netflix), songs (e.g. Spotify), products (e.g. Amazon) and, of course, ads. These algorithms define the boundaries of your Internet experience, affecting, steering and nudging your information consumption, your preferences, and even your personal relations.

Two paradigmatic (and likely most influential) examples clarify well the importance of this process. On Facebook, you can encounter 350 posts on average, prioritized on about 1.500. As such, you can be exposed only to 25% of the information, while roughly 75% is actually hidden. This is Facebook’s newsfeed algorithm that is choosing for you. And it is rather good at that. Think also of Youtube; its recommendations already drive more than 70% of the time you spend in the platform, meaning you are mostly “choosing” in a pre-determined set of possibilities. In fact, 90% of the ‘related content’ on the right side of the website is already personalized for you. Yet, this process occurs largely beyond your control and it is mostly based on implicit personalization — behavioural data collected from subconscious activity (i.e. clicks, time spent etc.) — rather than on deliberate and expressed preferences. Worryingly, this might become a default choice in future personalization, essentially because you may be well satisfied without further questioning the process. Do you really think the personalization that recommends you what to read and watch is indeed the best you could experience?

Personalization is not what is narrated by mainstream social media platforms. There are a number of fundamental assumptions that are nowadays shared by most researchers, and these need clarifications. Profiling technologies that allow personalization create a kind of knowledge about you that is inherently probabilistic. Personalization, however, is not exactly ‘personal’. Profiling is indeed a matter of pattern recognition, which is comparable to categorization, generalization and stereotyping. Algorithms cannot produce or detect the complexities of yourself. They can, however, influence your sense of self. As such, profiling algorithms can trivialize your preferences and, at the same time, steer you to conform to the status quo of your past actions chosen by ‘past selves’, narrowing your “aspirational self.” They can limit the diversity of information you are exposed to, and they can ultimately perpetuate existing inequalities. In other words, they can limit your information self-determination. So, how can you fully trust proprietary algorithms that are naturally designed for ‘engagement optimization’ — to hook you up to the screen as much as possible — and not explicitly designed for your personal growth and society’s cohesion?

One of the most concerning problems is that personalization algorithms are increasingly ‘addictive by design’. Human behavior indeed can be easily manipulated by priming and conditioning, using rewards and punishments. Algorithms can autonomously explore manipulative strategies that can be detrimental to you. For example, they can use techniques (e.g. A/B testing) to experiment with various messages until they find the versions that best exploit your vulnerabilities. Compulsion loops are already found in a wide range of social media. Research suggests that such loops can work via variable-rate reinforcement in which rewards are delivered unpredictably — after n actions, a certain reward is given, like in slot machines. This unpredictability affects the brain’s dopamine pathways in ways that magnify rewards. You think you liked that post… but you may have been manipulated to like that after several boring posts, with an outstanding perfect timing. Consider how just dozens of Facebook Likes can reveal useful and highly accurate correlations; hundreds of likes can predict your personality better than your mother could do, research suggests. This can be easily exploited. For example, if you are vulnerable to moral outrage. Researchers have found that each word of moral outrage added to a tweet raises the retweet rate by 17%. Algorithms know that, and could feed you with the “right” content at the right time.

As a matter of fact, personalization systems deeply affect public opinion, and more often negatively. For increasingly more academics, activists, policy-makers and citizens the concern is that social media, more generally, are downgrading our attention spans, a common base of facts, the capacity for complexity and nuanced critical thinking, hindering our ability to construct shared agendas to help to solve the epochal challenges we all face. This supposed degraded and degrading capacity for collective action arguably represents “the climate change of culture.” Yet, research on the risks posed by social media – and more specifically their personalization systems – is still very contradictory; these are very hard to prove and, eventually, to mitigate. In light of the fast-changing media landscape, many studies become rapidly outdated, and this contributes to the broader crisis concerning the study of algorithms; these are indeed “black-boxed”, which means their functioning is opaque and their interpretability may not even be clear to engineers. Moreover, there are no easy social media alternatives one can join in to meet friends and share information. These one day might spread but until that day billions of people worldwide have to rely on opaque personalization systems that ultimately may impoverish them. They are an essential and increasingly valuable public instrument to mediate information and relations. And considering that these even introduce a new form of power of mass behavioral prediction and modification that is nowadays concentrated in very few tech companies, there is a clear need to radically tackle these risks and concerns now. But how?

By analyzing challenges, governance and regulation of personalization, what we argue in this paper is that we as a society need to frame, discuss and ultimately grant to all users a sovereignty over personalization algorithms. More generally, with ‘algorithmic sovereignty’ in social media we intend the regulation of information filtering and personalization design choices according to democratic principles, to set their scope for private purposes, and to harness their power for the public good. In other words, to open black-boxed personalization algorithms of (mainstream) social media to citizens and independent and public institutions. By doing this, we also explore specific experiences, projects and policies that aim to increase users’ agency. Ultimately, we preliminary highlight basic legal, theoretical, technical and social preconditions to attain what we defined as algorithmic sovereignty. To regain trust between users and platforms, personalization algorithms need to be seen not as a form of legitimate hedonistic subjugation, but as an opportunity for new forms of individual liberation and social awareness. And this can only occur with the right and capacity by citizens as well as democratic institutions to make self-determined choices on these legally private (but essentially public) personalization systems. As we argue thoughout the paper, we believe that such endeavor is within reach and that public institutions and civil society could and should eventually sustain its realization.

May 6, 2020

Protesting online: Stefania interviewed by the Dutch Tegenlicht

Only a few months ago, we were able to walk the streets with for the Women’s or climate march. Now streets are empty and activists, except for a few, stay at home. How to demonstrate in the so-called one-and-a-half meter society?

Stefania has been interviewed in an article by the Dutch critical public documentary series Tegenlicht / BackLight concerning protesting online. In light of COVID what does it mean to protest changes – read the full article here (in Dutch).

April 30, 2020

[BigDataSur-COVID] Surveillance in the Time of Coronavirus: The Case of the Indian contact tracing app Aarogya Setu

by Soumyo Das (Center of Information Technology and Public Policy at IIIT Bangalore)

Covid-19 has brought governments across the world to the drawing boards trying to design efficient pandemic containment strategies. In India, while reports suggest that the rise in number of cases has reduced from exponential levels, the spread continues. The government, alongside enforcing a complete lockdown of all human activity in non-essential services and sectors, has considered the use of digital technologies (ICTs) to monitor and control the spread of the virus as an informational and preventive model. In tune with other national governments, including those of Singapore and China, on April 2^nd, 2020 the Government of India launched the ‘contact tracing technology’ initiative called ‘Aarogya Setu’. Developed in house, namely by the National Informatics Center of the Ministry of Electronics and Information Technology, the mobile application is available in eleven national languages. As April 25, it has reached 7.5 million registered users.

The application, designed to keep track of the travel and contact history of an individual, can be downloaded by users voluntarily. It registers the personal information of users—including name, age, gender, health status, and recent travel history. Asking users to respond to a series of questions designed to assess if the person is Covid-19 positive, Aarogya Setu generates a Unique Digital Identity for the individual. It also assigns the user a Covid-19 status: low risk, high risk, positive, or negative. It uses Bluetooth and GPS to collect all the other data. The connectivity system (Bluetooth) allows the application to record details of other registered users that the registered individual comes in contact with. The location tracking system (GPS) constantly registers the location of a user in 15-minute intervals. In the initial phase, users’ data are stored locally on their mobile device; however, for those who are assessed to be positive, the data is transferred from the mobile device to a national server for assessment and communication purposes—which raises a number of worries.

Aarogya Setu raises privacy and security concerns

Since the application allows to monitor the contact history and location of registered individuals who test positive, it is supposed to empower the Government to analyze the virus spread in a localized area, and informs individuals who came in contact with the positive individuals about self-isolation and further steps. Two things have to be kept in mind about the application. Firstly, its effectiveness is dependent on the individual practice of self-reporting symptoms in an honest and timely manner. Secondly, it has been designed only for smartphone users, and is furthermore voluntary, thus it can effectively monitor a subset of https://www.news18.com/news/tech/smartphone-users-in-india-crossed-500-million-in-2019-states-report-2479529.html. Therefore, as Jason Say, Senior Director of Government Technology Agency, Singapore, argues, ‘automated contact testing is not a panacea’, and ‘a competent human-in-the-loop system with sufficient capacity’ is a more effective strategy than over-relying on techno-centric solutions.

On top of that, Aarogya Setu comes with its fair share of privacy and security problems. To start with, with no comprehensive legislation in the Indian lawbooks which outlines protection of online privacy for individuals, application users have little to no choice but to agree to privacy policies set by the developers as instructed by the Government of India. While said policy provides a sketchy outline of where and for how long individual data would be retained, the majority of the text offers nothing but a string of vague statements which simply miss out on disclosing who owns and controls access to the data. Specifically, the policy reads that ‘persons carrying out medical and administrative interventions necessary in relation to Covid-19’ can have access to the data. Given that practically all ministries and departments of the Government of India are playing an active role in devising strategies and implementing processes to contain the spread of the virus, policy statements like this evoke ample chances of ‘interdepartmental exchanges of people’s personal information’ based on an analysis of the policy outlined in the application, as denounced by the Internet Freedom Foundation.

Beyond the concerns surrounding undisclosed and vague data use and data protection policies, the fact that individuals are assigned a Unique Digital Identifier number raises concerns of privacy as well. Firstly, given that all individuals are provided with a static identity number, there are concrete chances of identity breach. Moreover, all individuals in India have a national Unique Identity number (the so-called ‘Aadhar Number’) associated with the contact details of the same communication device used for the purpose of Aarogya Setu, with amplified risks of identity & data sharing. For example, these two identities might be leaked, which sounds the alarm bells regarding the potential linkage of biographical information, location and contact history of registered users of Aarogya Setu with that of an individual’s Aadhar number. In the meantime, numerous cases have been recorded of the National Unique Identity system being used to link identity metrics including those of an individual’s financial accounts and social welfare program accounts amongst others (Masiero & Das 2019). The same might happen for individual data collected via Aarogya Setu. Finally, fears have multiplied with intel sources reporting that software already available in the market can bypass the system security and extract sensitive information of an individual.

Voluntary adoption?

While downloading the application is voluntary, certain states has made individual registration on the application a mandatory requirement. For example, in New Delhi, Mr. Surjit K. Singh, Director of National Center for Disease Control, has strongly recommended the Delhi government to allow people to enter the city capital only after they have installed the application, therefore setting the tone for the use of the application for monitoring inter-district & inter-state movement of people. Similarly, the Tamil Nadu state government has urged employees of all higher education institutions in the state to use the application. App-based platforms like Zomato has made Aarogya Setu registration mandatory for food delivery personnel, while IT companies have mandated the same for employees who are reporting to office. Furthermore, both government and private organizations are actively pushing individuals to register on the Aarogya Setu application, and with reports of the government working towards procuring thousands of wristbands to be integrated with the application for greater individual monitoring.

A ‘Bridge to Wellness’

With no documentation being available at the time of writing for Aarogya Setu, organizations such as Internet Freedom Foundation and Software Freedom Law Center have raised concerns that the application is something of a black-box. They have called for more transparency on the algorithmic functioning of an application which is developed and promoted by the Government of India and deals with accessing and databasing the personal details of individuals. Ironically, the very same name of the application, which can be roughly translated as ‘Bridge to Wellness’, points to empowerment and better futures. But what would it take for the app to yield that “wellness” it evokes?

It is time for individuals and action groups in the country to raise demands for a greater transparency regarding the functioning of the application. The government must address the privacy concerns of its citizens. It must provide clarity on who owns the data, where it is stored, who can access and use it, and how—and for how long it will be stored. Unless such concerns are addressed, and effective measures taken at the earliest, Aarogya Setu only promises to cross people over to a world of algorithmic surveillance.

Author’s bio: Soumyo Das is a Research Scholar at the Center of Information Technology and Public Policy at IIIT Bangalore. His research primarily focusses on Information Systems in Organisations, and ICT4D. Soumyo holds an undergraduate degree in the applied sciences, and was formerly associated with a technology consulting firm as a Client Associate.

April 30, 2020

[BigDataSur] The Challenge of Decolonizing Big Data through Citizen Data Audits [1/3]

Author: Katherine Reilly, Simon Fraser University, School of Communication

A curious thing happened in Europe after the creation of the GDPR. A whole new wave of data audit companies came into existence to service companies that use personal data. This is because, under the GDPR, private companies must audit their personal data management practices. An entire industry emerged around this requirement. If you enter “GDPR data audit” into Google, you’ll discover article after article covering topics like “the 7 habits of highly effective data managers” and “a checklist for personal data audits.”

Corporate data audits are central to the personal data protection frameworks that have emerged in the past few years. But among citizen groups, and in the community, data audits are very little discussed. The word “audit” is just not very sexy. It brings to mind green eyeshades, piles of ledgers, and a judge-y disposition. Also, audits seem like they might be a tool of datafication and domination. If data colonization “encloses the very substance of life” (Halkort), then wouldn’t data auditing play into these processes?

In these three blog posts, I suggest that this is not necessarily the case. In fact, we precisely need to develop the field of citizen data audits, because they offer us an indispensable tool for the decolonization of big data. The posts look at how audits contribute to upholding our current data regimes, an early attempt to realize a citizen data audit in Peru, and emerging alternative approaches. The series of the following blogposts will be published the coming weeks:

The Current Reality of Personal Data Audits [find below]
A First Attempt at Citizen Data Audits [link]
Data Stewardship through Citizen Centered Data Audits [link]

The Current Reality of Personal Data Audits

Before we can talk about citizen data audits, it is helpful to first introduce the idea of auditing in general, and then unpack the current reality of personal data audits. In this post, I’ll explain what audits are, the dominant approach to data audits in the world right now, and finally, the role that audits play in normalizing the current corporate-focused data regime.

The aim of any audit is to check whether people are carrying out practices according to established standards or criteria that ensure proper, efficient and effective management of resources.

By their nature, audits are twice removed from reality. In one sense, this is because auditors look for evidence of tasks rather than engaging directly in them. An auditor shows up after data has been collected, processed, stored or applied, and they study the processes used, as well as their impacts. They ask questions like “How were these tasks completed, and, were they done properly?”

Auditors are removed from reality in a second sense, because they use standards established by other people. An auditor might ask “Were these tasks done according to corporate policy, professional standards, or the law?” Auditors might gain insights into how policies, standards or laws might be changed, but their main job is to report on compliance with standards set by others.

Because auditors are removed from the reality of data work, and because they focus on compliance, their work can come across as distant, prescribed – and therefore somewhat boring. But when you step back and look at the bigger picture, audits raise many important questions. Who do auditors report to and why? Who sets the standards by which personal data audits are carried out? What processes does a personal data audit enforce? How might audits normalize corporate use of personal data?

We can start to answer these questions by digging into the criteria that currently drive corporate audits of personal data. These can be divided into two main aspects: corporate policy and government regulation.

On the corporate side, audits are driven by two main criteria: risk management and profitability. From a corporate point of view, personal data audits are no exception. Companies want to make sure that personal data doesn’t expose them to liabilities, and that use of this resource is contributing effectively and efficiently to the corporate bottom line.

That means that when they audit their use of personal data, they will check to see whether the costs of warehousing and managing data is worth the reward in terms of efficiencies or returns. They will also check to see whether the use of personal data exposes them to risk, given existing legal requirements, social norms or professional practices. For example, poor data management may expose a company to the risk of being sued, or the risk of alienating their clientele. Companies want to ensure that their internal practices limit exposure to risks that may damage their brand, harm their reputation, incur costs, or undermine productivity.

In total, corporate data audits are driven by, and respond to, corporate policies, and those policies are organized around ensuring the viability and success of the corporation.

Of course, the success of a corporation does not always align with the well-being of the community. We see this clearly in the world of personal data. Corporate hunger for personal data resources has often come at the expense of personal or community rights.

Because of this, governments insist that companies enforce three additional regulatory data audit criteria: informed consent, personal data security, and personal data privacy.

We can see these criteria reflected clearly in the EU’s General Data Privacy Regulation. Under the GDPR, companies must ask customers for permission to access their data, and when they do so, they must provide clear information about how they intend to use that data.

They must also account for the personal data they hold, how it was gathered, from whom, to what end, where it is held, and who accesses it for what business processes. The purpose of these rules is to ensure companies develop clear internal data management policies and practices, and this, in turn, is meant to ensure companies are thinking carefully about how to protect personal privacy and data security. The GDPR requires companies to audit their data management practices on the basis of these criteria.

Taking corporate policy and government regulation together, personal data audits are currently informed by 5 criteria – profitability, risk, consent, security and privacy. What does this tell us about the management of data resources in our current data regime?

In a recent Guardian piece Stephanie Hare pointed out that “the GDPR could have … [made] privacy the default and requir[ed] us to opt in if we want to have our data collected. But this would hurt the ability of governments and companies to know about us and predict and manipulate our behaviour.” Instead, in the current regime, governments accept the central audit criteria of businesses, and on top of this, they establish the minimal protections necessary to ensure a steady flow of personal data to those same corporate actors. This means that the current data regime (at least in the West) privileges the idea that data resides with the individual, and also the idea that corporate success requires access to personal data.

Audits work to enforce the collection of personal data by private companies, by ensuring that companies are efficient, effective and risk averse in the collection of personal data. They also normalize corporate collection of personal data by providing a built in response to security threats and privacy concerns. When the model fails – when there is a security breach or privacy is disrespected – audits can be used to identify the glitch so that the system can continue its forward march.

And this means that audits can, indeed, serve as tools of datafication and domination. But I don’t think this necessarily needs to be the case. In the next post, I’ll explore what we’ve learned from experimenting with citizen data audits, before turning to the question of how they can contribute to the decolonization of big data in the final post.

April 28, 2020

[BigDataSur] The dilemma of making migrants visible to COVID-19 counting

The COVID-19 pandemic requires reconsidering the relationship between data and invisible populations as a form of de facto civil inclusion. While most forms of data management of populations are problematic, under which conditions would counting be just?

Annalisa Pelizza and Yoren Lausberg, Processing Citizenship research program, University of Bologna

Stefania Milan, DATACTIVE research program, University of Amsterdam

This post was originally published on Open Democracy on April 28, 2020

On March 13th, in announcing that Europe had become the epicenter of the COVID-19 pandemic, World Health Organization’s Executive Director Dr Michael Ryan made a plea in favor of invisible populations. “We cannot forget migrants, we cannot forget undocumented workers, we cannot forget prisoners,” he argued. In just a few days, civil societies around the world would have discovered that invisibility is indeed a recurrent companion to the virus. Exceptionally hard to contain due to its asymptomatic contagion and long incubation period, COVID-19 has also been hard to classify as a cause of death, complicating the efforts to trace it and count its victims. Despite narratives about its alleged democratic character, the virus seems to hit the weak, invisible populations the hardest. The elderly confined in care homes are decimated across Europe, largely uncounted. From China to Pennsylvania the toll of people passed away in the solitude of their homes—or of their shelters—does not appear in official statistics. Undocumented migrants are dying from the virus because they are too afraid to seek help, and their numbers typically do not reach official statistics. If today “being counted” is even more so a condition of existence and care, Western countries are failing to account for the health conditions of invisible populations like people on the move. In the days of COVID-19 as never before, what these dramatic (missing) numbers make apparent is that invisibility may mean death.

The COVID-19 pandemic puts us in front of a dilemma with regards to invisibilized populations, and migrants in particular—one which has to do simultaneously with societal and technological concerns. On one hand, visibility gaps are a systemic aspect of population management that might be welcomed by policy makers and populations alike. Indeed, the illusion of a “data panopticon” does not take into account the conditions of data collection, data gaps and the limits of system interoperability: not everyone is counted in all systems, and not in the same way. Such invisibility might serve the needs of informal economies and unscrupulous politicians ready to mobilize security concerns. From a different perspective, from homeless to prisoners, from migrants to sex workers, invisibility can be deemed a protection from care that too often resembles control and surveillance.

On the other hand, a surge in the visibility of migrant populations might help curbing the contagion and avoiding massive spreads within vulnerable populations. Indeed, being invisible translates into the inability to access crucial services in the time of the pandemic, and health care above all. Access to testing and cure requires insurance, and insurance requires being countable. Even when the costs of insurance can be offset by the collectivity, being countable remains a key condition of access. In the U.S., for example, the second coronavirus relief package known as the Families First Coronavirus Response Act has extended testing to the Medicaid-eligible population, even when uninsured, but not to undocumented migrants, nor to other temporary residents.

We suggest that, while in normal conditions populations on the move may prefer to remain invisible rather than face repression, stigma or deportation, the current situation requires to reconsider the relationship between data, populations and (in)visibility. We thus wonder under which conditions including invisibilized populations in the general COVID-19 count could turn out a just solution. For sure, some cautions should be used. In the best case scenario, instead of exposing vulnerable populations, such reconsideration might even entail a de facto form of civil inclusion. What follows makes the point by considering migrants and undocumented populations as especially vulnerable to COVID-19 due to their invisibilized status in official registries and administration, and the barriers to formal and professional care that this invisibility entails. While most of our examples originate in the European continent, our main terrain of study, we believe that there is something universal in this exercise that can also inform the way other countries and communities relate to people in the move in the time of the pandemic.

People on the move do not show in COVID-19 counts

António Vitorino, Director General of the International Organization for Migration has recently called for a universal response to COVID-19, regardless of migratory status. Portugal has specifically addressed the migrant condition in its response to the pandemic. It has extended to third country nationals with pending applications access to the same services of the resident population: from national health care to welfare benefits, from bank accounts to work and rental contracts. The Portuguese response constitutes a temporary de facto inclusion of foreign citizens, in the name of pragmatism as well as of human rights. It is however unique in a continent that has rather halted most bureaucratic procedures and data processing involving people on the move. Sweden, The Netherlands and Belgium have suspended administrative services for migrants, refugees and asylum-seekers. After having halted asylum procedures, Greece has put migrants living in overcrowded camps under quarantine. In Serbia, along the so-called Balkan Route, armed forces have taken over the security of about 150 social welfare institutions, 120 medical facilities and 20 migrant camps, de facto locking migrants in. Similarly, Bosnia Herzegovina has introduced tighter controls in the reception centres, which migrants and refugees can no longer leave or enter. Italy has declared its ports “unsecure”, asylum and police offices are closed and data processing suspended. Meanwhile, an estimated 200,000 undocumented farmworkers in Italy live in cramped informal settlements in precarious hygienic conditions and without running water, which makes it impossible to implement the social distancing and hygienic measures imposed to slow down the contagion. In France many sleep in makeshift camps or on the streets, with local nongovernmental organizations (NGOs) sounding alarm bells about an upcoming “health scandal”, and questioning the government’s lack of adequate response. In the UK, NGOs point out that the suspension of various support networks increasingly puts already precarious people at risk, noting how the hostile environment deters undocumented people from seeking help. All in all, in many European countries migrants are not included in COVID-19 counts, equally hindering access to care and relief systems. What are the consequences of this situation, and how can it be overturned?

The consequences of invisibility

Invisibility of moving populations in time of pandemic can have health, economic, and social consequences. First, we do see that its effects stack on existing social and institutional inequalities. Vulnerable populations are left behind in addressing the public health threats of the coronavirus outbreak. As already hostile environments bar mobile populations from seeking professional and official health care, the spread and effects of the coronavirus will be exacerbated among these populations. They are already vulnerable due to a lack of accessible information and access to hygiene facilities, but also because their economic vulnerability may force them to seek employment when others can choose to stay at home. The exclusion of some people from comprehensive efforts to counter the spread of COVID-19 will cause harsher and prolonged sanitary effects among these groups. With effects not only on their wellbeing, but also on the general wellbeing of society at large, as failure to contain the virus will exacerbate its spread.

Second, invisibility may entail dramatic asymmetries in economy and labor relations. Not only invisibility allows exploitation in agricultural economies, construction work and temporary job markets, among others. It also marks a harsh asymmetry between migrant workers’ contribution to the COVID-19 response and their under-representation in statistics. For instance, European countries like Austria and Germany are importing farmhands from Eastern Europe to harvest seasonal vegetables like asparagus. The Italian Minister of Agriculture Teresa Bellanova has recently proposed to give some of the estimated 600,000 undocumented immigrants in the country temporary work permits to plug the labor gap which is particularly large and urgent in the agri-food sector. Yet counting as well as rights asymmetries continue to permeate job sectors that are key in the corona response. Food delivery workers in European cities are largely migrants who cannot afford to “stay home” and lose income. According to the Migration Policy Institute, in the U.S. foreign born represent 38 percent of home care and significant shares of workers in food production and distribution, all sectors at the coronavirus response frontline.

Third, invisibility has societal consequences too, as it helps fueling racism and xenophobic reactions. In Italy, for example, pseudoscientific myths are spreading on social media, in a country where often migration is associated with heterogeneous skin traits and hospitalized patients are largely white. Not only do these resurgent racialized explanations of alleged immunity to the virus fuel racist narratives, lack any scientific base and disregard empirical evidence of Afro-American communities tragically and disproportionally hit by the virus on the other side of the Atlantic. They also reignite racial classifications and genetic pseudoscientific thinking that we hoped were buried with XIX century colonial anthropology. Furthermore, they counteract socio-scientific explanations and consequent policy action. If temporary residents are less prone to ask for support in case of COVID-19 symptoms, this might be due to their tendency to associate the health care system with repressive authorities, scarce linguistic skills or fragmented social networks: all explanations that should be investigated in order to curb the contagion.

Our proposals for just visibility

All things considered, one might wonder whether the current emergency requires reconsidering the relationship between data, visibility and populations. Institutional solutions appear to timidly move in the direction of making migrant populations more visible. In Italy, the introduction of mandatory self-certification to exit home was sufficient to halt the agricultural production chain, as work force is mainly constituted by irregular migrants. As a result, the Italian Agriculture Ministry is attempting to overcome the impasse by creating a new registry of agricultural labor. Even U.S. scholar and author Shoshana Zuboff, a well-known fierce critic of what she herself terms “surveillance capitalism”, in an interview with the Italian daily “La Repubblica” has surprisingly argued that contact tracing apps should be mandatory and data should be managed by public bodies. But Zuboff’s argument falls short when it meets vulnerable populations who are not longing to be traced and are inherently suspicious of authorities. Becoming visible through an app of this kind does not fit well with the fears of repression and deportation these vulnerable population live with.

The question is therefore how can visibility be just? The various consequences of invisibility we have identified do not exist in isolation. Forms of invisibilization stack upon each other. As mentioned above, mobile populations often work in already precarious or exploitative sectors, which have suddenly become foregrounded as “essential” during the pandemic. This creates the paradox that while the work is visibilized as vital, the workers are barred from accessing civil rights, are still kept out of the count and thus excluded from aid and relief. It is then crucial to consider what inclusion in the COVID-19 response is for: is it a temporal visibilization in disease tracing and tracking, so that those who have been immunized can return to orchards and elderly houses to become invisibilized workers again? Or will access to civil rights be granted on a permanent basis to all who are still excluded from it?

In facing the visibility/invisibility dilemma for populations on the move, diverse scenarios open up—from repressive authorities taking advantage of the temporal disclosure to identify and track undocumented migrants, to a de facto form of civil inclusion. De facto inclusion would entail universal access to civil institutions such as health care, welfare and civil rights. It would be an infrastructural (but nevertheless political) way to perform people on the move as members of civil communities, while at the same time protecting them through civil rights. De facto inclusion would entail protected visibility. In what follows we reflect on the conditions under which the counting of invisibilized populations can lean towards this second scenario.

We argue that a multipronged approach is needed to address the problem of making the invisible population of migrants countable under fair conditions. Firstly, we need to give careful consideration to how we count and what digital infrastructure we use toward this end. For starters, counting should respect the principles enshrined in the EU General Data Protection Regulation, most notably data minimization (i.e., data collection should be limited to what is necessary) and purpose limitation (i.e., data should be collected for specific, explicit and legitimate purposes). But it should also commit to fairness and transparency, whereby personal data should be processed in a way which is transparent to data subjects, and, we would add, abide to democratic oversight and accountability. In other words, the counting we propose should be finalized to protection of vulnerable populations and the societies surrounding them, rather than exclusion, discrimination or repression. To this end, we need to ensure that data collection and use are discrimination- and future-proof, and that data about, for instance, health conditions collected during the pandemic emergency are not used against these vulnerable populations at a later stage. In this process of envisioning fair rules for counting vulnerable populations, the infrastructure dimension is to be given adequate consideration. Although “invisible” in themselves, digital infrastructures–including how they are designed, integrated and who owns them–are an integral part of any decision-making with regards to counting, especially for what concerns the public versus private ownership and oversight.

Secondly and strictly related, access to civil rights for people on the move must also include the right to be deleted from any database, and to not be traced beyond the original goals (i.e., the purpose limitation mentioned in GDPR). Data about people who have been on the move are already stored in systems of identification and registration used at the border, with the risk of carrying stigmas far and wide. On top of that, entering a health care or welfare database often means enlisting a system of cross checks that can be invasive of personal life and heavily influence intimate choices. As many counts and registries are also modes of control and surveillance, inclusion should also mean inclusion in the right to be forgotten. Furthermore, any restrictive or invasive measure should come with adequate sunset provisions, whereby any data collection that is in some way invasive of people’s privacy can cease to have effect when, e.g., a vaccine becomes available and widely administered.

Thirdly, as we know that the practice of counting speaks for the counter more than for the counted, we propose an alliance between different counting entities rallying around the need for public critical care. These entities include, at the bare minimum, migrant-led organizations, shelters, health care institutions, unions, and organizations close to the ground. This comes with its own set of challenges, including database interoperability issues and principles, as various organizations will have to gather around a concern for care and public health coming with their own experiences and values. The alternative would however leave us with a prolonged public health crisis or centralizes state authorities or private corporations in the collection of population data.

Finally, and most importantly, the counting we propose should take stock of the European migration regime, and invert the priority given since 2015 to securitization at the expenses of health data. Our research at Processing Citizenshiphas indeed shown that in European frontline countries the assessment of health conditions was originally the primary concern upon disembarkation of people rescued at sea. However, with the so called “Hotspot approach” introduced in 2015 priority has been shifted to fill administrative databases for security concerns. If anything, COVID-19 is a powerful reminder of the need to restore the original priority given to health data in population management, rather than to administrative information. In sum, we argue that identification and tracking of migrants for purely security purposes should be replaced by health care assessment through specialized, non-interoperable information systems that count together resident populations and those on the move.

To conclude, we cannot but note that the bulk of our proposals—especially around data protection, data minimization, purpose limitation, sunset clauses—are valid also in the deployment of contact tracing apps for the general population. Which leads us to wonder to what extent any counting measure to contain the virus can be effective while distinguishing among populations. By considering how to fairly include invisibilized populations in what is today’s most pressing count, we might end up realizing that even most classifications for visible populations are being redefined. A more comprehensive solution to this conundrum would be rethinking critical services to include all residents of a given polity, regardless of their status. If so, the challenge is making sure that this redefinition is as inclusive as possible. This might mean changing the ways Europe sees its people and who these people are, and ultimately the role of data infrastructures in this inclusive recounting.

Acknowledgements

The authors wish to thank Chiara Milan (University of Graz) for sharing her knowledge about the current situation in the Balkans concerning populations on the move and the pandemic.

The authors disclose receipt of the following financial supports for the research, authorship, and publication of this article: Processing Citizenship (2017–2022, Grant Agreement No. 714463) and DATACTIVE (2015–2020, Grant Agreement No. 639379), both of which have received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program.

April 28, 2020

[BigDataSur] Data journalism without data: challenges from a Brazilian perspective

Author: Peter Füssy

For the last decade, data journalism has attracted attention from scholars, some of whom have provided distinct definitions in order to understand the changes in journalistic practices. Each one of them emphasizes a particular aspect of data journalism; from new forms of collaboration to open-source culture (Coddington, 2014). Yet, even among clashing definitions, it is possible to say they all agree that there is no data journalism without data. But which data? Relevant data does not generate by itself and it is usually related to power, economic, and/or political struggles (De Maeyer et. al, 2014). While journalists in the Global North mostly benefit from open government mechanisms for public scrutiny, journalists working in countries with less transparency and democratic tradition still face infrastructural issues when putting together data and journalism (Borges-Rey, 2019; Wright, Zamith & Bebawi, 2019).

For the next paragraphs, I draw from academic research, reports, projects, and my own experience to briefly problematize one of the most recurring challenges to data journalism in Brazil: access to information. Since relevant data is rarely available immediately, a considerable part of data-driven investigative projects in Brazil relies on Freedom of Information (FOI) law that forces governments to provide data of public interest. Also known as Access to Information or Right to Information, these acts are an essential tool to increase transparency, accountability, citizens agency, and trust. Yet, implementation and compliance of the regulation in Brazil are inefficient in all levels of government bodies (Michener, 2018; Abraji, 2019; Fonseca, 2020; Venturini, 2017).

More than just a bureaucratic issue inherited from years of dictatorship and lack of competences, this inefficiency is also a political act. As Torres argued, taking Mexico as an example, institutional resistance to transparency is carried out through subtle and non-political actions that diminish data activists agency and have the effect of producing or reinforcing inequalities (Torres, 2020). In the case of Brazil, however, recent reports imply that institutional resistance to transparency is not necessarily subtle. It may also be a political flag.

Opacity and Freedom of Information

According to Berliner, the first FOI act was passed in Sweden in 1766, but the recent wave follows the example of the United States’ act from 1966. After the US, there is no clear pattern for adoption; for example, Colombia passed a law in 1985, while the United Kingdom did so only in 2000. FOI acts are more likely to pass when there is a highly competitive domestic political environment, rather than pressure from civil society or international institutions (Berliner, 2014).

Sanctioned in 2011, the Brazilian FOI came to effect only in 2012. In the first six years, 611.3 thousand requests were filled just in the federal government (excluding state and municipal bodies). The average of 279 requests per day or 11 per hour suggests how eager the population was to decentralise information. Although public authorities often give insufficient responses and say that the request was granted, it is possible to say the law was about to “stick”. From the total requests, 458.4 thousand (75%) resulted in partial or full access to the requested information (Valente, 2018).

At the beginning of 2019, while president Jair Bolsonaro was at his first international appearance as the Brazilian head of state in Davos, vice president general Hamilton Mourão signed a decree to limit access to information by allowing government employees to declare confidentiality of public data up to the top-secret level, which makes documents unavailable for 25 years (Folha de S.Paulo, 2019). Until then, this could be done only by the president and vice president, ministers of state, commanders of the armed forces and heads of diplomatic missions abroad. Facing a backlash from civil society, Bolsonaro lost support in Congress to pass that bill and withdraw the resolution a few weeks later. Nonetheless, reports show that the issues regarding FOI requests are growing under his presidency.

Data collected from the Brazilian FOI electronic system by Agência Pública revealed that Federal Government’s denials of requests with the justification of “fishing expedition” increased from 8 in 2018 to 45 in the first year of Bolsonaro’s presidency (Fonseca, 2020). The term “fishing expedition” is pejorative and usually related to secret or non-stated purposes, like using an unrelated investigation or questioning to find evidence to be used against an adversary in a different context. However, according to the Brazilian FOI, the reason behind a request must not be taken into account when deciding to provide information or not.

At the same time, journalists’ perception of difficulties to retrieve information via FOI reached the highest numbers in 2019, when 89% of the interviewed journalists described issues like answers after the legal deadline, missing information, data in closed format, and denial of information (Abraji, 2019). In 2013, 60% reported difficulties, and the number dropped to 57% in 2015.

For example, after more than one year in the office, Bolsonaro’s presidency still refuses to make public the guest list of his inauguration reception. In addition to the guest list, the government keeps in secrecy more than R$ 15 million in expenses made with corporate cards from the Presidency and Vice President’s Office. The confidentiality remains even after a decision by the Supreme Court that overturned the confidentiality in November last year.

More from less

Despite the challenges, Brazilian journalists are following the quantitative turn in the field and creating innovative data-driven projects. As reported by the Brazilian Association of Investigative Journalism (Abraji), at least 1.289 news stories built on data from FOI requests were published from 2012 to 2019. In 2017, the “Ctrl+X” project, which scraped thousands of lawsuits to expose politicians trying to silence journalists in courts, won a prize in the Global Editors’ Data Journalism Awards.

In the following year, G1 won the public choice award with a project that tracked every single murder in the country for a week. The results from the “Violence Monitor” showed a total of 1,195 deaths, one in every eight minutes. However, this project did not rely on FOI requests but on an unprecedented collaboration of 230 journalists employed by the biggest media group in Brazil, Globo. They gathered the data from scratch at police stations all over the country to tell the stories of the victims. Besides that, G1 partnered with Universidade de São Paulo for analysis and launched a campaign on TV and social media so that people could identify some of the victims.

Regardless of the lack of resources, freedom, and safety, these projects show that data journalism can be a tool to rebuild trust from audiences. However, activism to break the resistance to transparency is a challenge even more prominent when opacity seems to be encouraged by institutional actors.

About the author

Peter is a journalist trying to explore new media in depth, from everyday digital practices to the undesired consequences of a highly connected environment. After more than 10 years of writing and multimedia reporting for some of the most relevant news outlets in Brazil, he is now second years Research Master’s student in Media Studies at the University of Amsterdam.

References

Berliner, Daniel. “The political origins of transparency.” The journal of Politics 76.2 (2014): 479-491.

Borges-Rey, Eddy. “Data Journalism in Latin America: Community, Development and Contestation.” Data Journalism in the Global South. Palgrave Macmillan, Cham, 2019. 257-283.

Coddington, Mark. “Clarifying journalism’s quantitative turn: A typology for evaluating data journalism, computational journalism, and computer-assisted reporting.” Digital journalism 3.3 (2015): 331-348.

De Maeyer, Juliette, et al. “Waiting for data journalism: A qualitative assessment of the anecdotal take-up of data journalism in French-speaking Belgium.” Digital journalism 3.3 (2015): 432-446.

Fonseca, Bruno. Governo Bolsonaro acusa cidadãos de “pescarem” dados ao negar pedidos de informação pública. Agência Pública. 6 Feb, 2020.

Michener, Gregory, Evelyn Contreras, and Irene Niskier. “From opacity to transparency? Evaluating access to information in Brazil five years later.” Revista de Administração Pública 52.4 (2018): 610-629.

Michener, Gregory, et al. “Googling the requester: Identity‐questing and discrimination in public service provision.” Governance (2019).

Valente, Jonas. “LAI: governo federal recebeu mais de 600 mil pedidos de informação”. Agência Brasil. May 16, 2018.

Venturini, Lilian. “Se transparência é regra, por que é preciso mandar divulgar salários de juízes?”. Nexo Jornal. São Paulo, 3 Sept. 2017.

Wright, Kate, Rodrigo Zamith, and Saba Bebawi. “Data Journalism beyond Majority World Countries: Challenges and Opportunities.” Digital Journalism 7.9 (2019): 1295-1302.

April 25, 2020

[blog] I quattro nemici (quasi) invisibili nella prima pandemia dell’era della società dei dati

by Philip Di Salvo, Stefania Milan

originally published on Il Manifesto, 24 April 2020

Big Data e Covid. La pandemia sta facendo emergere fenomeni e caratteristiche della società dei dati che, in circostanze di emergenza come quelle di queste settimane, rischiano di concretizzare quelli che—fino a poco fa—potevano essere considerati solo scenari estremi, conseguenze inaspettate, o effetti collaterali

La pandemia globale COVID-19 è la prima a manifestarsi su un piano così esteso e in modalità così tanto gravi in una fase avanzata della cosiddetta “società dei dati”. Ci troviamo, di fatto, in un momento spartiacque per il nostro stesso definire cosa significhi vivere un’epoca di pressoché totale trasformazione in dati delle attività umane in qualsiasi ambito.

Una situazione estrema come quella che stiamo vivendo in queste settimane di lockdown quasi totale inevitabilmente mostra tutte le sfumature di questo fenomeno, dalle più virtuose alle più potenzialmente inquietanti.

Non esiste nella storia recente un avvenimento di portata simile che possa competere con l’attuale stato di pandemia globale da un punto di vista di definizione del contemporaneo. Bisogna tornare indietro di due decenni, fino all’11 settembre 2001, per ritrovare un altro momento paragonabile di onnicomprensivo stress-test degli assunti culturali e dei fondamenti della nostra società nel complesso. Il 2001 e il 2020, però, hanno pochi punti di contatto per quanto riguarda gli ecosistemi tecnologici, le infrastrutture digitali e, di conseguenza, gli impatti sociali e politici di questi assetti tecnologici.

La società dei dati mette al centro la produzione di dati e il loro uso per creare valore aggiunto, dalla gestione del traffico al miglioramento dei servizi pubblici, dalla pubblicità personalizzata sul digitale fino alle app di contact tracing contro il COVID-19.

Il paradosso è che, anche in circostanze normali, siamo noi stessi a generare la maggior parte di questi dati, attraverso per esempio gli smartphone, le carte di credito, lo shopping online e i social media. La monetizzazione su larghissima scala dei dati relativi alle nostre preferenze e comportamenti ha generato il valore su cui si sono costruite società come Google e Amazon che hanno nell’analisi e nella predizione i loro maggiori punti di forza, o i loro monopoli.

Noi cittadini produciamo però dati anche ricorrendo al sistema sanitario pubblico o semplicemente passeggiando nelle nostre città, oramai popolate da una miriade di videocamere di sorveglianza o sistemi “intelligenti” di riconoscimento facciale. Molti di questi dati finiscono poi in mani private, anche quando apparentemente sembrano essere sotto il controllo di entità statali: spesso i server sono gestiti da imprese come Accenture, IBM o Microsoft.

Questa geografia variabile di dati, infrastrutture, e entità pubbliche e private è un cocktail potenzialmente esplosivo, soprattutto per la sua scarsa trasparenza verso gli utenti e i rischi per la privacy individuale e collettiva. La società dei dati è infatti anche la culla di quello che l’economista americana Shoshana Zuboff ha chiamato “capitalismo della sorveglianza”, il cui motore è la mercificazione delle informazioni personali anche a costo di ridurre la nostra capacità di agire in modo indipendente e compiere libere scelte. In altre parole, è il nostro stesso essere cittadini che cambia, e non necessariamente in meglio.

Il dramma a più livelli—dall’umano all’economico al sociale—scatenato dalla pandemia COVID-19 contribuisce a mostrare i lati più oscuri e controversi del sistema di mercificazione dei dati.

La pandemia sta infatti facendo emergere fenomeni e caratteristiche della società dei dati che, in circostanze di emergenza come quelle di queste settimane, rischiano di concretizzare quelli che—fino a poco fa—potevano essere considerati solo scenari estremi, conseguenze inaspettate, o effetti collaterali.

Per quelli che possono essere gli ambiti di interesse delle scienze sociali, ci sono almeno quattro diverse aree in cui la pandemia sta agendo da acceleratore di dinamiche potenzialmente pericolose fin qui rimaste più in nuce.

Al di fuori di ogni determinismo—sia tecnologico che epidemiologico—sono fin qui emerse almeno quattro distinte tendenze. Queste sono il positivismo acritico, l’information disorder, il vigilantismo e la normalizzazione della sorveglianza: quattro nemici resi quasi invisibili dal dramma umano della pandemia, ma che feriscono la collettività quasi quanto il virus. E sono destinati ad avere conseguenze di lungo termine a dir poco pericolose. Vediamoli assieme.

Il positivismo acritico

Il primo nemico invisibile è associato ad un verbo di uso comune, “contare”, un’azione che in questi giorni ci viene giustamente presentata come un alleato. “Facciamo parlare i numeri”, si sente spesso dire. Chi non rimane col fiato sospeso in attesa delle tabelle della protezione civile che ci comunicano il numero di morti, di guariti, di ospedalizzati? Il contare, e ancora di più il contarsi, è per ogni società un momento di presa di coscienza importante: basti pensare ai censimenti che hanno un ruolo cruciale nella definizione dello stato nazione. Per di più contare ha che fare con l’essenza stessa delle pandemie: i grandi numeri.

Di norma, tendiamo a credere di più ai dati statistici che alle parole, poiché vi associamo una sorta di verità di ordine superiore. Si tratta di un fenomeno noto anche come “dataism”, dataismo, un’ideologia che ripone eccesiva fiducia nel potere soluzionista e predittivo dei dati.

La fede nei numeri ha radici lontane, da rintracciare nei giorni del positivismo del 19esimo secolo, che postulava la fiducia nella scienza e nel progresso scientifico-tecnologico. Nel suo “Discorso sullo spirito positivo” (1844), il filosofo Augusto Comte spiega come il positivismo riporti al centro “il reale, in opposizione al chimerico”, e come si prefigga di “opporre il preciso al vago”, configurandosi come “il contrario di negativo”, vale a dire identificando un atteggiamento propositivo di fiducia nel futuro.

Di sicuro, riportare i fatti concreti al centro della narrazione del virus e della ricerca di soluzioni non può che essere cosa buona e giusta dopo una stagione buia per la scienza, in cui sono stati messi in questione perfino i vaccini. Purtroppo, però, la fiducia nei numeri è spesso mal riposta poiché, come si è spesso detto in queste settimane, i dati ufficiali tendono a raccontare una porzione limitata e spesso fuorviante della realtà pandemica.

Ciononostante, i numeri e i dati sono al cuore della narrazione del virus. Si tratta, però, di una narrazione poco accurata, spesso decontestualizzata, e non per questo meno ansiogena. Il risultato è un positivismo acritico che tende ad ignorare il contesto e non spiega come si faccia di conto e perché. Decisioni che coinvolgono intere nazioni vengono prese e giustificate sulla base di numeri che non hanno, però, a disposizione dati necessariamente affidabili.

L’Intelligent Retail Lab di Walmart negli Usa – foto Ap

L’information disorder nella pandemia

Il contesto informazionale di una pandemia è stato equiparato ad un’”infodemia”, un’espressione utilizzata in primis dalla stessa Organizzazione mondiale della sanità per definire circostanze in cui vi è una sovrabbondanza di informazioni—accurate o meno—che rendono molto difficile orientarsi tra le notizie o anche solo distinguere le fonti affidabili da quelle che affidabili non sono. La pandemia è di conseguenza anche una situazione particolarmente rischiosa per quanto riguarda il diffondersi di varie tipologie di “information disorder”—letteralmente disturbi dell’informazione—come varie forme di disinformazione o misinformazione.

Nell’infodemia da COVID-19 la cattiva informazione si è manifestata in vari modi. Il Reuters Institute for the Study of Journalism (RISJ) dell’Università di Oxford ha pubblicato uno dei primi studi sulle caratteristiche del fenomeno in questa pandemia, concentrandosi su un campione di notizie in lingua inglese vagliate da iniziative di fact-checking come il network non profit First Draft. Lo studio, che è un primo tentativo esplorativo di analisi del problema, rivela come la varietà delle fonti di disinformazione sulla pandemia possano essere sia “top-down” (quando sono promosse dalla politica o da altre personalità pubbliche) o “bottom-up”, ossia quando partono dagli utenti comuni.

Se la prima tipologia rappresenta il 20% del totale del campione analizzato dal RISJ, è anche vero però che la disinformazione top-down tende a generare molto più buzz sui social media rispetto a quanto prodotto dal basso. Scrive il RISJ, inoltre, che la fetta più grande della misinformazione emersa in queste settimane sarebbe costituita da contenuti “riconfigurati”, modificati ovvero in alcune loro parti. Solo una minoranza (il 38% circa) sarebbe invece composta da contenuti completamente inventati ex novo.

Lo studioso Thomas Rid, uno dei massimi esperti mondiali di campagne di disinformazione nell’ambito della sicurezza nazionale (alla cui storia ha dedicato un libro molto atteso e di prossima pubblicazione, “Active Measures”), ha fatto notare inoltre sul New York Times come la situazione di pandemia possa costituire anche un terreno particolarmente fertile per potenziali operazioni di “information warfare” volte a creare confusione e tensioni nelle opinioni pubbliche dei paesi colpiti, sulla scia di quanto si è visto negli Usa durante le elezioni presidenziali del 2016. Non va nemmeno dimenticata la misinformazione che sfocia nel razzismo e alimenta pulsioni xenofobe, come la falsa notizia, circolata in vari ambienti, che vorrebbe gli africani immuni al virus.

Un data center Bitcoin in Virginia – foto Ap

Il vigilantismo (digitale e non)

Di questi tempi, molti runner si sono sentiti apostrofare malamente, e in alcuni casi sono stati anche aggrediti fisicamente, da altri cittadini indispettiti dal potenziale pericolo per la salute pubblica che può rappresentare un individuo in libera uscita.

Persone che si stavano recando al lavoro hanno raccontato di essere stati vittime di ingiurie di vario tipo per non essere “stati a casa”. Innumerevoli video sono stati caricati sui social media con lo scopo di denunciare chi sarebbe presumibilmente andato a spasso infischiandosene del lockdown. Questo fenomeno è conosciuto in criminologia e sociologia come “vigilantismo”, come ha spiegato il criminologo Les Johnston già nel 1996.

Il vigilantismo riguarda privati cittadini che volontariamente assumono ruoli che non competono loro, come quello di controllo del comportamento degli altri e la relativa denuncia pubblica delle malefatte altrui, vere o presunte. Con le sue azioni di difesa delle norme sociali, il vigilante cerca di offrire delle garanzie di sicurezza a sé stesso e agli altri.

L’avvento dei social media e dei dispositivi mobili ha favorito la diffusione su larga scala di un “vigilantismo digitale” che, come racconta il ricercatore dell’Università di Rotterdam Daniel Trottier, ha lo scopo di attaccare e svergognare l’autore del mancato rispetto delle regole attraverso un’esposizione al pubblico ludibrio, che è spesso duratura e irrispettosa della privacy altrui, e alimenta aggressività e sentimenti di rivalsa.

Se il fenomeno è tipico di momenti storici in cui l’ordine costituito è a rischio, o viene percepito come tale, la sua comparsa e diffusione nei giorni dell’emergenza Coronavirus appare quasi inevitabile. Il vigilantismo digitale da COVID-19 è però particolarmente rischioso, per almeno due ragioni. Anzitutto, questo bisogno di odiare “chi esce da casa”, crea esclusione e stigma sociale, additando ed esponendo individui sulla base di indizi puramente visivi che non possono discriminare tra chi effettivamente rompe le regole e chi invece ha una buona ragione per farlo (per esempio, si sta recando al lavoro).

Questa pericolosissima creazione di “nemici del popolo” sfocia in danni ingenti a livello psicologico—dal senso di solitudine all’incomprensione al desiderio di ritorsione—che molto probabilmente sopravvivranno all’emergenza Coronavirus.

Questo fenomeno finisce inoltre per giustificare simili comportanti trasgressivi, sulla base dell’errato ragionamento che “se lo fanno gli altri, lo posso fare pure io”. In secondo luogo, il vigilantismo digitale o meno divide la collettività, con gravi e duraturi effetti a livello di divisioni sociali tra presunti buoni e cattivi, tra meritevoli e non. Finisce per intaccare la narrazione tanto necessaria di una comunità unita forte proprio della sua unità, in grado di fronteggiare l’emergenza in maniera razionale, proprio nel momento in cui vi è un bisogno estremo di sapere che il sacrificio individuale alimenta lo sforzo collettivo.

Sistemi di riconoscimento facciale in Germania – foto Ap

Privacy e normalizzazione della sorveglianza

La pandemia ha anche riacceso il dibattito sul ruolo della privacy nella società dei dati e, in particolare, in un contesto di emergenza sanitaria come quello di queste settimane. Da più parti, seguendo l’esempio—o i presunti “modelli”—offerti da alcuni paesi asiatici variabilmente democratici o non democratici come Cina, Singapore e Corea del Sud, è stato chiesto di intraprendere soluzioni tecnologiche di sorveglianza e monitoraggio digitale per cercare di rallentare la diffusione del virus tramite il monitoraggio digitale dei cittadini, in varie forme.

Anche in Europa, diversi governi hanno iniziato a lavorare a possibili soluzioni tecniche e, nel complesso, il dibattito si è orientato verso lo sviluppo di applicazioni di “deconfinamento” che potessero sfruttare varie funzioni degli smartphone per fare “contact tracing”, ovvero monitorare i contatti sociali delle persone infettate o potenzialmente esposte a focolai di contagio.

Ad accumunare queste soluzioni, ad ogni modo, sono le complesse e pericolose ripercussioni in termini di diritti, privacy e sicurezza, temi che è fin troppo semplice perdere di vista se si guarda alla tecnologia con lenti eccessivamente deterministe, soluzioniste o dal punto di vista del “positivismo acritico” di cui sopra.

Difficile riassumere il polifonico dibattito italiano sulla questione dell’app “Immuni” scelta dal governo a questo scopo, ma numerosi elementi indicano come si sia cercato da subito di far passare la privacy come un ostacolo per l’attuazione di misure fondamentali.

Questo si è visto all’estremo in Francia, il primo paese a ufficialmente chiedere a Google e Amazon di allentare le misure di protezione della privacy per facilitare l’adozione di app di tracciamento dei contatti. In Italia, “Immuni” andrà nella direzione di un approccio basato su Bluetooth e decentralizzazione, certamente meno invasivo di altre opzioni che sono state sul tavolo delle varie task force governative, ma alcune indicazioni interessanti emergono dal dibattito che ha accompagnato queste decisioni.

Per quanto questa soluzione sembrerebbe sulla carta meno invasiva di altre, anche in questo caso rimangono aperte diverse questioni di opportunità. Claudio “Nex” Guarnieri, uno dei massimi esperti mondiali di sicurezza informatica, ha commentato le varie soluzioni tecniche avanzate, ricordando come anche il Bluetooth non offra comunque garanzie in termini di efficacia.

Le scienze sociali e vari studi sul giornalismo e sulla sorveglianzamostrano come la “normalizzazione” della sorveglianza sia un fenomeno frequente nei dibattiti pubblici sul tema. Qualcosa di simile si è avvertito anche nel dibattito italiano ed europeo nel mezzo della pandemia: i timori degli esperti (sia tecnici che legali) sono stati speso bollati sbrigativamente come problemi di secondo piano, mentre si è fatta passare la falsa dicotomia tra privacy e difesa della salute pubblica, come se la prima invariabilmente ostacolasse la seconda. In realtà, come ha scritto anche lo scrittore e autore dell’acclamato Homo Deus (2017) Yuval Noah Harari sul Financial Times, porre i due temi come in antitesi è scorretto, in quanto non si dovrebbe chiedere ai cittadini di scegliere tra due diritti fondamentali che tra di loro non si auto-escludono di certo.

La domanda da porsi è a quanti e quali diritti siamo e saremo disposti a rinunciare—anche solo in parte—e per quali obiettivi? Una visione troppo deterministica delle potenzialità di queste soluzioni tecniche potrebbe anche portare a sopravvalutare la loro effettiva capacità di essere d’aiuto in questo scenario.

Troppo spesso, inoltre, si è banalizzato il discorso attorno alla privacytentando, in modo disonesto, di mettere sullo stesso piano le abitudini online degli utenti—spesso frivole—con un programma di monitoraggio statale della salute pubblica. La privacy non è morta, come si è invece letto da più parti, e per quanto in parte erosa dal più che problematico sfruttamento commerciale in essere sul web, non si può ridurre questo dibattito al novero delle scelte individuali, azzerandolo con un click.

L’altra questione a restare aperta è infine quella del ritorno alla normalità: a emergenza finita, come assicurarsi che le tecnologie di tracciamento e infrastrutture di controllo pensate per tempi di crisi vengano effettivamente disattivate (e i loro dati cancellati)? Su questo punto si è espressa in modo chiaro anche la Commissione Europea, che si è pronunciata con alcune raccomandazioni e una toolbox, auspicando per gli stati membri un approccio pan-europeo nella difesa della privacy e della protezione dei dati, oltre a standard tecnici condivisi e quanto più decentralizzati.

Il grande assente nello scenario italiano rimane però il dibattito parlamentare sulla questione—come invece sta avvenendo per esempio in Olanda proprio mentre scriviamo—che pur sarebbe doveroso per assicurare il controllo democratico, la accountability e il rispetto di norme e valori democratici di base in scelte tanto delicate.

Gli anticorpi

Ma come si combattono questi quattro nemici insidiosi? Purtroppo la soluzione non è né semplice né immediata. E non esiste (né mai esisterà) un vaccino capace di magicamente immunizzare la collettività contro il positivismo acritico, l’information disorder, il vigilantismo digitale e la normalizzazione della sorveglianza. Possiamo però lavorare sugli anticorpi e fare in modo che si diffondano il più possibile nelle nostre comunità. La società dei dati ha bisogno di utenti critici e consapevoli, che sappiano usare e contestualizzare gli strumenti sia digitali che statistici, che sappiano comprendere i rischi che invariabilmente vi sono associati ma anche cavalcarne i potenziali benefici. E che possano aiutare le fasce meno digitalizzate della popolazione a navigare la propria presenza digitale.

In questo processo assume un ruolo centrale la cosiddetta “data literacy”, ovvero l’alfabetizzazione informatica estesa alla società dei dati. Tale alfabetizzazione deve prendere in considerazione la questione della cittadinanza nell’era dei big data e dell’intelligenza artificiale e deve metterci in grado di compiere scelte consapevoli per quanto riguarda i contorni della nostra azione sul web, comprese le complesse considerazioni in materia di protezione dei dati personali.

Deve aiutarci a distinguere tra le fonti di informazione e a districarci tra gli algoritmi di personalizzazione dei contenuti che inficiano la nostra libera azione sul web. La sfida è aperta ma anche particolarmente urgente se è vero che l’Italia è il fanalino di coda tra i 34 paesi OEDC (Organization for Economic Co-Operation and Development) per quanto riguarda l’alfabetizzazione digitale. Una ricerca recente (2019) proprio dell’ OEDC ha rivelato come solo il 36% degli italiani sia in grado di fare “un uso complesso e diversificato di internet” —il che crea un terreno fertile per i quattro nemici che abbiamo identificato.

Il mondo dell’educazione ha certamente un ruolo chiave da giocare, affiancando una rinnovata educazione alla cittadinanza sul web alla bistrattata educazione civica. Per questo serve una seria formazione del corpo docente, ma servono anche dei fondi dedicati per strumenti, infrastrutture e preparazione. Si tratta però di un progetto di medio e lungo termine, che difficilmente si potrà attuare durante la pandemia. La questione da non perdere di vista è che il mondo “post-Coronavirus” è in costruzione proprio ora, nel vortice della pandemia.

Le scelte intraprese oggi avranno un inevitabile impatto sugli scenari futuri della società dei dati. Più che mai, a dettare queste scelte deve essere un approccio inclusivo, trasparente e onesto per non trovarsi in un futuro dove a dominare sono “scatole nere” tecnologiche, oscure, discriminanti e potenzialmente anti-democratiche.

Gli autori

Philip Di Salvo è ricercatore post-doc e docente presso l’Istituto di media e giornalismo dell’Università della Svizzera italiana (USI) di Lugano. Si occupa di leaks, giornalismo investigativo e sorveglianza di Internet. “Leaks. Whistleblowing e hacking nell’età senza segreti” (LUISS University Press) è il suo ultimo libro.

Stefania Milan è professoressa associata di New Media e Digital Culture presso l’Università di Amsterdam, dove insegna corsi di data journalism e attivismo digitale, e gestisce il progetto di ricerca DATACTIVE, finanziato dal Consiglio Europeo della Ricerca (Horizon2020, Grant Agreement no. 639379).

April 23, 2020

[BigDataSur] Beyond Touchscreens: The perils of biometric social welfare in lockdown

In the context of COVID-19, what are the perils involved by the perpetuated subordination of social welfare access to biometric identification?

Silvia Masiero

Gradually over the last years, India has introduced biometric identification of users in most of its social welfare schemes. One of the main such schemes is the Public Distribution System (PDS), the nation’s largest food security programme, which provides rationed subsidised commodities to the nation’s poor through a network of ration shops. Biometric access to the PDS is largely operated through Aadhaar, the world’s largest digital identification scheme which provides enrolees with a 12-digit number and capture of biometric credentials (ten fingerprint and iris scans) for recognition. While modalities of identity identification and authentication differ through the country, states that adopted an Aadhaar-enabled PDS require recipients to authenticate through a biometric point-of-sale machine to receive rations.

As a consequence of the COVID-19 crisis, biometric authentication in ration shops has been suspended in several Indian states. A commonly given reason for this is the risk of disease transmission associated to users’ fingerprint contact with the machine, which falls under the broader remit of social distancing measures taken within the ongoing pandemic. The Indian case epitomises, however, a global trend of transition to biometric identification in anti-poverty programmes, programmes that – in the light of very serious effects of the COVID-19 crisis on vulnerable groups – are now more crucial to their recipients than ever. In the context of COVID-19, what are the perils involved by the perpetuated subordination of social welfare access to biometric identification?

The Trade-Offs of Digital Identity

With reference to the use of biometrics in India’s social welfare system, many researchers have highlighted the dichotomy between an anti-leakage rationale and the exclusionary effects yielded by such technologies. Latest in chronological order, Muralidharan et al. (2020) report on a large-scale experiment conducted in Jharkhand, a state where deaths by starvationdue to failed Aadhaar-enabled authentication of PDS beneficiaries were previously reported. The results of the study reveal a 10 percent reduction in benefits for recipients (23 percent of the total) who had not linked Aadhaar credentials to benefit rolls, with 2.8 percent receiving no benefits at all. Such exclusionary effects mirror previous studies of the Aadhaar-based PDS in the same state, with Drèze et al. (2017) reporting, among others, the anxiety brought in poor people’s lives by the uncertainties of biometrically-enabled foodgrain distribution.

The policy vision behind biometric anti-poverty schemes can be summarised in terms of two different types of error being tackled. In targeted welfare schemes, an exclusion error means the exclusion of genuinely entitled subjects, while an inclusion error indicates the erroneous inclusion of non-entitled subjects into provision. By matching biometric records (collected through databases such as Aadhaar) with records of recipients’ entitlements, biometric anti-poverty schemes promise to maximise the affordance of proper targeting, offering credentials to the excluded and preventing access to the erroneously included. This rationale lies at the basis not only of the global proliferation of digital identity schemes, but of their ever-increasing incorporation in anti-poverty programmes, of which the Aadhaar-enabled Indian system constitutes a notable example.

A ration shop in Tumkur, Karnataka, April 2018

But the reality revealed by extant research, including our previous work on the Karnataka PDS, differs from the orthodoxy of good targeting. First, as illustrated most recently by Hundal et al. (2020), requiring biometric identification at the ration shop does not prevent diversion, because the system affords recording successful disbursement even if rations are not provided as per eligibility. Second, there is a trade-off between anti-leakage affordances–in the form of accurate recognition at the point of sale–and the repeated exclusions of entitled beneficiaries, for reasons that range from machines’ malfunctioning to issues of fingerprint readability reported to affect, in particular, the elderly and those in manual labour. In the Aadhaar-enabled PDS, the need for multiple fragile technologies to work at the same time, as highlighted by Jean Drèze, poses a problem of practical feasibility of the system, which is crucial to those parts of the country that are most subjected to infrastructural issues. While inclusion errors are at least in principle targeted by the rationale of biometrics, exclusions keep happening, and put into serious predicament a social welfare system that should cover exactly the most vulnerable groups.

COVID-19: A Reshuffling of Priorities

In the midst of the ongoing crisis, many studies are being conducted on the effects of COVID-19 on health infrastructures and, crucially here, economic vulnerabilities in the Global South. Studies of factory workers, gig-workers and low-income households all point in the same direction: the economic impact brought by national lockdowns is disproportionally affecting the poor and vulnerable, large proportions of whom are recipients of social welfare systems. Where such systems have limited reach or are not available, measures of immediate assistance are invoked, such as the provision of universal basic income or emergency social safety nets. In the Indian PDS, the promise of doubling foodgrain rationsalong with providing extra commodities increases the scheme’s cruciality, in a situation in which new vulnerabilities–such as that of migrant workers exposed to distress and food insecurity–have emerged in the wake of lockdown.

In these times of heightened crisis, severely affecting the users of anti-poverty schemes, the exclusion errors induced by mandatory biometric access are a risk that social protection schemes cannot afford to take. While the incorporation of biometrics is purposefully designed to improve targeting, the crisis poses us in front of the priority of reaching out to the most affected, adapting systems in such a way that biometric recognition–or its digital equivalents–are suspended at the very least. While the problem of touchscreen-induced disease transmission is in itself a valid reason for doing so, the inclusion-exclusion trade-off illustrated here equally poses a problem that needs consideration. As systems are adapted to coping with the COVID-19 crisis, the need for assisting the affected needs to prevail on the stringent adoption of biometric credentials.

Photo: A ration shop in Tumkur, Karnataka, April 2018