Even seemingly benign data on communities can lead to collective harm. Demographic census data can be used to target communities based on social or religious lines
The Committee of Experts on Non-Personal Data Governance Framework was formed last year to advise the Government on various aspects of non-personal data. It released its draft report in July and is currently accepting public feedback. Among other things, it recommended that a new legislation be drafted and a new regulator be set up to operationalise the framework.
The committee defines non-personal data as anything that is not defined in the Personal Data Protection Bill, 2019, and that does not contain personally identifiable information. This would cover a wide array of information, including, for example, anonymised datasets that companies collect from their users or customers, traffic pattern and weather data, agricultural data and datasets containing information that is not directly linked to an individual. Given that datasets are an important aspect of many businesses today, any non-personal data regulation would affect most companies and entrepreneurs. They would also have significant implications for providing benefits and enabling individuals and communities to exercise their civil liberties, due to the inferences that can be drawn even from data that is anonymised.
This is especially the case because most datasets in use are “mixed,” containing a combination of personal and non-personal data. For example, datasets collected by banks could contain client and transaction details as well as aggregated information on overall demographics and transactions. It is important to note in this context that the Personal Data Protection Bill, 2019, which is set to regulate personal data, is currently being deliberated by a Joint Parliamentary Committee. The finalised legislation and the Data Protection Authority, which is the proposed regulator under the PDP Bill, are empowered to regulate personal data. While the committee recognises that many datasets are mixed, it nevertheless includes anonymised personal data within the ambit of the non-personal data framework.
This is problematic because anonymising personal data, as the committee recognises, is not irreversible and the risk of re-identification of individuals is a real concern. Moreover, even seemingly benign data on communities can lead to collective harm, such as if data on average income is used to decide interest rates for those living in a certain area, or demographic census data is used to target communities based on social or religious lines.
It is also clear that algorithms can exacerbate existing biases unless there are sufficient technical and human-based safeguards. Using such analysis to make public service delivery decisions, for example, can also perpetuate discriminatory practices. Importantly, the framework suggested by the committee also allows the Government wide-ranging access to datasets. When combined with data that it already has access to, this carries significant surveillance risks which can endanger civil liberties.
Given the potential for such harms, it is essential that a key focus of any regulatory framework is on instituting safeguards, with clear accountability and redressal mechanisms. The primary goals of non-personal data regulation are to unlock the economic value of data, create a level playing field for digital businesses and protect the interests of those providing the relevant data. Unfortunately, there are a few significant issues with the framework proposed to achieve these aims.
First, crucial parts of the framework are vague. For example, one of the key concepts discussed by the committee is that of a “community.” This is relevant both from the view of protecting collective privacy and for assigning rights and obligations accruing on the basis of data generated by communities. The draft report defines a community as a collection of people bound together by a common purpose, objective, or geography, which means that an individual can potentially be a part of hundreds of communities. While it is meant to empower communities as providers of data, the report is silent on what would happen if individuals are part of multiple communities with divergent interests, and if members of a community do not agree on what would be in their best interest. What this means is that it is going to be extremely difficult for communities to actually benefit from the framework, given that key aspects are not clarified.
Second, the mandate of the new regulator, the Non-Personal Data Authority, is extremely broad and overlaps with multiple sectoral regulators. For instance, it would be practically difficult for companies to separate the non-personal data from personal in the datasets they use or create. The authority is also tasked with setting standards of anonymisation, even though penalties for re-identification are prescribed under the PDP Bill.
Similarly, regulating competition concerns to “level the playing field” falls squarely within the purview of the competition law framework and the Competition Commission of India. The mandatory data sharing mechanisms outlined in the draft report do not engage with proprietary rights that datasets are subject to and are likely to conflict with intellectual property laws. Such regulatory overlaps are likely to lead to conflict, litigation on the boundaries of each regulator’s mandate and higher compliance costs for companies.
Third, although the committee recognises the potential for harm and the need for safeguards in a non-personal data framework, it does not account for these in its report. For instance, when exploring the concept of collective privacy, it recognises that non-personal data can provide insights that allow for collective harm such as discrimination and states that safeguards are necessary. Unfortunately, the committee does not explore any specifics. Even though it proposes that a new regulator is set up, there is a marked lack of engagement with regulatory design best principles to ensure accountability, transparency and independence of the regulator, and effective grievance redressal. Given the overlaps in ambit, it is also not clear that a separate regulator is even required to specifically regulate non-personal data. New regulatory frameworks will also have to recognise that given the range of activities that are moving online, regulating issues in the “digital” space cannot be done in silos and will require cooperation and collaboration. Experience in India and other countries, where such collaboration occurs, indicates that there must be robust, binding processes that are built into regulatory frameworks themselves.
The committee would have done well to explore some of these aspects and other best practices in regulatory design, and more deeply consider the best ways to achieve its stated aims. For instance, instead of mandating data sharing, it could focus on leveraging the intellectual property law framework to incentivise data-sharing without contradicting existing copyright laws. More generally, the committee must specifically address the areas of overlap with other regulatory frameworks and clarify the ambit of the proposed non-personal data framework. It must carefully think through emerging concepts like collective privacy before seeking to regulate on that basis, and fundamentally reassess the need for a new regulator, specifically for non-personal data.
(The author is Fellow at Esya Centre)