Monetise govt’s non-personal databases, recommends working group
A working group constituted by India's electronics and IT ministry has recommended the monetisation of non-personal data held by government departments and ministries through a marketplace called the India Dataset Platform (IDP). The IDP would be a unified national data sharing and exchange platform, allowing data providers (government departments or agencies) to upload datasets for purchase by data consumers (research institutions, startups, etc.) for applications, innovation, or research purposes. However, there are concerns about the availability of data to non-Indian entities and the risk to privacy.
Non-personal data held by government departments and ministries should be monetised through a marketplace called the India Dataset Platform (IDP) for better decision-making and AI applications, a working group constituted by the electronics and IT ministry has recommended.

The central repository will be “a unified national data sharing and exchange platform to enable various data sharing and exchange use cases of all stakeholders including but not limited to Central/State/UT Governments, public sector undertaking, private sector companies, industry bodies, MSMEs (micro, small and medium enterprises) and startups, academia and researchers, civil society and media organisations, open technology communities, etc,” the working group said in a report released by the minister of state for IT Rajeev Chandrasekhar on October 13.
This working group, one of the seven constituted to look at AI governance-related issues, was headed by Abhishek Nag, CEO of Digital India Bhashini division, an independent business division within the ministry. Neel Bhatia, senior director of startup ecosystem and strategic industry collaborations at chip maker Intel, and Arun Gopal, principal cloud architect of data strategy of Oracle, a database management company, were amongst the group’s 15 members.
A data provider, defined as a government department or agency, would upload data sets to the platform, where they would be bought by a data consumer, a research institution or a start-up, for applications, innovation or research purposes, the expert panel suggested.
Could a non-Indian entity buy data sets through the IDP? “Minister of state Rajeev Chandrasekhar in one of the meetings said that the datasets would be available only to Indian startups and researchers. But if a Microsoft or a Google want to buy such data, you can’t really deny them. We will have to see how that plays out,” Avik Sarkar, visiting faculty at the Indian School of Business and a member of this working group, told HT.
The consumers will be able to access the data either by downloading the files or through APIs, as specified by the data provider. API, or application programming interface, allows two datasets to share limited amount of queried data with each other.
The working group “will also encourage” non-government entities to contribute data sets to the IDP, the working group said. When private entities are asked to share any data, including non-personal and anonymised data, that raises concerns around their intellectual property rights. “Many private entities --- such as the Ola Mobility Institute and PhonePe Pulse --- already make certain non-personal data publicly available. The idea is to make such data available on one platform,” Gaurav Godhwani, executive director and co-founder of CivicDataLab, and another working group member, told HT. All government departments would have to identify areas within their departments where data can be used to address societal challenges, improve public services or support policy formulation, the recommendation said. They would also need to conduct an inventory of the data already available with them and clean it up for processing.
The group has recommended that the IDP be headed by a CEO and have a chief data officer to manage its data operations. A chief technology officer will manage the technical end-to-end operations. A chief business officer will be responsible for revenue generation and client management, and a chief partnerships manager would develop and nurture strategic partnerships for the IDP.
The government at present runs the Open Government Data platform, where some public datasets are available for download, but it is not clear what its fate will be when the IDP is launched. Sarkar explained that the OGD platform has high-level statistical data. “For AI, the data needs to be more granular --- at farm level, at student level,” he said. But what about the risk of re-identification and the subsequent threat to privacy? “Pseudonymisation is necessary for policy formulation,” he replied.
Godhwani provided some more context: “The OGD only hosts government data that is openly available. It is not a marketplace nor are there any restrictions on who can access the data. IDP will function as an umbrella platform where we hope that high-level data from OGD will also be made available. On IDP, the data providers will be able to control the licensing conditions, who has access to the dataset, and what is the nature of their access --- open access, restricted access or registered access.”
Another working group, tasked with looking at the design of the national data management office, first proposed under the draft National Data Governance Framework in May 2022, recommended the management office be responsible for creating and operating the IDP. The working group on NDMO was headed by Abhishek Singh, CEO of the National eGovernance Division and Digital India Corporation, both entities under the ministry.
The IDP, envisioned as a platform-as-a-service (PaaS) or architecture-as-a-service (AaaS), would implement a pricing and payment management module so that the data providers can specify pricing details for the data sets they share on the IDP. Price would be determined by the data sets’ uniqueness, relevance and demand. Pricing could involve one time access, limited access, or a subscription for regular access to updated data sets, the working group said.
The IDP will also be able to empanel agencies whom government departments and agencies will be able to hire to curate, label and filter their data for creating useful data sets. This is seen as a value-added service.
The IDP will also register data customers, such as research institutions and startups, on the basis of their intended use of data and details of relevant credentials and expertise. Their submissions will be reviewed by the central body overseeing IDP.
The IDP’s central body will track how the data consumers use the APIs to access data. The metrics will include frequency and volume of data accessed. Governance rules around the IDP must be defined to operationalise it, according to the working group. The IDP would also need a scalable and secure platform that uses APIs for data access and availability, it said.
Whenever regulation of non-personal data has been discussed earlier --- either through the inclusion of a clause in what was then the Personal Data Protection Bill, 2019, or through a separate legislation, as proposed by the expert committee headed by Infosys co-founder Kris Gopalakrishnan that proposed a new regulatory body to govern non-personal data --- concerns have been raised about privacy of individuals and that of groups. “I hope that MeitY works on it. Data would need to be aggregated as well as anonymised to prevent harm. There are UN guidelines that can be followed when the IDP released its guidelines,” Godhwani said.