India has data, but who is going to figure it out
Often, getting basic information from the government can feel like drawing water from a stone. You have a right to know how much money is spent on cleaning your street or running your child’s school – it’s public money, after all – but finding out can be nearly impossible. And what if you want to know how much rain your city received every day for the last fifty years? The government would know, but you won’t.
These struggles are symptomatic of a lack of an open data culture, which not only robs citizens the right to know but also the government and independent researchers the ability to size up problems and come up with critical solutions.
Almost everyone who closely works with Indian data sets, or has tried at some point, is likely to have a story about the struggle to either locate a dataset or find it in an accessible format—something that is easy to download and analyse. A surprisingly large amount of Indian data is found in PDFs, which make it difficult to play with.
The problem can be solved by the ‘Open Government Data’ project. According to Open Knowledge Foundation, OGD refers to “data and information produced or commissioned by government or government controlled entities, which can be freely used, reused and redistributed by anyone”.
Open data can help create $3 trillion a year of value for the global economy, according to a McKinsey Global Institute report from October 2013. From citizens who wish to hold the government accountable, to government officials who want to improve the delivery of public programs, to researchers devising evidence-based policy, to civil society organisations tracking election results – nearly everyone benefits from open data.
In India, the OGD movement gained strength following a global push for open data that began with the launch of data.gov in the United States in 2009, and data.gov.uk in the United Kingdom the following year. In 2012, the government of India adopted the National Data Sharing and Accessibility Policy (NDSAP) and launched its OGD portal, data.gov.in, that October.
In theory, publishing data in a centralised repository should help better allocate scarce research resources, says Nisha Thompson, co-founder of DataMeet, a community of data science and open data enthusiasts. “Multiple individuals and organisations are spending the same amount of time, money and effort to collect and clean the same data sets,” Thompson said.
It is precisely for this reason — making government data easily accessible in a single place — that data.gov.in was launched.
“The objective is to make more data sets open, making it easy for citizen to access, reduce our RTI [Right To Information] flow, and at the same time, increase accessibility, transparency and accountability, while promoting innovation,” Sitansu Sekhar, Technical Project Manager with the Indian government’s data portal, said.
But is data.gov.in working as desired? Not really.
DOING IT WRONG
Sample this: DP Mishra, the data portal’s technical director, enthusiastically notes that the National Crime Records Bureau (NCRB) — responsible for collating official crime statistics — has published a significant amount of data on data.gov.in.
Go and search for “NCRB” on the portal, however, and you will get zero results. That’s the first problem: search. Sure, “crime” will give you listings, which includes NCRB data. But why are there no results for the organisation name?
Now, consider rainfall statistics. Try searching for “rainfall 2017” on the portal. This time, results do show up, but there is nothing for the year 2017. The same thing happens when you look up “health budget” — you won’t finding anything for 2017. Here, then, is the second problem: many important data sets are outdated, even though they lie somewhere with the government.
“For my use cases, I often find incomplete data sets on the data portal, making it almost impossible to use it for research whose outcome can then be used for evidence-based policy-making,” said Natasha Agarwal, an independent research economist.
Rakesh Dubbudu, founder of Factly, a public information portal that is helping the Telangana government to open their data, said the portal’s design is another issue.
“The moment you go to the data portal, you are overwhelmed. There is just too much information—you don’t know where to go to,” Dubbudu said, underscoring the difficulty in navigating the datasets, a third problem.
The portal also suffers from an issue that has nothing to do with design or technology: it’s simply hard to get government agencies to upload their latest data to the portal.
In 2015, Agarwal wrote a paper on the OGD movement in India, looking critically at data.gov.in. She wrote that the initiative suffers because the suppliers of data — ministries, for instance — do not see the value in making data open. Given their resource and capacity constraints, some ministries’ Chief Data Officers (CDOs) believe updating data.gov.in is not worth their time.
Shekhar, the project manager of the portal, acknowledged the challenge of getting agencies to upload new data. “Uploading data on the portal is an additional burden on the officer. That is not their sole task. The CDO has other work too, so they say they don’t have time for this task,” he said.
Mishra, data.gov.in’s technical director, said he and his colleagues have very little control over which data sets get uploaded to the portal. If the datasets are outdated, he said, “that’s not in our hand. As long as the government is not pressurised, it won’t happen. The community should write to the CDO about their concerns.”
But there is another problem, which Agarwal also highlighted in her paper: data.gov.in lacks an effective feedback mechanism to understand how people are using the platform.
DP Mishra said there is a form on the website where users can send in their queries. But most people send requests for adding specific data sets, he said. “There is hardly any feedback about the platform.”
The data portal team believes its product is at par with international standards – in fact, one of the best. Agarwal disagrees. “Having worked with the US data portal, and trying my luck with India’s, I can say we are not even close.”
All these problems notwithstanding, “we have definitely moved forward,” Thompson, the DataMeet founder, said. In 2010, when she moved back to India, “no one was thinking about data,” she said. “From then to now, we have a vibrant community of people working with data. The ecosystem has grown significantly.”