This content was deleted by the author. You can see it from Blockchain History logs.

Google launches dataset search engine

Google has launched a dataset search engine on 5 September, where data scientist can easily find dataset based on their requirements and will be a companion of Google Scholar, the company’s popular search engine for academic studies and reports.

search.png

Institutions that publish their data online, like universities and governments, will need to include metadata tags in their webpages that describe their data, including who created it, when it was published, how it was collected, and so on. This information will then be indexed by Dataset Search and combined with input from Google’s Knowledge Graph.

Natasha Noy, a research scientist at Google AI who helped create Dataset Search, says the aim is to unify the tens of thousands of different repositories for datasets online. “We want to make that data discoverable, but keep it where it is,” says Noy.

The initial release of Dataset Search will cover the environmental and social sciences, government data, and datasets from news organizations like ProPublica.Ideally, Google will publish its own dataset on how Dataset Search gets used. Although the metadata tags the company is using to make datasets visible to its search crawlers are an open standard, search engines improve most quickly when a critical mass of users is there to provide data on what they’re doing.

In other words: Google should publish a dataset about dataset search that would be indexed by Dataset Search. What could be more appropriate?

Google's own blog on this.