Public Datasets are going to have a massive role to play on Safe Network. There’s so much public data, much of it is static (so suits immutable data upload), none of it is copyright, most of it is in a good format, and it’s a perfect candidate for foundations or motivated individuals to voluntarily upload and increase the size of the network.
I feel like this is something that could be turned into a really neat project.
Gather a list of public datasets
Coordinate between uploaders to avoid duplicate costs (could be a project in this, maybe add a leaderboard to make it exciting)
I think we can start step 1 here in this topic.
Step 2 is perhaps something BGF or maidsafe foundation might sponsor? Or it could be individuals paying? Or it could be like a bounty system?
Try to include some sort of ‘time travel’ feature for datasets that are the same context/format but vary over time
Think about doing this in a way that naturally allows archive.org and other archival datasets to participate
I see a lot of talk about wikipedia but I feel it’s not a good fit for the early stages of the network since it changes a lot and very rapidly, and there’s a huge administrative aspect to wikipedia that isn’t so easy to port across compared to public datasets.
Some public datasets I know of
It’s not sexy but it’s better than uploading random data and if it’s done well it could be incredibly valuable. But I feel there’s a big challenge to coordinate this effort.
If you know of more public datasets or have ideas about how this could work please let us know.