Just throwing stuff at the wall here…
Search does two things - forms a subset of relevant pages, and orders the pages.
The first part, excluding unrelated pages, and the second part, ordering the pages, are probably not that different in practice since if you don’t look at result 1038 then it may as well have been excluded.
But I think the distinction is useful when considering how to approach search in an incremental way. I think the first part is probably easier than the second part.
The ordering consideration probably means the search will benefit from some additional context to help give the most relevant results for your particular situation. Google uses the example of searching for taj mahal for why context matters. “the perfect search engine should understand exactly what you mean and give you back exactly what you want”. I think both sides of the privacy debate have a strong case here…
Very early Safe Network search could be as simple as “ask your friend who’s really into boats which pages I should visit if I’m interested in buying a boat”. If the friend has appropriate knowledge and understands your situation, and they’ve read enough on Safe Network about boats, they can hopefully provide you some useful pages.
Maybe that’s a basis for the early search, have people who know where the good pages are on the network and keep track of content really closely, building a really human-level mapping of stuff. Almost like news aggregators, that kinda thing.
Then it can be extended to pagerank style automation and ranking, which is the same thing but automated and scaled up. It’s still more like asking a friend, since the results of that ranking are presumably stored locally and not on Safe Network and we have to actually ask the friend (ie the friend would suddenly know we’re interested in boats). At a basic functionality level there’s not that much difference between asking Tony-the-boat-friend about boats and asking google about anything. Both involve asking someone more knowledgable.
The hard part of scaling up will be improved ordering from added context. Google can do this automatically because of how much they already know about each person, from their searches but also from their gmail, maps, ip etc. On Safe Network this context will need to be supplied for each search. Is the search query itself context? In some way yes, but it’s such a strong piece of context that it probably deserves to remain conceptually distinct from the other stuff.
My feeling is someone will probably design a standard format for supplying context (including techniques for privacy etc) and this context-stuff will be managed automatically by some personal-assistant type algorithm that runs on your machine and ‘gets to know you’ and can fill the search context automatically each time.
But I’m not sure how to take the final step and put the search index data into Safe Network itself. It seems to me that we will always be finding better ways to ‘ask a friend’ to tell us some of their ‘knowledge’. You could just say Tony-the-boat-friend uploads once a week his best boat links, and google could do the same for all searches, but I’m not sure how practical or desirable that would be.
It’s also worth asking how will we submit a search query on Safe Network? Would it be like sending an email to google and google responds with another email containing the results? Most likely it will be direct p2p (as in my computer to google computers), so we really do end up just talking to google almost exactly like happens now.