Brainstorming decentralized search on Safe

Nicely laid out and great use case example :wink:

It makes me wonder about the new models weā€™ll see. For example, using your example, maybe search doesnā€™t need a central index. Maybe search becomes the best engine at finding the best ā€œperson who maintains links about Xā€ based on the context held in each userā€™s storage.

Ruben Verborgh prototyped something along these lines which I posted here some time ago, where the search uses the knowledge gathered about the user and held client side. Heā€™s not the only one interested in this approach, so I think @mav is right that weā€™ll see innovation in using context provided by the user to a tool, as well as in how this is applied in search.

FWIW, I donā€™t accept that centralised search indexes are the best technical solution to this problem. They are easier to understand, but getting them to scale was and remains a big problem.

We centralised search because the incentives were there, whereas we are creating new incentives to solve these problems in a decentralised manner. I expect interesting and possibly surprising innovations and discoveries - including many ā€œDoh, why didnā€™t I think of thatā€ solutions. I always think of WinZip.

There are lots of opportunities here for those willing to think long and hard about these problems. They are the people who will solve them, so it is brilliant to be here at this stage in the process. Later many will kick themselves that they didnā€™t have a go, because anyone can innovate and succeed if theyā€™re willing to work at it.

8 Likes

This is important for any solution or serviceā€¦ keen to see an answer to this.

The idea of ask a friend might work after a fashion but needs to address inertia ā€¦ how does new content become listedā€¦ in theory the creator would have lists and reason to list, so perhaps it sucks people inā€¦ if done well. Also, volumeā€¦ the trouble with this again is natural language is diffuseā€¦ a lot of lists might not exist.

2 Likes

Hash the search terms and then lookup the data object with the indices for those terms at the corresponding xor address.

The ordering aspect within a single search index would seem to be covered by a pagerank sort. Maybe other sort orders could be selected, such as publication date, author, organization, etc.

1 Like

I wonder weā€™re asking for where the search is a serviceā€¦ so, for any centralised off network service, is there or can there be an option to receive a userā€™s request and reply to it? Before the network can handle certain services, the option to provide a service to Safe Network might be very usefulā€¦ without that we might lack basic functionality that users will want.

Iā€™m certain that there will be people who will provide a centralized service and interface (safe site/app) for search. There could be some decent profit incentive to do so. But that is off-topic.

1 Like

Perhaps but might equally be a necessarily step to something that the network can adopt.

1 Like

Do you think this is practical? It would seem like a lot of data to be uploaded to achieve this - who pays for that? Would it be possible to deal with typos in search terms? What about varying ordering of search terms? When there are many results (eg ā€œbuy boatā€) does the person have to download the entire result set, then apply their personal ordering?

Do you think this is desirable? Should we be permanently storing the search results? Who decides which pages actually belong in the xor address for that search? How often are the search results updated? Can entries into the search results be spammed?

I like the concept but I would love to hear more about the practicality and desirability of this because there seems like some very difficult barriers (as compared to the current ā€˜ask a friendā€™ style google option).

If this is a public algorithm then the rank will be gamed. I think this is a big part of why google introduced additional context into their search results, pagerank alone wasnā€™t giving good enough results. But I like the idea of being able to publicly store and rank pages at a search-term-xor-address.

1 Like

I really donā€™t understand why you think this is necessary
A separate namespace to hold indices where the hash of the search terms is the xor address is a rather elegant solution. Isnā€™t this what you originally proposed above?

Example:

User types in the following to the safe browser address bar.

search: cat videos

Next, the client browser computes the hash of ā€œcat videosā€ and treats it as a xorurl. It then navigates to the xor address.

Case 1) The xor address exists. Once there a human readable list is presented for these terms with sorting options (pagerank, date, publisher, subject, alphabetical, etc.)

Case 2) The xor address of the complex phrase does not exist. The user is presented with a banner page to suggest they reduce the number of search terms, or gives them the option to retrieve indices for the individual terms and do the cross reference manually. They could also then be given an option to publish the fruit of their labor as an initial index for the complex phrase.

Other details might include how safesite owners could submit their site descriptions to index xor addresses, or how safe crawlers can update the indices in a secure and truthful manner while being compensated with PtP for every GET request to the index address.

Built-in browser search, no central service provider required other than the Safe Network itself, distributed content crawling and verification, a versioned history of index evolution thanks to permawebā€¦ whatā€™s not to love? Wasnā€™t this your idea?

P.s. The fun part begins once these indices are built and knowledge graphs can be constructed.

2 Likes

Because without option to receive and reply to users many service to Safe Network are hobbled.

One case would be opentimestamp, which becomes a whole lot more complex otherwise. Services could be marked off network to be clear what is in network, if thatā€™s a concern.

Iā€™m just looking to maximize utility to see as many real world use cases become possible. Sometimes that is at odds with pure use of one true solution but most solutions are a mix until they improve.

So, search as service would be trivial off network until the network can adopt a decentralised equivalentā€¦ or better.