Exponential Tree of Credibility (my crazy idea)


#1

Hello everyone!

Long time lurker of this forum here. I usually don’t post things, but yesterday I had an idea that I think might be relevant. I wanted to see what this community thought about it. It’s about solving the issue of multiple accounts to allow for reputation systems, voting, commerce etc.

I’ll start off with framing the problem: How can find out if someone whom you’ve only just met is trustworthy? How can you be reasonably sure that they are who they say they are? How can you know if they are actually going to mail you that thing you bought on the SAFE network version of Amazon (coming soon I hope!) after you transfer them your currency?

It occurred to me that the normal answer to this problem is to conduct a background check of some kind–that is, get some kind of feedback from the people with whom your subject has interacted previously. If I wanted to know if Alice was trustworthy, I could go and ask Alice’s previous client (for instance) Bob “Do you vouch for Alice?”. If I wanted to be more sure, I could ask a whole bunch of people who Bob has previously associated with and take an average. In fact this is exactly the way it is already done in the case of “seller’s ratings” or “trusted seller” etc.

However, there are clearly some major weaknesses in this scheme. Here are a few that I can see:

(1) What if people created a bunch of fake identities to vouch for themselves?
(2) What would prevent someone from “vandalizing” the ratings of another user (leaving unwarranted negative ratings)
(3) What would prevent the administrator of the system itself from censoring/manipulating or otherwise abusing his power? (such as selling “sponsored reviews”)

There is some evidence that this already happens with Ebay/Amazon/Reddit. What I was thinking is that the SAFE network provides the basic technology to begin to solve this system by taking it out of human control and using a clever algorithm to prevent the problems I mentioned in point (1) and (2). I do have an idea for such an algorithm. I want to reiterate that I have not thought about this for more than a couple days, so there might be some big problems with it, but that’s why I’m hoping you guys can be the devils advocate and try to shoot this down. Let me start off by explaining how I got this idea (please bear with me, I promise this is all leading somewhere!)

There is a website that I use called couchsurfing.org. Basically it’s a community where you can stay at the houses of people who have offered for free while traveling instead of a hotel. I’ve hosted travelers and “surfed” a lot myself. It’s really great. Anyway, the main question that people have about this when I tell them is something like “But how do you know the person who you’re going to stay with is not an axe murderer?” To this I reply, "Well, there is a feedback system, where each time you stay with someone or someone stays with you, you leave a rating on their public profile which either vouches for them (if you had a positive experience) or warns other travelers about this person (if you’ve had a bad time–I never have, but that’s beside the point).

Ok, so this is a fine system and all, but of course if you are very skeptical (which you should be!) then you might say, “Well how do you know that people are not just creating lots of profiles with fake pictures to leave themselves positive feedback?” Well, sure, they could be. So we have just changed the problem from “vetting that person” to “vetting the people who vouched for that person”. Of course we cannot be 100% sure, but if all the people who vouched for someone check out themselves (they have their own positive references, etc.) then can reasonably conclude that the person who they all vouched for is trustworthy? I think so. Well, good enough for couchsurfing anyway.

Bur not good enough for a very resilient system that would be needed for voting/commerce etc. If you’ve followed my train of thought thus far, I’m sure you’re thinking “What would stop someone from creating a whole huge web of profiles, like thousands of them, and then use them all to vouch for each other so that even if you checked all the references several ‘layers’ deep you still would be fooled?” Well, nothing would stop them is they were determined, and that’s the problem. I thought about this, and I realized that it is actually very easy to calculate how many fake profiles you would need to make in order to make web of references that was, let’s say, “n” layers deep, assuming that you insisted on having at a minimum x references for each layer. It’s just x^n. So if you wanted to check 4 layers in (the person => person’s references => each reference’s reference => THOSE reference’s references) with a minimum of 3 vouches, you would end up looking at 3^4 == 81 profiles in total. That’s not really too much, considering that checking 4 layer’s deep is pretty paranoid and we admitted earlier that creating thousands of fake profiles isn’t too hard to do.

But here is the crux of my idea: what if you automated this process in the SAFE network, via some kind of app? Here are the basic rules that the algorithm would follow when a “reputation check” is conducted on a given user (let’s call him Bob).

(1) The app queries, from Bob, the list of all the users that Bob has interacted with (perhaps names hashed to ensure anonymity).
(2) The app RANDOMLY selects a certain number (let’s say 5) of these names (randomness is key!).
(3) The app then takes an average of the ratings that were left from these 5 users… but we’re not done yet (we still need to weight the average based on the credibility of people who left references)
(4) The app sends out an identical query for the contacts of the people with whom EACH OF THESE 5 REFERENCES have interacted, and rates them according to the average of 5 more randomly selected ratings. This process is repeated “n” number of times, until the app has effectively gathered ratings data for a large number of accounts that are in the “tree” that is constructed by this iterating branching => querying process. Due to the exponential nature of this tree, the number accounts could easily number in the thousands.
(5) Now the app can reconstruct, from the bottom up, the weighted average rating of each user in the tree. So, for instance, if the user got a 100% positive rating (from some minimum number of other users) than any upvote that he contributed to the next layer in the tree would be weighted with a coefficient of 1 (the maximum level of credibility). If, however, the user had only 50% positive feedback, his vouching for someone else would be carry a reduced weight–50%–in proportion to how he himself had been rated.
(6) When this tree is followed all the way back up to Bob (the ‘trunk’), the app calculates that final number (derived from all those ratings farther up in branches) and this is Bob’s credibility rating.

Let’s say Bob receives 5/5 positive reviews. But 4 of those reviews are from accounts that are actually fake, and don’t have any reviews themselves. So then those 4 accounts with no reviews don’t contribute to Bob’s rating and so he ends up with only around 20% credibility rating.

It’s really, really, hard to describe this with words, but I hope you’re able to get some idea. In any case, if you’re on the same page with me at this point, you can see how this would require anyone who wants to make a fake account with a good rating to first create an exponentially huge number of other fake accounts and an elaborate tree of fake references in order to establish credibility. In order to prevent circular referencing, the app would compare the ratio of accounts/unique_accounts that it derived from it querying process and, if this number was significantly different from 1, would flag the account in question as a possible fake.

So that, if implemented properly, could solve problem (1). As long as there was some kind of effort (like a simple captcha) to create an account, it would not be worth it to have to create an entire fake tree of self-referencing accounts just to give credibility to one account.

Now for problem (2): how to stop people from leaving unwarranted negative references: If you think about it, the system I’ve described already might solve this problem naturally–since fake accounts (those not backed by a “real” tree of references) would have a weighted voting ability near to zero and thus could not do any damage.

As for issue number (3) (the system itself being abused) I’m counting on the decentralized SAFE network design and programmer’s a lot smarted than me to use cryptography to make this algortihm anonymous and tamper-proof with open source, signed software. Actually that’s why I want to put this post on this forum and not elsewhere–I’m not sure such a system would ever work if you had to trust a server.

Let me sum up by adding that this could offer an elegant solution to the “proof of unique human” problem that I’ve seen you guys discussing elsewhere on this forum. Basically, it all comes down to the fact that a large, non-self referencing tree or real accounts would take so much effort to fake that it would not be worth it. I could see this being used for not only for commerce/trust-assurance but also for voting/moderation functions, where a person’s vote is weighted according to the credibility they’ve established over time through a tree of positive interactions. It is worth pointing out that the voting credibility for a given person would be naturally limited by the system as “one person one vote”, thus precluding the situation of concentrated moderation power with a few users.

There is probably lot’s of holes in this argument, and please let me apologize to you noble people who have suffered through my entire wall of text! In any case, this is just something I had to dump out of my brain onto paper. I really look forward to getting some feedback and I’m hoping you guys can point out problems in this idea that I’m not seeing.

Cheers
Stuart


#2

This is good, check out identifi which are trying something like this. Its a very big problem that needs a solution. Trust models separate from commerce systems are good for the user as they can port their trust to other systems and keep their details more secure.


#3

Let’s say Bob receives 5/5 positive reviews. But 4 of those reviews are from accounts that are actually fake, and don’t have any reviews themselves.

What if I create 500 fake accounts each of which is vouched by 5 of other fake accounts from this group?

More importantly, the idea assumes there’s a 1 to 1 mapping between a user account and MaidSafe address.
IIRC someone said each user account can have up to 2 MaidSafe addresses. If that is correct then you’d have to rate addresses and I’m not sure if you could actually tell what addresses belong to what ID. In that case a person have an honest and dishonest account.

I’m also wondering how would that disclosure of one’s ID<->address mapping impact user’s privacy.


#4

I cannot think too much right now (in routing algorithms), but the general principle of the randomness is good and valid, this evens out over time to product stunningly accurate results. Perhaps I can add in a couple of algorithms to throw into this mix, first we have IDA (information Dispersal) and secret sharing. These help with this kind of thinking (both are in SAFE).

IDA (Rabin - Information Dispersal)-> splits up data into N of P (N == number of shares, P == parts required to make up N). Its relatively efficient and the data/P is pretty close to efficient (similar to how RAID works). It is not secret though.
Secret Sharing (Shamir) -> Here you have a similar N of P mechanism, but it is secret, that is to say that if you get P - 1 parts then you cannot deduce anything of the original data (unlike IDA where you have a chance). It is not space efficient as each P is approx equal to the whole data size.

Perhaps your ideas can be honed a little if somehow groups were to come together, but had to recreate a crypto key for instance that proved they were real, or perhaps each person who does stay gets another P (part). In IDA for instance you can add more shares at any time. Then the owner can give another share to all who stay as proof of staying or voting etc.

Just random thoughts to throw in some more possibilities for your ideas, hope you don’t mind.


#5

Stuart, this sounds like the germ (at least!) of an excellent idea. Definitely worth exploring in depth if you and others can take that on.

I wouldn’t worry about finding holes too much just yet, that’s always easy with new ideas, and diverts attention from the creative leaps and original thinking needed to tackle hard problems.

Later, we’ll all be able to run it through the mill and see how well it stands up.

My intuitive sense is that this works, or can work, using your ideas in a similar way to SAFE itself. David seems to be thinking that too. So go for it! :slight_smile:


#6

What if I create 500 fake accounts each of which is vouched by 5 of other fake accounts from this group?

That is why the app compares the number of accounts in the tree to the number of unique accounts in the tree. In any group of accounts there is bound to be some duplication, but you would expect that statistically the proportion of duplication would decrease in inverse relation to the total number of accounts. Bottomline is that for a “real” account, the ratio of unique_accounts/total_accounts (lets call it U/T ratio) in the tree is going to be slightly less than 1, but not by much. It is even possible that from time to time, the app could conduct automatically conduct “research queries” to determine what is the average ratio on the network. So to answer your question, a group of accounts which self-referenced in a circular fashion would be recognized as fake since the U/T ratio associated with each account is significantly less than 1.

More importantly, the idea assumes there’s a 1 to 1 mapping between a user account and MaidSafe address.
IIRC someone said each user account can have up to 2 MaidSafe addresses. If that is correct then you’d have to rate addresses and I’m not sure if you could actually tell what addresses belong to what ID. In that case a person have an honest and dishonest account.

I’m afraid I don’t know enough about the SAFE network to comment here at this point. I’m going to read up on this though!

I’m also wondering how would that disclosure of one’s ID<->address mapping impact user’s privacy.

Yes I agree that is a big concern! I wonder if there is a technical solution this problem? This is where I wish I knew more about encryption implementations :smile:


#7

Going to go try to educate myself on IDA and secret sharing now! Thank you!


#8

K, wait. Maybe I’m misunderstand what @janitor is saying, but if having multiple accounts impacts the disclosure of private information, wouldn’t that be a positive feature of something like this?

So if the system is set up that if you have 1 to 3 accounts then its statistically unlikely (ie impossible), for your private data to be compromised, but if you have 300 to 500 accounts then its statistically likely, that would seem to be a positive effect for a reputation system because rational actors would have incentive not to create squirrel accounts.

Am I missing something?


#9

@kirkion, I meant to say that I think it was said on this forum that each user account may have to have up to 2 addresses.
What I said about 100’s of accounts I meant that those would be hundreds of fake (farmed) MaidSafe identities, not that a single user account would have 100’s of addresses (I think someone said that won’t be possible).

@sbow22: "That is why the app compares the number of accounts in the tree to the number of unique accounts in the tree. "

I don’t understand this properly so I can’t comment. I hope someone will be able to. (It would seem to me that it’s hard to tell which of millions of accounts are unique, but I’m probably wrong so I’d rather not make nonsensical claims!)

It’d be nice to review older posts on this topic (I can’t remember with certainty whether it was said that each account will have 2 addresses, or two “public ID’s”), but unfortunately it’m too tired to do that now… But if someone has the energy and time to do it, hit the search function, maybe my original claims weren’t accurate.


#10

Thanks for the clarification.

And Here is where @dirvine talked about having two Public IDs

And @sbow22 that thread is one of the earlier discussions of reputation and what will be possible/desirable on the Safe network.


#11

To expand on my earlier idea:

First off, in any kind of system like this you will need to have a probationary period, where a new account can set up connections and not be tagged as fake (otherwise you will be giving early adopters a virtual lock on who can join this validated network).

So whether this is an app on top of the Safe network, or an opt-in core function (I think that you would get quite a bit of push-back to the idea of making this a mandatory core function), you will need some time where people are just engaged in social networking before the transaction ability of this verified account kicks in.

So if you have it set up that people have to verify that they possess a minor amount of safecoin, say enough for a week’s worth of average use (assuming that people are doing things like posting on forums and that those posts equate to PUT requests). It doesn’t matter how much you make the amount, but it would be an amount that an average user would have no trouble filling, but that would make it prohibitive to have too many of these verified accounts.

So then make it so that each account publicly posts a multisig key. The idea being that if you set up the account so that 20 of these multi-sig keys can overwhelm the original private key, then if someone creates 20 spam accounts, another rational actor can ping the 20 accounts and get a multi-sig key which allows them to drain the safecoin account associated with the spam accounts. As soon as that happens, the spam accounts are deactivated.

I’m imagining Spammer Hunter/Killers patrolling the Safe network looking for spam accounts so that they can drain the associated safecoin.

And this takes us back to the probationary period. If you require a reasonable probationary period, then these spam accounts will be detected and drained before ever becoming live.


#12

I have a strong knee jerk reaction to reputation systems outside commerce but Stuart’s idea or the one he is putting forward puts a grin on my face. There is something great about it. Maybe it could help cut down on unnecessary contracts. This sounds like a reputation system that could establish plausible ID or a level of trust that was almost synonymous with useful ID.


#13

I just love how certain political persuasion is vigorously opposing “unnecessary” contracts.
As if owning a MAID (or SAFE, later) isn’t a contract. (Of course it is, but they can get over it because … well, they want them. Sometimes you just can’t do without reasonable exceptions!)


#14

Don’t you just love how maidsafe just brings everybody together of every different colour and flavor imaginable?

As for the topic at hand I think I get it but honestly when thinking logically I think more in concepts and pictures than in mathematics (probably why I have a problem with code), especially when discussing subjects like this. If it was raw math that would be a different thing but what we’re discussing here is actually describing a process of abstract ideas and well… pictures and diagrams would be useful to help illistrate it.


#15

Maybe this exponential tree can be used to discredit the various organs of the lie machine. BBC 22% reliable. Fox 16% reliable. It seems like some of the ideas could work to gauge how characteristic misrepresentation was on these entities. But a lot of them would probably be happy to be honest 99% of the time so they could lie 1% of the time where it really mattered. The system would probably be able to weed out the strategic liar approach with the individual human raters who helped generate the statement checking characterizations, but the bar would be much higher for conflict media. It could never meet that bar because of its conflicts. But sense the current system works on bulk lies it could be a tool to raise public awareness.


#16

Yes, this was my feeling too. I will work on making a diagram and see if I can post it here.

EDIT: @kirkion About your idea on the spam hunter-killer bots… I had not thought about that, but it did occur to me that it would also be necessary to find some way of disincentivizing “real” accounts from being too liberal with the vouches that they give to others–perhaps users who consistently vouch for accounts that are later flagged as spam could be down ranked themselves. I’m going to have to give this some more thought.

Also, I just want to say thank you to everyone for the feedback so far, I’m glad you guys think the idea has at least some merit. Really looking forward to playing around with this idea once we can start building apps.