I maybe a bit slow to realise this and given the depth of thinking perhaps it’s already been talked to…?
Thinking about how users might use the Safe Network, it struck me there maybe a simple option to satisfy the interest of users who choose not to see certain content.
For example if an address is known to be a virus, can I know to avoid that without knowing the address for it?.. Could it be possible to know some public key like address length of the real location, to use that as an exclude filter? Is the xor space large enough for this already or would it need to be a couple of lengths longer?
Initially I was thinking you just take out the middle of an xor and that becomes unguessable, where the risk is that those who want it could use blacklists but if those lists are just like public keys, in that they are too short to not be onerous in getting the private real xor, then that might be uncontroversial.
So, for example… I don’t know if xors are the same from one testnet to the next and beyond but this I know
container - safe://hyryyryudredbri17c3qiimcpkkka93m8qyp915rmho3us7esudu7kzwf1anra
file - safe://hygoynyxt9qoktgj13tgq9xm17ci1yh54ho6o15aseky3fmd88fztxrcira
but if someone did not want it, would an exclude key be long enough to match well - not risking clashes on other content AND be short enough that the difference is not practical to guess.
So, to check then a few questions seem needed…
Is the route to data always via an xorurl?.. expecting necessarily yes and that the safe://domain is not a skip around that xor.
Are the urls, for getting at the file directly AND the FilesContainer hosting it, long enough to inhibit scanning of the “missing” piece of the address?.. whether that is a trim or a public key difference.
and for future, is there a practical limit on the length of xorurls? - does a longer xor cause a problem or does it just get wrapped up into fairly the same space and distribution of nodes??
Virus detectors in the past, maybe still, used a signature mechanism to find virus’ in code.
The XOR storage system can allow people to make lists of XOR addresses (hash/signature) for the sections of a malware file. This does not apply to inserted code but chunks of a whole file. Then a system to never retrieve those XOR addresses. Yes changing one byte would defeat this, but its more for malware already stored in immutable storage.
This list can extend to “worse of the worse” files that has been identified if one wants. Or extend to a filtering system for files.
At some point the client is requesting chunks based on the xor/hash address of the chunk and the “filtering” can operate at this point as well.
Not sure this is along the lines of what you were thinking about.
but how would a blacklist work… and be shared without being a goto studybook for those who what to improve on those?
I’m thinking the lists of xyz would be at user and not some third party. So, public key equivalents to xorurls seems a simple option. Otherwise sharing of lists and opinions… and the summing and diff of those becomes less practical.
Not sure what you mean by study book.
The point is once a malware program (maybe program with malware/virus inserted) is stored on the network it is immutable. Once the malware has been detected then its chunks are added to the list.
Of course they can change it to store a new malware file, but the original is still there ready to infect those who download it to run.
Yes like virus checkers the list increases more and more.
But a study book? Without the datamap these chunks are useless and unable to be decoded. With thousands of files (maybe 10 thousand chunks) it is virtually impossible to reassemble the files by mixing and matching the chunks till they are found. And of course after a short while I’d expect there to be more than thousands of files.
Anyone wanting a study book would be 100 times better to search out the forums on the current web to find out how to do it. If they could decode the files to study then I suggest they don’t need the files since they could probably do it themselves by utilising the current web to find the needed info.
The question is how does the user make the list? Do they find all the malware they can and create the list? Or just add to their list any malware that they caught so it doesn’t happen again.
All I was suggesting is that while the user has the list of chunk addresses that the client refuses to retrieve, it is built from resources that make the lists for people to use. Adblockers are another example that do this sort of thing using web URLs in their list. The user can just use their own entries or avail themselves of many lists to make their own and add their own URLs as they see fit.
Why not just use the xor addresses of the chunks preventing the ability of people using the lists to find and decrypt the chunks/files.
For example if an address is known to be a virus, can I know to avoid that without knowing the address for it?
Instead of storing the address, you could store the hash of the address.
A hash function, for those who don’t know, always returns the same output for a given input. But the input cannot be mathematically derived from the output. It’s a one-way operation.
This is how competently-designed websites store user passwords. Instead of of storing the password itself, they store the hash of the password. Then, when the user enters their password, they run it through the hash function and check if it matches the stored hash.
So you could do something similar with your blacklist URLs. Have the client run every website the user visits through the hash function, and check if it’s on the blacklist of hashed URLs.
This seems sensible for web addresses.
But at the heart of web addresses is the storage and chunks which have hashes of encrypted data as addresses. If you used the underlying storage then the hash is already done for you
Whatever works… my limited understanding was taking that the process of making a public key is the same as a hash.
Talk above of viruses is just one example… any class of content the user chooses to not engage could be a candidate… if I subscribed to uk.gov perspective on what is out of bounds, then they provide a list of that, in a way not useful but for blocking content I choose not to want by trusting that thought. Diff can still be driven out but user has control over what lists they subscribe to by topic or by authority they trust.
Can the user hold a list that is sharable and not controversial on topics and content that is?.. I wonder there could be simple scalable and fast at the user options for this but just querying the limit of what is possible for casting those. Suggestion of blocking a peice of the file risks too many clashes… so better if centred on whole file sig??
If a part of the file is OK then no need for those chunks to be in the list. For instance if the malware is a 1K part of a 50MB executable then only the part holding the 1K would be added to the list. Of course that make the 50MB file useless but if there were a clash with some of the file (ie a new version where the initial parts are exactly the same) then only putting the one offending chunk is the safe way.
Yes if that level of targeted is possible then perhaps that becomes useful… given a check that all are received is likely necessary anyway. Though I don’t know enough about the chunking process… would be surprised if offending peice in a file lands on being the same for every file it is part of… thought that the is simple frag by 1MB size…
Still as above this is not just about viruses but whatever content by topic too… how does user have control in a simple way…
I suspect there would be an App to allow the user to add to the list, to subscribe to 3rd party maintained lists (ie d/l their list and add), and so on.
Yes but the detail… how does that work… so that the list is not accessible to the user?.. 3rd parties are not going to make available the real endpoints, where they might happily share public keys/hashes that allow them to be blocked. If user trusts 3rd party and both do not want to enable what they disagree with, that surely should be possible… without resorting to 3rd party filtering all traffic which is in no-ones interest and esp the user where they might subscribe to multiple boundaries.
Whitelists are easier… but blacklist option is useful…
There is no problem for the user to see the list.
The list is a list of xor addresses and more importantly no datamaps. Without the datamaps retrieving the encrypted chunks gives the user a bunch of unusable data. Its not just one file of chunks and rearranging them till you get it right and decode. Its thousands of file segments as chunks, so rearranging them is unfeasible. In anycase these are files the user doesn’t want to see so they won’t do it either.
Now if one list is blocking the worse of the worse then again there is not an issue as reassembling the files is hard, but it can be made impossible by only including one chunk’s address from each file and the file is not accessible if the user tries to access it. Then more importantly it becomes impossible to find the whole file’s chunks since only one chunk’s address is given and the one chunk is just as good as random bytes. [obviously if the user really wanted to force a file d/l then they disable the filtering]
The key is that the datamap is required to decode a file. Plus only having the address of one chunk in the file means no one can ever use the list to find the whole file.
And if the chunk is in the first 2 chunks then if the list causes a file to be blocked then there isn’t even partial decoding of the file happening. This is for the case where the user requested a file (via link or whatever), then they don’t get part of the file/image/program decoded and showing if image/text
Planning for worst case covers all bases…
No problem with a list, if that is not a useful list… I’m behind understanding this but accepting what you’re suggesting… though wondering whether it covers off so many files that might be just one chunk?..
I am referring to the worse of worse list which is the international signature list of the cp imagery
No file can be one chunk, three is the minimum since later chunks use the earlier ones for self-encryption and 3 is the minimum to do that since the first chunk has to use the other 2 for encryption. Each chunk uses itself and the 2 around it to create the key to encrypt that chunk.
below 3 K file size there is no encryption/chunking at the moment. I maybe wrong on the no encryption longer term.