Suggestion: SAFE Network URL’s support for other languages/scripts through internationalized protocols

I’ve asked the question of IDN support before…

But I had not yet considered the following and thus will ask the question,
can IDN Homograph attacks be solved at the protocol level?

I’m imagining URL’s thus to look like this:

safe://foo.bar
sûr://fou.barre
sicher://fü.bar
sekuro://fŭo.baro
sábháilte://fet.barra
Ασφαλή://φου.παρ
сейф://фу.бар
सुरक्षित://फु.बार
安全://福。把如
安全性://フ、バル
안전://푸.바
ปลอดภัย://ฟู.บ้า
sur://fx.bâr

If so, I suggest that such protocols should be implemented.

8 Likes

Good question to ask! Just checked it quickly: In the RFC for public names it isn’t explicitly mentioned what kind of strings are allowed. Only that the public name is a string:

  • The Public Name Map is an RDF AOD w/specific type tag ( 1500 ) stored at the sha3 hash of the Public Name string shahash3('Public Name') .

Thus in theory it is supported, but I’m not sure about the exact URL and string parsing in the SAFE Browser.


I don’t think it makes sense to ‘solve’ homograph attacks at the protocol level. There are numerous solutions that can be applied at the client-side listed on Wikipedia and it seems that it’s rather elegant to solve it in the way Chrome and Firefox do. (Either by checking the user language or display in punycode.)

Also, the URLs you propose have translations of the safe:// scheme, but according to the IETF RFCs a scheme is limited to the latin alphanumeric characters:

scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )

ALPHA = %x41-5A / %x61-7A ; A-Z / a-z

6 Likes

I’m not sure if the API imposes restrictions on characters or string length, but the WHM definitely does when creating a public name. As for the CLI?

I can’t remember if the browser does any checking, but it makes sense that it should do sensible things as you suggest @bzee. It may be worth creating an issue to ensure it gets looked into at some point.

@folaht how about creating an issue as a feature request that links to this topic, and notes that mitigation can be done in the client as @bzee has explained?

I think it does.
Client-side solutions only work when all clients adhere to the same solutions.
That includes e-mail, chatrooms, games and any other application.
On top of that, punycode completely defeats the purpose of having your site accessible in a different language.
Imagine this awesome Chinese network that you want to use that’s previously only in Chinese.
One day, they make their make their network available for latin characters as well.
Would you like to see your URLs be turned into gibberish Chinese characters whenever someone types in bzee.site?
No.

2 Likes

I think you misinterpret the proposed solutions. In the use-case you describe it would just show bzee.site as there are no mixed scripts. Additionally, according to the section I linked to, Chrome used to check the User’s preferred language.

Punycode is also helpful when it seems a user would not recognize these characters.

Perhaps you’re right, describing it that way would require a protocol for all clients to follow. I’m not sure now what exact definition of protocol I had in mind. If someone shares a mixed script URL, then what will protect the user? Will the browser check this? Or would the SAFE APIs just throw an error with mixed scripts? What is the ‘protocol’ here?

What if it’s not a mixed script, but a name with only Cyrillic characters? Then who or what is supposed to protect the user here?

The more I think about this, the more I start to get drawn to the pet name system again.

Less than half of the world’s population is monolingual. I regularly use four different languages and three different character sets, occasionally more. And this goes for all different applications. To me concepts like “user language” or “locale” are usually just confusing as they tend to presuppose just one language. I may well write in Russian, while using Finland time but the Swedish currency Krona, when commenting on something that was written in English in the US. Please don’t forget about us multilinguals. :pray:

3 Likes

I should have said user’s (or browser’s) preferred language(s). You can setup a browser to accept multiple languages in order of preference.

Good to be reminded of use cases!

1 Like

Dat geldt ook voor mij.
Et je veux créer un idéolangue « fonêmik » pour l’UE basé sur le langue de l’amour, de sorte que je puisse y mettre le drapeau européen.
Vivû l’yrofoni !

1 Like

Eläköön syömäpuikko!

[…Message deleted…]

1 Like

Let’s continue here so as not to spam a serious topic.

Frankly, this would be a great way to cause much confusion. I expect the Safe Browser will just default to safe:// when nothing is given as it is customary for Chrome or Firefox to just assume HTTP if nothing is specified. Why complicate things for no benefit then?

Moreover, it’s also a terrible idea from a technological point of view. If we wanted to link to the Safe Network from an HTTP site, an iOS or Android app, or relegate protocol handlers in other ways, we’d have to register roughly a thousand different ones. Just, no.

2 Likes

Am I missing something in this post?
“When nothing is given?” seems to have no context and all
three sentences seem to have no relation with another.

To give a summary from what I interpret is the following:

  1. SAFE network URL support for other languages is confusing. (headline)
    1.1 When I type in nothing I expect my browser to choose a default protocol. (…okay, I expect mine to do nothing when nothing is typed in, but this not say anything about a great way to cause much confusion. Maybe the next sentences will clear that up?)
    1.2 Why complicate things then for no benefit? (A jump from non-sequitor to non-sequitor)

Anyway, I gave the benefits in the opening post and that is that it allows other languages and scripts without resorting to “ducktape solutions”. (Punycode)

Okay, so when China becomes the new global power and undoubtedly be used in all aspects of computer science like English today, should the protocol be replaced to support Chinese only or maidsafe be disbanded altogether?

Hold on a second…

what’s http:// in other languages?

1 Like

Doesn’t exist. They use the punycode method there and that’s been critisized for homograph attacks. Or do you mean what the abbreviation is in other languages?

The context was the thread itself. Having prefixes in many languages is a terrible idea and it would cause much confusion.

Okay, so that’s an “if” and not a “when”. I hope China will split into many smaller countries and stop causing so much problems. I hope it will be done without loss of life. I have similar feelings about all large countries, by the way.

Anyway, I’ll dust off my brown coat if the time comes but we’re good with just English for the time being. Other than that, I already made my point about the insanity of having to register a godless number of protocols.