Confusables handling

So, I just noticed CLI doesn’t seem to handle capitals in the same way the Browser seems to.

Browser perhaps one step ahead as safe://Hello becomes for safe://hello
where the CLI will not find safe://Hello.

Unclear then how the fuller set of utf-8 confusables is being handled atm and the intent for the future.

Is CLI not transposing confusables, a bug or a feature? - to perhaps allow browsers to handle as they please?.. default reaction is it’s a bug but it’s a muddy area and perhaps the CLI keeps it simple and deliberate for that reason??

Oddly unicode website down atm for the detailed list at https://www.unicode.org/Public/security/8.0.0/confusables.txt

7 Likes

It would be very weird to be able to register safe://hello, and safe://hellO, and safe://Hello, and safe://hEllo etc.

If you can’t register those as separate addresses, the CLI should probably use that list of confusables to avoid confusion. :slight_smile:

2 Likes

You could create anything, and the CLI reflects this, whereas the browser is forcing everything in the address bar to lower case before trying to resolve it.

1 Like

If all those were domains… which should/would/could the browser display…

To prevent name spoofing, name resolution should be implemented the same way in CLI and in browser.

But the problem is larger than just character case and perhaps IDNA you mentionned in another topic could help:

Yes… and that link above is the definitive detail of all UTF8 confusables.

Just found a use for github :smiley:
My archive copy now up at https://github.com/davidpbrown/test/blob/master/UTF8_confusables.tar.gz

1 Like