back to article Registry to ban Cyrillic .eu addresses even if you've paid for them

The internet registry operator for Europe's .eu domains will forcibly dump yet more internet addresses, with a decision to kill off any domains in the Cyrillic alphabet. In a note this week – provided only in English – the registry said it will forcibly delete any .eu domains that include Cyrillic letters on May 31, 2019, …

  1. mark l 2 Silver badge

    What a dickish move to pull, if your a website owner with a Cyrillic Eu domain its all well and good saying we are moving you to a new extension but how does that help your website if all existing links to your current site will 404 in a year because they aren't redirecting the old domains to the new one.

    Plus then their is the financial cost to your business, maybe paying for SEO work to be redone to the new domain, having to have any stationary, signs, merchandise with the old EU domain name redone. Also the potential loss of business because of it.

    1. gotes

      Does anyone run any serious sites on these extended character domains, though?

      1. Mark 85

        Define serious?

        1. gotes

          A site with some original content, still maintained, not a scam/malware site, not a mirror. I guess...

          I mean, does anyone open their browser and type in سرچانجن.com (or whatever) instead of typing the western script equivalent?

          1. Tomato42

            I'd say you should have a mirror running on "traditional", Latin-only, name anyway, if only to allow people from outside your little country to actually be able to visit (including expats that don't have access to PC with the correct input method installed)

            1. Anonymous Coward
              Anonymous Coward

              The original Latin alphabet, as devised by the Romans, didn't have a w or u. So strictly speaking a Latin url should not start with www, also url should be spelt vrl.

          2. Mike Shepherd
            Unhappy

            "Does anyone..."

            "...does anyone open their browser and type in سرچانجن.com (or whatever) instead of typing the western script equivalent?"

            Does anyone spend their life writing in those squiggly characters that just look weird to you? Yes they do. There's a whole world of foreign, squiggly-character writers out there, beyond your horizon. It was even there before the Latin alphabet existed. You just don't see it, because you live in your limited version of the internet with your limited character set. Get a life.

            1. big_D Silver badge

              Re: "Does anyone..."

              @Mike Shepherd I think you should read the OP again, or even the bit you quoted. His point was, if you type in the first part of the address in Cyrillic, Arabic, Hebrew, Urdu or whatever, why would you switch to Latin for the suffix (or would that be prefix)?

      2. Doctor Syntax Silver badge

        "Does anyone run any serious sites on these extended character domains, though?"

        Having read the article I'd expect there would be Bulgarian sites or sites aimed at Bulgarian speakers who do. Expansion of the acronym WWW should be a clue as to why there are sites aimed at countries, languages and scripts other.

      3. Anonymous Coward
        Anonymous Coward

        > Does anyone run any serious sites on these extended character domains, though?

        A quick browse of Bulgarian websites suggests that internationalised domains are not very popular, unlike use of Cyrillic in the path.

        Neither the Bulgarian government nor the Bulgarian NIC seem to have an internationalised domain.

        No such luck with some-day-candidates Serbia, Macedonia or Montenegro.

        So I tried the Russians. Turns out their government website sits under https://government.ru/ (nope, no HTTPS). They do have internationalised versions under правительство.рф and правительство-россии.рф ("government" and "Russian government" respectively) but those just redirect to the English-named website.

    2. big_D Silver badge

      I would say, that the Cyrillic name on the Cyrillic version of .eu and the Latin version of the Cyrillic name under the .eu domain.

      As to the European ä â°a etc. being in the "Latin" .eu domain, that only makes sense, those letters are additions to the English Latin alphabet and they still use e and u, therefore it would be .eu anyway. Greek is harder to defend.

      If there is a Cyrillic version of the .eu domain structure, it make sense to push the Cyrillic names there.

      For me, it looks like a logical move. At first glance, the only people who should be really upset are scammers using a mix of Cyrillic and Latin to dupe people into visiting "fake" sites.

    3. Anonymous Coward
      Anonymous Coward

      > if all existing links to your current site will 404

      They're hardly likely to 404 if you cannot get past NXDOMAIN.

    4. Gordan

      Not only is it a dick move, the letter ю is from russian and doesn't exist in some cyrillic alphabet languages. So it's even more of a dick move because it opens an additional can of worms, e.g. whether there should be a 3rd TLD for eu, this time ".ey" (y us cyrillic u).

      It sets a terrible precedent for effectively going back to code pages and character sets like they were before UTF-8 was invented by effectively only allowing domains that use the same character set as the extension. What's next, are they going to, say, ban škoda.eu because š isn't ASCII?

  2. Neil Barnes Silver badge

    Here's a thought...

    Let's just go back to ascii numbers instead of all these new fangled utf-8 international letters.

    More seriously - as pointed out this has been a nightmare and a black hat paradise for years. Perhaps a temporary fix might be a browser plugin that does the bells, whistles, and flashing lights thing if it finds something potentially suspicious: a url with two different scripts, or indeed a url not in a language that the browser is currently set to?

    While wholesale nuking of domain names might be a bit draconic, I'm struggling to think of a valid site that does this.

    1. Tomato42

      Re: Here's a thought...

      > a url with two different scripts

      that doesn't work, аррӏе.com is fully in Cyrillic, apple.com is in Latin script

      the whole idea with IDNs is back-asswards, everybody needs to learn Latin script anyway, for languages that use diacritics, loosing them is not a huge problem (and I speak two of them)

      for countries that don't use it, it's still actually used in them (like on road signs) because it is so popular and universal

      not to mention languages like Chinese dialects, Japanese and Korean, where you write in Latin script that is then transformed to traditional characters

      it's a cash grab by registrars, plain and simple

      1. onefang

        Re: Here's a thought...

        "аррӏе.com is fully in Cyrillic, apple.com is in Latin script"

        For what it is worth, they got rendered in different fonts in my browser, obviously different. I likely wouldn't be able to spot the difference if they where not side by side though, and I wasn't looking for the difference.

        1. MiguelC Silver badge

          Re: Here's a thought...

          Well, Firefox translates your fully Cyrillic 'аррӏе.com' to the quite different 'https://xn--80ak6aa92e.com/' URL when you click on the link

          That's what's shown on the address bar so you're only fooled into visiting (yes, I know that might be bad enough) but from then on it's on you

          1. Doctor Syntax Silver badge

            Re: Here's a thought...

            "Well, Firefox translates your fully Cyrillic 'аррӏе.com' to the quite different 'https://xn--80ak6aa92e.com/' URL when you click on the link"

            It doesn't do that here.

          2. Mage Silver badge

            Re: Here's a thought...

            If I paste https://аррӏе.com/ (Cyrillic)

            Hey there!

            This site is obviously not affiliated with Apple, but rather a demonstration of a flaw in the way unicode domains are handled in browsers. It is very possible that your browser isn't affected.

            Check out the complete blog post by Xudong Zheng for more details.

      2. heyrick Silver badge

        Re: Here's a thought...

        that doesn't work, аррӏе.com is fully in Cyrillic, apple.com is in Latin script

        Would it not be simply better to devise a translation table to trap and refuse to register names in Cyrillic that "look like" anything in Latin?

        1. Anonymous Coward
          Anonymous Coward

          Re: Here's a thought...

          For fairness and symmetry, should they also refuse to register names in Latin that "look like" anything in Cyrillic? :-)

        2. stephanh

          Re: Here's a thought...

          "Would it not be simply better to devise a translation table to trap and refuse to register names in Cyrillic that "look like" anything in Latin?"

          The Unicode consortium maintains tables of characters which are "confusable", and an algorithm to check if two strings are confusable.

          https://unicode.org/cldr/utility/confusables.jsp

          Arguably, the way URLs work could be changed to declare that two URLs which are "confusable" are actually the same. This could be done by normalizing to a single representation under the hood (Unicode calls this the "skeleton").

          Current proposed solutions are all about somehow fixing this in the client by some heuristics which should make the browser complain in certain cases. Frankly I don't think that scales.

      3. the spectacularly refined chap

        Re: Here's a thought...

        the whole idea with IDNs is back-asswards, everybody needs to learn Latin script anyway, for languages that use diacritics, loosing them is not a huge problem (and I speak two of them)

        I take when you go abroad you simply talk a little bit slower and a little bit louder? The world does not revolve around you and people don't need to learn a foreign alphabet simply because it fits in with your world view.

    2. Doctor Syntax Silver badge

      Re: Here's a thought...

      "Perhaps a temporary fix might be a browser plugin that does the bells, whistles, and flashing lights thing if it finds something potentially suspicious: a url with two different scripts, or indeed a url not in a language that the browser is currently set to?"

      I was thinking along the same lines except to make it core browser functionality to highlight any mixed scripts in URLs.

    3. Anonymous Coward
      Anonymous Coward

      Re: Here's a thought...

      > Perhaps a temporary fix might be a browser plugin

      Don't have anything else handy at the moment, but Firefox shows internationalised domain names in punycode. Which obviously is hardly going to help their popularity. :-)

    4. LeahroyNake

      Re: Here's a thought...

      We could also go back to typing in IPV4 addresses, we could jusr remember them like we all used to do with phone numbers .....

      Oh IPV6 that's a very long phone number :o

      1. heyrick Silver badge

        Re: Here's a thought...

        "We could also go back to typing in IPV4 addresses"

        Won't work.

        One IPv4 address resolves to good knows how many web sites. It's the name in the host header that lets the server tell them apart.

        1. Not an Anonymous Coward

          Re: Here's a thought...

          What about giving every website a unique IPv4 address?

          1. onefang

            Re: Here's a thought...

            If only there was enough to go around.

  3. Joe Werner Silver badge
    WTF?

    ACII stupid question and you get a stupid ANSI...

    To be honest I never liked the idea of any special characters beyond 7 bit ASCII (the printable chars, excluding whitespaces[1]). Yes, inconvenient in some languages I am familiar with, as are the missing accents. No, not a deal breaker. The inclusion of other alphabets is mostly a way for the registrars to make more money.... and then there are the homograph attacks which are indeed super nasty and really hard - or impossible in some cases - to spot. So (to stay with the example in the article) Apple would have to register the domain with the cyrillic a, but also the i-without-the-dot-replacing-the-l (a charakter commonly used in e.g. Turkish).

    (No, I do not feel terribly tolerant this fine morning, sorry)

    (title is, I think, from The Nathack Song?)

    [1] and that applies to file names as well! Especially whitespace... they have no place in a file or a path. I do not suggest going back to 8+3, but some things[2] made life considerably more difficult.

    [2] do not get me started on the decimal character, some idiot companies use your system locale to define that when exporting data, and then thy build a f'ing csv, a _comma_ separated table, with the _comma_ as the decimal character.

    1. Ken Hagan Gold badge

      Re: ACII stupid question and you get a stupid ANSI...

      "Yes, inconvenient in some languages I am familiar with, as are the missing accents. No, not a deal breaker"

      I think you'll find that for Joe Public, not being able to write your own name "properly" is a deal breaker. This really can't be handled with a "Let's all be reasonable and do it the American way" attitude.

      1. Mage Silver badge

        Re: ACII stupid question and you get a stupid ANSI...

        In many languages, a missing accent on Latin alphabet is a different word.

        It was stupid that Computers only suited the USA for so long. Even today a pure USA keyboard is key short compared to ANY european/English keyboard, The |\ beside Z is missing. It might be at the ~# location beside enter, then £ is replaced with # and maybe ¬ with ~, I forget.

        I really like the Linux Unix keyboard with AltGr on almost every key instead of just áéíóú (not even on USA Windows!), also mapping Compose to CapsLock means the Character Map tool hardly needed. That's just Latin/Roman alphabet.

        So many years Arabic, Hebrew, Greek, Cyrillic and common Asian ones a nightmare. Crazy that AFTER it was all sorted, the Kindles for years only did Roman/Latin with accents and a few Greek letters.

        The US software & HW industry adds the rest of the world, even non-US English as an afterthought.

        1. jabuzz

          Re: ACII stupid question and you get a stupid ANSI...

          Screw AltGr (especially as the cleaners seem to have made off with it on my Model M at work yesterday) and give me a compose key any day of the week. Well OK I have Scroll Lock mapped to compose as it's a pointless key in 2018. However AltGr is a horrible abomination that needs to die die die.

        2. HarryBl

          Re: ACII stupid question and you get a stupid ANSI...

          In Thai the the word 'mah' can mean one of 5 completely different things depending on the tone it's pronounced with. The Thais have 5 different words but English is stuck with 'mah'.

        3. Stork Silver badge

          Re: ACII stupid question and you get a stupid ANSI...

          Getting the Danish (and Norwegian) character "ø" was a pain for a long time, printers in the early 90es often gave you a "¢".

          But you learn the shortcuts. In UK I knew the DK ones (ØøÆæÅå), and at the moment I know those + the German ones on a PT keyboard.

          I wonder how you do Japanese or Chinese?

          1. onefang

            Re: ACII stupid question and you get a stupid ANSI...

            "I wonder how you do Japanese or Chinese?"

            I believe that's what things called "input methods" where invented for. Software shims between your keyboard and applications that allow typing with those sorts of character sets. Some or all of that may be entirely bogus, I've not dealt with them myself, only noticed as others on the development team coded for them. So I can't even enlighten you on how they actually work.

          2. heyrick Silver badge

            Re: ACII stupid question and you get a stupid ANSI...

            "Getting the Danish (and Norwegian) character "ø" was a pain for a long time, printers in the early 90es often gave you a "¢"."

            Oh, we Brits doing computer stuff in the eighties will be able to tell you plenty of stories about the trouble of getting a printer to give a '£' instead of a '#'. Sort of important to us, it being the British currency and all, but your bog standard fx80 clone would demand a blood sacrifice. Get that wrong, or try to reason with it, it'll print pages of # symbols.

            Of course, it sometimes blew up in return. Who remembers having to mentally fix software listings in magazines where you'd see stuff like "PRINT£X" instead of "PRINT#X"?

            "I wonder how you do Japanese or Chinese?"

            Dunno about Chinese. For Japanese, there are two ways, basically similar. One is you type the sounds in using the Latin alphabet, and it will change in the fly. For instance one would type "ko" and you'd see "こ". Do it again to see "ここ". Then add a "ro" to arrive at "こころ" which would magically change to "心". It's actually insanely more complicated than that, but that's the gist of it. The second method is to write in kana directly, essentially like こ こ ろ without the Latin part...

            (it means ❤, by the way)

    2. Anonymous Coward
      Anonymous Coward

      Re: ACII stupid question and you get a stupid ANSI...

      > To be honest I never liked the idea of any special characters beyond 7 bit ASCII

      Bah! If we're going to go there, my proposal is to scrap DNS altogether and revert to IP addresses. Version 6, obviously.

      Ok, so I am joking but come to think of it, telephone numbers are digits-only to this day.

  4. Anonymous Coward
    Anonymous Coward

    A typically political decision...

    "EURid claims that the forced deletion of existing domains is needed in order to protect users against what are called "homograph attacks" where domain names are registered with non-Latin scripts to make them look like all-Latin domains e.g. a Cyrillic "a" is used in place of a Latin "a" to make people think they are on the website of, say, "apple.com.""

    What is the amount of EU domains which actually try doing this? And also: how does that amount compare to the full extend of EU domains?

    It's a very typical way in which modern politicians react: don't hold the offenders accountable for their own actions. Nah, just try to forbid the whole thing and be sure to demand heavy fines if people don't comply. Whatever happened to the democratic notion of innocence until proven guilty?

    I see a very dangerous trend happening because the only reason a ruling like this gets pushed is because it's easier on the politicians, not because it's the best they can do. And worse: many innocent individuals get to suffer from this because... reasons.

    Another issue is that there's no need anymore. Modern browsers have already adapted so that they'll display those characters for what they are, you won't see "apple" anymore. That was pretty much a unanimous decision, so once again the politicians are acting after the facts.

    And is it just me or is it plain out arrogant (and bizarre) not allowing people outside of the EU to register and host a .eu domain? I mean.. I live in Holland and I can easily get a .de domain, or a .jp domain and even a .us domain if I want to. Where's the problem with that?

    1. Guus Leeuw

      Re: A typically political decision...

      Dear Sir,

      "I live in Holland..." that, according to this Twentenaar, is the problem!

      Best regards,

      Guus

  5. Voland's right hand Silver badge

    They did not even get that one right

    It should not be ЕЮ. It should be ЕС.

    Европейски Съюз - that is how it is written in the only (for now) EU language to use Cyrillic - Bulgaria. It is nearly identical +/- a letter in the language of the other two Cyrillic using candidate countries (Serbia and Monte Negro).

    1. gkodinov

      Re: They did not even get that one right

      Except that .ec (in latin) is the TLD for Equador. This was the grounds they've refused registering .бг on !

      1. onefang

        Re: They did not even get that one right

        "Except that .ec (in latin) is the TLD for Equador. This was the grounds they've refused registering .бг on !"

        That reminds me, in the source code for the Sad Life viewer (Second Life for those that have not been reading El Reg for long enough), they use the country code SL to refer to their own timezone (otherwise known as Pacific Time, or PT, coz that's where their headquarters are). While I have yet to hear any residents of Sierra Leone complaining, I don't think many have noticed.

      2. GIRZiM

        Re: the TLD for Equador

        Excellent!

        I was reading this thread, thinking "I'm gonna have to go to some serious effort to shoehorn this in - it's not gonna be easy."

        But, to my utter astonishment, here we are!

        Why was I seeking a way to shoehorn in a link to it?

        Wait and see, but it's gonna be huge I tell you ... huge!

      3. Robert Carnegie Silver badge

        Pot, kettle

        How do you get to "Equador" from .ec ? ;-)

        In English the country is "Republic of Ecuador".

        In Portuguese it is "Equador".

        The country is Spanish speaking, however. So, "Ecuador".

        The Municipality of Equador is in Brazil, which is almost all that Wikipedia (en) has to say about it.

        1. Nick Kew
          Pint

          Re: Pot, kettle

          which is almost all that Wikipedia (en) has to say about it.

          Brilliant. An update of "Mostly Harmless".

    2. frobnicate

      Re: They did not even get that one right

      It all makes perfect sense, once you realise that "ею" is instrumentalis singularis of "she"—that's what they are doing to Europe.

    3. John Brown (no body) Silver badge
      Joke

      Re: They did not even get that one right

      "It should not be ЕЮ."

      I must admit to having initially read that as .EIO and wonder if Old MacDonald, the Agriculture MEP, was behind this.

  6. sitta_europea Silver badge

    I think if you join the EU you should have to make English your official language. :)

    1. gkodinov
      Thumb Down

      That's what the Pope told St. Cyril in 867 AD too.

      But then recoiled and later St. Cyril was declared "a patron of Europe" (in 1980 AD) !

      1. Voland's right hand Silver badge

        That's what the Pope told St. Cyril in 867 AD too.

        Pope or the bishop of Prague? I do not think the two brothers which bestowed onto half of Europe the "slightly vandalized Greek alphabet" got as far as a trial before the pope. They were convicted of herecy by the bishop of Prague, the pope just countersigned the verdict.

        In any case, my history is refusing to load from permanent storage this morning :) Need more coffee.

  7. J.G.Harston Silver badge

    It's as though the EU doesn't beleive in property rights. You pay for something, you own it, and at a whim the government can take it off you.

    1. Anonymous Coward
      Anonymous Coward

      > It's as though the EU doesn't beleive in property rights. You pay for something, you own it, and at a whim the government can take it off you.

      Have you seen the present france.com domain dickery?

      1. Mark Spooner

        The EU very much believes in its right to all of what you thought was your property.

        1. Nick Ryan Silver badge

          The EU very much believes in its right to all of what you thought was your property.

          Name a national government that doesn't?

    2. Orv Silver badge

      I think in any jurisdiction, you don't own a domain name so much as rent it.

    3. John Brown (no body) Silver badge

      "It's as though the EU doesn't beleive in property rights. You pay for something, you own it, and at a whim the government can take it off you."

      Hmm, yes. In the USA it's known rather appropriately known as "eminent domain".

      1. Stork Silver badge

        and the current US president is a key supporter of eminent domain

  8. TimMaher Silver badge
    Coat

    Old MacDonald

    Had a farm.

    Eu Eio.

    I’ll get my coat.

  9. trisul

    Not really

    "plus special characters from an range of European languages, such as the German ü, the Romanian ș, and the Swedish å."

    Not really, as German, Romanian and Swedish include all the latin characters. So, there is no inconsistency there, only with Bulgarian and Greek.

  10. Mage Silver badge

    EU Xenophobic?

    Not as much as the one country that's voted to leave. Plank & splinter. Kettle vs entire kitchen.

  11. onefang

    'Would it not be simply better to devise a translation table to trap and refuse to register names in Cyrillic that "look like" anything in Latin?'

    Normally I'd say that's a bad idea, some purely innocent things might trigger that. Flag it for a human to check, sure, not flat out "computer says no". On the other hand, it's a step up from what they did do, so up voted anyway.

  12. Andy Mac
    Unhappy

    EURid

    Ne’er was there a more appropriate name.

  13. jabuzz

    Bulgaria

    One imagines that they are shortly going to find out that what they are proposing is in fact with a Cyrillic alphabet country a full EU member state (and a couple more as prospective members) illegal and will be shortly reversing their decision. If they don't they will likely find themselves in front of the ECJ and loosing very badly. Frankly those who proposed this and all those who signed off on it need identifying and sacking.

  14. EURid vzw

    EURid response

    This action is in line with what we communicated when the .ею extension launched on 1 June 2016. The .ею clones also had a free 3 year term. This move follows the industry best practice of 'no script mixing' IDNs, which is an important element of online security.

  15. Camilla Smythe

    You can all Fuck Off and...

    Use Emojis.YourFlagEmoji

    Sorted.

  16. N2

    why not ban

    .bid, .date, .shite, .tools, .etc .as .some .fucking .wassock .gave .these .away .to .all .the .fekking .spammers

    .knuts

  17. DCFusor

    Not saying this is good, I'm saying it IS.

    You didn't get what you didn't pay for. Late adoption has consequences too.

    Not saying it's good, but it is what IS.

    Imagine the nightmare of having umpty different languages' words to mean "printf()" - or any other library function name for any computer language - it's not all just about the end user interface, someone has to write the code. Easy to whine...but no one spent the money early on, just cruising on the work done in the US. MBAs everywhere.

    Someone once chanted "developers, developers, developers" and that was in a case where there was no native language barrier - and it was hard anyway.

    1. Strangelove

      Re Printf() - but not all programming languages are English based

      BASICOIS has French keywords for example (and is quite like basic) and there are several cyrillic languages that I know not what they do.

  18. T-man

    Oh yeah! Instead of prohibiting registration of domain names that contain mix of Cyrillic and Latin chars to address the concern of possible "homograph attack", let’s delege all Cyrillic domain - another triump of beurocracy in Brussels...

  19. Anonymous Coward
    Anonymous Coward

    Stick with .uk domains

    Only alphanumerics and hyphens permitted (although hyphens shouldn't be allowed IMHO). Non-uk citizens/organisations permitted, yes even those in the EU after BREXIT!

    Can't even have café.co.uk

    Problem solved (almost: a few possible ambiguities like lower case L and number 1 one).

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon