Category: Programming

  • Fast ads matter blogpost on web.dev


    While few people love ads, most people don’t mind them and understand that they’re one way to keep the internet open and accessible to as many people as possible. It’s hard to argue against ads that load fast and don’t interrupt your browsing experience (and respect your privacy).

    There are a couple of ways to finance content you publish online, popular choices include display ads and pay walls. As a user I usually prefer ads over paywalls, as long as the ads are not disturbing my overall experience.

    This is just a short note to link to another blog article about why and a bit of how you can make display ads faster on your website that I wrote together with my colleagues.

    Read it here: Fast ads matter

  • Filenames with non-ascii letters

    Filenames with non-ascii letters

    Let’s start off with a quick question!
    Can you spot the difference between the two rows below?

    /images/räksmörgås.jpg
    /images/räksmörgås.jpg

    I couldn’t.

    My browser however insisted that there was no “räksmörgås.jpg” on the webserver – a file that from my point of view clearly was there.

    Since the error only occurred with filenames containing the letters å,ä & ö I at first suspected that there was an issue with mixing up UTF-8 and ISO-8859-1, however, this wasn’t the case.

    My next course of action was to urlencode the requested filename and the filename from the server, and this is when I found something interesting!

    ra%CC%88ksmo%CC%88rga%CC%8As
    r%C3%A4ksm%C3%B6rg%C3%A5s

    Now you see the difference, right?

    The reason behind the difference is that there are multiple ways to represent the common Swedish letters å, ä and ö (and other non-ascii letters aswell – but for readabiltiy, let’s keep it short).

    If we look at the char codes for three letters that were causing trouble in my case:

    Letter | Mac OSX   | Linux
    -------+-----------+------
    å      | 97 + 778  | 228
    ä      | 97 + 776  | 229
    ö      | 111 + 776 | 246
    

    Notice the pattern?
    Mac uses ”a” (97) and ”o” (111) and then adds the circle (778) or the dots (776). Linux however has a diffrent char entirely.

    There are multiple standards for representing characters in unicode, the competing normal forms here are ”Canonical Decomposition” (NFD) and ”Canonical Composition” (NFC) – and I needed to convert between the two.

    My solution

    I had this error on a server where files had been stored on a Mac and then re-uploaded to a Linux server. I didn’t have shell access to the server so I fixed it by using the following PHP-code that looped through all affected files and updated their names:

    <?php
    // Normalizes all filenames in folder
    foreach(glob("*", 2) as $file){
      $after = Normalizer::normalize($file, Normalizer::FORM_C);
      if($file !== $after){
        rename($file, $after);
      }
    }
    

    You could probably use iconv or similar tools to achieve the same thing easier if you’ve got shell-access (or php exec is enabled).

    Fun fact: Räksmörgås is a commonly used Swedish word used for testing that the non-ascii ÅÄÖ is working correctly.