xecrets

Installing Ubuntu 24.04.1 on Windows 11 Hyper-V

2024-11-25T04:08:00.000-08:00

Ubuntu 24.04.1 on Hyper-V

Amazingly enough, it turns out to be non-trival to get Ubuntu 24.04.1 running on a 4K high resolution display on Hyper-V. During the process I experienced crashes during installation, disk read errors, hangs, black screens during boot and login - and difficulties achieving a connection with full resolution on my 4K display.

Since our file encryption app Xecrets Ez is built for Linux in addition to Windows and macOS, we need an environment where we can test and develop it for Linux, and the choice fell on Ubuntu as it's touted as perhaps the most common desktop environment for normal users.

In some configurations, the high resolution was just not available, specifically this was where the linux-azure cloud package was installed. Apparently this only supports lower resolutions out of the box. I'm sure it's possible to tweak this, but as everything else Linux it's practically impossible to find useful documentation.

When I finally got an image working reasonably well, it turned out that the screen scaling was not saved between reboots. I have no idea why, but I did find a remedy.

What follows is a basic step-by-step action list that at least got things working for me with no hitches during the process. Your mileage may vary, but there should be some useful hints here.

Here goes.

Download a .iso image

Get the current image from https://ubuntu.com/download/desktop

"The latest LTS version of Ubuntu, for desktop PCs and laptops."

Download 24.04.1 LTS (At the time of writing this was current)

This downloads for example: ubuntu-24.04.1-desktop-amd64.iso

When downloaded, startup Hyper-V Manager, and under the Actions pane select:

Quick Create...

Select the ISO image:

Local Installation Source -> ubuntu-24.04.1-desktop-amd64.iso

Uncheck "This virtual machine will run on Windows (Enables Windows Secure Boot)"

Under More options:

Name: "Ubuntu 24.04 LTS" (...or whatever you want to name it)

Click: Create Virtual Machine

Edit settings: Uncheck Use Automatic Checkpoints (may be the cause of black screens and hangs), Integration services check "Guest services" ("Ensuring it then says All services offered")

In the window "Virtual machine created successfully", click "Connect", and then start the virtual machine.

"Try or Install Ubuntu", accept all default choices except language and keyboard according to your situation.

Reboot as instructed, complete installation

In Ubuntu: Software Updater - install all updates, "Restart now"

Open terminal:

# This was part of the hard-to-find secret sauce...

sudo apt update

sudo apt install linux-image-extra-virtual

sudo nano /etc/default/grub

Edit

# Use your native resolution....

GRUB_CMDLINE_LINUX_DEFAULT="quiet splash video=hyperv_fb:3840x2160"

Save

sudo update-grub

Shutdown

In the Windows host, open powershell in administrator mode and using your VM name and your native resolution as above:

set-vmvideo -vmname "Ubuntu 24.04 LTS" -horizontalresolution:3840 -verticalresolution:2160 -resolutiontype single

Startup the Ubuntu virtual machine again and log in.

If scaling is needed, set scaling in settings to for example 200%, and also to make it "stick" after reboot

open terminal and do:

# This was also non-trivial to figure out...

gsettings set org.gnome.desktop.interface scaling-factor 2

Reboot and confirm that it all works as expected and no settings have reverted.

Create a checkpoint

Now is probably a good time to shutdown again and create a manual checkpoint.

Credits

A lot of the above is found in various parts in this thread: https://superuser.com/questions/1660150/change-screen-resolution-of-ubuntu-vm-in-hyper-v . But it's about various versions and distributions, and there are also some digressions. No mention of crashes and black screens, which did appear to be reduced (but still happen) once automatic checkpoints were disabled. (Dynamic memory may be another culprit here, still trying to verify this.) This post is specifically about Windows 11, Hyper-V and Ubuntu 24.04.1 .

Other notes

Slightly off-topic, but worth mentioning if the goal is to setup a development and/or test environment under Ubuntu 24.04 in a Hyper-V virtual machine:

Don't install the snap version of VS Code. It messes up the environment for .NET framework apps when debugging, moving the location of special folders into the 'snap' folder etc. Download the binary and install it.
Don't install the snap version of the .NET Framework. If you require higher bands than the .1xx band (which is the only supported by the Canonical repository), then download the binary and install it manually.
If you want to enable enhanced mode when connecting to the VM, thus getting copy/paste to work as well as higher screen resolutions, you can actually forego most of the above, as you need to run xrdp (or equivalent) for this. There are quite a few how-to sites available, some with scripts. I've found that this script actually works fine for 24.04 also. It's referenced on this site. It turns out you may actually need to really set the transport type, even with Windows 11 and generation 2 VMs like this: Set-VM -VMName <Name-of-VM> -EnhancedSessionTransportType HvSocket in a PowerShell window as administrator.

How avoid being asked for a passphrase for SSH local signing of git commits on Windows 11?

2024-07-24T03:51:00.000-07:00

This is not about how to authenticate to github or any other git repository. That works fine, see my other blog post about that. It's a little dated but mainly it's still relevant.

I want sign my commits to for example https://github.com/xecrets/xecrets-cli, and I'd like to do this using an SSH key - not a GPG-key. I already have a SSH key, and signing works fine, and commits show up as 'Verified' on github.

Using commonly found instructions on the Internet, after creating a SSH key, and uploading it to github (you need to do it once again specifically to use as a signing key, even if it's the same one you use for authentication) do the following:

git config --global user.signingkey /PATH/TO/.SSH/KEY.PUB
git config --global commit.gpgsign true
git config --global gpg.format ssh

However I don't want have to type the passphrase to the private key file every time, and this was much harder to find.

As it turns out, git will use it's own ssh tool for signatures, unless told otherwise. On Windows 11, this does the trick:

git config --global gpg.ssh.program "C:\Windows\System32\OpenSSH\ssh-keygen.exe"

To be able to verify commits locally, you also need to create a file with allowed signers. It can be named anything and placed anywhere, but the following seems like a good place.

git config --global gpg.ssh.allowedSignersFile c:/Users/[YourUserName]/.ssh/allowed_signers

The allowed_signers file itself, contains of one or more lines with a list of comma-separated principals, i.e. typically e-mail addresses and a public key, i.e. something like:

you@somewhere.com ssh-ed25519 AAAA...

Thanks to the StackOverflow community and specifically this answer found there.

What everyone is missing from the CrowdStrike Falcon incident

2024-07-23T11:44:00.000-07:00

Introduction

25 years ago, I said (really, I did!) that automatic software updates pose a greater risk than malware (ok, at that time we really only had viruses).

Many incidents since than has proven this right, but none more so than the CrowdStrike Falcon Blue Screen of Death (BSOD) incident on July 19, 2024.

Since as usual the company won't release any detailed information on what really happened, we'll have to rely on other sources. I found that Dave Plummer's account on YouTube was very good, and trustworthy.

What really happened?

In summary, after looking at crash dumps and based on his own knowledge of how the Windows kernel works, Dave Plummer explains that what happened was probably the following.

CrowdStrike has a need to check not only file signatures, but behavior in general of software on the system. To do this they have created a device driver, that doesn't actually interact with any hardware, but has achieved WHQL release signature. This means that it's very likely reliable, and certifiably from a known source. They also flagged it as boot-start driver, meaning it's really needed to boot the system. This is of course to make sure it really does get loaded, which is great - as long as it doesn't crash the system. Which, unfortunately, it did.

So how did this signed, certified, driver crash the system? In, short, by CrowdStrike hacking the protocol and Microsoft allowing it to happen. We don't know exactly what they do, but the challenge they have is that they feel the need to frequently update what behavior to watch for, and in order to do so, they essentially need to be able to update program logic frequently. They could do so by building a new driver with the new logic, and getting it WHQL signed. Two problems for them here. First, it takes time to get a driver certified. Secondly, updating a driver is not done on-the-fly, so a reboot is required.

The "solution"? To provide the driver with instructions that define the logic to execute, in other words, one form or another of P-code - or even machine code! Thus, they can keep the same driver, but update the logic by conceptually having the driver "call out" to external logic. This is what CrowdStrike misleadingly calls a "content update". I would call it a code update.

Essentially, by allowing the driver to read and perform instructions based on external "content", regardless of what you call it, they effectively bypass the whole point of WHQL certification of kernel mode drivers.

In the end though, it appears that it's just a trivial embarrassing bug in the driver that caused the crash itself, in turn triggered by some equally trivial embarrassing process error during CrowdStrike deployment of "content updates".

The "content update" that CrowdStrike sent out was full of zeroes. Nothing else. Obviously not the intended content. And this simple data caused the driver to crash, in turn causing the system to crash since it doesn't have much choice in this situation.

As the driver is also marked as a boot-start driver, it'll always get loaded on reboot, even if it crashed last time. This is what makes it so time-consuming to fix, the driver can't just be flagged as faulty.

This is just the mark of plain really really bad software. This software runs in kernel mode with full privileges and can do anything. And giving it a bunch of zeroes as input crashes it. Just. Not. Simply. Good. Enough.

Imagine what a creative hacker could do with that, inserting something more malicious than zeroes? How does full system control at kernel mode sound? This is of course speculation, but it's highly likely that it's possible with the current version of the driver.

What everyone seems to be missing...

In all the aftermath and all the comments, there are two glaring omissions in the analyses according to me.

The big failure

How is it possible that someone sends out an update affecting the behavior of kernel mode code, all at once, simultaneously, to millions and millions of systems around the whole globe at once!?

I've participated in many roll outs, and never would I allow a big-bang roll out like this. CrowdStrike should be charged with negligence for having this type of process. It's just plain irresponsible.

The only reasonable way to do global roll outs, especially for kernel mode code, is to stagger it. Start with 10 systems. What happens? Do they respond properly? Then a 100, then a 1000, etc. And since it's a global roll out, take time zones into consideration! Now we saw how the problems rolled around the globe, starting in Australia, with reports of downed systems as the working day started there.

I understand they want to get updates out at speed, but this is ridiculous. In the end, this caused way more damage than any possible threat they could have stopped by this procedure.

The smaller failure

The smaller failure, but still not mentioned by analysts is the fact that this kernel mode driver that accepts external input apparently does no input validation at all worth the name. Perhaps the content is digitally signed (I don't know, but no-one has mentioned it either), but even if it is, this type of software must assume that it can't trust external content. According to Dave Plummer, the "content update" in question was all zeroes, so at least no embedded digital signature, apparently.

If it's not - all an attacker would have to do is to deposit a file in %WINDIR%\System32\drivers\CrowdStrike with a name such as C-00000291*.sys containing zeros - and the system becomes unbootable without manual intervention!

Once again, this is just not good enough, and should be cause for some lawsuits in the coming months. A manufacturer of critical security software executing in kernel mode should not be allowed to sell code this bad without financial consequences.

Localizing a .NET console or desktop application

2024-05-15T02:04:00.000-07:00

Having gone through countless different solutions to the ever present problem of localizing .NET applications, I think I've finally found a good solution.

The main driver this time was the need to provide a convenient way to translate texts used in the Xecrets Ez desktop app for easier file encryption. This is developed with .NET 8 and Avalonia UI, which apart from C# uses it's own XAML format, AXAML with a need to reference translatable texts there too.

.NET has of course since forever supported localization, sort of, through the resource compiler, .resx files and satellite assemblies.

The problem has always been, and remains to this day, how do I get the actual translations done? Translation is often not done by the original developer, and often by people who definitely don't want to fire up Visual Studio and the resource editor. Never mind that it doesn't even have a proper translation view, with a source language and target language in the same view.

There have been various attempts to make resx-editors for translators, but they were typically Windows-only desktop applications. Translators often like apples.

Since maybe 10 years the number of web based translation services have been increasing year by year, but the support for the resx file format is often spotty or limited, or non-existing.

I've long been looking at somehow leveraging the huge amount of services and software that use the .po format for texts, all based on the gettext utilities and specifications. However, the old C-style extraction of text from the source has never appealed to me, and also requires a lot of parsing of C#/.NET code that is just bound to go wrong. Not to mention various XAML dialects that might also need parsing for the gettext extraction process to work.

Then when looking for a library to work with .po files, thinking to roll my own somehow, deep inside a related sample project I found POTools!

This little hidden gem can do a gettext-style extraction, but from a .resx file! Now things started to fall into place.

Doing the extraction, or discovery if you like, of translatable strings from a structured source like a .resx file is easy, solid and dependable - and fits the general development model of .NET better than wrapping all texts in T() method calls, similar to the original C-style gettext mode of operations. It also enables easy re-use of strings in multiple places without any problems.

With POTools I could now easily extract a template .pot file from the already existing Resources.resx file, and start looking for a suitable translation tool. There are many such out there, but for my purposes I went with Loco, which offers just the right set of capabilities at the right price. The real point is that once you go with the .po/.pot format for your translations, there's a huge eco system out there to help you out!

As a minor issue, I decided to create a separate resource file, InvariantResources.resx, to hold the texts that by definition should never be translated, but still are suitable to keep in a resource file.

So, now I have a bunch of translated .po files, as well as the original .resx file with the original texts. How to tie all this together?

Using another part of the .NET building blocks for localization, IStringLocalizer offers a nice interface that fits well. Using sample code from the Karambolo.PO library that can read .po files, it was trivial to create a POStringLocalizer implementation.

This is now available as a nuget package for your convenience! There's sample code there on how to wire it all up, as well as in the github repository.

So, we now can use the Portable Object file format and the gettext universe of tools and users, while still maintaining a good fit with .NET code and libraries.

Thanks go to the author of Karambolo.PO library and POTools!

ConcurrentDictionary.GetOrAdd() may not work as you think!

2021-05-10T10:50:00.005-07:00

It's concurrent - not lazy

We had a problem with random crashes in a customers web. It was not catastrophic, it would get going again but still it was there in the logs and we want every page visit to work.

A bit of investigation with an IL-code decompiler followed, which by the way is absolutely the best thing since sliced bread! I found code equivalent to the following in a third party vendors product:

private readonly ConcurrentDictionary<string, Type> _dictionary
    = new ConcurrentDictionary<string, Type>();


private readonly object _lock = new object();

public Type GetType(string typename)
{
    return _dictionary.GetOrAdd(typename,
        (t) =>
        {
            lock (_lock)
            {
                return DynamicallyGenerateType(t);
            }
        });
}

The thing here is that DynamicallyGenerateType() can only be called once per typename, since what it does is emit code into an assembly, and if you do that twice you get a System.ArgumentException: Duplicate type name within an assembly .

No-one wants that, so the author thought that it would be cool to use a ConcurrentDictionary<TKey, TValue> since the GetOrAdd() method guarantees that it will get an existing value from the dictionary, or add a new one using the provided value factory and then return the value.

It looks good, reasonable, and works almost all of the time. Key word here is: almost.

What the concurrent dictionary does is it uses efficient and light-weight locking to ensure that the dictionary can be concurrently accessed and updated in a consistent and thread safe manner.

It does not guarantee a single one-time lazy call to the value factory used to add a value if it's not in the dictionary.

Sometimes, under heavy initial load, the value factory passed as the second argument to GetOrAdd() will be called twice (or more). What the concurrent dictionary guarantees is that the value for the provided key will only be set once, but the value factory may be called multiple times with the result thrown away for all calls except the race-winning one!

This is clearly documented but it's easy to miss. This is not a case of the implementation not being thread safe, as is stated in some places. It's very thread safe. But the value factory may indeed be called multiple times!

To fix it, add a Lazy-layer on top, because Lazy<T> by default does guarantee that it's value factory is only called once!

private readonly ConcurrentDictionary<string, Lazy<Type>> _dictionary
    = new ConcurrentDictionary<string, Lazy<Type>>();

private readonly object _lock = new object();

public Type AddType(string typename)
{
    return _dictionary.GetOrAdd(typename,
        (t) =>
        {
            lock (_lock)
            {
                return DynamicallyGenerateTypeLazy(t);
            }
        }).Value;
}

Now, although we may instantiate more than one Lazy<Type> instance, and then throw it away if we loose the GetOrAdd-race, that's a minor problem and it works as it should.

Please note that this is only true as long as you use the default LazyThreadSafetyMode.ExecutionAndPublication mode.

The additional lock may look confusing, but it was in the original code and makes sense in this context, because while the concurrent dictionary and lazy layer guarantees that only one call per value of 'typename' is made to DynamicallyGenerateTypeLazy(), it does not guarantee that multiple threads do not call it it concurrently with different type names and this may wreak havoc with the shared assembly that the code is generated to.

SSH with TortoiseGit and Bitbucket or GitHub on Windows 10

2021-05-04T12:33:00.003-07:00

Memo to self

It's always complicated to remember the steps necessary to get SSH working, and there are some idiosyncrasies as well. This guide may help you, I'm sure it'll help me the next time I need to do this myself.

Password-based login with HTTPS is starting to be obsolete, and it's less secure. Also with the nice SSH agent in Windows 10, you only need to enter the password once - ever.

Generate a key pair

Open a command prompt and run the ssh-keygen command, to generate a private and a public key file. Accept the defaults.

Enter a strong password for the private key file when asked, and ensure that you store it securely in your password manager.

This should create files in %USERPROFILE%\.ssh named id_rsa (the private key file) and id_rsa.pub (the public key file).

Enable and start the OpenSSH Authentication Agent Service

Nowadays it is shipped with Windows 10, but it's not enabled by default. So start your Services gadget and ensure the service is set to startup automatically, and it's running.

Add the private key to the SSH Authentication Agent

In the command prompt, type ssh-add . It should select the default ssh key id_rsa, and ask for the password you entered previously.

(If you get the error message "Can't add keys to ssh-agent, communication with agent failed", there seems to be an issue with certain Windows distributions. For whatever reasons, the following workaround appears to work. Open a new command prompt but elevated with Run As Administrator. Then type:

sc.exe create sshd binPath=C:\Windows\System32\OpenSSH\ssh.exe .

Then exit the elevated command prompt and try again to do the ssh-add in your normal command prompt.)

Save the public key to Bitbucket...

Open the file %USERPROFILE%\.ssh\ids_rsa.pub in Notepad, Select All (Ctrl-A) and Copy (Ctrl-C). Paste it into this dialog in Bitbucket, Your Profile and Settings -> Personal Settings -> SSH keys -> Add key:

The Label is just anything that makes it easy for you to remember what key it is. Perhaps todays date, and the name of the computer you have the private key on can be a good start. Or just "My public SSH key" works too.

...and/or save the public key to GitHub

Go to Settings -> SSH keys -> New SSH key

The Title has the same meaning as Label for Bitbucket, see above.

Remove any Credential Helpers

Git credential helpers may conflict with the use of SSH keys, and there is no need for them anyway, so remove them from TortoiseGit in the Settings -> Git -> Credential menu so it looks like this:

Tell Git where to find SSH

Set the environment variable GIT_SSH to C:\Windows\System32\OpenSSH\ssh.exe . Right-click "This PC" -> Properties -> Advanced system settings -> Environment Variables... -> New...

Restart explorer (Task Manager -> Details -> explorer.exe right-click -> End Task, then File -> Run new Task -> Open: explorer -> OK) , or logout and login, or restart your computer.

Update origin URL to use SSH

Finally, update your repos origin to use SSH instead of HTTPS. The easiest way is to copy the part after 'git clone' in the Bitbucket "Clone" feature.

Click the "Clone" button, Select SSH and then the URL part of the git clone command suggested, and paste it in TortoiseGit Remote origin for the repo:

Done! Now you can enjoy password-less use of git with Bitbucket and/or GitHub.

Thinking about cacheability

2021-05-03T23:24:00.000-07:00

Performance, Cacheability and Thinking Ahead

Although it's very true that "premature optimization is the root of all evil" (Hoare/Knuth), this should never be taken to mean that writing inefficient code is a good thing. Nor does it mean that we should ignore the possible need for future optimizations. As all things, it's a question of balance.

When designing APIs for example, just how you define them may impact even the possibility for future optimizations, specifically caching. Although caching is no silver bullet, often it is the single most effective measure to increase performance so it's definitively a very important tool in the optimization toolbox.

Let's imagine a search API, that searches for locations of gas stations within a given distance from a given point, and also filters on the brand of the station.

(In a real life application, the actual search terms will of course be more complex, so let's assume that using an underlying search service really is relevant which is perhaps not necessarily the case in this literal example. Also, don't worry about the use of static methods and classes in the sample code, it's just for show.)

[Route("v1/[controller]")]
public class GasStationV1Controller : ControllerBase
{
    [HttpGet]
    public IEnumerable<GasStation> Search(string brand, double lat, double lon, double distance)
    {
        return SearchService.IndexSearch(brand, lat, lon, distance);
    }
}

We're exposing a REST API that delegates the real work to a fictitious search service, accessing an index and providing results based on the search parameters, possibly adding relevance, sponsoring and other soft parameters to the search. That's not important here.

What is important is that we've decided to let the search index handle the geo location part of the search as well, so we're indexing locations and letting the search index handle distance and nearness calculations, which on the surface of things appear to make sense. The less we do in our own code, and the more we can delegate, the better!

But, unfortunately it turns out this is a little too slow, and also we're overloading the back end search service which has a rate limiting function as well as a per-call pricing schedule so it's expensive too. What to do? The obvious thing is to cache. Said and done.

[Route("v2/[controller]")]
public class GasStationV2Controller : ControllerBase
{
    [HttpGet]
    public IEnumerable<GasStation> CachedSearch(string brand, double lat, double lon, double distance)
    {
        string cacheKey = $"{brand}-{lat}-{lon}-{distance}";
        return ObjectCache.GetOrAdd(cacheKey, () => SearchService.IndexSearch(brand, lat, lon, distance));
    }
}

Now we're using our awesome ObjectCache to either get it from the cache, or if need be, call the back end service. All set, right?

Not quite.

The location that we're looking to find near matches to is essentially where the user is, which means there'll be quite a bit of variation of the search parameters. In fact, there is very little chance that anything in the cache will ever be re-used. The net effect of our caching layer is just to fill server memory. We're not reducing the back end search service load, and we're not speeding anything up for anyone.

The thing to consider here is that when we're designing an API that has the potential of being a bottleneck in one way or another, we should consider to make it possible to add a caching layer even if we don't to begin with (remember that thing about premature optimizations).

Avoid designing low-level API:s that take essentially open-ended parameters, i.e. parameters that have effectively infinite variation, and where very seldom a set of parameters is used twice. It's not always possible, it depends on the situation, but consider it.

As it turns out, our only option was to redesign what we use the search index for, and move some functionality into our own application. This is often a memory/time tradeoff, but in this case, keeping up to a 100 000 gas stations in memory is not a problem, and filtering them in memory in the web server is an excellent option.

This is how it looks like now, and although we're obliged to do some more work on our own, we'll be fast and we're offloading the limited and expensive search service quite a bit.

[Route("v3/[controller]")]
public class GasStationV3Controller : ControllerBase
{
    [HttpGet]
    public IEnumerable<GasStation> Search(string brand, double lat, double lon, double distance)
    {
        string cacheKey = $"{brand}";
        return ObjectCache.GetOrAdd(cacheKey, () => SearchService.IndexSearch(brand))
            .Select(g => g.SetDistance(lat, lon))
            .Where(g => g.Distance <= distance)
            .OrderBy(g => g.Distance);
    }
}

Now we have more manageable set of search parameters to cache, and we can still serve the user rapidly and without overloading the search service and or budget.

Taking this a step further, we'd consider moving this logic to the client, if it's reasonable since then we can even let the HTTP response become cacheable, which can further increase the scalability and speed for the users.

In the end performance is always about compromises, but the lesson learned here is that even if we don't (think) we need optimization and caching at the start, we should at least consider leaving the path for it open.

Weird random constants in GetHashCode()

2021-04-29T08:34:00.002-07:00

Random integers in GetHashCode() for C#/.NET

When you ask Visual Studio to generate Equals() and GetHashCode() for C#/.NET, as well as when you inspect other code, you often see addition and multiplication of various seemingly random constants as part of the calculation. Here's an example of how it can look:

public class MyTypeA
{
    public MyTypeA(string value) => ValueA = value;

    public string ValueA { get; }

    public override int GetHashCode() => 250924889 + EqualityComparer<string>.Default.GetHashCode(ValueA);
}

public class MyTypeB
{
    public MyTypeB(string value) => ValueB = value;

    public string ValueB { get; }

    public override int GetHashCode() => -1007312712 + EqualityComparer<string>.Default.GetHashCode(ValueB);
}

public class MyFixedByVsHashCode
{
    public MyFixedByVsHashCode(string value)
    {
        A = new MyTypeA(value);
        B = new MyTypeB(value);
    }

    public MyTypeA A { get; }

    public MyTypeB B { get; }

    public override int GetHashCode()
    {
         int hashCode = -1817952719;
         hashCode = hashCode * -1521134295 + EqualityComparer<MyTypeA>.Default.GetHashCode(A);
         hashCode = hashCode * -1521134295 + EqualityComparer<MyTypeB>.Default.GetHashCode(B);
         return hashCode;
    }
}

The above example was generated using Visual Studio 2019 for .NET Framework and contains a number of seemingly random strange integer constants: 250924889, -1007312712, -1817952719 and -1521134295. If you generate code for .NET Core or .NET 5 it may look a little different, but under the hood it's similar.

Executive summary: The reason for these numbers is to reduce the risk of collisions, i.e. the number of situations where two different instances with different values get the same hash code.

So what's up with these magic numbers? First of all: No, they're not random. Let's go through them.

public override int GetHashCode() => 250924889 + EqualityComparer<string>.Default.GetHashCode(ValueA);
public override int GetHashCode() => -1007312712 + EqualityComparer<string>.Default.GetHashCode(ValueB);

These values are derived from the name of the property. Exactly how is not documented and not clear, but it's essentially equivalent to "NameOfProperty".GetHashCode() . The purpose is to add the name of property to the equation, reducing the risk that two properties with the same value get the same hash code.

Then we have the integer constants from the multiple property implementation:

int hashCode = -1817952719;
hashCode = hashCode * -1521134295 + EqualityComparer<MyTypeA>.Default.GetHashCode(A);
hashCode = hashCode * -1521134295 + EqualityComparer<MyTypeB>.Default.GetHashCode(B);

These are fixed, and do not vary. A little bit of analysis shows they are far from random. The first one, -1817952719 is actually the product of two relatively large primes, 16363 * 151379 = 2477014577 and is thus a nice semiprime, and when this is interpreted as a signed 32-bit integer we get -1817952719.

The second one, -1521134295, when interpreted as an unsigned 32-bit integer is 2773833001 - and that is a nice large prime!

Using primes and semiprimes as factors and constants in polynomials has been shown to produce numbers with better distribution than other constants.

So it's all about reducing the risk of collisions.

But how bad can it get? Actually, very bad... Here follows a seemingly good enough implementation that is similar to many real-world manual implementations. In fact, I've written a good number of similar, although hopefully not with as catastrophic result as this.

public class MyTypeA
{
    public MyTypeA(string value) => ValueA = value;

    public string ValueA { get; }

    public override int GetHashCode() => ValueA.GetHashCode();
}

public class MyTypeB
{
    public MyTypeB(string value) => ValueB = value;

    public string ValueB { get; }

    public override int GetHashCode() => ValueB.GetHashCode();
}

public class MyBrokenHashCode
{
    public MyBrokenHashCode(string value)
    {
        A = new MyTypeA(value);
        B = new MyTypeB(value);
    }

    public MyTypeA A { get; }

    public MyTypeB B { get; }

    public override int GetHashCode() => A.GetHashCode() ^ B.GetHashCode();
}

internal class Program
{
    private static void Main()
    {
        Console.WriteLine($"Hashcode is: {new MyBrokenHashCode("Something").GetHashCode()}");
        Console.WriteLine($"Hashcode is: {new MyBrokenHashCode("Other").GetHashCode()}");
    }
}

The above example produces the following output:

Hashcode is: 0
Hashcode is: 0

That's not good! Two different instances with two different values not only produce the same hashcode, it's 0! In fact it's worse than not good, it's potentially catastrophic for performance. The scary thing is, everything will still work, and look nice during testing but if these objects are placed in a HashTable or Dictionary or similar, and in production they grow to a larger number of elements then indexing these collections degenerate into linear searches in a linked list.

So what happens?

Two different types happen to generate the same hashcode for the same underlying value ("Something" or "Other"). That's actually not that unusual. Then we use XOR to combine the hashes, but XOR has the known weakness that XOR:ing identical values will always result in zero, regardless of the values.

This example is slightly contrived, but it demonstrates that seemingly good-looking code can have subtle pitfalls causing really bad effects.

Conclusion - Trust the tools and use Visual Studio generation of GetHashCode-code! Even if you don't notice any problems with your own implementations, do regenerate the code when you have the chance.

Iterations and the squaring factor

2021-04-29T08:33:00.001-07:00

The power of 2

I recently found code that was functionally equivalent to the following:

public class Filter
{
    private readonly IEnumerable<string> _old;

    public Filter(IEnumerable<string> old) => _old = old;

    public IEnumerable<string> WhatsNew(IEnumerable<string> updated) => updated.Where(s => !_old.Contains(s));
}

Nice, compact and easily understandable. We keep track of an original list of strings, and we get an updated list we'd like to know what's new.

Or is it, really?

As I mentioned, I found this type of code but why did I notice it? Because during debugging the call of the WhatsNew() method took significant time, it was boring to sit there and wait for it to complete!

The problem is that if the two collections are of the approximate same size, for example if updated contains a single new string, the typical number of calls to the string comparer is _old.Length * _old.Length / 2 .

In other words, the number of operations grows exponentially with the length of the list, this is typically expressed as O(N**2), read as "order of N-squared". That it's actually on the average divided by 2 doesn't matter for the O() notation, it just means that the number of operations is proportional to N squared.

In the real-world situation, the number of elements were on the order of 20,000. That's not extraordinary large in any way, but 20,000 * 20,000 / 2 is 200,000,000 !

That's 200 million operations! That can take real time even in a pretty fast machine.

The problem is the lookup in the _old list. We need to enumerate the updated one in one way, no way really to get around that, given the assumptions here.

This is where hashtables, or dictionaries which use hashtables under the hood and similar collections come into play. A lookup using a hashtable is enormously more efficient, and it will approach a linear increase in time rather than exponential. Here's how it could have been (and subsequently was) coded using a HashSet :

public class Filter
{
    private readonly HashSet<string> _old = new HashSet<string>();

    public Filter(IEnumerable<string> old)
    {
        foreach (string value in old)
        {
            _ = _old.Add(value);
        }
    }

    public IEnumerable<string> WhatsNew(IList<string> updated) => updated.Where(s => !_old.Contains(s));
}

Now, our WhatsNew() method will operate O(N), i.e. the time taken will be proportional to the number of elements, not the square of the number of elements! For larger sizes of the collection, that's a huge gain.

Obviously there are many variations both to the problem and the solution, but the message here is to be aware of the cost of doing effectively nested iterations of large collections.

This is also one of those examples of things that might not bite you until it's too late and your application is running in the real world. During testing and unit testing which usually is done with smaller data all will look well (even if we know we should be using expected data sizes somehow it often doesn't happen). Then, when it scales up in the real world performance can deteriorate dramatically and quickly!

This is similar to the old fable of the reward in grains of rice . Doubling the list does not decrease performance by half, as many would expect. It decreases performance proportional to the square of the increase! It get's progressively worse, quicker and quicker, and can surprisingly fast become a critical problem.

With the updated solution, doubling the list will decrease performance by roughly half, which is much easier to handle and scale with.

The strange git love of the command line

2021-04-27T22:39:00.002-07:00

Git users love of the command line

Why do a vast majority of git users love the command line so fervently? And why do I care?

Git is a great tool, but like said of "C", it's like a sharp knife. Excellent in the hands of an expert, but easy to cut your fingers if you're not.

Using git inexpertly can get you into all kinds of difficult-to-get-out trouble. Most developers' main job is to code, not maintain super-complex repositories with 100s or 1000s of contributors. You'd think they would welcome any tool that made the use of git easier and safer, perhaps forgoing some complex scenarios.

But, sorry to say, nope. Almost all of the developers I know and work with use git from the bash command line. It's also the only thing they use from bash, or any command line. Git has a zillion commands and options, but 9 out of 10 days the only things we do are:

Pull updates from the remote.
Create feature branches.
Commit changes locally.
Push changes to the remote.
Stash changes temporarily locally.
Switch branch.
Merge branches.
Rebase feature branches.
Inspect the commit history of a file.

There are excellent graphical user interface frontends for git, in all environments. I'm on Windows, and I happen to be fond of TortoiseGit, but there are others.

The great thing about these tools are that for one, they are 100% compatible with each other and the command line, since they all use the same underlying implementation of git.

Nobody, not even the most hard-core developers, use a command line editor or a command line debugger for day-to-day development. We're simply more productive with graphical full screen integrated development environments like Visual Studio, and less prone to mistakes. There's a reason we're not using TECO any more, even if it was awesome in it's time and still is in some ways - I loved it!

The same applies to source code control. It's simply faster, more convenient and less error-prone to use a tool that integrates with your graphical environment be it Visual Studio or Explorer or whatever your local equivalent is.

But, for some strange weird reason - for git suddenly it must be done from the bash prompt. I try to explain, I try to show, but the command line fixation remains. Git from the command line users repeatedly get into trouble and have to reset their local repository clones. Command line users frequently take much longer to investigate history inspections, and also often just skip doing things like rebasing the feature branch before merge to the main development branch causing the commit history to become much more complex. Just because it's too complicated to do from the command line.

Still, they persist.

I get that there are some rare cases that the command line is needed. I get that for batch operations, such as merging a bunch of repositories from perhaps the development branch to the main branch a script using the command line is perfect.

But the command line is not gone just because you use a modern tool for day-to-day use. It's still there when you need it.

Why do young gifted competent developers insist on using a tool model that originates in the 60's when there are so many better alternatives - and in all other cases they do?

I don't get it. But it is a problem, because it affects productivity both directly in daily use and indirectly because it tends to cause more complex commit histories to untangle in the future.

Code should work every time

2021-04-26T00:11:00.000-07:00

The Philosophy of Black and White

I have a background in real time operating system kernels, compilers and encryption.

In all of these contexts, it is self-evident that a piece of code either works every time, or else it's broken and it must be fixed immediately.

Code that works most of the time, or even almost every time just won't cut it.

If there are special circumstances during heavy load, or during startup or shutdown or just with bad enough "luck" that the software does not work as expected, it needs fixing. Perhaps it controls heavy machinery, compiles your code or encrypts your data. Right?

Would you feel comfortable with a compiler that emitted faulty code every now and then, depending on how heavy the system load was? Of course not.

I have taken this view to heart, and in my mind there are really only two types of software:

Software that behaves consistently every time, regardless of load or timing.
Broken software.

(Software of type 1 can of course still be broken in other ways - but it should then be consistently broken!)

Here I'd like to argue that this is just as applicable to a national informational website, a booking system, or even a game.

Software should behave consistently every time, else it's broken.

Unfortunately it's often hard to achieve, especially as so much code today runs in a web server environment, which is an inherently multi-threaded environment with a high degree of parallelism. Most issues with software that behaves inconsistently come from parallelism, and race conditions.

A race condition is when two different actors invoke code at the same time, and the outcome depends on who wins the race.

There are numerous mechanisms, and even whole books, dedicated to solving these problems.

My purpose here is not to talk about all those ways, but rather to argue that it's important as a developer to feel that it's important that our code works consistently every time.

In real life, I often have to argue about these matters, and I'm not infrequently met with the basic opinion "well, it really doesn't happen that often and it looks like hard work to fix so we'll live with it".

The problem is that these things tend to grow worse exponentially with increased load. So, if you're working on something that is expecting fewer and fewer users and lower and lower load, well maybe you can get away with "it's not worth fixing".

If you're like most of us, working on something that expects increased load and continued development, remember that bugs do not age well. They just get worse and worse, until it's really bad.

So, while perfectionism over all is not always a winning strategy, for consistency under load always strive for perfection. Otherwise you're building software broken by design.

For these matters, it really is back or white. Either the code works every time, or it's broken.

Using a search index as a database is a bad idea

2021-04-23T07:36:00.000-07:00

The case

I am currently working on project where we are using a CMS product, in conjunction with a search service based on elastic search as well as some back-end API:s.

Elastic search is an incredibly powerful search service, and you can do almost anything with it.

But should you?

What we did

With a CMS that does not support storing of arbitrary data particularly well, it is tempting to look for creative alternatives. It's always a big step for example to add support for an OR mapper or any other custom table or database. It's not necessarily a good idea, if it can be avoided.

In this situation we hit upon the idea of using the search index to store data fetched from a back-end API. After all, it can serialize and index just about any .NET type - so why not add some data carrying properties to our custom index object?

I had a bad feeling about this, my spidey-sense started tingling... I was thinking that an index is something fairly approximate and it's only intended to as well as possible make it possible to find data. Not store it. Hmm...

In the team I tried to argue along these lines, but I had no luck. So off we went, starting to store more than text to search and back-references to the actual data.

What happened

So now we're in trouble. Not really deep trouble as yet, but it's just not a good idea as it turns out. We're getting inconsistent states and the code can't trust what it sees. It works, sort of, kind of, most of the time but...

The problem is basically that when you store your data in a database or a file, you expect consistent and reproduceable behavior, every time. If you don't get that, the assumption is that something is broken.

When you store your data in a search index, this just does not apply, here are some of the reasons:

The index never promised to give a consistent view! Two reads can give different results, in elastic there's a concept of shards for example that can cause this behavior.
The index never promised that a write is immediately or deterministically reflected in a subsequent read. This is due both to caching and to queueing behavior in the index, since the assumption is that you're basically requesting an index update that you'd like to be effective asap - but not guaranteed immediately.
The index has a rate limit, it's perfectly ok for it to say that it's too busy, since the assumption is that at worst you lost an index update. No data is lost. With a database or a file etc., if that happens, you'll just simply have to gear up, it's a fatal error situation.

Specifically we're now in a situation that our code won't always work, mostly if we're "too fast". If we wait a few minutes between writing, and expecting it to be able to read back, it'll often but not always work.

Conclusion

Don't use a search index as a data store. Even if looks cool and easy, don't. I don't know exactly what problems you'll run into, but I'll bet a beer or soda that it'll cause you some unexpected grief.

AxCrypt, Xecrets and the OpenSSL Heartbleed security issue

2014-04-14T01:19:00.000-07:00

Information about Heartbleed

On April 7, a security advisory was published concerning OpenSSL, the security vulnerability described has been given the popular name 'heartbleed'. OpenSSL is a software library component commonly used in web servers supporting encrypted communication using SSL with clients.

This issue probably affects the majority of web servers in the world, and is about as serious as a security issue can be. It's arguably the most dangerous vulnerability the Internet has seen.

However, it does not in any way affect the security of AxCrypt file encryption or Xecrets online password manager.

In the case of AxCrypt, simply because AxCrypt is not a web server, and does not use SSL in any way.

Xecrets is an online service, using a web server, and does use SSL but it is still not vulnerable because OpenSSL is not used, i.e. the faulty component is not part of the software used by Xecrets. There is no indication that the Certificate Authority used by Xecrets has been compromised, so connections to https://www.axantum.com/ are still to be trusted fully as before.

You do not need to change passwords or passphrases for AxCrypt-encrypted files or your Xecrets account unless you use that same or similar password somewhere else.

The Lesser Evil, avoiding Copy and Paste

2013-10-02T03:30:00.001-07:00

I'm a great supporter of clean code. My own take on this can be found in an earlier post. The most common issue that I find is Copy and Paste-programming, and the most common explanation is lack of time. The problem is that it seldom saves time, even in the short run.

Copy and Paste-programming is a time thief. Every time. Even that deadline you have in 2 weeks, 2 days or 2 hours is endangered.

I'm hoping this post may inspire some hard-pressed-for-time developers to find the resolve to tackle the demon that is Copy and Paste, and come out feeling a little bit better and actually delivering more in less time.

The rationale

You probably know that Copy and Paste is a bad thing. Unfortunately most focus is on long-term maintenance aspects when explaining why it's bad. This makes for a perfect rationale when you're in a hurry. "I'll Copy and Paste this now, just to get the functionality done in time. I know I'll have to pay for this later during maintenance, but I don't see any other choice.".

I'd like to point out that the problems wíth Copy and Paste are much more severe than this, and that there really are other choices! Remember that evils are seldom equal, and if you have two evils to choose from, go for the lesser evil!

The maintenance argument

The main problem with the maintenance argument is that maintenance is not strictly separated from development. If you're lucky enough to work in a really agile environment, there's really no such thing as maintenance anyway, just a continuous number of releases. So which release/sprint/deploy should pay the cost of later, in favor of now?

It's seldom old code that you're Copying and Pasting. It's typically relatively new code, which means that it's still in rapid movement. Chances are, that within those 2 weeks/days/hours to release or end of the sprint, you'll have revisited that Copy and Pasted code more than once. In which case you'll have to propagate the changes to both copies, or forget one, and then have to bugfix it during acceptance testing - or worse hotfix it after deployment.

In either case, you're paying the cost for the Copy and Paste, not in that magical later slow phase of maintenance, but right now, when time's most at a premium. You're not saving time. You're losing time, and the reason you have so little time to lose is partially because of many small decisions like this.

Why is it likely that it's new code you're Copying and Pasting? Because otherwise it's unlikely that you know that the code is there to Copy! The other reason is that the most stressed-for-time releases are the major ones, especially the first. These are times where there is lots and lots of new code, also increasing the chances that it's new code. Finally, if the code is really old and stable, there's at least a chance that the functionality in question really has been factored out to common ground and there's no need any longer to Copy and Paste.

So, the "we'll gain time now, and pay for it later but that's ok"-argument is simply wrong. You're not gaining time now. You're losing time now.

The lesser evil

There are however options. Copy and Paste is the greater evil, but what are the lesser evils? For example:

Extract the code snippet to a general miscellanenous utility type. So, now you have a type with various disconnected pieces of code. A bucket full of... That's a lesser evil.
Move it to a common base class. So, now you're putting support code in a base class, and using inheritence to extend rather than specialize. That's a lesser evil.
Make the code snippet in question a public method just where it happens to be. So, now you have an unholy dependency between two types that probably should not have the dependency. That's a lesser evil.

All of these lesser evils are explicitly known to the compiler and fairly easy to safely refactor later, and it might even be good strategy since usually a pattern emerges after a while and it becomes clearer just where the common code should reside. There's only one copy of the code in either of these lesser evil alternatives, and that has to be a good thing!

Even if you never get to the refactoring stage, these are still lesser evils than the alternative - Copy and Paste.

Hopefully some readers will here find some supporting arguments and useful techniques reducing the amount of Copy and Paste.

Aggressive Coding

2013-06-09T10:03:00.000-07:00

Why you should code agressively, not defensively

Defensive coding is a concept that has it's origin in the absolute first ideas about programming as a craft, at least 40 years ago. Wikipedia describes it as something "intended to ensure the continuing function of a piece of software in spite of unforeseeable usage of said software". Apparently principles that can be characterized as defensive coding techniques are still being taught, or at least not actively discouraged.

Defensive coding sounds good in theory, but in practice it tends to excarberate the problems in question, and clutter up the code making it harder to understand and refactor.

A typical idea in defensive coding, is the Wikipedia example of copying strings from one buffer to another. The idea is that if the caller provides a longer source string than expected, this might in the case of C/C++ open up for the classic buffer overrun security vulnerability.

The defensive coder will, as the example shows, provide a function that checks the maximum buffer length and silently refuses to copy more than that to the destination buffer. This is bad and dangerous!

The defensive coder has now just hidden a serious bug in the calling code. If the contract states that 1000 characters is the maximum length of the input, the caller must ensure this and the callee refuse to accept anything that violates the contract.

The agressive coder will instead throw an exception or simply terminate the program if the function is called with a source larger than the allowed 1000 characters. This is safe and secure programming!

In terms of my current preferred language C#, I see this principle violated in a variety of ways. One frequent pattern is checking return results from other library functions for NULL, or empty strings etc, and then attempt to silently do something despite that this was unexpected. This typically indicates that the programmer does not know the contract of the method s(he) is calling. Do know the contract and ensure to follow it when providing input, and assume that it is followed for the produced output!

I recently rewrote a major functionality for a client and this also involved updating and refactoring the dependent code. To my horror a huge amount of code was devoted to checking the outputs of other methods, even to the extent of catch-all clauses silently ignoring any and all problems.

Instead of checking outputs from other other code because 'maybe it can return a NULL' - find out if it can, and what the appropriate action is! If you can't find out with reasonable effort and it doesn't seem a useful response, just use the return value and let your code blow up with a NullReferenceException should it in fact happen. This will alert you to the problem, and you can then find out what you really should do about it. Do check the inputs to your own code, and when the caller violates your contract report this with an exception.

Controlled crashing is good when it's because a caller violates a contract!

Agressive coding increases the chance of problems being caught and fixed early, and reduces the amount of clutter in the code immensly. This in turn lets you concentrate on what your code should do, instead of what someone elses code should not.

The shortest book on good programming, ever!

2013-04-25T11:17:00.000-07:00

The Coders Decalogue

This text is about making software work better and saving huge amounts of time, irritation and frustration for developers, users, customers and other stakeholders in the software business. Which likely means you.

When not developing my own software AxCrypt and Xecrets, I work as a contractor and consultant. In my work, I work with new software and old software. I work quite a bit with advanced troubleshooting and performance optimization in the .NET area.

Over the years, I've come to realize that I spend most of my time as a developer, doing things I wouldn't need to do if just a few simple rules are followed. I'd still have more than enough to do, no worries, but I'd be delivering much more real value to my customers for each hour spent. And so would millions of other developers. Come on - this is really not that hard!

I won't explain the rationale here, or give lot's of pedagogical examples. That would turn this into a real book, which would be nice. But I don't have time to write a book and you probably don't have time to read one.

So just trust me on this ;-) Really.

Do write code for humans. Smart one-liners, compact code, use of sneaky language constructs etc may not break your program. But it's not enough that the compiler understands the code. Don't write for the compiler, write your code in a style to make it as easy to read for humans as possible.

Don't copy and paste code with any kind of logic (ifs, loops, selects etc). Do always factor out common snippets. Even when you're in a hurry. Single-liners without logic are ok. That's called a statement in most languages, and you do need a few of those to make something happen and they can't all be uniqe.

Don't check-in commented out code. It's ok when you're trying out the new code - but when you're done, you're done. Since the code anyway resides in a version control system (right?), the old code is still available in the history. Do check-in clean code frequently always improving it slightly at the very least..

Do use long and descriptive class, method and member names. Letters in your source code are cheap. Use them freely. Don't abbreviate unless it's an industry or domain standard.

Don't comment code to explain what it does. If you need comments to explain the code, fix the code instead so it's understandable. If you release libraries, use structured comments for public classes and methods to document intended usage patterns, assumptions and contract details. Do comment why the code does what it does, when it's not obvious.

Don't nest if-statements or loops. In some special cases, one-liners inside a nested if may be ok. Do use early-exit and write small methods to remove the need for nestling inside a method.

Don't catch exceptions unless you know why you're catching them and what to do about it. Never catch all exceptions, except at the top of a given thread's call hierarchy and then only if consequences of not catching it dictate the need. If you do, log it! Do program to avoid exceptions when you know the conditions to prevent it happening in the first place.

Do write short methods that does one thing and are named accordingly. If a method does not fit on a screen of a reasonable size, then it does too much. If you have trouble naming it properly, it probably does too many things. Don't write long methods that you need to scroll to see all of.

Don't try to be smart. When there is no known need for advanced or smart solutions, do keep it simple and use simple standard patterns until you know it needs special treatment.

Don't optimize unless you know you need to. You'll know by measurements using performance profilers. This is not the same thing as writing inefficient code. Do write efficient code according to best practices that avoids known pitfalls and bad design. Performance optimizations come on top of that, for example caching or special-purpose thread-synchronization constructs, and are to be avoided until the need is proven.

Do always step through your code at least once to verify your assumptions about it's behavior. Don't trust just running the application and be satisified when it appears to work.

This is in no way the complete zen of good programming, nor is it revolutionary or unique. All of this has been said before. I'm sure you'll have your own pet peeves you'd like to add to the list. I have a few of my own, but the idea here is to list important things that are really super-simple to do. Now.

I am absolutely convinced that if these rules are followed, overall productivity in the software industry will rise dramatically.

If you're a developer, are there any of these rules you honestly disagree with? Do you work like this already? If not, try it out! Use peer-reviews to discuss your check-ins with this list as a guideline.

It's really this simple.

PS - There are 11 rules here. I'd like to get it down to 10 as the title indicates. Cast your vote on which one should go! Or perhaps what needs to be added, but then you'll have to drop two... ;-)

Security, compatiblity and backup

2012-08-04T13:50:00.000-07:00

Users of AxCrypt are obviously concerned about the security of their files. Howerver, there is some confusion about just what security means.

Encryption means security from others reading the data. In the case of AxCrypt, it also means that undetected modification of the data is not possible.

Encryption does not mean security from data loss for any number of reasons, such as accidental deletion, ransom attacks by hackers where AxCrypt even has been know to be used by the black hats, or hard disk crashes.

In fact, encryption adds another level of processing to the files, actually increasing (albeit very slightly, but still) the risk of something going wrong. If you think about it - the more you do, the higher the risk of a snafu. That doesn't mean AxCrypt is dangerous, it just means what it means - the more operations you perform the higher the risk is, as counted in number of failures per million for example.

In this day of rapid development on all fronts, there's always the question of data compatibility across computers and program versions.

All AxCrypt-versions from 1.0 to the current 1.7 in both x86 and x64 bit versions are compatible with each other, so no worries there. AxCrypt will always be upwards compatible, so version 2.0 may in fact in the future produce encrypted files 1.7 can't read - but version 2.0 will always be able to decrypt anyting an older version has produced. But, at this time, all versions are in fact compatible.

Also, AxCrypt-encrypted files are not tied to any particular installation in any particular computer, and uninstalling AxCrypt won't decrypt any files any more than uninstalling Word converts your documents to Notepad text files. If you have the file, and know the password, you can always decrypt it in any computer where you can get one of the various versions of AxCrypt running.

Now to the most important message about security, in the meaning keeping your data safe not only from prying eyes - but from any number of catastrohpes.

Your most important and powerful protection against data loss is spelled 'BACKUP'.

Please ensure that you have backups of all your data, encrypted or otherwise, and that you keep a reasonably recent version of the copy off-site, and that you periodically do check that you in fact can read the backup and that the expected data is really on the backup media.

Personally I backup to two USB-drives that I swap once every few weeks, always keeping at least one drive off-site. It's cheap, it's effective and it's very safe since all the data on the backup is encrypted.

Anti-Malware Vendors - here we go again with another round of FUD...

2012-07-23T14:15:00.001-07:00

Over the years, I've been periodically plagued by false positives reported for AxCrypt by various anti-malware vendors. These small-time, opportunistic, shady vendors like Microsoft, ESET, McAfee, Avast et. al. have a long history of just flagging anything they please as malware, and be damned the consequenses.

I am a small one-person operation providing free strong encryption software for personal privacy and security. I have over a decade and perhaps 20 million downloads of faultless operation on record. Nevertheless, at least once a year, these companies start reporting my software as malicious, causing me and my users no end of grief.

Why will not a single one of them just for once take repsonsibility for their actions? I have not received as much as one single communcation from them. Not once. Not when they flag my software falsely as malicious. Not when they rescind that flagging, as they inevitably do when enough users get suspicous and start questioning the reports.

Now, in 2012, it's starting again. This time because I'm trying to make some small revenue using bundled advertisments for other software with the installer in order to be able to spend some more thousands of hours developing free software. For more specifics about that particular choice read here.

As a current example, a recent report from Microsoft concerning the adware bundle AxCrypt uses that is at the time of writing actually a disclaimer of a recent false positive may serve. This causes uncertainty and fear for my users, but what does Microsoft care? Did they ask before flagging? Did they report when they removed the flag?

A different example are some recent reports about my site and my software from virustotal.com which is even worse, because these guys hide behind the additional screen of being an aggregator - so they don't even have to take any responsibility at all, they're just forwarding information uncritically. This is a free service, so you can't even complain.

What can you as a user do? I don't really know, miss out on great, safe and free software because of fear, uncertainty and doubt seems the most likely case. Or, you may start to at least make your voice heard when these situations arise.

When your Anti-Malware software reports a false positive - demand your money back!

What can I do? I don't know that either. If you have any ideas on how I can protect my reputation and continue to provide free, safe security software - do let me know.

I'm getting tired of this. How much cr*p must I take to write and publish free software for your security and integrity?

AxCrypt used for ransom attacks

2012-07-23T05:58:00.000-07:00

In October 2011 I got an e-mail from a Turkish corporation, claiming that someone had hacked their server to the extent of getting full administrator access. Thereafter the hacker had installed AxCrypt and encrypted all or most of the files on the server, and subsequently demanded a ransom from the company owning it.

At first I was very sceptical - how could someone get that kind of access to a server, and then hit on the idea to use AxCrypt to encrypt the files (for which it is workable, but not really well suited since it for example requires full administrative permissions to install etc, not just write permission to the files). On top of that - no backups, the only copy of the files were apparently the files on the server.

It seemed just to bad to be true. A file server wide open to remote login with administrator permissions and a guessable password with no backup routines? My first guess was that this was some kind of scheme to see if I would respond that, "Sure, there's a backdoor into AxCrypt - just pay me a small amount and promise not to tell anyone and I'll help you out.". Sorry, no such (bad) luck. AxCrypt does not have any backdoors, and I can't be of help.

Now, in July 2012, I've had an additional few similar e-mails and even a few phone calls, in total about 10. All of them from Turkey. Strangely enough the contacts have escalated, at start it was only e-mails which were not responded to when answered, then the e-mails started getting answered, then english speaking persons were calling from Turkey - now most recently Swedish-speaking persons are calling from Sweden, still referring to problems orginating in Turkey.

I'm still at a loss to really explain the phenomen, but I'm now tending towards actually believing that the basic facts are true. Servers and perhaps also personal computers are being hacked (it's not entirely clear just what kind of computers have been hacked). That so far every single incident has been in Turkey, is I believe due to the simple fact that the hacker is likely to be Turkish. A significant number of these hacks seems to occur during the weekend, so it's also likely that the hacker has a day job too which is somewhat comforting since it implies that the 'business' is not very profitable.

If you happen to be the victim of a ransom attack, in Turkey or elsewhere, I am very sorry for your sake but please understand that I cannot be of any help whatsoever. You must contact your local police authorities and get them to investigate. They should be motivated to do so, since apparently this is not that infrequent - once again assuming that the stories I hear are actually true as told.

I've tried to come up with some way to make AxCrypt even less suitable for the purpose of ransoming, but I really can't think of anything. It's just a tool, and if you let the hacker into your system with full administrator permissions, I don't think there's anything anyone can do - except you and that is to have backups!

This is not an AxCrypt issue. This is a security policy issue at the vicitims site.

The hackers are even not that smart to use AxCrypt. To perform the attack they don't really have to install anything - all they have to do is to encrypt the file system with EFS, Encrypting File System which is an integral part of all modern Windows editions, export a recovery certificate and then reset the administrator password. Done. No need for extra tools such as AxCrypt. On top of that, there are literally hundreds of alternative encryption tools out there, all of them potentially 'useful' in this context. I guess in a twisted kind of way I should regard it as a compliment that AxCrypt is so easy to use and secure that even hackers want to use it!

Remember that backups are your final protection against data loss, regardless of the cause. Go check your backup routines now - and validate that you actually can read the backups regularily as well!

About Xecrets and the XML Encryption Vulnerability

2011-10-29T12:00:00.000-07:00

On October 19, researchers at the Ruhr-Universität Bochum announced a flaw in W3C XML Encryption.

The Axantum Password Manager Xecrets uses XML Encryption to store data on our servers.

This does not mean that Xecrets is vulnerable to attack.

The flaw only works in an attack against a server that knows the encryption key, and that can be queried about the result of attempted decryption of partially modified encrypted data. It is based on the fact that most implementations will happily decrypt the provided data using the secret key and then give different error messages if the decrypted data cannot be parsed as XML. These varying error messages can then be used to infer the original data, but not the actual encryption key.

Xecrets on the other hand never accepts encrypted XML in this way, nor does it know any users encryption except briefly during the users visit.

The XML Encryption flaw does not affect Xecrets.

Why you should install programs to the default location

2011-08-23T03:04:00.000-07:00

AxCrypt since version 1.7 does not have an option for the user to select installation directory during the installation process.

Some like to change the installation directory, typically to D: or E:, instead of the standard location typically on C: and on English versions of Windows in C:\Program Files\ or C:\Program Files (x86) . This is no longer directly possible from the installation graphical user interface of AxCrypt, and sometimes I get asked why.

The main reason is to avoid trouble, and minimize user options where I as a developer believe I can make a better informed choice. AxCrypt is built around many such decisions based on that premise, we choose the algorithms to use instead of providing you with a bewildering array of choices for example. This is simply because I as an encryption expert believe that I can make this choice better in at least 99,9% of the cases and thus spare all those users a strange question they don't really know how to or even want to answer.

With several millions of installations of AxCrypt, just about anything that can possibly go wrong has at least once or twice. More than twice I've had to help users with trouble caused by not understanding the interaction between Windows, the registry, fixed, removable and network drives and AxCrypt installation. AxCrypt has been installed to network drives, on remote VPN-mounted drives, on USB drives on CD's and just about anything you can imagine. Often it works, but sometimes it does not.

With AxCrypt 1.7 and the upgrade to use Windows Installer technology, a major motivation was increased robustness. Part of this is to minimize the risk of a user mistakenly making a bad choice, and the safest and easiest way to do this is to make the choice automatically. Thus the option to select installation a directory was removed from the installer graphical user interface. It's still there - but you need to know a bit about Window Installer in order to force it to do your bidding. The idea here is that any user skilled and knowledgeable enough to do this, is also skilled enough to make that decision with small or no risk of mistake. It's also very clear that if something does go wrong, it's something that needs to be fixed by the user and it does not wind up as an error report about AxCrypt.

The thing that cinched the decision to remove the installation directory was that I could not, try as I might, think of a single valid functional reason for changing it from the system default! Aesthetical, arguably yes. Functional, no. AxCrypt is tiny, has no performance impact on the drive it is installed to, and does not produce any growing data there. When we do change the installation directory, we also break some assumptions that are made by other software. We are also now responsible to ensure the right file system permissions for example, we might open a vector for a malware infection by installing to a directory that allows non-administrative rights to write to for example. Please note that AxCrypt, due to Windows design limitations, requires administrator elevation to install anyway. Other assumptions we break are the locations of 32-bit vs. 64-bit software in the various virtualized environments offered.

So, we wind up with a situation where I can find no situation where it's bad to install to the system default location, but several where it's bad to install to a different location. By making the intstallation easier for the user by removing one decision, we also make it safer and more robust. It's an easy call I think.

Finally, if you can provide me with a valid functional reason for not installing AxCrypt to the system default location, please do so and I will try to accomodate that reason in the best way I can think of.

Concerning false positive reports about AxCrypt from antivirus software

2011-08-19T03:03:00.000-07:00

From time to time I get user reports about warnings from antivirus software concerning either the installer or one or more of the components of AxCrypt.

This causes great trouble both for me and the user. The user often winds up with an inoperable software, and I get a lot of extra work defending myself against unfounded allegations by software companies that take no responsibility whatsoever. They will not guarantee anything about the 'security' they provide, and they will not in any way assume responsibility for harm caused by flagging clean software falsely as malicious. In a normal legal context this would be called slander, and be cause for legal action.

Some facts about AxCrypt and AxCrypt distributions. AxCrypt is always built completely from source, we do not statically or dynamically link to any third party code except those libraries that are part of the Visual Studio development environment and which come directly from Microsoft.

Distributions are not built in a developer PC, they are built in a special purpose build server which only do that. No other software is installed than that required to build our various softwares there. This server is stationed behind double firewalls, and is never used for any general purpose.

As part of the automated build process, each executable is digitally signed with an authenticode certificate, issued to 'Axantum Software AB'. The issuer of this certificate do certify that such an entity exists, and that it is in good standing. I have provided them with proofs of the companys registration etc. This signing process then ensures that any bits distributed with that signature is traceable back to me and my company, and we would thus potentially be legally accountable for any malware intentionally placed there.

To sum it up: There is no infection in a distribution from me which is digitally signed with my authenticode certificate in the name 'Axantum Software AB'.

It is a continuing effort trying to defend oneself as an independent developer against the so-called anti-virus companies unfounded allegations.

It is beyond belief that a serious anti-virus vendor still in 2011 will flag a properly digitally signed executable as malicious.

If I had the financial resources I would take strong legal action, since this causes sometimes hard or impossible to repair harm to my good standing, and that of my programs.

Please check that you have the properly digitally signed versions of both the installer and the executable components if you are in doubt, instructions on how to do this are found here.

Please help the community by reporting your findings as a false positive to your anti-virus vendor. Although the vendors empathically deny this, they do share signatures (or 'borrow' from each other). This is clearly evidenced by the fact that these false-positive situations usually come in swarms where I get a few reports first from one vendor, and then most of the other vendors follow suit. That can't be a coincidence...

About the ASP.NET Padding Oracle Attack

2010-09-20T02:58:00.000-07:00

About the Padding Oracle Attack

You may have read about the Padding Oracle Attack, risking exposure of sensitive information in millions of ASP.NET sites.

This site is not one of the them in any real sense, and never was.

The ASP.NET Padding Oracle Attack exploits a vulnerability published as early as 2002 by Serge Vaudenay in a paper entitled "Security Flaws Induced by CBC Padding Applications to SSL, IPSEC, WTLS...". As usual it's amazing how long time it takes for these things to come to the attention of the large vendors, such as Microsoft.

This attack is in no way specific to ASP.NET - just about every major web platform is likely to be potentially vulnerable. For the technical details, please read the paper by Vaudenay as well as more recent paper entitled "Practical Padding Oracle Attacks" by Juliano Rizzo and Thai Duong. Here I'll just try to explain the factors that cause the vulnerability, and what the consequences may be as well as to describe why this site never was vulnerable in any real sense.

Padding is used in a block cipher to make clear text about to be encrypted an even multiple of the block length. In other words, if the encryption algorithm is designed such that it encrypts 16 bytes at a time, and your clear text is not a multiple of 16 bytes long, we need to add a few dummy bytes at the end to make it an even multple of 16 in this example. These 'dummy bytes' are called padding.

Most encryption schemes use padding that follows a pattern so that the decryption logic can recognize and remove them. Since such a padding scheme is self-verifying, the decryption program can determine if the padding is correct or not - and also give a specific error if the padding is wrong.

An attack requires access to an application that uses a block encryption cipher and actually knows the decryption key, and which an attacker can 'ask' if a given encrypted text contains a padding error or not.

The idea is to send in encrypted text to the application, and then determine if it specifically has a padding error after decryption or not. Obviously, if an attacker sends in bad encrypted text, an error is likely to occurr, but the attack requires that an attacker can distinguish the very specific error 'padding error' from other errors reported.

What's a Padding Oracle?

There are basically two ways an attacker can determine if a padding error has occurred as the result of the the manipulated encrypted text: The easy way is if the application actually says exactly this. With ASP.NET you can for example get the quite clear message "CryptographicException: Padding is invalid and cannot be removed". It does not get any clearer. The harder way is if the application shows different timing characteristics between reporting this error and other possible errors. This is a much harder attack, and much more likely to take significantly longer time since the timing is determined by many other factors as well that are likely to be unknown and uncontrollable by the attacker.

The way to defend against the attack is then to A) ensure that no specific message or error code is returned when a padding error occurs, and B) ensure that timing cannot be used by an attacker as an indirect distinguisher.

A Padding Oracle is something we can ask a question about a given encrypted text, and receive an answer stating either 'Yes, the padding is correct' or 'No, the padding is incorrect'. The trick is to ensure our application is not a Padding Oracle!

The consequences of an attack and why it's so serious for ASP.NET

What's the worst that can happen? Well, anything that is protected by the encryption key used to encrypt the data that the attacker is potentially vulnerable to both inspection and undetected modification.

In the case of ASP.NET, this usually means that the 'machine key' is vulnerable. This is the ASP.NET machine key used to encrypted ViewState and cookies etc, it's not the Windows machine key. In the case of this site, we generate a new key every time the site is started, so even a successful attack has very short time of validity.

Gaining access to the ASP.NET machine key typically means being able to impersonate a logged on user, and possibly gain access to files and other information available to that logged on user. In the case of ASP.NET 3.5 SP 1 and later, it means being able to access all files accessible to the web application via a virtual path. In actual practice, the attack is practical with only a few thousand tries on a typical web site.

The problem with ASP.NET is that a security researcher found a pretty much universal 'Padding Oracle' that is almost entirely independent of the application in question. It uses the 'WebResource.axd' handler as an attack vector. This handler seems to have the bad taste to respond 404 Not Found when the coded resource has correct padding, but is wrong - and 500 Server Error when the coded resource has incorrect padding. There's your padding oracle.

This is pretty bad, so we certainly should take this seriously.

The status for www.axantum.com

The Xecrets on-line password storage has never been vulnerable to this attack for the simple reason that we don't know the encryption key users use, so there's no possibility that our application can be used as a padding oracle for the purpose of breaching the Xecrets password encryption.

However, the Xecrets site as such does use ASP.NET and can theoretically be used as a padding oracle in the sense that it if it should fall to such an attack it would be possible to act as an administrator of the application (not the system). This will still not enable anyone to access stored Xecrets, because the system does not know the encryption key for those files. There is no sensitive information available that is protected by the ASP.NET machine key. It could in theory enable someone to get free access to the Xecrets service though!

Also, becase we create a new machine key every time we restart or recycle the application, even a successful attack would only be valid for a rather short time. Then again, there are rumours that a followup to the attack could lead to code injection.

The Xecrets site uses custom handling of both server errors and not found errors, but it's still probable that it was vulnerable to the WebResource.axd attack. The Xecrets site has from start employed a number of strategies to give aways as little information as possible and reasonable in the face of errors, and has thus always conformed to the first criteria to avoid vulnerability - it returns the same message and page regardless of what kind of error manipulated encrypted text sent to the site causes.

The problem here is that Microsoft has once again failed to follow that maxim, and also failed to follow general good cryptology practices and confused encryption with authentication. Encrypted data should always be verified for authenticity before use, for examle by employing a Message Authentication Code, or a digital signature. All encryption from Axantum uses the well-known 'Encrypt-then-HMAC' or other mechanisms to ensure the authenticity of encrypted data. If ASP.NET had done the same, this would never have happened.

Once again it is shown that following established security and encryption practices will mitigate the situation even in the face of future attacks, impossible to know at the original time of construction. It is also shown that even today, it will take up to 8 years(!) for billion dollar companies to react to a published threat affecting some of the worlds most widely deployed platforms.

As of today, the Xecrets site is also updated to avoid even the ASP.NET Padding Oracle attack via WebResource.axd - or any other similar vector for that matter.

Password Expiration is a Meaningless Ritual

2009-12-05T02:57:00.000-08:00

There are many examples throughout history where a once meaningful rule over time outlives it's original usefulness and becomes meaningless ritual. Password changing policies in a modern network of independent computer like a typical corporate network is such an example today.

A password changing policy is that annoyance that you are faced with every 3, 6 or 12 months typically when you get a notification stating that your password is about to expire and you have to change it.

Now, why is that a meaningless ritual? Because the original justification no longer applies. This practice originated in a system of time shared central computers, your IBM mainframe or VAX/VMS mini(!) computer. You connected to this central beast using a fairly dumb synchronous or asynchronous terminal. The distintive feature of these terminals were that they did not load and exectute arbitrary code. They just displayed information as it was sent to them. To gain access to a system or an application, you started this application on the central computer, and it then asked you for your credentials, i.e. user name and password. If it was ok, it let you access the system. It was all fairly similar to the DOS-box we have in todays Windows computers.

In these days of central IT departments and limited number of terminals, it was common practice that if you went for a vacation, or had to take sick leave, you'd let your collegue use your login information to help complete the tasks that needed completing. This of course led to a situation over time where you essentially lost control over who actually had access to your logon credentials and could use your account. So, to minimize the effect of this, IT departments invented password aging and expiration, forcing you to change it every now and then. This actually had an effect, because if someone with bad intentions actually had gained access to your password, it now become worthless (unless of course, you do as most people did then and still do - use a consistent theme for your passwords, since you can't be bothered to invent and remember a really new one every time).

So, back to why this practice now is a meaningless ritual. Because the password no longer is limited to giving access is to a central system via a non-programmable terminal. Today, the password typically gives you the right to install and execute arbitrary code in the actual computer used to access the systems in question! Anyone with any kind of security training knows that if someone once has had access to a computer with enough privileges to run and install software, that computer is forever potentially compromised until it is reinstalled from original operating system media.

Does changing the password acutally enable you to regain control over your system? Is that the recommended practice if you've had a virus or other malware in your computer? Change the password? Of course not! It's a meaningless gesture changing nothing. Your system will remain potentially compromised until you reinstall the original software from scratch.

So, if you're and IT department manager why would you want to implement a password expiration policy? The only reason I can think of is because it feels good, and because it's way we've always done it. It doesn't actually improve the security of your network one single bit. Not at all. It does annoy the users, and gives you a certain sense of power of course! That's always something.

What should you do instead, provided you're constrained to passords?

Set up a password complexity policy that is tough enough that a dictionary attack is unlikely to succeed. Go for length rather than require special characters etc. 15 characters or more is probably a good idea.
Set up a password change policy to the effect that the password never expires and cannot be changed by the user - yes, the opposite of what is probably the most common policy today!
The best is to generate passwords for your users - yes, you select them! Use a password generator that produces passwords that are not just random collections of characters, but rather combinations of characters that are possible to remember. Give the new user the password an a piece of paper, and keep no copy for your self.
Explain to the new user that this is the password, it's ok to keep the paper in the wallet for a few weeks until it sticks to memory. In return for this rather tough password complexity, the user will never need to remember another password while employed by this company. That's a fair tradeoff!
Also explain that this password may not be re-used at any other location, that it's a breach of company security IT policy to do so. The password is in effect company confidential and privileged information that may not be disclosed to any third party.

Now, if you get into the situation that the password is considered compromised, this will most likely be because of a malware infestation in your corporate network, it's fairly obvious that you both have to clean all the systems and change all the possibly compromised passwords. But only then! And the reverse applies too, if you have a suspicion that the password is compromised, you should consider all systems where this user has logged on as compromised and candidates for reinstallation.

So, let's start to modernize our policies and actually make them mean something instead of going through old and meaningless rituals!

Update your password policies today!

How not to shuffle a deck of cards with LINQ

2009-03-01T02:55:00.000-08:00

I’m an avid reader of MSDN Magazine, and seldom find any errors. However, in Ken Getz's article “The LINQ Enumerable Class, Part 1” in the July 2008 issue, I found a rather glaring error that needs correction. I sent the following text to Ken, but unfortunately never got a response. Hopefully some will see this blog post, and we'll not be seeing the error illustrated here in production code.

The following piece of code intended to solve the classic shuffle problem is very wrong:

Dim rnd As new System.Random()
Dim numbers = Enumerable.Range(1, 100), OrderBy(Function() rnd.Next)

The error will manifest by making some shuffles more or less likely than others. It is not an unbiased shuffle.

The problem lies in the fact that a list of 100 random numbers, independently chosen, are used to produce a random order of the numbers 1 to 100.

If this code is used as a template for a simulation, the results will be skewed, because not all outcomes of the shuffle are equally likely. If the code is used (with appropriate substitution to a strong pseudo random number generator) for gaming software, either the players or the casino will get better odds than expected.

This is rather serious, as code snippets from MSDN Magazine are likely to be used in many applications.

Why is the code wrong?

Because, when shuffling N numbers in random order, there are N! number of possible shuffles. But, when picking N random numbers independently, from a set of M numbers, there are M**N possible outcomes due the possibility of the same number being drawn more than one time.

For there to be a possibility of this resulting in all shuffles being equally likely, M**N must be evenly divisible by N!. But this is not possible because in this particular case M, 2**31-1 or 2,147,483,647, is prime! System.Random.Next() will return a value >= 0 and < Int32.MaxValue, so there are Int32.MaxValue possible outcomes, which is our M in this case.

This is a variation of a classic implementation error of the shuffle algorithm, and I’m afraid that we’ll have to stick with Fisher-Yates shuffle a while longer. Changing the code to use for example Random.NextDouble() does not remove the problem, it just makes it a bit harder to see. As long as the number of possible outcomes of the random number sequence is larger than the number of possible shuffles, the problem is very likely to be there although the proof will differ from case to case.

There are many more subtle pitfalls in doing a proper shuffle, using the modulo function to reduce integer valued random number generator outputs or using multiplication and rounding to scale a floating point valued RNG just being two of the more well-known.

By the way, the actual implementation of System.Random in the .NET Framework is quite questionable in this regard as well. It will not return an unbiased set of random numbers in some of the overloads, and the Random.NextDouble() implementation will in fact only return the same number of possible outcomes as the System.Next(), because it just scales System.Next() with 1.0/Int32.MaxValue.