Wednesday, July 24, 2024

How avoid being asked for a passphrase for SSH local signing of git commits on Windows 11?

This is not about how to authenticate to github or any other git repository. That works fine, see my other blog post about that. It's a little dated but mainly it's still relevant.

I want sign my commits to for example https://github.com/xecrets/xecrets-cli, and I'd like to do this using an SSH key - not a GPG-key. I already have a SSH key, and signing works fine, and commits show up as 'Verified' on github.

Using commonly found instructions on the Internet, after creating a SSH key, and uploading it to github (you need to do it once again specifically to use as a signing key, even if it's the same one you use for authentication) do the following:

git config --global user.signingkey /PATH/TO/.SSH/KEY.PUB
git config --global commit.gpgsign true
git config --global gpg.format ssh

However I don't want have to type the passphrase to the private key file every time, and this was much harder to find.

As it turns out, git will use it's own ssh tool for signatures, unless told otherwise. On Windows 11, this does the trick:

git config --global gpg.ssh.program "C:\Windows\System32\OpenSSH\ssh-keygen.exe"

To be able to verify commits locally, you also need to create a file with allowed signers. It can be named anything and placed anywhere, but the following seems like a good place.

git config --global gpg.ssh.allowedSignersFile c:/Users/[YourUserName]/.ssh/allowed_signers

The allowed_signers file itself, contains of one or more lines with a list of comma-separated principals, i.e. typically e-mail addresses and a public key, i.e. something like:

you@somewhere.com ssh-ed25519 AAAA...

Thanks to the StackOverflow community and specifically this answer found there.

Tuesday, July 23, 2024

What everyone is missing from the CrowdStrike Falcon incident

Introduction

25 years ago, I said (really, I did!) that automatic software updates pose a greater risk than malware (ok, at that time we really only had viruses).

Many incidents since than has proven this right, but none more so than the CrowdStrike Falcon Blue Screen of Death (BSOD) incident on July 19, 2024.

Since as usual the company won't release any detailed information on what really happened, we'll have to rely on other sources. I found that Dave Plummer's account on YouTube was very good, and trustworthy.

What really happened?

In summary, after looking at crash dumps and based on his own knowledge of how the Windows kernel works, Dave Plummer explains that what happened was probably the following.

CrowdStrike has a need to check not only file signatures, but behavior in general of software on the system. To do this they have created a device driver, that doesn't actually interact with any hardware, but has achieved WHQL release signature. This means that it's very likely reliable, and certifiably from a known source. They also flagged it as boot-start driver, meaning it's really needed to boot the system. This is of course to make sure it really does get loaded, which is great - as long as it doesn't crash the system. Which, unfortunately, it did.

So how did this signed, certified, driver crash the system? In, short, by CrowdStrike hacking the protocol and Microsoft allowing it to happen. We don't know exactly what they do, but the challenge they have is that they feel the need to frequently update what behavior to watch for, and in order to do so, they essentially need to be able to update program logic frequently. They could do so by building a new driver with the new logic, and getting it WHQL signed. Two problems for them here. First, it takes time to get a driver certified. Secondly, updating a driver is not done on-the-fly, so a reboot is required.

The "solution"? To provide the driver with instructions that define the logic to execute, in other words, one form or another of P-code - or even machine code! Thus, they can keep the same driver, but update the logic by conceptually having the driver "call out" to external logic. This is what CrowdStrike misleadingly calls a "content update". I would call it a code update.

Essentially, by allowing the driver to read and perform instructions based on external "content", regardless of what you call it, they effectively bypass the whole point of WHQL certification of kernel mode drivers.

In the end though, it appears that it's just a trivial embarrassing bug in the driver that caused the crash itself, in turn triggered by some equally trivial embarrassing process error during  CrowdStrike deployment of "content updates".

The "content update" that CrowdStrike sent out was full of zeroes. Nothing else. Obviously not the intended content. And this simple data caused the driver to crash, in turn causing the system to crash since it doesn't have much choice in this situation.

As the driver is also marked as a boot-start driver, it'll always get loaded on reboot, even if it crashed last time. This is what makes it so time-consuming to fix, the driver can't just be flagged as faulty.

This is just the mark of plain really really bad software. This software runs in kernel mode with full privileges and can do anything. And giving it a bunch of zeroes as input crashes it. Just. Not. Simply. Good. Enough.

Imagine what a creative hacker could do with that, inserting something more malicious than zeroes? How does full system control at kernel mode sound? This is of course speculation, but it's highly likely that it's possible with the current version of the driver.

What everyone seems to be missing...

In all the aftermath and all the comments, there are two glaring omissions in the analyses according to me.

The big failure 

How is it possible that someone sends out an update affecting the behavior of kernel mode code, all at once, simultaneously, to millions and millions of systems around the whole globe at once!?

I've participated in many roll outs, and never would I allow a big-bang roll out like this. CrowdStrike should be charged with negligence for having this type of process. It's just plain irresponsible.

The only reasonable way to do global roll outs, especially for kernel mode code, is to stagger it. Start with 10 systems. What happens? Do they respond properly? Then a 100, then a 1000, etc. And since it's a global roll out, take time zones into consideration! Now we saw how the problems rolled around the globe, starting in Australia, with reports of downed systems as the working day started there.

I understand they want to get updates out at speed, but this is ridiculous. In the end, this caused way more damage than any possible threat they could have stopped by this procedure.

The smaller failure


The smaller failure, but still not mentioned by analysts is the fact that this kernel mode driver that accepts external input apparently does no input validation at all worth the name. Perhaps the content is digitally signed (I don't know, but no-one has mentioned it either), but even if it is, this type of software must assume that it can't trust external content. According to Dave Plummer, the "content update" in question was all zeroes, so at least no embedded digital signature, apparently.

If it's not - all an attacker would have to do is to deposit a file in  %WINDIR%\System32\drivers\CrowdStrike with a name such as C-00000291*.sys containing zeros - and the system becomes unbootable without manual intervention!

Once again, this is just not good enough, and should be cause for some lawsuits in the coming months. A manufacturer of critical security software executing in kernel mode should not be allowed to sell code this bad without financial consequences.

Wednesday, May 15, 2024

Localizing a .NET console or desktop application

Having gone through countless different solutions to the ever present problem of localizing .NET applications, I think I've finally found a good solution.

The main driver this time was the need to provide a convenient way to translate texts used in the Xecrets Ez desktop app for easier file encryption. This is developed with .NET 8 and Avalonia UI, which apart from C# uses it's own XAML format, AXAML with a need to reference translatable texts there too.

.NET has of course since forever supported localization, sort of, through the resource compiler, .resx files and satellite assemblies.

The problem has always been, and remains to this day, how do I get the actual translations done? Translation is often not done by the original developer, and often by people who definitely don't want to fire up Visual Studio and the resource editor. Never mind that it doesn't even have a proper translation view, with a source language and target language in the same view.

There have been various attempts to make resx-editors for translators, but they were typically Windows-only desktop applications. Translators often like apples.

Since maybe 10 years the number of web based translation services have been increasing year by year, but the support for the resx file format is often spotty or limited, or non-existing.

I've long been looking at somehow leveraging the huge amount of services and software that use the .po format for texts, all based on the gettext utilities and specifications. However, the old C-style extraction of text from the source has never appealed to me, and also requires a lot of parsing of C#/.NET code that is just bound to go wrong. Not to mention various XAML dialects that might also need parsing for the gettext extraction process to work.

Then when looking for a library to work with .po files, thinking to roll my own somehow, deep inside a related sample project I found POTools!

This little hidden gem can do a gettext-style extraction, but from a .resx file! Now things started to fall into place.

Doing the extraction, or discovery if you like, of translatable strings from a structured source like a .resx file is easy, solid and dependable - and fits the general development model of .NET better than wrapping all texts in T() method calls, similar to the original C-style gettext mode of operations. It also enables easy re-use of strings in multiple places without any problems.

With POTools I could now easily extract a template .pot file from the already existing Resources.resx file, and start looking for a suitable translation tool. There are many such out there, but for my purposes I went with Loco, which offers just the right set of capabilities at the right price. The real point is that once you go with the .po/.pot format for your translations, there's a huge eco system out there to help you out!

As a minor issue, I decided to create a separate resource file, InvariantResources.resx, to hold the texts that by definition should never be translated, but still are suitable to keep in a resource file.

So, now I have a bunch of translated .po files, as well as the original .resx file with the original texts. How to tie all this together?

Using another part of the .NET building blocks for localization, IStringLocalizer offers a nice interface that fits well. Using sample code from the Karambolo.PO library that can read .po files, it was trivial to create a POStringLocalizer implementation.

This is now available as a nuget package for your convenience! There's sample code there on how to wire it all up, as well as in the github repository.

So, we now can use the Portable Object file format and the gettext universe of tools and users, while still maintaining a good fit with .NET code and libraries.

Thanks go to the author of Karambolo.PO library and POTools!