Monday, November 26, 2007

Book Review: XSLT 2.0 Programmer's Reference Third Edition

XSLT 2.0 Programmer's Reference Third Edition, by Michael Kay, Wiley Publishing, Inc., ISBN 0-7645-6909-0

XSLT, or XSL, is a subject that I'm no expert in but I've come across it from time to time and generally have had a hard time to really grasp the how and the why of it. In most cases I can program, or at least tweak, just about anything with very little introduction. Fixing and tweaking the XSLT stylesheets that I've come upon has been a tougher experience where I've felt myself reduced to guesswork and magic. That's not a feeling I like, so I decided to do some background studying.

A Programmer's Reference is perhaps not the first choice as an introduction to a subject, but in this case it was hard to find just where to start, and I felt that I was experienced enough to go for some core literature from the start, which also would have the benefit of being useful in a real situation as reference literature.

Since I'm a newcomer to XSLT this review will have to be both about the book as such, and also about the subject matter. Let's start with the book.

Michael Kay is certainly an authority, being the editor of the XSLT 2.0 Working Group. The book is also authoritative and extremely carefully written with an extraordinary focus on details. I did find a few typos, errors and editorials mistakes but taking the amount of text into account it's still a very, very good piece of work.

This book is not written to be read cover to cover, which I did, but it's still not a bad way to get a thorough introduction to XLST. Be prepared for quite a few hours though, I spent about 20 hours reading this book. It's entitled XSLT 2.0, and was written before XSLT was actually approved as an official recommendation which it was on 23 January 2007. I've not checked, but there are sure to be some minor differences between the final recommendation and the drafts upon which the book was written. In consequence being such a recent standard, there are very few XLST 2.0 compliant implementations in existance, so XSLT 1.0 is still very much in use. The book is careful to keep track of differences and changes, and should work well for XLST 1.0 use as well.

It's very heavy reading indeed, but if you only want to get one book about XSLT 2.0 this is very probably the one to get.

The real question though that I must raise after reading this and getting a good feel for XSLT is: Do you want to get any book about XLST at all?

XSLT is about XML transformation, or actually transformation in general. It doesn't really have to be from or to XML, it can be from plain text to HTML or any number of other combinations depending on the requirements and capabilities of parsers and processors available. This is obviously extremely useful - to be able to massage data from and to different forms, and frequently used in one-off applications and in various integration projects. The target use of XSLT is also to fit in along CSS 2.0 as a way to perform formatting for presentation that is not possible with CSS - that's why it's called XML Stylesheet Transformations.

So XSLT certainly address an important area. However, sadly, I must conclude that it's not a very good tool in my opinion. Even if supplemented with good development environments with color coded and syntax checking editors, it's still simply not very human eye-friendly. Too many angle brackets and colons one might say. Syntax does matter! The real problem though is that it's a functional programming language, not a procedural language, and this simply does not lend itself to performing complex tasks in the real world.

Functional languages focus on defining the program in terms of functions that are state-less and without variables. Everything is defined as functions without side-effects, that is to say, each call to a function with the same parameters will always return the same result. Iteration is replaced with recursion - even when iteration is the natural way to address a problem, because an iterator must be updated for each round in a loop, and you can't do that. This means that while anything can be programmed in a functional language, it must frequently be done in ways that are not well known to the majority of developers.

There's a reason why functional languages like Lisp, ML and Scheme have not become commerically successful, although loved by the academic community for decades. Basically I think it's a question about maintenance and complexity. In the real world of commercial programming, the systems must be maintained for decades by perhaps 1000's of different developers over the years. This has always been an uphill task, but no functional language with the possible exception of Erlang has succeeded in combining expressive power, with robustness, documentability and maintainability.

XLST is certainly expressive, but I categorize it as being of the class of write-once and write-only languages. Integrated with XPath 2.0 it's possible to write programs that are so smart, that even the author will have trouble understanding them the next day.

There's nothing wrong with the basic concept of defining a standard way of transforming documents between different representations, and making it possible to choose between doing the processing in the browser or on the server. It's neat and it's cool. However, doing anything but non-trivial transformations is a maintenance nightmare. That the functional programming model is very little known among main-stream developers does not make it any better.

Somehow, it feels like XSLT is 90% geared towards the internal needs of the W3C - it's used extensively to format and publish the specifications for the various specifications published by the W3C. But, this actually means as far as I can judge, that the specifiations are written in raw XML using plain text editors and doing the markup manually - something that won't exactly work for any other organization.

So, unfortunately, in the end I feel that XSLT 2.0 is a technology that's elegant, but will never be used on a wider scale. However, if you do have the situation of having many, many documents in some kind of structured format (not necessarily XML surprisingly enough) and want to transform them to XML or XML-like format like HTML, then XSLT may well be just what you need. Be prepared for a very high entrance cost though, and rest assured that as author of the stylesheets you'll have a very high level of job-security.

There are also serious performance issues with XSLT, due to the functional style of programming, compilers and optimizers have a hard time generating decent code for the underlying procedural architecture of our computers. In theory, functional programming could come into it's own performance wise as multi-core architectures become more common because it does make it easier to realize parallell computation, but today other problems overshadow, and in most cases I'm fairly sure that performance will in many cases be unacceptable with XSLT.

So, to summarize: If you want to learn and use XSLT 1.0 or 2.0, this book is probably the one to get, but you should not assume that XSLT is a silver bullet for XML transformation, there are many caveats.

Wednesday, October 24, 2007

Troubles with Health Monitoring, System.Net.Mail.SmtpClient and SSL

The web is full of desperate pleas for help by prematurely bald developers who have discovered the fatal flaw in the shiny new ASP.NET 2.0 System.Net.Mail.SmtpClient class. This is touted as overcoming all the problems of the old variant, that could not even be configured for credentials without resorting to some heavy-duty tricks.

However, as has been discovered by countless people, while the new SmtpClient() class is neat, easy to use, and configurable via Web.Config - they forgot one thing to make configurable. The EnableSsl property is not settable via Web.Config. So big deal you say - just write a line of code and set it manually... Problem is - you frequently did not write the code that instantiates the SmtpClient. The most well known problem is with the suite of new login controls, which have the capability of sending mail in some circumstances. Works fine - unless you need to enable SSL for the SMTP connection. Fortunately, there's a well-known workaround since these controls expose an event called SendingMail, where you can do magic things including affecting how the mail is sent - most simply by taking over responsibility of sending it.

Then I hit the wall, really hard, trying to use the System.Web.Management.TemplatedMailWebEventProvider class. This is a provider that can subscribe to health monitoring events, and send them via e-mail. Using SmtpClient() of course, with the instantiation hidden deep in its innards of sealed and internal classes in System.Web.dll. No events to the rescue this time either.

After hours of fruitless searching, I finally come to the conclusion that I needed a work-around, ugly as it may be. So, here's where the decorator pattern meets reflection. Sigh. It aint pretty, but it does work, and I do get to use the otherwise rather nice and advanced TemplatedMailWebEventProvider (the same technique can be used for the SimpleMailWebEventProvider, or any provider derived from MailWebEventProvider).

In the end, it's just a few lines of code (comments and veritical white space removed for brevity):

using System; using System.Collections.Specialized; using System.Reflection; using System.Web.Management; public class TemplatedMailWithSslWebEventProvider : WebEventProvider {     private TemplatedMailWebEventProvider _templatedProvider;         public TemplatedMailWithSslWebEventProvider()     {         ConstructorInfo constructor = typeof(TemplatedMailWebEventProvider)             .GetConstructor(BindingFlags.Instance | BindingFlags.NonPublic,                             null, new Type[0], null);         _templatedProvider = (TemplatedMailWebEventProvider)constructor             .Invoke(null);     }     public override void Initialize(string name, NameValueCollection config)     {         if (config == null)         {             throw new ArgumentNullException("config");         }         _templatedProvider.Initialize(name, config);         FieldInfo field = typeof(MailWebEventProvider)             .GetField("_smtpClient",                       BindingFlags.Instance | BindingFlags.NonPublic);         field.SetValue(_templatedProvider, new SmtpClientWithSsl());     }     public static MailEventNotificationInfo CurrentNotification     {         get         {             return TemplatedMailWebEventProvider.CurrentNotification;         }     }     public override void Flush()     {         _templatedProvider.Flush();     }     public override void ProcessEvent(WebBaseEvent raisedEvent)     {         _templatedProvider.ProcessEvent(raisedEvent);     }     public override void Shutdown()     {         _templatedProvider.Shutdown();     } }
All that's left for you to do is to define the SmtpClientWithSsl() class, deriving from System.Net.Mail.SmtpClient() whose developer probably by the same oversight that forgot about SSL, also forgot to make it sealed. Fortunately. Here two wrong almost makes one right!

One of the morales of this story is to really think about the use of sealed and internal. My first try was to implement a custom templated e-mail provider, but it turns out that was quite a job, and I could not override or use anything from System.Web.dll because it was all sealed and used lots of internal helpers. If you really need to hide the implementation that bad, it might be better to introduce a public base class, where the essential interfaces are exposed as protected methods and properties. When you limit a class to sealed, and it depends on lots of additional logic, do consider making that logic available at least to alternative implementations and give it a base to inherit from.

Tuesday, October 16, 2007

Book Review: HTML, XHTML, and CSS Bible, 3rd Edition

HTML, XHTML, and CSS Bible, 3rd Edition, by Bryan Pfaffenberger, Steven M. Schafer, Chuck White and Bill Karow, Wiley Publishing, Inc. ISBN 0-7645-5739-4.

Not being really a GUI person, I tend to thrive more on the backend of things with threads, algorithms and such, I still need to do a fair bit of client side stuff. With this in mind, I purchased this book thinking that an authoritative work like a bible was just what I needed.

However, this is not a bible. Not at all. It's a very broad introduction on a broad range of subjects with very little depth. Why it has XHTML in the title is totally beyond me, there's not even a chapter about it. This book is really about HTML, CSS and the web in general - forget XHTML (although admittedly there is little difference from HTML).

In the preface, it's mentioned that the main author, Bryan Pfaffenberger is the author of more than 75 books. Impressive! Until you actually read this one. I have never seen so many mistakes in one book. Nor have I seen such blatant waste of space and duplication of code. Seriously, you're paying for about 50 pages of Lorem Ipsum here!

So, it's sloppily written, wasteful of dead trees and is mistitled. Is it thus useless? No, actually not really.

If you really don't think you can use this bible as a reference (because of all the mistakes etc), you can probably get some pretty good use out of it if you're relatively new to the web, but can code a little, and you're thinking about starting your own web site for whatever reason. It does cover a wealth of topics which are relevant as an introduction. There are chapters about how to upload your site, how to optimize for search engines etc etc - none of which is implied by the title, but still useful for a different reader.

I would like to rename it to "Getting Started with the Web" or something like that. As such, it's not really that bad. But don't buy this if you're looking for detailed in-depth information about HTML, XHTML and CSS.

Friday, September 21, 2007

Book Review: Code Complete 2nd ed by Steve McConnell

Code Complete Second Edition, A practical handbook of software construction, Steve McConnell, Microsoft Press, ISBN 0-7356-1967-0.

Code too Complete? The emphasis in this book is said to be practicality, as opposed to academic and theoretic articles. However, it's apparent that Steve McConnell lost track of that goal and perhaps really wants to be a real academic software hotshot, like Edsger Dijkstra. The book is 862 pages, not including 21 (twenty-one) pages of bibliograhpy citing approximately 450 references.

Just about every statement in the book has a reference (McConnell 2004). There are literally hundreds of citations of various research results, all which goes to prove the various points the author wants to make. Did you for example know that researches have found that the optimal number of white space in indentation lies between 2 and 4 spaces? (Miaria et al. 1983). Do you really care that much that you want a reference to a detailed study?

Code Complete suffers greatly from an attempt to be too precise - it's not a bad thing per se to be careful in keeping opinions and researched facts apart. But too much of a good thing can actually hide the important message that's there. And there is one! Code Complete should perhaps be labelled 'Catalog Complete' - because it is a pretty much complete catalog of proven software construction practices, and that's a very good thing, as it goes.

Looking behind the veils of the semi-academic format, Code Complete does indeed contain an enormous amount of information and good advice on how to construct great software. There is absolutely nothing new in this book, this is not a landmark that will be referenced like the beginning of something new in the future, it's a statement of the current state of the art. However, since it's also a fact (reference ignored) that the majority of current software projects and current software developers do not work at the level of the current state of the art, it's very useful to get a complete exposé like this for some, but not all.

I read this book cover-to-cover, every page, every word (except some cross-references in the margins). Don't do that. Skim it, browse it, and stop and read when you find something of personal interest.

Code Complete should probably be read by first-line software development managers and developers with 10 years plus of experience. I really don't think that it's useful for less senior people. There are sure to be more accessible and inspring texts on the subject matter, all of which likely are referenced in Code Complete. So if you're a manager or a senior designer or architect, read it, pick your 10 favorite things from it, make a note of references that sound relevant for your co-workers and then go out and buy them those.

Tuesday, September 11, 2007

Book Review: ASP.NET 2.0 Security by Stefan Schackow

The full title of this nice not so-little tome reads 'Professional ASP.NET 2.0 Security, Membership, and Role Management', by Stefan Schackow, published by Wrox, Wiley Publishing, Inc. ISBN 978-0-7645-9698-8.

Let's get one thing clear first - Stefan will never get a Nobel prize. This may well be the most boring book I've ever read, it's also full of small typos and minor editorial mistakes. At the same time - it's one of the more readily useful ones too. I read this book cover-to-cover, like I do practically all such books. This might not be the recommended way to get the most fun out of it, however, it's still something that must be done.

Let's get the other thing clear - If you ever think about implementing, extending or otherwise do any kind of real-world application using the ASP.NET 2.0 membership, profile and role providers you need to read this book. The documentation and SDK will not suffice. You may get something that appears to work, but you're missing out on all the little details that will make for really robust and secure code.

Stefan covers with absolutely mind-numbing detail just what actually happens in ASP.NET when a request is authenticated and how authorization uses the identities in all the various scenarios, depending on which IIS version, what kind of impersonation is in effect, what kind of authentication is used etc. This is absolutely essential information, that I've never seen collected like this. I used to work for a leading supplier of content management systems, and the support was constantly plagued by hard-to-debug cases with security related problems. How I wished that I'd read this book then...

After this almost bottom-less dive into the details of the fundament, Stefan continues to cover just how the membership, profile and role providers are architected, how they are intended to be used, interspersed with anectdotes from the development with rationales for various strangenesses that is left in the final product etc. Intermixed are various code samples with how-to recipes to achieve various neat functions by wrapping or extending the providers supplied with ASP.NET.

It's not an inspiring book in the most common sense, but as one manager once told me 20 years ago when he handed me my very own copy of "System 370 Job Control Language" - it's required reading. (It should be noted that I was employed as a Unix/C developer at that time - I still don't know what he was thinking when giving me that book. I did read parts of it though, to my horror... I still have it around to remind me.).

So, if you're working with the security-related providers in ASP.NET 2.0 and don't have one - go get one now! (The Stefan Schackows book, not the JCL one).

Monday, September 10, 2007

Get a handle on controls

A common inconvenience is that in ASP.NET, controls that are part of templates are not directly accessible from code, frequently resulting in code like this:

TextBox myTextBox = wizardStep.FindControl("MyTextBox") as TextBox;
This has the added problem associated with the late binding of the control to an embedded text string ("MyTextBox"), a misspelling won't be discovered until the run-time hits this code. There's also no real assurance that it'll work as expected, since the the control found may in fact not even be a TextBox. Finally, FindControl() is not recursive, so you have to keep track of the exact correct container. All this is error prone and inconvenient.

It's been source of irritation for some time for me, and I've handled it in various ways in my code. Today I spent an hour doing this in a better way. Enter the ControlReference class, which looks like this:

public class ControlReference<T> where T : Control{ private EventHandler _eventHandler; public ControlReference(EventHandler eventHandler)
{
_eventHandler = eventHandler;
}
public ControlReference() { } private T _control; public T Control
{
get { return _control; } set { _control = value; }
}
public void OnInit(object sender, EventArgs e)
{
Control = (T)sender;
if (_eventHandler != null)
{
_eventHandler(sender, e);
}
}
}

(Formatting somewhat condensed for this format.)
When using this inside the template, you hook the OnInit event in the markup, like this:

<asp:TextBox runat="server" ID="MyTextBox" OnInit="myTextBox.OnInit" />
In the code, all you do is:

protected ControlReference<TextBox> myTextBox = new ControlReference<TextBox>();
...
myTextBox.Control.ForeColor = Color.Red;
This makes it easy, convenient and safe to refer to controls in templates

Wednesday, March 28, 2007

What's in a name?

Recently I've found myself spending a lot of time discussing naming. I've come to realize that this is an area that has not received enough focus in software design. I'm not primarly talking about naming conventions, i.e. should things use camelCase, PascalCase, begin with underscore etc. It's about names of classes, namespaces, methods, properties, variables, modules, programs etc.

As a horrible example how bad it can get, I recently read an article in MSDN magazine about Windows Installer XML, WiX. Can you believe this leading software company names the components of WiX as follows:

Candle
Dark
Light
Lit
Tallow
WixCop

Just how does this naming help you work with WiX? I'm sure there's a funny story behind it, or someone just ran out of imagination. All these things add upp, and every time you can't recall what a thing does or what it's name is it causes frustration and delays. Here's a suggested alternative naming of the same components:

Wix2XMLInt
Msi2Wix
XMLInt2Msi
WixLibGen
FileTreeWixGen
WixCop

Given the following descriptions, which do you think are easiest to work with?

Candle - Transform WiX source to intermediate XML
Dark - Generate WiX source from a MSI file
Light - Transform intermediate XLM to a MSI file
Lit - Generate WiX libraries
Tallow - Generate WiX source to replicate a directory/file tree
WixCop - Check a WiX source for potential problems

Most people will have trouble remembering or associating something if there's no mnemonic or association to what it is or are. If you call something 'red', that is in fact 'blue' - this will cause a problem for people!

Do spend some more time thinking about how you name things!

It's a one time issue, but the names live on forever. How many names of things do you work with daily? I probably keep at least a 1,000 in my head at any one time. Those that cause me difficulty are those that I can't find any thing to associate with - because they are meaningless in the usage context, or associate the wrong way.

Tuesday, March 6, 2007

Planned exceptions considered harmful, or try the Try-pattern

A really common misuse of exceptions in code is to validate input. A typical code pattern may look something like this:

private int GetIntValue(string s)
{
int i;
try
{
i = int.Parse(s);
}
catch (Exception)
{
i = -1;
}
return i;
}

This is bad. Very bad. The exception is not an exception, it's an expected outcome of validating the input. That is not the way of doing this. Enter the TryXXX pattern, implemented in various parts of the .NET Framework 2.0 and later, and hopefully soon in your code! It should look like this:

private int GetIntValue2(string s)
{
int i;
if (int.TryParse(s, out i))
{
return i;
}
return -1;
}

The TryXXX pattern is a very nice way to handle the situation that you're expecting, and want to handle, bad input. In your own implementation, you should not just wrap a try-catch block with the TryXXX pattern, you should ensure that your data is valid without generating an exception in the first place.

If other parts of the code is not expecting any invalid input, and the right thing to do indeed is to throw an exception, then it's the TryXXX that should be wrapped, and if it fails throw the exception.

Thursday, February 22, 2007

Don't do objects - do types!

I've been faithless the last couple of months, doing lot's of code in C# for ASP.NET instead of using a real programming language, i.e. C++. Actually, C# is pretty nice and ASP.NET is a very productive environment for Web Application programming.

During this time I've come across a usage pattern that I really want to discourage. C# is a beautifully typed language, especially with the introduction of Generics (not to be confused with Templates in C++, they are two very different things) - so why do so many people insist on stuffing things into strings and objects?

A typical example is the Guid structure in C#. It's essentially byte-array holding the 16 bytes of a GUID, a Globally Unique Identifier. Very useful, and very common. But very frequently stored and passed as a string, or even worse, as an object.

I use a code policy that states that an object that has a specific type should always be stored, referenced and passed as that type. Even if it does come into the code in the form of a string, you should still convert it at the first opportunity and then keep it in it's natural typed form.

The rationale for this is that you can get some very nasty and hard-to-find runtime bugs with late effects, causing errors, instability and exceptions long after the initial problem occurred. Consider for example a Guid entering the system as a string, being stored as a string and then used. It might not be in the correct format, but that will not be discovered until way too late. If you follow a code policy to convert it into the appropriate type from the start, you'll automatically catch such things very early.

Work is also progressing on Xecrets, the safety-deposit box on the net that will launch in the summer time frame. I've given notice to my "day job", and will start working full-time on AxCrypt, AxCrypt2Go and Xecrets in June.

Wednesday, February 7, 2007

Quitting my day job

Finally! It's decided. I'm going to start spending serious time with AxCrypt and related services. AxCrypt and AxCrypt2Go will remain free and open source, and to finance it I'll also be developing and provide Internet-based services. These will be available for trial, to all registred AxCrypt-users, and later as subscription services. I've spent some time preparing for this, including developing the new site which looks fairly similar to the old one, but in fact is a full-fledged application whilst the old one was basically static HTML.

So, the good news is that in the summer time-frame, I'll be working more or less full-time with AxCrypt and AxCrypt2Go. The even better news is that I'll make Xecrets, the new Internet-based subscription service that complements the free software available. There'll never be a requirement to register or pay anything to use AxCrypt or AxCrypt2Go.

I am hoping to provide such great value for money that many of you actually will find the subscription service so useful that you'll be glad to pay a small amount for that, while at the same time providing financing for AxCrypt-development.

The philosophical view here is that encryption software is good to have open source, and software licensing is a slight evil, that unfortunately often is necessary to finance the product. I'll be trying for the best of both worlds here, providing free software (after all, there's absolutely no additional cost for me to provide a second copy of AxCrypt once the first is produced) but also providing services on the Internet, where a reasonable fee for the server and bandwidth cost feels fair and hopefully generating enough additional revenue to also support AxCrypt development.

The Xecrets subscription service will start as a safety-deposit for your secrets, such as logons, PIN-codes etc. Always available, and always protected. When you need them, you can logon to the Xecrets-site and search your secrets in a standard free-text search way. Your data is never stored decrypted on the server disk, nor is your decryption key.

Most, if not all, of us have a multitude of logon-codes and PIN-codes to keep track of. With Xecrets you have the possibility to have them available, protected and backed-up for a very low cost - $1/€1 per month.
There are endless possibilities and variations, but all things have to start somewhere, and this is what I'm planning right now.

These are the plans - please let me know your thoughts! I'm open for all suggestions that will help me keep providing the community with free software, while also being able to afford to spend sufficient time doing so. Mail me now!