The time to break backwards compatibility is NOW. In Longhorn.

The time to break backwards compatibility is NOW. In Longhorn..

Since my original entry pointing to Michael’s post about “The IE Patch (MS04-004) demystified” I have seen a lot of ridiculous and ludicrous comments in the midst of some great insight. I am only thankful that none of those idiots seem to visit my blog, as I am not sure I would appreciate such dim-witted statements here.

Yes, I’m venting. Mostly because in the midst of Microsoft doing something right as it relates to security, people complain. It wasn’t even a month ago that these same people complained about the IE vulnerabilities… only to find something else to complain about after the recent IE patches. Yesterday on one private mailing list I am on I actually heard people discuss “class action” lawsuits against Microsoft for “loss of profits”. Idiots. The moderator of that list sure got a piece of my mind on that one.

But that’s not what this post is about. There are plenty of blog entries and news stories around the world that already point out that RFC 1738 STATES under section 3.3 that the HTTP URL format should NOT include username and password information. Don’t believe me?

3.3 HTTP

The HTTP URL scheme is used to designate Internet resources accessible using HTTP (HyperText Transfer Protocol). The HTTP protocol is specified elsewhere. This specification only describes the syntax of HTTP URLs.

An HTTP URL takes the form: http://<host/>:<port>/<path>?<searchpart>

where <host> and <port> are as described in Section 3.1. If :<port> is omitted, the port defaults to 80. No user name or password is allowed. <path> is an HTTP selector, and <searchpart> is a query string. The <path> is optional, as is the <searchpart> and its preceding “?”. If neither <path> nor <searchpart> is present, the “/” may also be omitted.

Within the <path> and <searchpart> components, “/”, “;”, “?” are reserved. The “/” character may be used within HTTP to designate a hierarchical structure.

Quite frankly… it appears that Microsoft was wrong in breaking the original standards in the RFC by adding the support. And they were right when they removed it. Enough said.

Which gets me to the point of this entry.

When Robert Scoble came to see me recently we got into a discussion about Microsoft’s cardinal rule. “Don’t break the build. Don’t break backwards compatibility.” He gave an example that if they simply broke existing software with a patch/change, that action could have devastating effects to Microsoft in client retention, and even bad press. (Sound familiar, in the recent few days?) Steve Ballmer would not appreciate a call from a CTO of a major corporate client screaming that their entire system is broken due to such a change, and as such the cardinal rule is considered their “Prime Directive”… so to speak. That’s interesting.

Why I found this interesting is that this discussion was not only surrounded around IE, but Longhorn. Its launch is still a distant ways off, but this kind of rule SHOULDN'T be part of Longhorn. The time to break backwards compatibility is NOW. In Longhorn.

While that statement sinks in and you prepare to send me a nastygram… let me predicate that by saying I know this comment makes me look like an ignorant outside observer. I am. I acknowledge that. I live in a small box, and don’t look at it from an end user’s perspective… but as a computer security software architect.

Microsoft has made great strides as it relates to designing better security into their operating systems. I have been saying over and over on this weblog that we won’t see any of these significant changes until Longhorn. And I still believe that. Mostly because it takes a few years from the time code is written until it is available to the mass market. We won’t see Longhorn server until at least 2006, and I would bet it’s not really ready until 2007. I base that very fact that in the last three release cycles, there is always a desktop version a year before the server one.

Let’s get back to the topic of this post, as it relates to software development and secure coding. If we look at what Microsoft has been doing as of late, we can see that they have made significant changes to build a foundation for a more secure computing experience:

They have created better error-reporting software. They have found that the top 20% of their errors make up 80% of the problems. Knowing this and capitalizing allows Microsoft to significantly prioritize and reduce bugs that matter the most.
They have created better developer tools to help write more secure software, with release of tools like prefix, prefast, AppVerifier and FxCop. Their only problem right now with this is that they AREN’T letting developers know about them!
They halted product development for a period of time and retrained their developers to code more securely
They audited as much product source code as humanly possible and now have a dedicated lead security person for each component of the Windows source code to watch over code quality as it relates to security. Previously they had a clean up crew come in after the fact and try to sanitize the master sources.
Microsoft has begun to provide more secure defaults when shipping new product. As a clear example we have seen the launch of Windows Server 2003 with a lessened attack surface than previous versions of their server product.
Microsoft now provides better tools such as the Microsoft Baseline Security Analyzer to analyze and audit patch management as it relates to security bugs in a proactive manner.
After major security incidents (like MSBlaster and MyDoom) Microsoft has released tools to help respond and fix possible vulnerable and compromised machines. Although these are not timely enough (IMHO), it’s still good to see.
Microsoft has provided a more definitive patch management cycle to address “patch hell” until their newer products get released that have a significantly lessened attack surface, and have better code quality.
Microsoft will be providing better integrated firewalling with their Internet Connection Firewall (ICF), to be released with the next service pack of XP. Ok this item isn’t about secure coding… but more about “secure by default” mentality.
Microsoft is being more open about the entire security process. And not just for PR purposes. More articles, documentation and transparent communication are now available through MSDN, Microsoft employee blogs, and Microsoft’s Security webcasts.

With all these positive moves there is one thing that is missing. I have arrogantly stated in the past that the NT kernel continues to be brittle, and riddled with insecurities and needs to be replaced. I would like to alter that thinking and say now that it is time that the kernel gets refactored.

This argument comes into play because there was way too much code written and added in an insecure state before Microsoft retrained its teams to think more securely. The line of reasoning that code bloat means less secure software has been around for ever and is based on simple mathematics. As more lines of code are written the complexity rises exponentially and exposes the system to more vulnerability and risk. But this is true of all operating systems… and any code. On the secure coding mailing list (SC-L) we have been spending time recently discussing how to maintain better code quality and design more secure software. It’s not easy.

But I look back on a great article Joel Spolsky wrote in which he stated that Netscape made the single worst strategic mistake that any software company can make: they decided to rewrite the code from scratch. He was right. It is much more cost effective to refactor code that is working and just needs to be cleaned up. And that is something that from a secure coding perspective is much more difficult to accept. It is a WAY better idea to design it from the start securely, threat model it properly and code it effectively. Bolting on security after the fact is much harder. Grafting secure coding practices onto insecure code isn't always a sane approach… as it would be much more effective to rewrite that code entirely. This is where refactoring comes in. You can rewrite sections of code, and remove “dead weight” as necessary.

This SHOULD be done in Longhorn. Although I am confident that most of the kernel has been rewritten by now over the years… I think that there are entire areas of code that have to be removed, or at the very least, refactored. There are entire subsystems within Windows that simply should be torn out, as they have been replaced with better systems that should be threat modeled, analyzed and refactored. This might/will break backwards compatibility with some software. Some people might not like that. Well… Microsoft could follow what Apple did with OSX, and include VirtualPC for free and allow users to run their legacy software in XP or Windows Server 2003 through a sandboxed virtual machine, allowing them to bridge the gap until the software vendor has time to update their products, or the client finds an alternative.

Let me give you an example. Why is it that there was a Network DDE Escalated Privilege Vulnerability in Windows 2000 a couple of years ago? Why the hell were people still using DDE in software for Windows 2000, when OLE replaced it, and then COM replaced OLE and finally DCOM replaced COM. And guess what… in Longhorn DCOM will be replaced with Indigo! Seems like a PERFECT time to focus on the intricacies of Indigo, design and code it properly (which I would gather they are doing now that they have been properly trained), and provide a clear and clean upgrade path to the new system. Yet I know on Microsoft’s Indigo FAQ they state that Longhorn will still include COM+… but upgraded to include Indigo. *sigh*

There are lots of examples of this within the system. If you think about it for a moment, there are examples ranging from the driver framework to the graphics layer that could be ripped, refactored and replaced. Longhorn is the PERFECT time to do it, and the most logical step forward in the evolution of the server operating system from Microsoft. With Microsoft already giving access to Longhorn API, there is no excuse for the learning curve of the new Longhorn API systems to be to difficult to tackle for any developer. Further to this, Microsoft has made great strides to simplify many of the API and reduce the total amount of code that needs to be written. If we can agree that more lines of code means more potential vulnerability, we can use simple mathematics to show risk/return ROI on products being updated to the new system (as it relates to security).

A perfect example is the new Filter Manager that is in Longhorn and is now backported to XPSP2 (and hopefully W2K SP5… done yet Darren? 🙂 ) that is being used for file system filter drivers (FSFD). Filter drivers have been a significant problem in the past for Microsoft. Too many third party drivers (anti-virus, encryption drivers etc) didn’t play nice together and would choke a system. They didn’t scale well, had stability issues and were all around ugly for interoping with other drivers. I know in one case I used to be able to install two separate antivirus drivers and freeze my system! Microsoft hosts “Plugfests” to do interoperability testing to help mitigate these risks… but made a smarter decision and simplified the framework to reduce the actual amount of code you need to write for a FSFD. This forward thinking maneuver will benefit Microsoft significantly… security, stability and performance of the third party code will be increased as well as its ability to interop better. Complex buggy legacy drivers will be a thing of the past… which only helps the Longhorn platform.

Anyways, enough ranting. You get my point. I think a quote I like from Gene Spafford could best sum this up:

“When the code is incorrect, you can’t really talk about security. When the code is faulty, it cannot be safe.”

You may now send me your nastygrams. If they are constructive, please post them here. If not… send them to /dev/null.

[Dana Epp's ramblings at the Sanctuary]

Leave a comment