really long weblog post on .NET startup/shutdown and other matters

Chris Brumme is one of the key architects on .NET (and hence, on the next version of Windows). Today he released a really long weblog post on .NET startup/shutdown and other matterswhich is really more like a book chapter than a weblog post. His stuff is super technical.

But, if you got near the end of his 9000-word piece, he wrote this about Windows security and I thought it was so important that I'm reprinting it here in total.

Chris Brumme says: “I haven’t blogged in about a month. That’s because I spent over 2 weeks (including weekends) on loan from the CLR team to the DCOM team. If you’ve watched the tech news at all during the last month, you can guess why. It’s security.

From outside the company, it’s easy to see all these public mistakes and take a very frustrated attitude. “When will Microsoft take security seriously and clean up their act?” I certainly understand that frustration. And none of you want to hear me whine about how it’s unfair.

The company performed a much publicized and hugely expensive security push. Tons of bugs were filed and fixed. More importantly, the attitude of developers, PMs, testers and management was fundamentally changed. Nobody on our team discusses new features without considering security issues, like building threat models. Security penetration testing is a fundamental part of a test plan.

Microsoft has made some pretty strong claims about the improved security of our products as a result of these changes. And then the DCOM issues come to light.

Unfortunately, it’s still going to be a long time before all our code is as clean as it needs to be.

Some of the code we reviewed in the DCOM stack had comments about DGROUP consolidation (remember that precious 64KB segment prior to 32-bit flat mode?) and OS/2 2.0 changes. Some of these source files contain comments from the ‘80s. I thought that Win95 was ancient!

I’ve only been at Microsoft for 6 years. But I’ve been watching this company closely for a lot longer, first as a customer at Xerox and then for over a decade as a competitor at Borland and Oracle. For the greatest part of Microsoft’s history, the development teams have been focused on enabling as many scenarios as possible for their customers. It’s only been for the last few years that we’ve all realized that many scenarios should never be enabled. And many of the remainder should be disabled by default and require an explicit action to opt in.

One way you can see this change in the company’s attitude is how we ship products. The default installation is increasingly impoverished. It takes an explicit act to enable fundamental goodies, like IIS.

Another hard piece of evidence that shows the company’s change is the level of resource that it is throwing at the problem. Microsoft has been aggressively hiring security experts. Many are in a new Security Business Unit, and the rest are sprinkled through the product groups. Not surprisingly, the CLR has its own security development, PM, test and penetration teams.

I certainly wasn’t the only senior resource sucked away from his normal duties because of the DCOM alerts. Various folks from the Developer Division and Windows were handed over for an extended period. One of the other CLR architects was called back from vacation for this purpose.

We all know that Microsoft will remain a prime target for hacking. There’s a reason that everyone attacks Microsoft rather than Apple or Novell. This just means that we have to do a lot better.

Unfortunately, this stuff is still way too difficult. It’s a simple fact that only a small percentage of developers can write thread-safe free-threaded code. And they can only do it part of the time. The state of the art for writing 100% secure code requires that same sort of super-human attention to detail. And a hacker only needs to find a single exploitable vulnerability.

I do think that managed code can avoid many of the security pitfalls waiting in unmanaged code. Buffer overruns are far less likely. Our strong-name binding can guarantee that you call who you think you are calling. Verifiable type safety and automatic lifetime management eliminate a large number of vulnerabilities that can often be used to mount security attacks. Consideration of the entire managed stack makes simple luring attacks less likely. Automatic flow of stack evidence prevents simple asynchronous luring attacks from succeeding. And so on.

But it’s still way too hard. Looking forwards, a couple of points are clear:

1) We need to focus harder on the goal that managed applications are secure, right out of the box. This means aggressively chasing the weaknesses of our present system, like the fact that locally installed assemblies by default run with FullTrust throughout their execution. It also means static and dynamic tools to check for security holes.

2) No matter what we do, hackers will find weak spots and attack them. The very best we can hope for is that we can make those attacks rarer and less effective.

I’ll add managed security to my list for future articles.

Thanks Chris, hope you don't mind me reprinting this. [The Scobleizer Weblog]

Leave a comment