Open Source Software and Source Code Analysis: A natural match

Author: Ben Chelf
Registered since: 10/2007
Last article: 10/2007
Total articles: 2
Expert profile   All experts   

Printer-friendlyE-mail this article to a friendYour Comment

Sleepless nights

Six years ago this week, I first came to understand the words “graduate student.” Over the span of five days, I spent a total of two hours sleeping - the rest of my time was hacking, eating, injecting caffeine into my bloodstream, and trying to fight off hallucinations of penguins dancing around my feet. Why was I depriving myself? I was searching for bugs in Linux.

Introducing the concepts of “meta-compilation” to the world through that first publication in OSDI (Operating Systems Design and Implementation) proved to be a stepping stone not only for Dawson Engler’s research group at Stanford University but also for source code analysis as a whole. By demonstrating the power of this new technique on a code base as large and well tested as Linux, we got people’s attention about a new way to find bugs in code.

Now, six years later, we’ve been able to take this technology to the broader open source community and make it part of the open source development model. Ahead, I’ll tell you what keeps me up at night these days and why it’s paving the way for a more reliable, secure Linux.

Measuring software quality – a new approach

In the past decade, the open source model of software development has gained tremendous visibility and validation with the commercial success of projects like Linux, Apache, and MySQL. This new model, based on the “many eyes” approach, has led to fast evolving software that is being used in production environments by countless commercial enterprises. While the perceived quality of these packages is very high (“all bugs are shallow”), there isn’t really a clear metric for measuring their quality.

In the past few years, Coverity (the entity established to commercialize the Stanford meta-compilation technology) has introduced new source code analysis technology that analyzes millions of lines of code automatically, discovering defects that cause run-time crashes, performance degradation, incorrect program behavior, and even exploitable security vulnerabilities. Of course, using this technology to improve software is its primary benefit and the reason why thousands of developers are using it every day. However, since source code analysis is unbiased and analyzes 100% of paths through any given code base, it can also be used as a metric to measure the quality and security of a given body of code.

No metric is perfect. To propose the results of source code analysis as an absolute measure of quality would certainly be misguided, since no automated analysis can detect all of the bugs in software. However, many types of common program level defects are detectable by Coverity, so the results of our scans are a good measure of overall software quality - they also represent a consistent, repeatable metric through which to compare two code bases. Unlike cruder, more indirect measuring sticks like cyclomatic complexity, source code analysis provides actionable, easy to verify defect cases that pinpoint the root of a real software problem. Consider the two approaches:

Cyclomatic complexity framework (1) “Function ‘foo’ has too many paths through it.”

Source code analysis framework (2) “Function ‘foo’ has a memory leak on line 73 that is the result of an allocation on line 34 and the following path decisions on likes 38, 54, and 65.”

The first result does nothing to indicate poor code quality by itself – it is purely circumstantial. Given limited information, we can infer a lot more about the quality of a code base from the second result, which represents a real bug that we know degrades the software’s performance.

Enter scan.coverity.com

In March, Coverity introduced http://scan.coverity.com, a website dedicated to the continual scanning of many popular open source projects, and offered access to the defect databases for the many developers of those projects. The initial results are summarized in the sidebar. As you can see, Linux stacked up quite well against the other open source packages analyzed, especially considering its complexity:



SIDEBAR: The average defect density for the 32 open source packages that Coverity initially analyzed was 0.434 defects per thousand lines of code. The standard deviation for this set of results was 0.243. Table 1 shows the raw data including lines of code analyzed, number of defects found, analysis time, and defect density. Graph 1 shows the distribution of defect density based on ranges that represent 1/2 of a standard deviation. Graph 2 shows a comparison of the LAMP stack with the baseline derived from the analysis of the 32 open source packages. The average defect density for LAMP was 0.290 and all but one of the LAMP packages had a better than average defect density.]


Graph 1


Graph 2 (*core kernel code only)

The response

As of the time of writing, http://scan.coverity.com has been live for just under 6 weeks. In that time, Coverity has granted access to over 500 developers so that they can promptly address the defects discovered in their codebases. These developers have already submitted over 2000 patches to fix those defects, an impressive total which helps validates the scan results as a useful resource in the development process and an indicator of code quality.

The Linux development community has been very responsive in tackling the defects discovered in the kernel. Over 100 defects have already been patched with another 80 tagged as bugs to be addressed. In addition, the developers of the High Availability Linux project have fixed all 30 defects found by Coverity in Linux-HA. To date, 11 packages are at or near 0 defects on http://scan.coverity.com (Amanda, ethereal, glibc, icecast, Linux-HA, OpenLDAP, OpenPAM, Python, Samba, SQLite, and XMMS); Coverity has also expanded the list of projects scanned from the initial 32 to 50 and growing.

Thank you open source software It should not be surprising that the increase in the popularity of the open source development model led directly to technological breakthroughs in source code analysis. Traditionally, source code analysis has been subject to complaints regarding either scalability (i.e., the algorithms show promise but can’t handle millions of lines of code), false positive rate (e.g., try running lint on millions of lines of code!), or both. Though there are billions upon billions of lines of code in existence, most of it resides behind the corporate veil, and so people aiming to improve software quality did not have a large sandbox in which to test their methods. The popularity of Linux and the open source model is changing that, giving researchers and commercial entities like Coverity a tremendous library of real world code on which to refine practical source code analysis. Now, this technology is being used by both commercial and open source developers, and all the world’s software stands to benefit.

10/2007, Ben Chelf



Ben Chelf is Coverity's Chief Technology Officer. Mr. Chelf was a founding member of the Stanford Computer Science Laboratory team that architected and developed Coverity's technology.
All experts   
Publish your own article   


Comments on this article 


Open Source Software and Source Code Ana... new 
Technical article 10.10.07
Re: Open Source Software and Source Co... new 
Thomas McCabe Jr. 08.02.08

Write your comment on this article...

Subscribe to the newsletter

Never miss a story and stay informed with our newsletter.
Your email:  
RSS-Feed: All current newsOur News on your website

More articles on this topic

IT controls – the secret of high performing organisations
As securitymanager.net discovered when it met with Tripwire’s Paul Gostick, configuration audit and control is more than just a good idea; it’s a business imperative for any organisation that wants to fulfil its business objectives successfully ...
Compliance – less burden, more benefits
Regulatory compliance has become the boardroom issue of the decade. Executives are paying closer attention because compliance affects all aspects of business operations. And while compliance traditionally focussed on legal aspects of managing policies ...
Skype – Great opportunity or a threat?
Skype, the company that eBay paid £1.4B to acquire last September is continuing to gain ground in enterprises as users deploy it on their PCs with or without management approval. As it comes to your organisation, should you embrace it and its ...
RFID : not when but how
Retailers need to plan their RFID engagement now, if they are to close the gap with pioneers such as Metro and Tesco. It is no longer enough to wait, says Ronald van Zanten of Cisco Systems ...
Second SpyAudit Report
Webroot Software and EarthLink released their second SpyAudit Report, which tracks the growth of spyware on consumer PCs...

Articles on other topics

Ten Golden Rules for Marketing in Times of Recession
Everybody talks about this, but who actually does it? Difficult times are an opportunity to differentiate yourself from the competition...
Where CRM Goes Next
Companies today are facing unprecedented change: Consumer spending is off, business spending is slowing, and customer sentiment is tracking downward...
Quick, Free and Ready-to-Use: The Wiki Concept
Wikis have become an attractive alternative in content management. Whereas the structure of content in "real" management environments must be defined in advance, a wiki entirely adapts itself to meet content requirements...

Kostenlose Kontaktanzeigen
The Content Management PortalThe Document Management PortalThe IT Security PortalThe Customer Relationship Management PortalThe E-Commerce PortalThe Enterprise Resource Planning PortalPortal on VoIP and mobile communication The directory of Clinic IT SolutionsThe directory for IT professionals
homeimprintprivacy policycontactadvertising

know how

news

events

security alerts

Quick search




Current survey


Do you use antivirus software at your workplace?



Recommend us


Do you like our website? Why not recommend us?



Recommended reading


Understanding Digital Signatures