Supercomputers: interconnected servers running Linux

The list of the top-500 Supercomputers was released today at top500.org.

The list shows a number of interesting statistics. I am including some comments for changes observed from 2000 to 2007.

  • Operating Systems. The stats show a significant change in 7-years. Unix went from 90-to-12 percent while Linux went from 5.60-to-77.80 percent.
  • Processor Family. A significant change is apparent also where RISC-based technologies, namely IBM's Power, Sun's SPARC, MIPS, Alpha and PA-RISC, lost to the combination of Intel and AMD processor technology. The combined Intel and AMD offerings went from 4.00-to-78.80 percent.
These changes indicate that in contrast to earlier vector-based technologies, 80% of today's top-500 supercomputers are configured as thousands of garden-variety, conventional Intel and AMD microprocessors, e.g. scalar computers, running Linux and interconnected, clustered, by several network technologies.

The use of thousands of clustered Intel and AMD based servers for supercomputers may explain why tier-one vendors continue to devote much R&D and marketing to this segment. It may be regarded as a niche but apparently today it represents 19% of the server market and growing at 9% per year.

The industry seems to have found that there is much in common between the technology components needed for supercomputers and those needed by Google, Amazon, Yahoo, YouTube, Microsoft and others to power the growth in network-based services.

A key component that often calls less attention than real estate, power, storage, servers, OS and application software is the technology used to interconnect, to cluster, servers.

Core and leaf Switches and accompanying cables represent a huge expense, room, weight and computing capacity limiting factor.

Sun announced Constellation a system offering the building blocks around a connectivity technology that promises to simplify configuring supercomputers, should I say web services, scaling from tera-to-peta-flops.

The heart of Sun's Constellation is Magnum, a High-density 3456-port InfiniBand switch, that contributes to simplify the configuration and logistics for interconnecting large numbers of servers.
+------------- Top-500 Supercomputers ----------+
+------------- Number of Processors ------------+
Date Processors Count Share %

2007.June 1 1 0.20 %
33-64 3 0.60 %
65-128 5 1.00 %
129-256 2 0.40 %
257-512 81 16.20 %
513-1024 126 25.20 %
1025-2048 176 35.20 %
2049-4096 53 10.60 %
4k-8k 33 6.60 %
8k-16k 14 2.80 %
16k-32k 3 0.60 %
32k-64k 2 0.40 %
64k-128k 1 0.20 %

+------------- Interconnection Technology ------+
Date Technology Count Share %

2007.June Gigabit Ethernet 206 41.20 %
Infiniband 128 25.60 %
Myrinet 46 9.20 %
SP Switch 36 7.20 %
Proprietary 35 7.00 %
NUMAlink 15 3.00 %
Quadrics 11 2.20 %
Crossbar 10 2.00 %
Cray Interconnect 9 1.80 %
Mixed 4 0.80 %

+------------- Operating Systems ---------------+
Date OS Count Share %

2007.June Linux 389 77.80 %
Unix 60 12.00 %
Mixed 42 8.40 %
BSD Based 4 0.80 %
Mac OS 3 0.60 %
Windows 2 0.40 %

2000.June Unix 453 90.60 %
Linux 28 5.60 %
BSD Based 17 3.40 %
N/A 2 0.40 %

+------------- Processor Family ----------------+
Date Processor Count Share %

2007.June Intel EM64T 231 46.20 %
AMD x86_64 107 21.40 %
Power 85 17.00 %
Intel IA-32 28 5.60 %
Intel IA-64 28 5.60 %
PA-RISC 10 2.00 %
NEC 4 0.80 %
Sparc 3 0.60 %
Alpha 2 0.40 %
Cray 2 0.40 %
Intel + AMD 394 78.80 %

2000.June Power 143 28.60 %
Sparc 122 24.40 %
MIPS 62 12.40 %
Alpha 56 11.20 %
PA-RISC 53 10.60 %
NEC 25 5.00 %
Fujitsu 19 3.80 %
Hitachi SR8000 10 2.00 %
Cray 6 1.20 %
Intel IA-32 3 0.60 %
Intel i860 1 0.20 %

An open source content management solution

Alfresco is an open source Enterprice Content Management (ECM) alternative to closed products such as Documentum, IBM DB-2 Content Management, Filenet, Opentext, Interwoven, Vignette and Microsoft's Sharepoint among others.

Alfresco was developed using exclusively Open Source components such as Spring, Hibernate, Lucene. It represents a standards-based alternative to expensive, closed, commercial ECM products. Referenced standards include JSR-168, JSR-170 and JSR-283

Alfresco represents a good example of a business based on open source components and open development culture, Bazaar model, resulting in a content management solution that can be tailored to small and large organizations. Alfresco is licensed under GPL.

The list of customers using it for ECM, collaboration, workflow, document, web, records and image management is impressive for a relatively new product.

It is worth mentioning that Alfresco produced a full functioning, scalable, open ECM alternative to conventional products in less than a year of development effort using the open source components listed below. It represents a very good example of a collaborative, open and successful business developed by reusing existing components.

Alfresco's selection of open source components is a valuable reference, a list of chosen components among a large selection, as well as an excellent example of application development using existing well tested components free of proprietary licenses, royalties, patents and other demons.

Most components with the exception of Spring, OpenOffice, Hybernate and Lucene, are small cleverly crafted software, prepackaged unique functions, chosen by Alfresco to deliver the resulting functional integration in place of custom code.

  • Spring. Spring is an application framework for Java.
  • Open Office. OpenOffice.org is a multiplatform and multilingual office suite.
  • Hibernate. Hibernate is a high performance object/relational persistence and query Java library.
  • Lucene. Apache Lucene is a text search engine library written in Java.
  • MyFaces. Java Server Faces is a web application framework.
  • FreeMarker. FreeMarker is a template engine to generate text output based on templates.
  • Rhino. Rhino is an implementation of JavaScript written in Java typically embedded into Java applications to provide scripting to end users.
  • EHCache. Ehcache is a java-based distributed cache for general purpose caching.
  • ACEGI. Acegi Security provides applications with authentication, authorization and access control.
  • Log4j. Offers logging behaviour to Java applications.
  • jBPM. Workflow and business processes library for Java applications.
  • Axis. Apache Axis is an XML based Web service framework.
  • POI. Apache POI project consists of APIs for manipulating various Microsoft file formats using Java.
  • Xfire. XFire facilitates use of Web Services, via SOAP, for a Java application.
  • Quartz. Quartz is a job scheduling system for Java applications.
  • PDFBox. PDFBox is an open source Java PDF library for working with PDF documents.
  • TinyMCE. TinyMCE is a web based Javascript HTML WYSIWYG editor.
  • Jaxen. Jaxen is a Java to search and extract information from XML documents - an XPath Engine.
  • JCR RMI. Apache Jackrabbit JCR-RMI is a Remote Method Invocation (RMI) layer for the Content Repository for Java - Apache Jackrabbit implements JSR-283.
Other open source content management systems, include:


Service Oriented Architecture - SOA

Service-Oriented Architecture - SOA - is a key technology for developing network-based services. I was pleased to find a reference to a talk by Patrick Steger at the International Conference on Java Technology.

Just reading the Abstract tells me that the subject is detailed and well structured accompanied by a working code sample. I'll post the video reference if available; please post it as a comment should you find it; thanks.

On the subject of network-based services, here is a reference to the Economist's article: A battle at the checkout.

Abstract - Standards for an interoperable, secure and flexible SOA

"SOA (Service-Oriented Architecture) is becoming the central strategy for more and more companies and therefore getting business critical. An enduring SOA has to provide very high grades of security and availability combined with good interoperability and usability to protect both, the valuable assets and the often tremendous investments of the company.

Based on WSIT (Web Services Interoperability Technologies, SUN Microsystems) and WCF (Windows communication foundation, Microsoft) an interoperable, secure and flexible SOA is feasible today. This talk will provide you with the theoretical background of the standards you need to know when aiming for that target.

During the talk we will create a simple yet secure and interoperable SOA system centred on the well known Calculator service.

The SOA is built on a step by step basis and introduces the following major security relevant WS-Standards:

  • XML Encryption
  • XML Signature
  • WS-Security
  • WS-MetadataExchange
  • WS Secure Exchange
  • WS-Trust
  • WS-SecureConversation
  • WS-SecurityPolicy
  • Security Assertion Markup Language (SAML)
  • eXtensible Access Control Markup Language (XACML)
For each Standard you will learn its purpose, status and relationship to the other standards.

The final SOA system supports a scenario where a client application requests metadata from a Calculator Service and uses that metadata to obtain the SecurityPolicy of that service. In addition the location of the Authentication Service issuing the required SAML Token to access the Calculator Service is retrieved from the metadata.

The client then authenticates with the central Authentication Service and receives a SAML Token in return. Using the SAML Token the client calls the Calculator Service's add operation.

The Calculator Service validates the SAML Token and asks the central Authorization Service to check the authorization of the client to use the add operation with the given parameters."


Of Open-source, the Web and Hybrid cars

What do Open-source, the Web and Hybrid cars have in common?

They are cool.

Young and not so young, students, professors, geeks, retirees, movie-stars, are using, talking, reading, writing about Open-source software, the Web, Hybrid cars among other topics. They are participating and relating to day-to-day aspects, events of impact to their lives, to their families, to their neighbors, to their surroundings, to the environment.

Open-source stood against the country-club and gated community approach to software development; against an exclusionary way to develop function and value earlier thought available to selected few.

Open-source started with the premise that software is priceless and that all should benefit from it; like water, forests, rivers, oceans, fisheries, fauna, flora, atmosphere, etc. These are primordial resources, they are priceless, they must be managed and preserved in such a way that we can truthfully say we left them better than we found them. Same with software source code; you can use it and you can enhance it as long as such enhancements are known and available to all to see, to adopt, to improve, to distribute, to preserve for use by future developers.

The Web fostered also an idea of equal participation to not only consume but to produce information, privilege earlier franchised to selected few. The Web was to be MSN and AOL available only via paid membership. The Internet and later the Web, thanks Sir Tim Berners-Lee, changed that and as with Open-source it created a culture, a much more universal and open model, a way to look at primary resources in this case applied to data, information, and knowledge.

The use and evolution of Hybrid automobiles, alternate sources of energy in general, seem also to be largely influenced by concerned individuals, by concerned groups and communities. The technology has existed for decades with little if no leadership by governments and industry. The success of Hybrid cars and the ongoing effort to go beyond manufacturers designs, go beyond the 'I have a Hybrid car' statement, to enhance the efficiency of these vehicles by adapting better batteries and plug-in to the grid, use of solar energy, modified driving habits, etc, resulting in twice the efficiency of manufacturer's design.

It was not Toyota, Honda, Ford, GM et al that made available
plug-ins for hybrids. It was consumers that demanded it and relatively small companies offered better batteries and adapters for connection to the grid and to solar panels resulting in 17-to-29 percent more fuel efficiency. These are not industry or government initiatives; these are concerned individuals, asking, trying, experimenting, using, measuring, enhancing. Does it sound like Open-source; It does. Also, it helps when movie-stars and politicians get involved. It is cool.

Have a look at the links below for selected references re subject.

Open-source generally implies adoption of an 'open culture', model, for software development only. It could very well apply to other domains. Have a look at this article in the Economist postulating use of Open-source culture/model to Health Care.
CAN goodwill, aggregated over the internet, produce good medicine?
The current approach to drug discovery works up to a point, but it is far from perfect. It is costly to develop medicines and get regulatory approval. The patent system can foreclose new uses or enhancements by outside researchers. And there has to be a consumer willing (or able) to pay for the resulting drugs, in order to justify the cost of drug development.

Pharmaceutical companies have little incentive to develop treatments for diseases that particularly afflict the poor, for example, since the people who need such treatments most may not be able to afford them.


Server form factors

Pedestal, Rack-Mount and Blades are the common configuration form-factors for server units.

In the quest to maximize computing capacity and minimize floor-space, energy consumption and cooling requirements, Rack-Mount units are generally preferred to Blades. Blades may offer better packaging but they exhibit one distinct shortcoming

  • Shared computing resources. A Switch is included to share Network and I/O among packaged Blade-computing units. When hosting large applications and/or offering Virtual-Hosting, this resource sharing is a problem for which sites generally prefer Rack-Mount units.
The recently announced Sun Blade 6000 offers the packaging advantages of Blade format while configuring ten completely independent servers - no sharing of computing resources among participating server units.

The salient points of this design include the following:
  • More memory and more I/O. Up to double the memory and I/O capacity of competing Blades and Rack-Mount configurations is claimed by Sun
  • Reduced energy requirements. Shared power and cooling saves energy when compared to equivalent Rack-Mount servers
  • Processor choice. AMD Opteron, Intel Xeon and SPARC processor-based server modules, as well as support for Linux, Solaris, and MS Windows operating systems
  • Standard I/O: PCIe Express including support fo 1 and 10 Gigabit Ethernet technology.
A good test is to see if Google, known to use close to 1/2 million Rack-Mount units, switches instead to Blades for use in its many new Datacenters.

The Web: is it the future Datacenter ?

Yahoo, Amazon and Google among others are effectively forging an emerging computing model as an alternative to setting and operating conventional computing facilities for a company.

This emerging model offers vasts amounts of computing resources available to small and not so small companies to host their applications and services using a common 'computing cloud' as described in this Economist's article in reference to the partnership between Google and Salesforce.com.

Examples of this 'computing in the cloud' include services such as:

  • network.com. Sun's $1/CPU-hr, pay-per-use computing service, offers a catalogue of registered applications as well as the ability to develop, test and operate custom applications across the Internet.
  • Amazon. Amazon's Simple Storage Service - s3 - offers unlimited storage via a programable interface priced at $0.15 per GB-Month of storage used. Elastic Computing Cloud (EC2) offers virtual on-demand Linux images that live in S3 for booting and stopping; root access is provided.
  • salesforce.com. "Planning and implementing customer relationship management (CRM) solutions can be a significant undertaking. Salesforce.com's Successforce helps you succeed by unlocking the power of our business solutions and providing you with the greatest value from your investment."
  • Google. Several Google services and APIs as referenced here.
This trend does not mean that Datacenters will be replaced by Web-based services. No; but what will likely happen is that 'computing in the cloud' will be increasingly attractive to at least two groups of applications/services:
  1. New companies/services. Avoiding a large capital expenditure and logistics and cost of operating a computing facility(ies) will be very attractive for new business and services.
  2. New applications within large corporations. Often in-house IT organizations are unable to respond rapidly to new applications/services and corporate end-users may look to external providers for solutions.


Sun's Zettabyte File System - ZFS

A weak point of Unix and Linux operating systems is the File System.

Defining and managing disks is difficult and requires expertise to configure and manage external storage; it is a complex and error prone task. But what is most serious is File System corruptions, inability to rebuild and/or excessive time to recover, problems present even when using newer File Systems - Journal/Log-based File Systems.

Sun's Zettabyte File System, ZFS, is a fresh approach at how data is structured and stored on a disk, or set of disks, while addressing ease of use and File System integrity. The salient features of ZFS include:

  • Integrity. The system does not overwrite data, it saves new data first and then deletes the data it replaces. It includes also several built-in checks to prevent data corruption.
  • Capacity. Find at wikipedia.com a good reference to the capacity metrics of ZFS. "ZFS is a 128-bit file system, so it can store 18 billion billion (18.4 x 10^18) times more data than current 64-bit systems." The limitations of ZFS are such that they will unlikely be encountered in practice. The capacity limit for one ZFS storage pool, zpool, is 2^128 = 3.4 x 10^38 bytes. The limit of zpools is 2^64 = 1.8 x 10^19.
  • Snapshot. In-place-copies of the File System can be taken at anytime, with minimum overhead, and available concurrently as required thus facilitating access to on-line previous File copies and for Backups. This is such a useful feature and it appears uses the same/similar approach as the highly successful Netapp's storage units using WAFL File System.
  • Quotas. The system supports quotas at a File System level.
  • Management. ZFS uses the concept of pooled storage; simply plug in additional drives, without worrying about storage parameters such as volumes or partitions. This approach significantly reduces the labour required to define, expand as needed and manage storage.
  • Architecture. ZFS implementations exist for SPARC and for Intel/AMD x86.
  • Implementations. Initially it was available as part of Solaris, but now a Linux implementation is available and most recently it is rumored to be included in the upcoming Mac OS X 10.5, Leopard.
Here is a link to Jeff Bonwick's blog. Jeff is the project lead for ZFS at Sun.

Should you be interested in more detail, read it here and see it here.
+---------- quantities   of   bytes -------------+
name prefix +----standard---+ historical

kilobyte kb 1000^1 = 10^3 1024^1
megabyte mb 1000^2 = 10^6 1024^2
gigabyte gb 1000^3 = 10^9 1024^3
terabyte tb 1000^4 = 10^12 1024^4
petabyte pb 1000^5 = 10^15 1024^5
exabyte eb 1000^6 = 10^18 1024^6
zettabyte zb 1000^7 = 10^21 1024^7
yottabyte yb 1000^8 = 10^24 1024^8


Google's infrastructure: scalability++

Google has introduced a vast array of products where search remains the main focus.

It is interesting to learn how Google does it and what are the internal IT services behind it.

Here is a list of Google products and of programing APIs.

A good place to start is Google's mission statement:

"To organize the world's information and make it universally accessible and useful."

Wow; this is a formidable challenge and offers a good insight at the underlaying data volumes, storage, computational complexity, network and computing topology across the world to design, build and deliver Google's work.

I found this video of much interest re subject. The video describes essential building blocks that offer a glimpse at the technology behind Google's services; they include:

  • Computers. Google uses hundreds of thousands conventional computers, plain vanilla Intel/AMD x86 units, powered by a custom Linux OS.
  • GFS - Google File System. GFS is a distributed file system, a basic unit of storage to save abstractions such as BigTable.
  • BigTable. BigTable is a storage abstraction for managing structured data designed to scale to petabytes, 10^15 bytes, of data.
  • MapReduce. MapReduce: simplified data processing in large Clusters. A model to define a given programming task across a large data set using a Map, a key and value pair programming abstraction, and associated computing function(s).
This reference is about An Economic Case for Chip Multiprocessing.

I found this blog offering more detail about Google's infrastructure.
And this one I found recently has much more information re subject.

Also this one has a summary report from Google's Scalability Conference:
At Google they do a lot of processing of very large amounts of data. In the old days, developers would have to write their own code to partition the large data sets, checkpoint code and save intermediate results, handle failover in case of server crashes, and so on as well as actually writing the business logic for the actual data processing they wanted to do which could have been something straightforward like counting the occurence of words in various Web pages or grouping documents by content checksums. The decision was made to reduce the duplication of effort and complexity of performing data processing tasks by building a platform technology that everyone at Google could use which handled all the generic tasks of working on very large data sets. So MapReduce was born.


Google Gears

Google released Gears a browser extension that enables people to access Web applications when working off-line.

This is a significant development in defining software and associated APIs and charting the evolution of the Web-platform. You can read about Gears at Google's code.google.com/apis/gears and a good summary is available also at CNet.

Salient points include:

  • API. A defined API for the Web-platform; it defines access to 1) a local Web-Server - yes, an embedded Web server to cache and serve html, JavaScrip, images, etc.; 2) local Database, SqLite, to store and search data locally; 3) WorkerPool: allows asynchronous operation resulting in responsive operation e.g parallel work.
  • Open-source. BSD open-source license.
  • The Google culture. Geek-friendly culture fostering creativity by Google employees and by outside developers. The first Google application using Gears, Google Reader, was done as part of the company's program in which employees can work on their own projects for 20 percent of their work week.
  • Participation of other companies. A number of companies working on Ajax-based applications are working with Google to define and use Gears; these include Adobe, Opera, Mozilla and Dojo Foundation creators of Dojo Tolkit.
The amount of work, products, services, APIs that Google is releasing is shaping the direction of the Web-platform which in this case defines the way a Web Browser and Web-Applications can work in a disconnected, standalone, manner. Products such as Palm's Foleo may indeed use this technology as Adobe and others likely will use it to offer their applications in connected and disconnected mode.

Google's Gears blog is located here.

I found this blog with some very interesting thoughts on the impact of a universally accepted method, a standard, for content synchronization across the Web.
The Web platform's promise is access to content anytime, anywhere and on anything—as long as the user has Internet access. Google Gears could bring some of that information off-line, further extending that promise. Universal synchronization would be game-changing, however; it would be a paradigm shift for digital devices, desktop software and the Web.