notes on computing

2011-10-11

the battle to control JavaScript

CoffeeScript is a relative new programming language.

I was skeptical about exploring yet one more programming language and in this case one that compiles to another language, JavaScript.

JavaScript is probably the most used programming language; it is the software technology used by every web browser to render interactive information and applications.

The number of devices depending on JavaScript include desktops, laptops and a great percentage of mobile devices and smartphones. The total number easily adds up to more than a billion devices and likely up to 2-billion have a JavaScript engine installed; so, why CoffeeScript?

Each browser has a different interpreter software of JavaScript code, code that is not guaranteed to execute properly on all the tens if not hundreds of JavaScript implementations available today. This is a big problem given the vast dependency on JavaScript as the language of the web; at least for the presentation and interactiveness of web-based applications running on billions of devices.

CoffeeScript makes it easier to write and produce correct JavaScript; by correct it is meant code that follows identified best practices and that has greater probability of correct execution across different JavaScript implementations.

The subject goes beyond producing good JavaScrit; it involves the need to simplify the programming labour, increase productivity and quality of the resultant code and streamlining the process by using the same language on the client device and on the server.

JavaScript is now the de-facto standard to present information and applications on billions of devices.

JavaScript is also being used outside client devices; it is now used on the server. See node.js as an example of a JavaScript server based technology. This direction to use JavaScript on the server adds to the need for increased productivity and improved quality of generated JavaScript. No longer used as a client component only but used to write much larger and complex computing tasks on the server.

Enter Google. Google recently introduced a new programming language, Dart. Dart compiles also to JavaScript in addition to offering its own processing environment available now on the Dart site and soon to be available on Google's browser, Chrome.

Google's blog introducing Dart outlines three main goals for the language:

Create a structured yet flexible language for web programming.
Make Dart feel familiar and natural to programmers and thus easy to learn.
Ensure that Dart delivers high performance on all modern web browsers and environments ranging from small handheld devices to server-side execution.

Note the reference to 'server-side execution' meaning a language targeted to address the server side as well as the client side.

Reading about Google's Dart, one may also conclude that Google may be seeking distance from Java and from Oracle's control over Java and accompanying lawsuits; Google has Go and now Dart as recently introduced programming languages.

Back to CoffeeScript. It is easy to like CoffeeScript; it is simple, easy to learn and it has a functional flavour to its syntax. It is an alternate and a better way to produce consistent and correct JavaScript, at least better than the one I can write for use on web-based applications, CouchDB, node.js and the ever growing list of JavaScript based tools.

There are several articles and blogs detailing the strengths and weaknesses of CoffeeScript; this blog offers an outline of the good parts, namely:

it is easy to learn
it is easier to type than JavaScript
it helps you better understand JavaScript
it produces better JavaScript than you could by coding JavaScript directly

I found this screen-cast and accompanying text samples very useful for understanding and learning CoffeeScript.

Update

Dart. I had a quick look at Google's Dart. There are many things to like about Dart including the Erlang-like message driven concurrency implementation. However, the syntax is, should I say it, conventional; looks like an updated Java or C-like syntax.
CoffeeScript. CoffeeScript has a functional syntax, simple and arguably easier to learn for someone unfamiliar with C-like languages.
Uninspiring isn’t always a bad thing. This blog offers a balanced assessment of Dart and its potential acceptance as a programming language.
Book. The Little Book on CoffeeScript offers a good introduction to the language.

2011-05-14

The challenge of scaling up and scaling out

Additional computing capacity has been achieved by increasing the speed processors operate.

However, the speed of light became a limiting factor for a single processing unit, a single CPU, to handle its internal electrical connections. Marginal improvements in speed certainly will be available but order of magnitude improvements from speed increases are unlikely to materialize.

To increase computing capacity, hardware vendors began using additional processing units, CPUs, per chip by adding the equivalent of additional traffic lanes in a highway; same speed limit but offering more lanes and hence able to handle more traffic, more cars moving per unit of time.

We have now multi-CPU processing units of 2, 4, 8, 16, 32, 64 and more CPUs and soon likely will reach hundreds and probably in few years will count with thousands of CPUs in one computer unit.

The software to tap such capacity has not evolved as fast as the hardware. It is a serious challenge to develop applications that can operate correctly while running concurrently on more than one processing unit and a challenge also to use same or similar methods to distribute work among a network of computing nodes.

While there are proprietary and open source products to address software operation on clusters of computers, such as the widely used Message Passing Interface, MPI, in supercomputing applications, conventional programming languages lack tools and features to facilitate development of applications designed to operate correctly and efficiently by scaling within a computer unit, scaling up, and scaling out by distributing work among participating computing nodes.

The programming languages listed below receive attention these days partly because of corresponding claims of design, features and functions related to operating on multiprocessor systems and related constructs for concurrency and scalability.

Scala. Excerpt from Scala's home page: "Scala is a general purpose programming language designed to express common programming patterns in a concise, elegant, and type-safe way. It smoothly integrates features of object-oriented and functional languages, enabling Java and other programmers to be more productive."
Clojure. Excerpt from Clojure's home page: "Clojure is a dynamic programming language that targets the Java Virtual Machine (and the CLR ). It is designed to be a general-purpose language, combining the approachability and interactive development of a scripting language with an efficient and robust infrastructure for multi-threaded programming."
Erlang. Excerpt from Erlang's home page: "Erlang is a programming language used to build massively scalable soft real-time systems with requirements on high availability. Some of its uses are in telecoms, banking, e-commerce, computer telephony and instant messaging. Erlang's run-time system has built-in support for concurrency, distribution and fault tolerance."

A recent development in the Scala community is the introduction of the Akka library that offers concurrency models and APIs for applications to operate correctly in a multi-CPU environment, but offers also APIs to go beyond one computing node and distribute work among participating nodes in a network. Akka's concurrency and distributed models and software constructs look and feel as it were a port of Erlang to Scala.

We believe that writing correct concurrent, fault-tolerant and scalable applications is too hard. Most of the time it's because we are using the wrong tools and the wrong level of abstraction.

Akka is there to change that. Using the Actor Model together with Software Transactional Memory we raise the abstraction level and provide a better platform to build correct concurrent and scalable applications.

Actors also provides the abstraction for transparent distribution and the basis for truly scalable and fault-tolerant applications.

A further development just announced this week is the partnership of the creator of Scala, Martin Odersky, and the creator of Akka, Jonas Bonér, to form the company named Typesafe offering an open source integration of Scala and Akka.

The software integration represented by Typesafe is good news to Scala and Java developers by offering an integrated set of tools, documentation, training and consulting in the area of applying concurrency constructs to build correct and efficient applications in muti-CPU and distributed computing environments.

The Akka library offers the Scala and Java APIs. Using the library one is certainly pleased to see the strong influence of Erlang.

Another Scala and Java library addressing the ability to scale up and out is GridGain. Here are excerpts re library from GridGain's home page.

GridGain 3.0 is the industry JVM-based post-functional distributed middleware that combined traditional object-oriented programming approach with comprehensive functional programming support.

GridGain is a High Performance Cloud Computing software that enables development of high performance Java and Scala applications that scale up on any grid or cloud infrastructure.

GridGain provides state of the art implementations for compute grids, data grids, and zero deployment. You can develop with GridGain using Java or Scala programming languages - both of which are supported natively.

The result - is a powerful, flexible and highly expressive APIs that are simple and productive to use.

Also this week we saw the introduction of Google's Go programming language at Google's I/O conference in San Francisco.

Google presents Go as designed specifically for concurrency and distribution and included as a third language alternative to Google's App Engine service: Python, Java and now, Go.

It appears that the software world is gradually catching up to benefit from the availability of tens, hundreds and even thousands of parallel computing lanes in one node and scale out by distributing work among participating computing nodes in a network.

2010-12-25

Towards a paperless world: eBooks are here to stay

Over the last few years I have purchased several electronic books in the form of PDF files.

I find the PDF format quite useful and indeed better than printed books particularly when reading science, math, software and technology related documents since it allows me to Copy & Paste items for testing and prototyping while reading.

This feature of Copy and Paste I find it very useful; it is an active and experimental learning experience unavailable using conventional paper format.

eBooks change not only the reading experience but probably more importantly the authoring and review process of creating, editing, publishing and distributing books.

For schools and universities, both for faculty and students, this interactiveness, validation and experimentation represent a unique way to author, learn, document and review, thereby enhancing, communicating and validating in realtime, activities possible today by using mobile phones, tablets, laptops and desktops in the classroom, lab, office, bus and at home.

In fact the evolution of eBooks may morph with other electronic developments and media and possibly transform into a much more different experience than today's eBooks; much more interactive and likely resembling applications self-contained and optionally connected to external resources offering more than passive text rendering and including features and functions adding value, effectiveness and productivity in a number of disciplines such as in health, education, science, math, language, chemistry, and transportation to mention few areas of applicability.

Licensing is difficult. The PDF books I have indicate clearly they are licensed to me; a watermark and text along the bottom of each page say so; however, they can be copied and distributed; enforcing of the license is a problem for authors and publishers.

Browsing through amazon.com I found Stephen Hawking's latest book, The Grand Design, for $9.74 unavailable in PDF format but available in Amazon's proprietary format: Kindle.

I was not about to buy yet another electronic proprietary gadget to read a book; however, I found Amazon offers free Kindle readers for a variety of platforms including the Mac.

Few minutes later and $9.74 allowed me to commence reading Hawking's latest book using my MacBook Pro, a bright and very legible platform for reading electronic content.

Some lessons I learnt from this experience include the following points:

I prefer PDF format for eBooks. Using a PDF I have a file that I can read using any of the technologies I use. It allows me to Copy and Paste content such as sample source code which is now the majority of the eBook content I read.
Amazon's Kindle is subscription based. You purchase books housed in Amazon's cloud; you do not store nor own a file such as when obtaining a PDF or ePub book. The Kindle unit and readers for various platforms connect to the Amazon's cloud, download the book to be stored internally within the reader, and rendered using a Kindle or readers for various platforms. The Kindle reader in both the Mac and MS Windows are very good but read only Kindle books. The system is good but closed; I prefer an open format accepted by the industry to be used universally as we do for text files, Web pages, PDF documents among other formats. However, using a subscription model and using a proprietary format, Amazon is able to better control licensing and avoid copying books. Kindle does not support Copy & Paste.
ePub. ePub is an open file standard for documents; files have the extension of .epub and there are various utilities to convert from text, MS Word, PDF, etc. to ePub format. See the Wikipedia entries for ePub and for International Digital Publishing Forum.

Given my limited experience reading books using electronic media, I see a trend evolving towards eBooks that may result in more new books appearing and distributed in electronic media than paper and some books may be produced only in electronic format for subjects such as textbooks, software, R&D and industry specific.

Mobile phones, laptops, tablets and other electronic appliances may influence the direction towards certain new books, magazines, etc. appearing only in electronic format as is the case for Internet based content. In that case, new information is already available only in electronic media thus contributing towards the desired but elusive paperless world for information exchange.

I found a good summary and chronology for eBooks and related developments at Wikipedia.

Update. See reference to news that Amazon's eBook sales surpass paperbacks for the first time. For each 100 conventional books, Amazon sells 115 eBooks, should I say, Kindle formatted books.
Update. Article re subject: As E-book Sales Explode, Consumption Patterns Change.

2010-09-20

Clojure: a functional language

Much is talked about emerging computer languages and lately I have spent time trying two of these tools, Scala and Clojure.

I find these two languages very useful for several applications and I like each for different reasons and attempt to keep current with related blogs and information items.

Unfortunately, there are insufficient spare milliseconds in a day available for reading, exploring and applying these emerging computing utensils.

Among the promises of these new programming tools is a better way to express an algorithm and capabilities to tap computing concurrency on multi-CPU computing platforms.

A key property of Scala and Clojure is that they are called functional languages; a definition of functional I prefer for a programming language is this:

a functional language makes programming centered around functions easy and natural

Should you be interested in Clojure, IBM's DeveloperWorks has a good and informative article on the concurrency features of Clojure; DeveloperWorks also includes this introduction to the language.

As mentioned earlier, the most popular programming languages offer some very basic concurrency features: threads and locks. For example, Java 5 and 6 introduced numerous new utility APIs for concurrency, but most of these were either utilities built on threads and locks, such as thread pools and various types of locks; or they were data structures with better concurrency/performance characteristics. There was no change to the fundamentals of how to design concurrent programs. You still have to solve the same puzzles, and the solutions are just as brittle. You just have less boilerplate code to write.

Clojure is fundamentally different in all respects. It does not give you the usual primitives, threads and locks. Instead you get entirely different concurrent programming models that include no mention of threads or locks. Notice the use of the word models—plural. Clojure has four different concurrency models. Each one of these models can be considered an abstraction on top of threads and locks. Let's take a look at each of these concurrency models, starting with the simplest one: vars.

2010-07-16

Jaguar tops list of 500 supercomputers

The list of the top 500 supercomputers is published every 6-months by top500.org.

The update released on June 2010 is available here.

The highlights for this update include the entry of China in the first 10 systems; summary points are:

Jaguar ranks first. Jaguar climbs to first place displacing Roadrunner which now occupies third place behind China's Nebulae.
China's Nebulae is second. China joins the first 10-systems in the list; Nebulae ranks second in the benchmark tests and it is first in peak performance clocked at 2.98 peta-flops/s, e.g. 2.98 x 10^15 floating-point instructions per second.
Linux on commodity hardware. Linux maintains first place as the dominant operating system a reversal of roles with Unix that dominated the field back in 2000; today, 91% of the top-500 supercomputers run on Linux as shown in the graphic below. The hardware is based on commodity Intel and AMD processors, all use x86-64 64-bit technology not much different than that found on conventional new desktops and laptops. Today, 91% of the top-500 supercomputers run on thousands of interconnected conventional servers, pizza-box like computing units, each containing four or more Intel or AMD cores/processors; Jaguar uses 6-core AMD Opterons.

A distribution of the top-500 supercomputers across continents is shown below. For the Americas, the count is 290 = 282 + 7 + 1 for USA + Canada + Brazil respectively representing 290 / 500 or 58%

A distribution among system integrators, listed as vendors in the top500.org list, is included here. The site lists 32-vendors; the graphic below shows vendors with the most entries in the published list.

The data values released on June 2010 for hardware, Operating System (OS) and distribution of the top-500 supercomputers across countries are shown below. These stats are derived from the detailed information found at top500.org. Note that the category shown as 'Intel and AMD' is the sum of the systems listed under Intel IA-64, Intel EM64T and AMD x86-64; 455 = 5 + 401 + 49 respectively.


+------------------------------------------------+
              Top 500 Supercomputers  
--------------------------------------------------
no  name                          count        %
--------------------------------------------------
    Processor Architecture 

1   Intel and AMD                   455       91
2   IBM Power                        42        8
3   Other                             3        1
               
    Operating Systems      

1   Linux                           455       91
2   Unix                             23        5
3   Other - mixed                    17        3
4   MS Windows                        5        1

    Countries              

1   USA                             282       56
2   United Kingdom                   38        8
3   France                           27        5
4   Germany                          24        5
5   China                            24        5
6   Japan                            18        4
7   Russia                           11        2
8   Sweden                            8        2
9   Canada                            7        1
10  Italy                             7        1
11  New Zealand                       7        1
    Other Countries                  55       11
+------------------------------------------------+

2010-04-09

of iPad and other promises of a paperless world

Now that the iPad has arrived and the subject of various reviews and opinions, I stopped by the local Apple store and had a quick look and experienced Apple's latest product.

After few minutes and an effective introduction to the unit by one of Apple's gurus, I came to the conclusion that it is an attractive and a well designed computing appliance; an excellent industrial design example as expected from Apple: simple, intuitive, elegant; however, it is not for me.

The design, ergonomics, look, touch & feel exhibit the Apple-esque experience, the expected result of a contemporary product of Steve Jobs and his design and engineering teams. Working with the unit and asking myself whether it could replace my mobile computing and communications tool, an Apple product, a MacBookPro. The answer is no; certainly I do not need to carry yet another unit in addition to a phone and a laptop.

However, the designed target for the iPad may include casual Internet browsing, e-mail, and media player for e-books, e-newspapers, e-magazines, video, pictures, games, etc.

Other more serious use cases may find the iPad as the tool for applications where conventional laptops, smart phones and similar units are too big or too small and generally inadequate and the tablet format may be a good fit and indeed useful such as in manufacturing, health, labs, transportation, construction, education, science, police, customs, field work, mobile point-of-sale unit in retail and other industries, GPS-based and compass guiding tour unit at museums, theme parks, golf courses, factories, universities, cities, etc.

With the introduction of the iPad, Steve Jobs' list of industrial designs and computing contributions may include the reintroduction of the tablet as a computer form factor with design characteristics for it to be applicable in several areas including applications where paper is predominately used thus commencing the ever expected en masse transition to electronic media and the volume use of tablets and iPad-like devices, such as Google's expected Chrome-based tablet, may finally be effective, efficient and conventional computing appliance used where newspapers, books, magazines, maps, specifications, reference and training documents, and paper forms are used today.

For this transition to materialize, the price must come down to less than half and towards $100 dollars, and the unit, applications and application development and licensing, improved and without requiring application registration and vendor approval, fostering an open and federated rather than a closed and centralized application ecosystem, for tablets, laptops, smart phones and other communications and computing devices to find en masse use as media and application players and significantly reducing the need for paper.

This article How the Tablet will Change the World is an interesting read.

Here is a quote from Google's CEO, Schmidt, that illustrates the need of very affordable and even disposable devices:

Hopefully someone else will be making the devices, we do the software. What's important now is to get the mobile architecture right. Because mobility will be the way you will provision in the future. Fast forward 5-10 years. The answer should always be mobile first. You want to have the best app on mobile....The promise of Chrome and Chrome OS is that the devices you give to employees will have 2-second boot time, will be disposable, low price.

2010-03-18

Concurrency and Functional Programming Languages

There has been much talk recently about the applicability and use of functional programming languages particularly in reference to concurrency of operations benefiting from multi-CPU computing technologies.

I found that to better understand programming in a functional manner I should start with Haskell. Haskell is the most often mentioned reference, the de facto standard, the purist among functional languages that implement, fully or partially, Lambda Calculus, a kind of computational algebra introduced by Alonzo Church, Haskell Curry and others back in the 1930s.

Included below are references to the material I found most helpful in my quest to understand and use functional programming languages.

Functional Programming. A good place to start is the definition of functional programming found in Wikipedia. Although this video is about Clojure, I find it a very good reference explaining what makes a language functional.
Haskell. I used this tutorial to learn Haskell. I found it a valuable reference to learn the concepts behind functional programming.
Clojure. Clojure is a Lisp-like language that uses the Java JVM for run-time. The more I study it, the more I find it useful, simple, concise. The Clojure Home site and these videos, ten in total, were a great help. Should you want more detail, this video and this article are very useful in learning the language. This blog includes several useful Clojure links. Also, Sean Devlin has several informative videos should you be captivated by the power and elegance of Clojure.
Scala. Scala is a mix of functional and object-oriented, e.g. C++, Java, Python and Microsoft's C#. Like Clojure, Scala uses the Java JVM for run-time. Although I find Clojure's syntax cleaner, concise, should I say elegant, I feel much more at home when using Scala, at least for now and likely the byproduct of working with non-functional languages; Scala is both functional and object-oriented. The place to start is the Scala Home site and this set of Scala reference links. This presentation by the creator of Scala, Martin Odersky, offers a good introduction to the language. Here is part-2 of Martin's presentation.
Functional Languages and Concurrency. This short video offers an informal insight as to why functional programming implementations are needed to achieve scalability and concurrency.

There is a pressing need to simplify the labour involved in developing applications in general, even more so for applications that are scalable and support high volume concurrent operations in a defined, effective, efficient and mathematically proven way.

The next few years will show whether functional programming languages address these needs, namely productivity, scalability and concurrency, and hence adopted in volume by business applications.

Note that Microsoft has a contribution to functional languages in the form of F#. I had a look at F# and indeed looks interesting; I will look at it in more detail; here are some references I found useful:

why F# is a contender in the functional programming space; blog
an F# introduction; blog

For a lighter look at functional programming, To Mock a Mockingbird: And Other Logic Puzzles offers and excellent introduction to the subject.

2009-06-25

Roadrunner supercomputer remains on top

The list of the top 500 supercomputers was updated and released this week by www.top500.org

Roadrunner remains on top and Cray's Jaguar retained the number two spot.

Both of these systems are powered by AMD Quad-Core Opteron processors.

Rodarunner is clocked at 1.105 and Jaguar at 1.059 petaflops or 10^15, floating-point-operations per second.

Some key statistics include the following:

Processor Architecture. Intel and AMD account for 88% of the listed systems: 442 = 399 + 43 or 88%.
Operating System. Linux is at 443 systems or 89%, Unix at 4% and MS Windows at 1%
Vendors. For the first time Hewlett Packard, HP, surpassed IBM in the number of installed systems. HP has 212 or 42% and IBM 188 or 38%.

2008-09-02

Does the Internet need another browser?

Do we need another browser?

Yes; the Internet needs a well defined standard and reference implementation adopted and contributed by all.

The browser, possibly the most used software of all times, needs a standard to address the several and varied functions it evolved to from the early html-rendering days.

Google is introducing today more than a software product; it is a proposed common platform, reference design and implementation to collectively work and evolve a modern browser.

Key properties of Chrome include the following:

WebKit. It uses a well known and well tested web rendering engine, WebKit, used by Safari, Nokia, Google's Android and several others.
New Javascipt engine, V8. This is a development on its own, a new JavaScript engine develop by Google's Denmark group. Reportedly, benchmarks show it at the fastest JavaScript engine so far along with the redesign features to enable multi-process and multi-thread processing for browser processes: tabs.
Task isolation. Tabs and Plug-ins run on assigned processes. One tends to have tens of tabs open; if one crashes it generally crashes the entire application. Security and process isolation are also welcome byproducts of this design.
Offline mode built-in. Gears is built-in resulting in offline operation for applications that use this capability; Gmail, Docs, etc, are likely to be the first applications to use it.
Open sourced. Means that FireFox, Safari, Opera, and yes, IE, can incorporate some or all the technology proposed by Google; everyone wins; a standard is needed and one is available for test and development.

Whether Chrome's technology is used by other browsers or not, Google has the cloud and now the cloud's client resulting in a formidable software combination through which to deliver services independent of platform and form factors as Chrome shows up on MS Windows, Macs, Linux, Android, and other wired and increasingly wireless units.

One can easily conceive an upcoming Chrome net-book, named cbook, gbook, Android-book or any other name likely starting with g, running, you guessed it, Chrome, as the graphical user interface and Android as the OS.

References:

Google's blog entry re subject is found here.
Chrome's comic doc-book.
Nicholas Carr has written a blog-entry defining it as the Cloud's Chrome lining.

2008-07-19

parse-cmd: A simple command-line Java parser

Wikipedia defines parsing as follows:

"In computer science and linguistics, parsing, or, more formally, syntactic analysis, is the process of analyzing a sequence of tokens to determine grammatical structure with respect to a given (more or less) formal grammar.

A parser is thus one of the components in an interpreter or compiler, where it captures the implied hierarchy of the input text and transforms it into a form suitable for further processing (often some kind of parse tree, abstract syntax tree or other hierarchical structure) and normally checks for syntax errors at the same time."

When writing a Java console application, or an application responding to an input stream, without using a formal parser such as ANTLR - ANother Tool for Language Recognition, one ends up writing custom code each and every time; a generic approach to define and parse few application specific commands is needed.

A quick search on Google shows several tools to handle command-line options but generally I found these solutions complex. Access to a generic and simple parser was the beginning of parse-cmd.

Should you need a simple, less than formal, one-class Java command-line parser, take a look at parse-cmd.

An equivalent implementation using Scala of a command-line parser is available here.

2008-07-03

Roadrunner: a Linux-based petaflop computer

On June 18 the updated list of the top 500-supercomputers was published.

This time the list includes the first petaflop computer, 10 to the 15th floating point operations per second, one thousand teraflops, for a computer built by IBM for the US government.

The introduction of Roadrunner is one more example of the drive away from proprietary technologies towards use of thousands of commodity components clustered and managed by Linux.

This architecture for supercomputer solutions is strikingly similar to that used by Google, Amazon, Yahoo et al to power their services.

Note that IBM did not use its processor technology, Power. Instead, it used the combination of AMD's Opteron and the Cell technology, developed by IBM and Toshiba for Sony's PlayStation 3.

Comparing the updated list to that of June 2000, offers these points.

Use of commodity components. Roadrunner uses the processor in Sony's Play Station 3 and AMD Opteron chips. Today 86% of the top 500 supercomputers use conventional Intel/AMD technologies with close to the remaining 14% using IBM's Power. This contrast to using close to 100% RISC-based hardware back in 2000 when there were 4 Intel-IA-32 based computers. The share of IBM's Power went from the top technology offering in 2000, with 143 computers, down to 23 in the current top-500 list.
Linux. Roadrunner and 92% of the top 500-supercomputers use Linux for OS compared to Unix at 90% back in 2000. Now, Unix is represented exclusively by AIX with 23 computers or 4.60%.

   +------------- Top-500 Supercomputers ----------+
   +------------- Operating Systems ---------------+
   Date       OS               Count     Share %

   2008.June  Linux              460     92.00 %
              AIX                 23      4.60 %
              Mac OS               2      0.40 %
              Windows              2      0.40 %
              Other               13      2.60 %

   2000.June  Unix               453     90.60 %
              Linux               28      5.60 %
              BSD Based           17      3.40 %
              N/A                  2      0.40 %

   +------------- Processor Family ----------------+
   Date       Processor        Count     Share %

   2008.June  Intel EM64T        356     71.20 %
              AMD x86_64         107     21.40 %
              Power               68     13.60 %
              AMD x86_64          55     11.00 %
              Intel IA-64         16      3.20 %
              Intel IA-32          3      0.60 %
              Cray                 1      0.20 %
              NEC                  1      0.20 %
              Intel+AMD          430     86.00 %

   2000.June  Power              143     28.60 %
              Sparc              122     24.40 %
              MIPS                62     12.40 %
              Alpha               56     11.20 %
              PA-RISC             53     10.60 %
              NEC                 25      5.00 %
              Fujitsu             19      3.80 %
              Hitachi SR8000      10      2.00 %
              Cray                 6      1.20 %
              Intel IA-32          3      0.60 %
              Intel i860           1      0.20 %
   +-----------------------------------------------+

2008-04-25

Scala: functional programming for Java

Programming is a long and complex effort for projects of any size and consequently we are in a constant quest to find a language to improve productivity: deliver quality results using less time and effort.

I have no statistics about distribution of programming language usage. A thesis is that Java enjoys a lead among languages in use today: Internet, Web Services, finance, telecoms, health, communications, aviation, military, automobile, oil, factory automation, appliances and consumer products, etc.

The topic of alternate languages to Java often brings lively discussions of pros and cons among languages such as Ruby, Groovy, Python, Erlang, Scala, F#, C#, Lisp, PHP, C++, C and JavaScript among several others.

After some reading and limited experimentation with these alternate languages, I come to the conclusion, thesis without proof, that Java is here to stay in the lead. However, Scala profiles as a language that leverages and expands the Java experience, and JVM, and may contribute to an even greater use of Java through Scala.

Scala is a relatively new language; some basic points about Scala include the following:

Created. Created by Martin Odersky at L'École Polytechnique Fédérale de Lausanne in 2001.
Platform. Runs on the Java Virtual Machine, JVM, and directly uses and produces Java classes, Java byte-code. An implementation for MS .NET is available but appears less developed.
Features. The main distinguishing factor of Scala is its support for both object and functional programming models.
Object-oriented language. All entities are typed objects. Types are defined by Classes and Traits constructs that offer flexibility in defining class hierarchies.
Functional-oriented language. Scala's object orientation mixes with a well defined pattern-based approach to declare what is to be done rather than how; functional versus imperative approach to define algorithms. Erlang is the functional language often mentioned when describing Scala's functional features.

The following points summarize few days experimenting with Scala.

Pros

Java. Scala is a natural language progression for Java programmers. Existing libraries can be reused and classes developed with Scala can be incorporated and used in Java.
JVM. Scala uses the well tested and cross platform Java Virtual Machine, JVM.
Object-oriented. Classes and Traits are designed in a manner that allows flexibility in defining class structures.
Typed language. Scala infers type permitting concise expressions, typical of dynamically-typed languages, while type enforcement is done at compile time. Dynamically-typed languages do not require variables to be declared as a specific type; any variable can contain any value or object. Dynamically-typed languages offer great flexibility and concise statements at the expense of undetected potential errors. Scala offers expression conciseness of dynamically-typed languages while the compiler infers and strictly enforces type conformity.
Functional-oriented. Scala has an algebraic foundation that as one becomes familiar with it helps understand the less intuitive aspects of the language. Pattern matching and support for higher-order functions, methods accepting functions as parameters, are among the powerful functional features. Thread management under the Actors library appears as a native language feature to the end-user, the programmer.

Cons

Scala is new. Although introduced in 2001, Scala has just recently received wider attention. Programmers, students and the rest of us will need much more time to learn, understand and use Scala.
Acceptance. Time will tell how Sun, Google, IBM, industry, the Internet and the IT communities accept Scala. A good sign of acceptance would be Google deploying a Scala API for the recently introduced Google Application Engine, which offers a Python only implementation now.

Included below is a quick-sort method as shown in Scala by Example.

Application. Defines SortTest as a console application.
Method sort. Defines method sort that takes a list of integers, xs, and returns, results is what Scala calls it, in a list of integers, def sort(xs: List[int]): List[int] =
Check input list for length 1 or empty. The method results the input list, xs, should it be of length of 1 or less. return statement can be used; it is often unneeded in Scala.
Define pivot. Defines a val object assigned to the value of the element at position xs.length / 2 of the input list, xs. Scala differentiates between 'val' and 'var' objects; var objects can be modified; val objects are unmodifiable. Also note that the compiler infers the type for pivot as integer and enforces it. The statement as written appears from a dynamically-typed language since no type is stated. Note how pivot is defined, val pivot = xs(xs.length / 2); the inferenced statement is, val pivot: int = xs(xs.length / 2)meaning a val object of type integer, ': int', assigned the integer value at xs( xs.length / 2)
Sort recursively. Defines three sub-lists, less than, equal and greater than pivot, and catenates these lists using Scala's less than intuitive catenate operator ':::' The List object has a method, filter(), that takes a function expression, as shown in, xs.filter( x => x < pivot), and applies that function, x < pivot to all elements of the list, and results (returns) the elements that match the expression.

object SortTest extends Application {

    def sort(xs: List[int]): List[int] = {
        if  (xs.length <= 1) xs
        else {
            val pivot = xs(xs.length / 2)
            sort(xs.filter(x => x <  pivot))  :::
                 xs.filter(x => x == pivot)   :::
            sort(xs.filter(x => x >  pivot))
        }
    }
   
    // now, define a List, Y, sort it and print
    val Y = List(3, 4, 0, 7, 9, -8, 8, -3, 1)
    Console.println( sort(Y) )
}

Compile and run as follows:
scalac SortTest.scala
scala  SortTest
List(-8, -3, 0, 1, 3, 4, 7, 8, 9)

A second example serves also to show the power and conciseness of Scala. This example is found at the Scala site under the subject: A Tour of Scala: Automatic Type-Dependent Closure Construction.

Here we define a method, myWhileLoop, that takes two parameters, cond, of type function that results Boolean, and body also of type function that results Unit, and the method results an object of type Unit. Unit returns no value, like void in Java, but it represents zero or more lines of code, it is a unit of code. Once defined, the method is used as if myWhileLoop were native to the language or at least it looks so.

The combination of => refers not to equal and greater than characters, but used in Scala to designate a very useful object type, a function. A function type is represented by what is called in Scala a right-arrow; think of it as one symbol instead of two, possibly borrowed from math to denote a function: leads to, becomes, transforms from one equation to another.

object TargetTest1 extends Application {
    def myWhileLoop(cond: => Boolean)
                   (body: => Unit   ): Unit = {
        if (cond) {
            body
            myWhileLoop(cond)(body)
        }
    }
    
    // use 'myWhileLoop' as if defined in the language
    var i = 0
    myWhileLoop (i < 5) {
        println(i); i += 1
    }
}

Compile and run as follows:
scalac  TargetTest1.scala
scala   TargetTest1
0
1
2
3
4

Here is a third example extracted from the Scala site, A Tour of Scala: Mixin Class Composition. It shows the power of Scala to define class hierarchies via class, trait, abstract and with expressions.

abstract class AbsIterator {
  type T
  def hasNext: Boolean
  def next: T
}

trait RichIterator extends AbsIterator {
  def foreach(f: T => Unit) { while (hasNext) f(next) }
}

class StringIterator(s: String) extends AbsIterator {
  type T = Char
  private var i = 0
  def hasNext = i < s.length()
  def next = { val ch = s charAt i; i += 1; ch }
}

object StringIteratorTest {
  def main(args: Array[String]) {
    if ( args.length > 0 ) { 
      class Iter extends StringIterator(args(0)) 
                 with RichIterator
      val iter = new Iter
      iter.foreach(println)
    }
  }
}

Compile and run as follows:
fsc    StringIteratorTest.scala
scala  StringIteratorTest abc123
a
b
c
1
2
3

2007-12-31

iMac + Fusion: a great developer's workstation

I've avoided Apple. Apple products are the ultimate proprietary systems; hardware, OS and applications are exclusive of the Apple club in direct contrast to an open source direction I prefer.

It was a Java based project requiring interface to a C-module via Java Native Interface, JNI, that forced me to buy an Intel-based iMac; I had no choice; the project demanded application development and testing not only on Intel-based but also on PowerPC based Macs.

It was a pleasant experience.

The minimalist design approach of Mac products grows on you. Even if the iMac does not work, it is a pleasure to look at it; from an ergonomics and mechanical design perspectives, it is an elegant, simple, functional unit. The aluminum-based keyboard is a good example of excellent ergonomics, minimalist inspiration, Apple-like design.

But it was VMware's Fusion, the virtualization application for the iMac that made the iMac my choice Java and Internet developing station replacing a Vista based laptop.

The combination of Fusion and Leopard's Spaces result in full-screen operating system instances available intuitively via control + arrow-keys. It all works well including instant sleep and restart, effective time synchronization, reliable network persistence, etc, instant on and off when needed quite a contrast to the experience with Vista + VMware's Workstation where 'sleep' and restart did not work or did not work well.

I have now an iMac configured with 4-gigabyte RAM supporting:

Apple's Leopard operating as the host OS. I am unfamiliar with this OS. The console and Unix commands work well and was at home with its operation. Java 6 is not officially supported yet; it is in all guests I have installed. I do not use any of the Leopard tools such as iPhoto and iMovie. If editing pictures or movies I know I have the right environment and I'll try these tools later.
MS Windows 2000 Pro as main desktop development environment. Java 6, GCC, NetBeans 6, JEdit and Ant are the tools that work well in possibly the best OS from Microsoft. I did not have licenses for Vista or XP but I did have an unused Windows 2000 license that operates as a simple, functional and most responsive MS OS uncluttered from the obstructive security alerts and other confetti added to XP and later to Vista that get in the way of productivity.
Ubuntu 64-bit, 7.10, as main Server and development environment. This is the main server environment hosting Java 6, Apache, PHP, Tomcat, MySql, hsqldb, PostgreSql, etc. I could do this work on Leopard but I am familiar with Ubuntu and it works just as responsive and well, possibly better, under Fusion than in a dedicated environment.
Solaris Express Developer Edition 64-bit for general development of the C-module and Java application on a Solaris platform. OpenSolaris is work in progress with the objective to make Solaris easier to use by adopting a Linux-like operation and presentation; it profiles as a good development and server platform.

Each OS has 1-gigabyte RAM. They are all instantly available via control + arrow-keys as mapped by four Leopard Spaces running each OS in full-screen mode.

Each OS has Subversion client installed to access a Subversion server hosted on a dedicated Ubuntu 64-bit environment on the Internet. In this manner, the Subversion server fills the data and version control functions and allows check-out and check-in for projects under work using any of the OSes listed above or a workstation at a client or project office. It nicely separates the tools, e.g. operating systems, from the data, e.g. projects and personal files.

This approach allows me to use one or more environments as workstation development at home or project offices while relying on Subversion for data and project repository. Effectively each OS has only the data for the project under work data which is deleted once project is checked-in.

All is not perfect however. It is mostly a matter of getting used to a new environment rather than functional problems.

Plus

Ergonomics. The iMac is an all-in-one desktop computer, simple, elegant, functional.
VMware's Fusion. This virtualization tool possibly learned from VMware Workstation for MS Windows and had a fresh clean design and porting it to Leopard resulting in a great implementation hosting different OSes. 'Sleep' and restart works flawlessly including time synchronization with host at restart time. Matching screen resolution with host works well and contributes to the simple and effective transition among OSes via control + arrow-keys.
Connectivity. Network settings and overall connectivity, wire and wireless, work well including network printing from guests to defined printer on host OS, Leopard.

Minus

Keyboard. PC users will be searching for several keys unavailable on the minimalist wireless keyboard that came with the iMac. The delete key is effectively a backspace key, end, home, page-up and page-down keys are unavailable. There are key-combinations to simulate these; however it takes time to find them and more time getting used to using them productively. Also, some keys have one behaviour under the host OS, Leopard, and different one under Fusion guests. I am now used to these keyboard differences and I am able to operate productively; it is the cleanest looking keyboard; it is functional and I am able, after some time with it, to use it productively in all four operating systems.
Mouse. The wireless mouse is the classical one button Mac mouse; I do miss the right-mouse click which is simulated by control + click.

In summary, I will not have selected this environment should not have been for the project requirement to develop and test on an Apple platform. The discovery of Leopard + Fusion made the case for me to switch to this as my primary workstation where I can use any of the four top desktop operating systems as needed while all interacting with the same data and project repository under Subversion.

The result is a proprietary system, iMac + Leopard + Fusion, used as the base platform to host Microsoft and open source OSes, and associated development tools, resulting in a productive, multi-OS self-contained workstation.

I suspect I'll be looking at Apple's MacBook to replicate a mobile equivalent of the configuration described above. The rumored announcement by Apple, on January 15th, of an ultra portable MacBook may be the fully functional multi-OS compact laptop I'd like to use for presentations and for work while away from the office.

N.B.1 MackBook, MacBook Pro and respective successors, hosting Ubuntu via Fusion, profile as compelling alternatives to anyone considering a MS Windows laptop, Win2k, XP or Vista, and/or a Linux mobile unit such as a Dell Ubuntu laptop.

N.B.2 There are at least 3-products that provide virtualization on the Mac: Boot Camp, Parallels and Fusion. This reference offers a benchmark and comparison among these products.

2007-11-13

Android: a story about the phone, Java, Linux and other wars

Google released the API for Android and several first impressions of its impact and contribution to the mobile phone platform are now available.

There are good and not-so-good news.

Various blogs describe well the positive points of Android such as defining a common and relatively 'open' platform for volume development of related applications and services, bringing down the barriers of entry, offering competition on a common infrastructure. A good article describing Android is found here.

These are all good points, timely introduction and needed development to standardize what otherwise is a set of exclusionary and stove-piped technologies vertically aligned across the usual suspects, aka communications cartels.

I installed the SDK first on my Ububtu.x86.64 environment. It did not work. I found that the initial distribution is for 32-bit only; the site should point this out on the download entry.

Under Vista, x86.32, it worked well using Eclipse and standalone via the Ant script generated by the included application generation utility, activityCreator.py.

As I work with the SDK, I like what I see, namely

Java. The SDK is Java-based including an Eclipse plug-in.

What concerns me includes:

Java. Same language supporting a subset of the JDK. The run-time VM is a custom one, Dalvik, perhaps needed to gain performance on target platforms. Java faces fragmentation and Android is a good example of it. Google's GWT uses a similar architecture and supports also a subset of Java's JDK, a different subset than Android. There is fragmentation for Java even within Google. Perhaps this is the cost of rapid development but certainly they can do better than this. A good article describing Android's Java gambit is found here.

Sun's Jonathan Schwartz noted in a blog entry his dinner invitation, at his place, to Linus Torvalds. It appears now that Jonathan may need to include Sergey and Larry on such invitation, have a good supply of java to digest the vast amount of material accumulated since the initial invite, leave egos at the door and collectively endorse, support and join the OpenJDK community.

A fragmented Java is not in the interest of developers and of the software industry. Unix offers a good example of the perils of fragmentation; it is a good idea avoiding same fate for Java.

2007-11-06

Android: an open platform for mobile communications

The announcement by Google of an open platform for mobile communications marks a milestone in the evolution of the Internet.

The conventional cell-phone is a wireless unit with some data processing capabilities, a phone plus a PDA, as exemplified by Palm's Treo, Apple's iPhone and RIM's Blackberry.

In contrast, Google defines a Linux-based computing platform that can make phone calls.

The difference is significant an offers wide range of options and possible technology combinations by defining a device stack for mobile communications that includes a Linux-based OS, a defined API and associated development tools.

Not much is known since the API will be available on November 12th. What is known can be summarized as follows:

Platform. Open source software communications platform named Android. The definition is a software-based API that allows any hardware technology to be able to develop and deploy mobile applications and services.
Alliance. Open Handset Alliance, OHA, which includes 34 registered participants. Participants include T-Mobile, Sprint, China Mobile, Telefonica of Spain, Samsung, Motorola, LG, Intel and Texas Instruments.

Not surprising is the absence from the alliance of names such as Microsoft, Verizon, AT&T and Apple.

This move by Google offers also an alternative to proprietary development by proposing instead a common infrastructure for all to participate, use, enhance and compete by delivering value above a common foundation.

The enthusiasm of seen an alliance for an open source communications platform is tempered by the fact that the API and associated development tools are unavailable now. Unclear also is the language, or languages, supported for development.

However, first impressions are that Google has once more outfoxed the usual suspects and proposed an approach for the evolution of mobile communications in a manner and culture earlier responsible for the volume adoption of the Web, Apache, Linux and Firefox.

References

Google. Here is an entry in Google's blog re subject announcement.
BBC. Q&A article re proposed mobile platform at BBC.
API. Android's software development kit, SDK, is available here.
Nokia. Nokia does not rule out participating in the OHA alliance.

2007-09-29

Opportunity for Apple to sell Leopard as a standalone OS

Leopard, the next version of Apple's operating system, is to be released in October.

The timing for such introduction highlights the opportunity for Apple to offer the option of a standalone Leopard; e.g. the OS unbundled from Apple's hardware.

Two reasons contribute to an excellent timing for Apple to offer an unbundled operating system.

Vista's troubles. Vista's failure to deliver on the expectations created by Microsoft is such that end-users prefer Vista's predecessor, XP.
Increased use of virtualization. Virtualization has reached volume adoption and is now a needed function for software development and for other applications on the desktop and on the server. It is conceivable that a year from now consumers may buy computers able to run more than one operating system. In fact such computers may have no operating system other than a virtualization function as part of the hardware, bios or flash configuration.

These developments, problems with Vista and availability, acceptance and use of virtualization, result in a great opportunity for Apple.

Should Apple sell Leopard as a standalone operating system, for installation on a computer or under a virtual environment such as VMware, Virtuozzo, Parallels, Linux-KVM, XenSource, and Sun's xVM, will benefit consumers and Apple.

Consumers. Consumers have an alternative to Microsoft even if a standalone Leopard license is supported by Apple for operation on a virtual environment only.
Apple. Apple benefits by expanding adoption of its operating system beyond the present niche and enabling agile 64-bit Leopard to run on dual and quad Intel and AMD platforms directly or through virtualization.

Supporting vast combinations of hardware configurations is indeed a problem for any OS. However, offering unbundled Leopard for operation under a virtual environment is an attractive option.

Licensing Leopard for operation under a virtual environment simplifies support significantly given that video, disk, network and other devices are virtualized, defined by each virtualization tool, thus avoiding support for vast number of devices, associated drivers, and contributing to stability.

We need now to develop in more than one OS. It is not a nice-to-have function; it is needed for several areas including cross-platform development.

I trust Apple will see the benefit of licensing Leopard standalone so I can install it on my mobile development environment: MS Vista, VMware and Ubuntu-64 as guest. I was skeptical, but VMware Workstation, on Vista-32-bit, supports 64-bit guests; it performs well and it is stable.

2007-09-01

Of open source, open minds and open culture

I came across the blog referenced below identifying 10-principles proposed by the author from experience in software work.

It took more than software to evolve towards the Internet, Web, GPS, associated applications and services, the 'open' communications world of today and the results and culture of 'open source'.

Open source is about people, ideas, creativity, tools and technology, working with others through ongoing peer review to achieve results thought impossible in context of time, resources and conventional methods, processes, organizations and management.

The result is communities of dedicated people, members of projects, tasks, applications, and recently corporations, formal and mostly informal, working together productively to define and resolve problems, deliver results, effectively and in a responsive manner.

Often it is not about cost. It is rather about the cost of not doing it, of not delivering results in a timely manner, of missing a window of opportunity, of failing to offer solutions, of failing to offer value, of becoming irrelevant.

Open source brings to mind the potential application of same culture, communication and tools to define and address challenges in fields other than computing where intellectual property, patents and copyrights represent serious obstacles to development.

Here is a definition of Open Source, from opensource.org, that may be applied to fields other than information technology:

Open source is a development method that harnesses the power of distributed peer review and transparency of process.

Areas that come to mind to apply open source methodology and culture include:

Education
Health
The environment
Energy
Government

I list below the 10-principles referenced in the subject blog.

Adoption precedes monetization.
Lots of customers is a greater barrier to entry than lots of intellectual property (IP).
A business' brand is its greatest asset in driving sales. Not its IP.
Lower barriers to evaluating and using your product.
Sell customer value, not vendor value.
Product use should breed re-use and further sharing.
A collective product best serves a collective market.
Invest in service and your product, not sales.
Transparency breeds trust, and trust breeds revenue.
People make a business.
Updates
Here is a good summary of lessons learned from open suource - ten things I ...