2008-07-19

parse-cmd: A simple command-line Java parser

Wikipedia defines parsing as follows:

"
In computer science and linguistics, parsing, or, more formally, syntactic analysis, is the process of analyzing a sequence of tokens to determine grammatical structure with respect to a given (more or less) formal grammar.

A
parser is thus one of the components in an interpreter or compiler, where it captures the implied hierarchy of the input text and transforms it into a form suitable for further processing (often some kind of parse tree, abstract syntax tree or other hierarchical structure) and normally checks for syntax errors at the same time.
"

When writing a Java console application, or an application responding to an input stream, without using a formal parser such as ANTLR - ANother Tool for Language Recognition, one ends up writing custom code each and every time; a generic approach to define and parse few application specific commands is needed.


A quick search on Google shows several tools to handle command-line options but generally I found these solutions complex. Access to a generic and simple parser was the beginning of parse-cmd.

Should you need a simple, less than formal, one-class Java command-line parser, take a look at parse-cmd.

An equivalent implementation using Scala of a command-line parser is available here.

2008-07-03

Roadrunner: a Linux-based petaflop computer

On June 18 the updated list of the top 500-supercomputers was published.

This time the list includes the first petaflop computer, 10 to the 15th floating point operations per second, one thousand teraflops, for a computer built by IBM for the US government.

The introduction of Roadrunner is one more example of the drive away from proprietary technologies towards use of thousands of commodity components clustered and managed by Linux.

This architecture for supercomputer solutions is strikingly similar to that used by Google, Amazon, Yahoo et al to power their services.

Note that IBM did not use its processor technology, Power. Instead, it used the combination of AMD's Opteron and the Cell technology, developed by IBM and Toshiba for Sony's PlayStation 3.

Comparing the updated list to that of June 2000, offers these points.

  • Use of commodity components. Roadrunner uses the processor in Sony's Play Station 3 and AMD Opteron chips. Today 86% of the top 500 supercomputers use conventional Intel/AMD technologies with close to the remaining 14% using IBM's Power. This contrast to using close to 100% RISC-based hardware back in 2000 when there were 4 Intel-IA-32 based computers. The share of IBM's Power went from the top technology offering in 2000, with 143 computers, down to 23 in the current top-500 list.
  • Linux. Roadrunner and 92% of the top 500-supercomputers use Linux for OS compared to Unix at 90% back in 2000. Now, Unix is represented exclusively by AIX with 23 computers or 4.60%.
   +------------- Top-500 Supercomputers ----------+
+------------- Operating Systems ---------------+
Date OS Count Share %

2008.June Linux 460 92.00 %
AIX 23 4.60 %
Mac OS 2 0.40 %
Windows 2 0.40 %
Other 13 2.60 %

2000.June Unix 453 90.60 %
Linux 28 5.60 %
BSD Based 17 3.40 %
N/A 2 0.40 %

+------------- Processor Family ----------------+
Date Processor Count Share %

2008.June Intel EM64T 356 71.20 %
AMD x86_64 107 21.40 %
Power 68 13.60 %
AMD x86_64 55 11.00 %
Intel IA-64 16 3.20 %
Intel IA-32 3 0.60 %
Cray 1 0.20 %
NEC 1 0.20 %
Intel+AMD 430 86.00 %

2000.June Power 143 28.60 %
Sparc 122 24.40 %
MIPS 62 12.40 %
Alpha 56 11.20 %
PA-RISC 53 10.60 %
NEC 25 5.00 %
Fujitsu 19 3.80 %
Hitachi SR8000 10 2.00 %
Cray 6 1.20 %
Intel IA-32 3 0.60 %
Intel i860 1 0.20 %
+-----------------------------------------------+