IBM home | Products & services | Support & downloads | My account |
|
Ruby: a new language | ||||
Introducing the latest open source gem from Japan
Maya Stodte (mstodte@pop.rcn.com) Author Maya Stodte looks at Ruby, a pure object-oriented scripting language, which has successfully seduced Python and Perl users in Japan. Ruby is now beginning to make its international debut, touting elegant syntax, single inheritance, straightforward OO features, true closures, and an iterator more extensive than most call-back routines. Below Maya looks at the language's profile in depth, providing code comparisons to Perl, and trying to illuminate some of the features that have so entranced Japanese programmers for the past few years. Yukihiro Matsumoto (see our interview and Resources later in this article) developed Ruby in answer to the Perl community's proposal that "there's more than one way to do it." Ruby is an absolutely pure object-oriented scripting language written in C and designed with Perl and Python capabilities in mind. Ruby has been gaining popularity over the past few years, especially in Japan, where it was born and conceived. Its features, like Perl's, are designed to process text files and complete systems management tasks. Ruby is highly portable and easily customized, but primarily draws users because of its purity and readability. In particular, CGI code scripters are increasingly frustrated with Perl's occasionally enigmatic code and Python's inelegant and difficult syntax that requires "too much typing." Neither Python nor Perl were designed as object-oriented languages. Consequently, the OO features often feel "added on" and are not fully integrated into the language core, making for cryptic code. Ruby, on the other hand, is by definition an object-oriented language. It literally treats every data structure as an object, and offers only single inheritance with its methods in order to reduce confusion and encourage simple, straightforward code. Ruby's interface to objects is therefore completely defined, and allows for extensive code alteration in implementing methods. Based on the syntax of Eiffel and Ada, the power of C, the functions of Python, and the diversity of Perl, Ruby really is an attempt to combine the best of everything. And rumor has it that Yukihiro Matsumoto, affectionately known as Matz, has interwoven Ruby with a mysterious spell to keep us coming back for more. Ruby has, at the very least, been showing up Python in Japan for quite some time.
Language profile Ruby is primarily an interpreted language. This means, of course, that Ruby avoids all of the various problems and annoyances usually associated with compilers and compiled programs and languages. The advantage is that Ruby programs are immediately executable. The disadvantage is that Ruby's execution may be much slower than that of a compiled C program. But since its source code is freely available, and many extensions and modules are already offered on the Ruby home page (http://www.ruby-lang.org/en/), the typical disadvantages of an interpreted language in terms of tedious work are much less severe in Ruby's case. The fact that Ruby is for the most part an interpreted language is particularly advantageous in the edit-interpret-debug cycle, since it allows you to write and test programs simultaneously. But because it is not strictly an interpreted language, Ruby's interpretive overhead is not as high as that of, say, Tcl. It is always possible to call the interpreted code with a compiler later to increase execution speed. Because it is object-oriented, Ruby is highly reliable and can write small programs that are reusable and easily modified. Ruby includes all the basic OO features (such as classes, methods, etc.) that other object-oriented languages offer; inheritance, polymorphism, singleton method, and mix-in are all implemented in Ruby. But, unlike other languages offering these OO features, Ruby is a pure object-oriented language. This means that absolutely all things in Ruby are treated as objects. Classes, integers, strings, arrays, and code blocks are all treated as objects. This is what lends Ruby both its power and its simplicity. Ruby is also fundamentally an extensible language. You can either compile source code for extensions and integrate them into your local system, or download precompiled extensions from the home page and various other sites; the latter option allows you to avoid building your own extensions. You can also easily extend Ruby using C. The online extension repository is available through the Ruby home page, and the extension library directions are in the file README.EXT in the Ruby package.
Language properties
Running speed
Portability
Learnability
Ruby's mark and sweep garbage collector, which literally works with all Ruby objects, eliminates the need to maintain reference counts in extension libraries. Many garbage collection techniques require that each memory cell contain a count of the number of other cells that point to it; if the count reaches zero, the cell is freed and its pointers to other cells are followed to decrement their counts recursively. Such methods of collection cannot therefore handle circular data structures because cells in such structures will never have a zero reference count and would never be reclaimed. However, Ruby's mark and sweep garbage collection salvages dynamically allocated storage during execution time through periodic storage reclamation. Each cell initially reserves a bit for marking whether or not it is clear, and unmarked cells are freed once they have been traced from the root during garbage collection. The mark and sweep garbage collector also significantly reduces the memory requirements of any program written in Ruby. In addition to the garbage collector, Ruby's Application Programming Interface (API) makes writing C extensions easier. The API is written in C, ensuring portability. It provides an interface between Ruby and C and an interpretation of call-by-value and call-by-reference arguments in both directions. An Application Archive is available through the Ruby home page (http://www.ruby-lang.org/en/raa.html). The archive includes a "What's New" section, a list of applications from the interpreter, text, and mail applications, and a parser generator among other things. Since not all applications are currently stable, the archive lists the condition and update schedule for each application. The archive also offers a library of code with calendars, databases, and GUIs. An English speaking mailing list is available. To get it, send an e-mail to ruby-talk-ctl@netlab.co.jp. with subscribe First-Name Last-Name in the mail body.
Readability
translates into Ruby as:
(Code example from Masaki Suketa, CQN02273@nifty.ne.jp). Ruby's syntax is also widely known for its simplicity, which makes for code that is user-friendly and readable on the one hand, and which features a highly powerful grammar and semantics on the other. For example, Perl requires the use of the semicolon at the end of virtually all statements; by contrast, Ruby anticipates the end of all statements without such a prompt. In Perl:
translates in Ruby to:
and in C to:
An example of a Ruby script that exemplifies its power and elegance is:
(Code examples by Minero Aoki, aamine@dp.u-netsurf.ne.jp) A final example of readability and power in Ruby's syntax is its CLU-inspired block passing features. These are commands surrounding Ruby's code -- such as { ... } or do ... end -- that can be passed to methods or converted to closures. With this feature, an object of a certain class knows how to perform functions itself, rather than the other way around, where a function knows how to handle different types of objects. For example, a sort criterion can be specified as follows under block passing:
A grammatical element "{|..| ....}" is called a block. "{|..| ....}" can be written as "do |..| .... end". In this example we are assuming that each member of "array" has the methods "date" and "name". "x <=> y" returns -1, 0 or 1 for "(x < y)", "(x == y)" or "(x > y)" respectively, and "sort" does a quick sort by the block value. Furthermore, in Ruby it is possible to define a Schwartzian transformation on Array:
where "yield" returns the result of block evaluation, and Ruby can sort more efficiently with the criterion above:
(Code examples by Goto Kentaro)
How does Ruby differ from other OO languages?
Pure object orientation
Any program written in Ruby can add methods to both classes and instances of classes at runtime. Consequently, two instances of the same class can behave differently at the same time.
Inheritance
Closures and the iterator
The following tree structures, written in Ruby and Perl, are instructive. In Ruby, the tree class can be written very naturally as:
Perl does not offer this kind of class definition; rather it performs the tree in many different ways. For example,
introduces a node using a reference to an array. However, this notation makes it necessary to remember that $$node[0] is the parent and $$node[1] is the reference to the child nodes. This method can be improved upon in Perl, using anonymous hash:
However, this is less efficient than Ruby's version. Perl's equivalent OO notation using the package, on the other hand, looks strange in comparison to the simple Ruby example above:
Ruby's iterator resembles what many languages call a "callback routine," although it does much more than a typical callback routine. Consider, for example:
This code is equivalent to:
The "each" iterator allows you to shorten code in this manner. This method allows access to Array without requiring knowledge of its access method. Ruby also allows you to define your own iterator. In comparison, Perl has the "each" iterator, but you cannot define it. Below is an example of a user-defined "each" method in Ruby:
The iterator was initially set to abstract the loop, but was changed to the more useful 0/1 times iteration. A good example of this usage is File.open in the following code:
The File.open method opens the file and if the iterator blocks (do...end) finish the method, or if an error occurs, the file is automatically closed. Using iterator, it is also possible to access data inside an object without exposing instance variables. For example,
allows you to access each child node without accessing the instance variable directly:
(Code examples by Akinori Ito, aito@ei5sun.yz.yamagata-u.ac.jp)
How does Ruby compare to Perl and Python? Although Python is similar to Ruby in design and purpose, there are significant differences. Ruby's statement structure is more conservative than Python's. It is not necessary to write "self" to access the attributes of an object in Ruby. Ruby does not access object attributes by default as Python sometimes does. Ruby's functions and methods, unlike Python's, are not first class objects. Ruby converts small integers and long integers automatically; Ruby does not have tuples, and all data in Ruby are class instances. Though it is rarely claimed that Ruby is more powerful than Python, Ruby is faster, more natural, more elegant, and increasingly more popular.
A brief history of Ruby Since February of 1993, mailing lists have been established, Web pages have formed, and a community has begun to grow around Ruby. The mailing lists, in particular, have been instrumental in creating and stabilizing the language. The oldest Ruby-list has 14,789 messages to date. Most of Ruby's scripters and developers have come from Python and Perl, though a few are fresh young upstarts. Ruby was written in C and based on Perl; in fact Ruby is like a streamlined version of Perl -- not "too cryptic and weird," in Matz' words -- with an emphasis on correct object orientation. The examples available through the Ruby home page illustrate Ruby's strong ties to Perl. Here, for instance, is an implementation of the finger command:
In Matz' own words, "I [simply] decided to make it. It took several months to make the interpreter run. I put in the features I love to have in my language, such as iterators, exception handling, garbage collection. Then, I reorganized the features in Perl into a class library, and implemented them. I posted Ruby 0.95 to the Japanese domestic newsgroups in Dec. 1995." Ruby 1.0 was released in Dec. 1996. 1.1 was released in Aug. 1997. 1.2 (stable version) and 1.3 (development version) were released in Dec. 1998. The new stable version of Ruby 1.4.3 was released Dec. 1999. It is available from the site along with a reference manual and compiled binary for Win32/DOS as well as four ftp mirrors.
The Ruby community They come to Ruby from newsgroups (fj.sources being one of the biggest), online communities (freshmeat.net, NetNews, the Java-House MailingList, Nifty-Serve), magazines (Linux Japan), and of course, by word of mouth. As Nishikawa (nyasu@osk.3web.ne.jp), a Ruby user, relates, "I was searching for a good scripting language which can deal with databases when I started enjoying the horse races. One of my friends encouraged me to bet, and suggested an algorithm competition of the winning collection amount. I got the result database of the past horse race and put them into MySQL database in my Linux box. C/C++ programming is not suitable for trial-and-error. I started it with Excel95 VBA and ODBC driver. Soon I found this was also a crazy way. I was going to look for a good scripting language by the web search engines, and I found Ruby. Please don't ask me about the result of the competition!" Ruby attracts users for many reasons. Some users are won over by the ease with which it is possible to code Gtk+ applications. Others praise Ruby for its simplicity, clean syntax, powerful string manipulation and purity of design as an OO language. Some users are drawn to Ruby from Perl, complaining that Perl doesn't handle complicated data structures well, that it doesn't handle multi-dimensional arrays effectively, and that it's difficult to read. Some are drawn to Ruby's Japanese supporting libraries. Still others are drawn because Ruby is smarter than other languages, can write larger programs, has Tk and Gtk interfaces, and most importantly, is well balanced and natural. And, of course, Ruby's iterator is a big attraction. Most of Ruby's users formerly used C, Perl, C++, Java, Python, sh, csh, and Awk. Common Ruby programs and applications range from mail readers and schedulers to small and medium-sized applications, text format conversions, data processing and statistical analysis, prototyping of numerical systems, networking, manipulating databases, analyzing and predicting horse races, synchronizing PalmPilot with MySQL database, implementing of CGI, SMTP client, POP client FTP client tools, and converting files to other formatted files, or extracting the information from large files.
Features Ruby is still missing
Who created Ruby? developerWorks interviews Matz developerWorks: What's your background? Mazumoto: I'm a professional programmer. I've been working for netlab.co.jp -- a Japanese open source company -- for the past two years. Currently I'm working on Ruby's Web support, such as CGI libraries, Apache module to embed Ruby, etc. The project is funded by a Japanese public grant. I've released several free pieces of software in the past. Among them is cmail, the emacs-based mail user agent, which is written entirely in emacs lisp. Ruby is my first piece of software to become known outside of Japan. I really love to create things. Man is created in his creator's image. So man is naturally a creator. I'm no good at painting, drawing, or music, but I can write software. That's why I write software. I really love freedom. To gain my freedom, I believe I must allow freedom for others. That's why I write free software. dW: Why did you decide to create Ruby? Matz: I've been a big fan of programming languages, object-oriented programming, and human computer interface -- in this order -- for 15 years. Since my high school days, it has been my dream to design and implement my own ideal object-oriented language. Ruby is my dream come true. I majored in computer science and specialized in programming languages at the university. I learned a lot about computer languages, and designed a few myself. In 1993 after my graduation, I was talking with my colleague about scripting languages, about their power, possibility, and the ad-hoc-ness of the typical ones. As an OO fan, it seemed to me that object-oriented programming was very suitable for scripting too. I researched on the Net for a while and found Perl5, which was not released yet but planned to implement OO features. I felt it was inconsistent and not handy enough as an OOPL. I couldn't stand it. Thereafter, I abandoned Perl as an OOSL. Then, I came across Python. It was an interpretive, object-oriented language. But I felt that:
I wanted a language more powerful than Perl, and more object-oriented than Python. I decided to design my own. It took a year and half to implement the first public release version; that was in 1995. Now, believe it or not, Ruby is more popular than Python in Japan. I've written a book titled Ruby the Object-Oriented Scripting Language, which was released late October, 1999, and was one of the best-selling computer science books in Japan in November. I admit Ruby is still rather a domestic language. But I believe it's soon to be a globally known language. dW: How do you think it compares to other OOSLs? Matz: There are some scripting languages with features to support object-oriented programming. On the other hand, there are few object-oriented languages with features to support scripting programming. Perl and Python are the former; Ruby is the latter. Maybe this distinction doesn't matter for many, but it does matter for me. And I believe it makes programming easy and fun. dW: What do you think Ruby's most compelling feature is? Matz: Let me point to three features:
dW: What is your favorite thing about Ruby? Matz: Since I created it, I love it as my masterpiece. I designed Ruby with the human computer interface in mind, so that it follows the principles of interface:
Programming in Ruby is fun, because it follows the principles above, and it makes me concentrate on the thing I want to do, freeing me from the usual programming bothers. dW: What were you using before you started using Ruby? Matz: sh(bash) and C. I knew Perl, but I didn't use it. dW: What do you mostly use Ruby for? Matz: I spend most of my time developing Ruby in C. But I use Ruby daily -- from writing one liners to accomplish my small talks, to writing medium-sized programs for my company. I rarely write big programs in Ruby, not because Ruby is not suited for big projects, but simply because I'm too busy to be in another big project. dW: What would you like to see added to Ruby? Do you anticipate expanding the language? Matz: I feel the syntax and basic features of Ruby are pretty stable now. I'd like to work on two things in the near future: promotion (especially outside of Japan), and performance improvement of the interpreter. I hacked the interpreter for a long time, so simple tuning would not work for the current interpreter. I'd like to rewrite the whole interpreter for better performance someday.
|
About IBM | Privacy | Legal | Contact |