Read Online
Download PDF
Additional resources
Revision history
Known typos/bugs
Report a bug
License terms
About the author
"node66_1.gif" "node66_2.gif" "node66_3.gif"

Appendix A: what is so special about Mathematica (a personal evaluation)

Mathematica is distant from many other programming languages in many ways. One that seems most important to me is that it is not a minimal language, in the sense I will explain. This leads to lots of ways of how any given problem can be solved.  Mathematica language supports all major programming styles - procedural, rule-based, functional and object-oriented (OO is not directly supported but can be implemented within Mathematica [3,4,1]) . This richness of the language is a great strength since it allows to choose the programming style which is best suited for a particular problem - some people say that Mathematica language is "problem-oriented".  

It allows one to program and research at the same time, using in the research, in principle, all the power of modern mathematics. The great advantage here is that it is very easy to switch the thinking mode from programming to research and back, or do necessary (non-trivial) mathematical (or statistical, etc) checks quickly without resorting to special libraries and interrupting the main programming workflow.

The problem (pun unintended) is however that all the different solutions possible for a given problem, are inequivalent, primarily in terms of efficiency, but also in terms of the code readability, ease of maintenance and debugging. These issues will probably be of no concern for a pure scientist who just needs to plot a graph or two, simplify some formula etc. But this will certainly be a concern for people from the software development community who may consider to use Mathematica as a tool for rapid prototyping, for which, in my opinion, it has major advantages for complex software.

When I program in C, and say solve some problem in two different ways, it is not very likely that the performance of the two implementations will be different more than a factor of 2 (unless I do something stupid, or when the difference in solutions will be actually in algorithms of different complexity).  In Mathematica however, it is quite easy to get say 5 or 10 different solutions where the performance of the  most and least efficient may differ in several orders of magnitude or have different computational complexity altogether. Of course, the reason is that, programming in Mathematica, we "sit" on top of a lot of internal algorithms used to implement given built-in functions that we are using.  But from the user viewpoint, these performance differences are often completely unobvious, until one gets a better understanding of how the system works.

Imagine now that we are building a system which has 4 stages of information processing, each one taking as an input a result from the previous one. And then, on each stage we produce 2 different solutions which differ in performance 10 times. At the end, we get one system working 10000 faster than the other. In practice, this means that the "slow" system in most cases will be completely useless, given that there anyway exists an overhead due to the symbolic nature of Mathematica and a very high level of its language. If one wants to build something serious and interesting  in Mathematica, one has to learn techniques to program in it efficiently, or at the very least, be aware of certain performance pitfalls associated with each programming style.

At the same time, the great advantages that Mathematica brings are the ease and speed of writing and debugging the code, the extremely small code size, and the ability to stay on quite a high level of abstraction throughout the process of solving the problem, without going into unnecessary low-level details which hide the essence of the problem. This allows a single person to manage substantial projects. Mathematica  is a great "thinking laboratory". Due to its highly interactive nature, it is also a great tool to design and analyze algorithms. For me personally, this overweights the eventual complications arising from the above performance issues, especially because once you understand the system, you rarely get an unexpected behavior or nasty performance surprises.

Let us  forget for a while about the most well-known goal of Mathematica  - to carry out mathematical transformations and solve various mathematical problems (symbolically or numerically or both), and think of it as a programming environment. The main questions  then are: what are the main ingredients of this environment, how are they different from their analogs in other languages, what sort of problems can be solved better or easier, and which programming paradigms and ways of thinking are encouraged in Mathematica.

If I was asked to describe Mathematica in one sentence, I would say that it represents a functional programming language built on top of  a rule-based engine, which operates on general symbolic trees (or, directed acyclic graphs if you wish).

Mathematica mainly consists of the following blocks:

1. Powerful rule-based engine with pattern-matching and evaluator built around the general Mathematica expressions - we can think of this as a programming environment defined on and optimized to work with general symbolic trees.

2. Global rule base which allows the user to both define functions as global rules and make them interact with the pre-built system rules in a non-trivial way. The former is a necessary ingredient for programming. The latter can be used effectively in, for instance, carrying out mathematical transformations and simplifications, since the system already knows many identities and properties of functions and other mathematical objects. However, it can be used in many more situations, basically every time when we want to define new objects by new rules. Systems of rules are way more flexible than say classes in OO paradigm, since they basically define grammars of small languages, and are not rigidly tied to specific data structures.

Function calls are then internally just a special instance of application of global rules. They are  made efficient by built-in hash tables used for global rules (among other things). Type checking (when needed) is made almost trivial by the pattern-matcher. The functions can be "overloaded" in much more general way than in more traditional  OO languages.

3. Highly optimized and efficient structural operations on lists and arrays (Flatten,Transpose, Partition, all numerical built-in functions, comparison operations, etc), which are similar to those in the APL language.

4. Support of the functional programming paradigm by both the possibility of defining pure (anonymous) functions and by efficient built-in higher-order functions  such as Apply, Map, Fold, etc. Due to the uniform representation of everything as Mathematica expression, these higher-order functions (and thus the FP programming style) apply to general Mathematica expressions rather than just lists. This is a very powerful capability.

The availability of the  rule-based approach means basically that one can easily create a language describing any new object one wants, be it either a more formal language with a grammar or just a collection of some objects and relations between them. What is important is that this can be completely syntax-based. By adding new rules to some built-in functions ("overloading") one can make this new language immediately interact with rules that exist in Mathematica kernel and thus take advantage of those.

The rule-based approach also means that the language is very unrestrictive (or should I say powerful) - it puts virtually no bounds on what types of manipulations can be done in principle. As some extreme examples, one can define functions that produce other functions, functions that change their own definitions at run-time (for example, we may program a function that destroys itself after it is done with the work, and even produce such "disposable" functions at run-time by other functions), functions that manipulate the definitions of other functions at run-time, and many more seemingly weird possibilities. Techniques like dynamic programming,  caching and memoization, lexical closures etc are a common practice in Mathematica programming and require little effort from the programmer. Also, if one feels that for a particular problem a more "rigid" or restrictive framework (such as object orientation) is needed, it can be implemented within Mathematica.

The availability and effectiveness of functional programming style allows to both make the code more concise and create data structures on the fly (since in this approach any complex data structure is represented by a possibly nested list). If however one wants to shift the accents more towards data structures, this is also possible and easy thanks to the syntax-based pattern-matching and rule substitution. And because in Mathematica functional programming can be performed on general Mathematica expressions (more general than lists - this is made non-trivial by pattern-matching),  one can also combine the two programming styles to shift the relative roles of functions and data structures as to feel most comfortable. It is typical in Mathematica programming to use functional programming in the more exploratory stage and then create more rigid data types and structures after the design has shaped.

The large number of built-in functions has both advantages and disadvantages. To list just a few advantages: you get a huge collection of (often very sophisticated) algorithms already implemented, tested etc., packaged in built-ins.  Extended Help system and error messages allow to very quickly learn new functionality, write and debug programs. However, while the capabilities of Mathematica such as pattern-matching and rules substitution are great, they are also expensive in terms of performance. As a result, many operations would be too slow if implemented directly in Mathematica language. Therefore, they are implemented in a lower-level language such as C, and packed into the Mathematica kernel. This  solves the problem, but often makes the performance  hard to understand (especially for inexperienced users), since the performance of user-defined and built-in functions can be dramatically different.

All is not lost however. The general principles on which Mathematica is built give the language  overall consistency. This, plus a large number of quite generic and efficient built-in higher-order functions (that is, functions that manipulate other functions) allow for efficient general Mathematica programming techniques. These techniques are not too difficult to learn, and in some sense they split the entire Mathematica language into a "scripting" (quick to write, but often slow to execute), "intermediate" (a bit more thinking but faster code), and "system" (less intuitive thinking, but yet much faster code) language layers (please bear in mind that this classification is my own and based on my personal experience, rather than a widely accepted one).

The part of the difficulty of learning Mathematica programming  is that there is no good formal distinction between these layers. Typically, the first is  characterized by heavy use of the procedural (or otherwise straightforward) code, the second corresponds to use of functional programming and the third by heavy use of optimized structural operations, but this is not an absolute criteria. One and the same operation can play a "scripting" role in one context and "system" role in another.

For many problems (especially purely scientific), "scripting" layer is sufficient. This layer consists mainly in using built-in commands or gluing them with a typically procedural code. A big part of the bad reputation that Mathematica used to have for its "slow performance" is related to the fact that most people are only aware of this language layer, because it corresponds most directly to their programming experience in other (procedural) languages.

The other two layers serve several purposes, such as improving speed and quality of code design, generally improving performance, and removing certain performance bottlenecks within Mathematica, without resorting to external code (although this is also possible through connecting technologies such as MathLink or J/Link).  Also, and perhaps even more importantly, they  provide a programmer with new ways of thinking about the problems.  Less important for some scientific applications, these layers are much more important  for software development and prototype design.

From the pragmatic point of view, the proper use of each of the above capabilities individually, and the ability to choose the programming paradigm that best fits a given problem, can  greatly improve one's uses of Mathematica (both in terms of speed of writing  and debugging the program, and speed of the code execution).

It probably does not make sense to master Mathematica on this level for someone who needs it just occasionally, to compute an integral or two or plot a graph or two. However, for a person who needs to routinely perform lots of non-trivial checks and experiments (typical for computer modelling/simulations or rapid prototyping), this level of use of Mathematica will be very valuable. The end result of learning these techniques will be twofold: great reduction of time (both human and computer) and code size for most problems, and the ability to push Mathematica a lot further in solving hard or computationally-intensive problems, before switching to more efficient specialized software or programming language.    

From the programmer's point of view, the speed of writing and debugging the code combined with its typically small size allows a single person to manage quite large projects. The mechanism of packages provides a support for larger scale programming. In addition, by combining the above functionality in non-trivial ways, one can develop different and possibly novel ways of both programming and thinking about problems. The underlying rule-based nature of Mathematica makes it possible to remove many restrictions on what can be done in principle, typical for more traditional languages. The price to pay is often efficiency issues. Getting familiar with Mathematica on a deeper level can help deal with them in many cases.

For non-trivial  and/or computationally demanding problems, containing many steps, it is rather dangerous in my view to take the "recipe" approach and search in the Help etc for similar problems solved (this certainly helps a great deal, but you have to understand the code).  Even if the solution you find is optimal for some other problem,  there are many subtleties which may turn your even slightly modified code wrong or inefficient.  Learning these subtleties by trial and error  may be faster for every given case, but does not pay off at the end, if one has to frequently use Mathematica. On the other hand, learning a coherent picture of Mathematica  programming will ensure that you always pick the right idiom for the problem. Also, all the mentioned subtleties are then naturally understood within this framework, since on a deeper level they simply reflect the way the system works.   

"node66_4.gif" "node66_5.gif" "node66_6.gif"

Created by Wolfram Mathematica 6.0  (05 February 2009) Valid XHTML 1.1!