As I was writing this, in a sleepy west of England seaside town, the world was scrambling to deal with humanitarian catastrophe following the Indian Ocean tsunami of December 2004. Though I have no working knowledge of the affected region, datasets dropped into my inbox because chance electronic acquaintances, some of whom I have never physically met, needed ad hoc analysis and comment and a larger, specialist agency wasn't immediately available.
In the course of all this, my unknown correspondents may have shared with me, and with other parts of the network, no more than a few words of vocabulary - even their mathematical notation may sometimes be unreadable - but when it came to graphical presentation, we were all on common ground. With the interconnection of people, irrespective of geography, collaborations which would once have been either impossible or heroic have become casual actions in the odd moments of a normal working day.
Most of the work, given the context, was not about sophisticated analysis; I was just supposed to summarise volumes of information from one kind of specialist in a way that makes them useful to another. Detailed graphic communication across the internet needs task-specific tools. Just a moment ago, I was dealing with one such request through SigmaPlot; at eight o'clock this morning, it had been Origin. My primary use of those two packages (versions 9.0 and 7.5 respectively), however, has been on a smaller and less dramatic stage. Chronic rather than acute problems have been the subject of previous internet conversations - between inhabitants, environmental chemists, crop specialists and accountants, as agriculture adapts to localised groundwater changes driven by climate change and industrialisation.
- A basic black and white histogram (background window Graph1) is modified by application of an Origin 7.5 theme from the gallery (foreground). At screen right, a matrix has been generated from the worksheet (left) preparatory to generation of a three dimensional plot.
I'm regularly asked: why use more than one package in cases like this? Why Origin and SigmaPlot, rather than just one of them? Part of the answer is, of course, 'because I can'. More important is that different packages offer different blends of strengths. Informative visual communication is like good cooking: sometimes speed is of the essence; other times depth is more important; but either way, you want the best possible result. Over the past few months, I've come to appreciate, amongst other things, the quick, uncomplicated accessibility of SigmaPlot; the control and extensibility of Origin; and the large area of competence overlap between them.
How far any given user wants development of in-package analytic convenience to weigh against that of core visualisation varies tremendously, and this is reflected in the software. SigmaPlot offers t- tests, for instance, and very convenient outlier identification, as well as the obvious natural companions to scatterplotting, such as curve fitting and a flexible, sophisticated, well-developed regression menu. Analytical breadth beyond that is left to its sibling, SigmaStat, all of whose repertoire is callable from within SigmaPlot, but depth develops from one release to the next. This seems a wise choice, usability by a wide audience being a prime asset which would blur under the weight of too many extras. Origin, on the other hand, pitches itself at a different market, and has no equivalent stablemate, so it has evolved a more muscular set of tools. An analysis menu includes, alongside curve fitting, correlation and data management, extended options such as FFT, convolution and deconvolution; the statistics menu next to it adds one and two way ANOVA, plus Kaplan Meyer and Cox Proportional Hazards for survival analysis work.
- Comparison of the different workspace layouts in SigmaPlot 9.0 (top left background) and Origin 7.5 (bottom right foreground). Import of data from SigmaPlot to Origin (bottom right) and from an Excel sheet (lower left).
It seems likely, though, that either package will be bought for its core graphical strengths. Interoperability with other software, and recognition of evolving conventions, are a different matter: such issues are now central to any software's survival in its evolutionary niche as the larger environment changes. In the increasingly borderless information world, it's not enough for a product to do its job well; it must function well as an extension of other tools that may be in use. While a fully seamless flow is still an unachieved ideal, reality is gradually approaching it. Both these packages will import data from a range of sources, and, in the overwhelming majority of cases, do so without the user having to think very much about it.
Each will directly open a native Excel sheet - a trick whose necessity these days says something about the compromises which we, as users, make in the interests of one-world standards. There are far better data-containers than Excel, but its ubiquity can't be ignored. SigmaPlot is well placed to act as an unthreatening but powerful extension for Excel users, while Origin is probably the choice for similar expansion of MatLab (though it also responds to direct Excel control). Origin has taken great pains to enhance the general level of data transparency; initial import setup for your particular data may still require some time (though a wizard takes much of the strain), but the results can be saved to a filter file and data files thereafter dragged and dropped into place. SigmaPlot can now take data directly from ODBC sources.
It's not just file formats that loom up to spoil the one-world view: progress constantly throws up new requirements, including low level standards, with data itself being encoded in new ways. Not so long ago, I'd never heard of IRIG (see below); now it crops up with noticeable frequency. Origin recognises IRIG directly, and also talks to LabView for a significant reduction in transfer overhead.
User interfaces are also aspects of transparency. Both SigmaPlot and Origin continue developments that take them a long way from the faces they presented even a couple of releases ago. The hold of Excel rears its head again in modifications to worksheet behaviour; SigmaPlot, for instance, offers find and replace, pane freezing, a print preview and improved formatting (also column headings which need no longer be unique - I'm not sure this is an advantage, but the option is there). Origin introduces optional automatic update of repeated worksheet calculations, including those based on columns from other worksheets.
- From a SigmaPlot 9.0 worksheet, comparative histograms (pictured in the background) and bar charts (shown overlaid, lower left) are generated.
SigmaPlot has the same notebook manager structure as SigmaStat, its general features and look now familiar across the analysis and visualisation software sector: a tree structure in the (dockable) left browser pane, showing all open notebooks and their components, with gallery palette beyond and content on the right. This approach works well here; a useful touch is the ability to consult pages from several documents simultaneously - in my specific case, that means easy comparison of particular groundwater profiles across multiple sites.
A parallel multiple-worksheet comparison is found in Origin's ability to produce wizard-guided plots from the same variables in different sheets. Origin has also adopted the idea of 'themes' from other application areas - a theme being a collection of formatting details which can be applied to output. Themes are provided, but can also be defined by the user and a theme can be set as the system default for future sessions. My groundwater data visualisations go out to a dozen or more participants who usually have to put up with a single 'look', but when the work is being done in Origin it is feasible to provide different versions in their various house styles. Alongside this is copy and paste of format settings from one graph to another, enabling rapid harmonisation of detail across the work at hand - particularly useful when evolving a new theme from an existing one.
The great strengths of SigmaPlot are speed and ease of use. Many of the incoming water project datasets have been turned around and sent on their way speedily, effectively and painlessly thanks to exactly these SigmaPlot qualities. To this extent, transparency, intuitiveness and interface design are part of core function. Other developments, however powerful and clever, must never interfere with usability. Systat Software has shown a shrewd awareness of this. Since their acquisition of the product line, many improvements have taken place so that running the software is a completely different experience; other changes have sensibly been slow and careful out of consideration for the user's convenience. Improved support for category data is an example: not just good in itself, it also reduces the need to manipulate the data sheet for a given output. Automatic placement of curve-fit equations in graphic titles, power display, an improved wizard for histogram generation, more useful default labelling, and a clutch of other improvements, continue the same trend. In a different sense, so does the arrival of separate preference files (and password file protection) for all users on the installed machine.
In Origin, improved usability is a means to an end rather than an end in itself. Here, the central concern is fine-grained control across large projects and over time: enhanced usability follows from developments in function. Where SigmaPlot has a transform language, Origin has Origin C and scripts. Other files related to a project (reference material for instance, such as textual commentary in PDF or MS Word) can be attached to the Origin project file to form a unified package; and if any of these are script or Origin C files, they compile automatically. Beyond that, automation server support allows direct access from other applications including response to Visual Basic - I've had some of the more voluminous and standardised water datasets coming directly into a dedicated analytic package, on to Origin and out again as standard format graphics, then emailed onward to the next person in the chain, before I even see them. It's equally possible to return results to the originating application or, conversely, pass data out of Origin for processing and bring the results back in again. Origin also handles its graphics as discrete components, giving greater control over their interrelation as well; there are other aspects going on and on, but you get the idea.
Why use two different products? Better to ask 'why not?', when they are available and serve complementary needs.
IRIG-ularity
In the last issue, the editor commented wryly that the advances in computing speed which make modern science possible are often driven by military requirements. I share his sad acceptance of this fact of life, which goes beyond speed to most other aspects of applied technology. The IRIG (inter-range instrumentation group) standard is one example.
Time data is crucial for many purposes, and is subject to all the uncertainties of accuracy and precision as other measures. As data is increasingly correlated across the globe, accuracy becomes more and more of a issue. If I use the same measuring instrument in London as a colleague in Ulan Bator, it's a fairly trivial matter to harmonise our methods and results; but to ensure that clocks agree to the same level is more of a challenge. If our contact is only intermittent, the uncertainty escalates. We can measure to ridiculously precise subintervals of a second, but is a time difference down to the phenomenon being measured or the vagaries of our respective computer clocks?
IRIG is a 74-bit code for time data acquisition, recording and representation which embraces both precision and accuracy. It is increasingly found in a great variety of instrumentation and data capture contexts. Binary coded decimal time of year information accounts for 30 of those bits; binary seconds of day another 17; and 23 bits for other functions (daylight saving and leap second information, local time offset, parity, time quality - incremented in levels of uncertainty against UTC from 0000 at full reliability to 1111 indicating none); and the last two digits of the year, with four bits remaining unused.
Time signals based on it are generated for a variety of uses from raw experimental data stream to audio and video. For more detail, see IRIG Standard 200-89 (Range Commanders Council of the U.S. Army White Sands Missile Range) and IEEE 1344-1995.