Visualisation and Analysis Software - off the Beaten Track
I'm occasionally asked 'why haven't you reviewed such and such?' The answer is sometimes (but rarely): 'I don't think it's a good product, so I passed it by'. On other occasions: 'I've tried, but the publisher hasn't made a copy available for review'. Then there is 'I didn't know about it', or 'I hadn't thought of it'; in which case, I go and take a look. Size or clout don't impinge. We've reviewed some very small and very large products, from very small and very large companies, and have actively looked for those with a particular outlook or distinct contribution.
Products which evolve rapidly tend to get noticed more frequently. Reviews often originate in a suggestion or an enquiry - from a user, from a potential purchaser, from a colleague, from any number of other sources. Reviewing software is also, like many areas of human endeavour, inclined to be chreodic: channels that yield results tend to be revisited; return to those that do not is less frequent; and those that are never encountered in the first place don't get revisited at all!
The deliberate intention behind this article was to break out of the channels and look specifically for examples of analytic or visualisation software that have (for one reason or another) not been reviewed. In contrast to my usual practice, I didn't work intensively with any package over a period of time; I just dropped in on them in their day-to-day employment by tolerant users. In the event, there were far more of them than could be mentioned even briefly; this is just a whistle-stop tour.
Of the big-name number-crunchers, the omission that stands out is the SAS Institute. SAS, the product, is a giant that tends to swim in business more than scientific waters; that's not a hard and fast rule, but it's a definite tendency. Not that I see any particular barrier between the two areas (some of the products below are primarily aimed at business use), but in a quick overview like this, I prefer to politely bypass it for smaller siblings. Scientific Computing World has twice reviewed one of them: StatView, now sadly discontinued for two years and sadly missed. Another, with a high profile and widely used in engineering, is JMP - towards which SAS are directing the StatView upgrade path. JMP has developed a great deal over time, and is now a very easy to access source of mainstream analytical methods. The interface is unusual but effective: a tabbed set of onscreen 'cards', from each of which is accessed a subset of related icons, each with a short piece of identifying text. For most people, most of what is needed will be here (SAS says that JMP provides 90 per cent of its revenue) though, as always with any product, if you are a knowledgeable user requiring something particular, you will look before you leap.
- Multiple views of JMP's 'tabbed cards' interface, with (at centre right) a worksheet overlaid by a spinning plot generated from it via the 'Graph' menu (top right).
Also very welcoming, encouraging even the newest and most nervous user to get stuck in and explore data, is KYplot - one of those products that's actually fun to use as well as productive. This is a product that emphasises the interplay of analysis and visualisation for productive data exploration. One graduate student called it his 'aerial reconnaissance platform' for new incoming experimental data. Version 4.0 (current at the time of writing) has gone well beyond the originally sound but unassuming capabilities - new statistical tools, image processing, ternary plots, and a range of other facilities are newly arrived. Area and amplitude of signal peaks are a click away; graphical output of three dimensional FFTs likewise. Size of data set used to be a problem; it may be for the very largest sets, but if you can store it in Excel you can now import it to KYplot.
For a purely visualisation approach, a friendly alternative to KYplot is Vista (Visual STAtistics). This is a freely downloadable (but not public domain, and with moderated source code access) node-mapped system, designed by Forrest Young at the University of North Carolina. It follows the model used in many case environments: select your components, join them up into a sequential process, and run your analysis. Like many such packages, Vista was written for academic use, but offers a distinctive approach for work well beyond that.
- Using other people's kit, in their time, it wasn't feasible to generate ideal screen shots for each package; so, to provide a level playing field, this collage is compiled entirely from images supplied by the publishers of the products illustrated. Top left, BayesWare; top right, Vista; frame centre, RCAexpress Fishbone software; bottom left and bottom centre, The Unscrambler; bottom right background, WinBUGS; bottom right foreground, KYplot.
Still on the theme of easy-to-use interfaces, Camo Technologies offers two intertwined products of which The Unscrambler is the analytic line, representative of a growing market in tools for rapid access to multivariate data. It could be an invaluable time saver, stripping away surface layers of confusion before applying more traditional tools to what it lays bare. The core unscrambler itself is supported in the full package by a multivariate experimental designer, as well as by a predictor and a classifier responding in real time to instrumentation data. There is also a spectroscopy pack for pre-treatment and model transfer. All of my trials were carried out in version 9.0, but a release (9.1) appeared in August and a quick play around shows a number of useful new features; pulling in 3D tables from MatLab, for example, speeds up relevant investigations significantly.
For something very different, and very specific, but along the same 'initial earth moving' line of approach, there is RCAxpress's Fishbone software. This is not a statistical-analysis product, but a problem-solving one worth investigation by anyone who has to isolate woods from trees - particularly in industrial process. It offers automated generation of Ishikawa diagrams (also known descriptively as fishbone diagrams, hence the product name), for relations between input factors and a specified outcome. Simple to use, once you've got the hang of it, it could save countless hours of tedious and expensive manual background work before or between analyses.
Slipping sideways from the focused networks of an Ishikawa diagram into fuzzier realms, there is an increasing range of products for applied analytical work with neural networks and Bayesian statistics. Both are available within some of the big beasts of the statistical software jungle, but much of the running is often made by smaller products from individuals, study groups, faculty departments or niche companies. Bayesware Discoverer, though not new, is still well worth a visit. Aimed at data mining approaches, it is yet another way to shift a lot of topsoil quickly and discover possible lines of enquiry (or, of course, check out intuitive ideas already in place, but not yet firm enough to justify investment). It offers a largely automated way to generate models quickly, by sucking in the contents of a database (it will talk through DDE to MS Access and S-Plus), searching for the best probability fit, and spitting out a Bayesian network. Wizards are the fast way into using the software at first sight, but options can be fine tuned as you gain confidence and want to focus more finely.
Staying with things Bayesian, I had a lot of fun with WinBUGS. As a free-to-download environment, this could perhaps have been included in the Free Software article on page 22, but it is only really free for educational use, so it finally found its roost here. It is one flavour, designed to run under Windows, of BUGS (Bayesian inference Using Gibbs Sampling; jointly developed by the MRC Biostatistics Unit at Cambridge and Imperial College School of Medicine at St Mary's, London). WinBUGS makes playing easy; although it is restricted to smaller problems, it gives a good feel for what is possible in the fuller versions (my initial flying hours being on other people's DOS and Linux copies). PKBugs is an extension into pharmacokinetic and pharmacodynamic models and, of particular personal interest to me, spatial model fitting with map output (aimed at epidemiology, but useful for any space-related data) comes via GeoBUGS. There's even an Excel add-in - enter data, the sheet calls WinBUGS for you, results are returned; the user need have no knowledge of BUGS at all, although obviously it helps.
With regretful thanks to all those users who gave their time, experience, and access to other good packages not mentioned here: 'That's all, folks!'. Time and space have run out on me. The others will get their day in the future. Remember that reader suggestion is one engine behind review choices; if you think that an important or deserving case is languishing unnoticed, please let me know.
SOURCES
Bayesware Discoverer
http://www.bayesware.com/products/discoverer/
JMP: 50Megabyte demo copy from
http://www.jmp.com/jmpdemos.html
KYplot
http://www.kyenslab.com/en/
RCAxpress fishbone software
http://www.rcaxpress.com/
The Unscrambler
http://www.camo.com
Vista
http://forrest.psych.unc.edu/research/index.html
WinBUGS
http://www.mrc-bsu.cam.ac.uk/bugs/winbugs/contents.shtml