This semester, I am putting together a 3D data visualization tool using Processing, to demonstrate the usefulness of this language for rapid prototyping of ideas for 3D graphics applications, including virtual reality. The code, documentation, and issue tracking are, as of recently, hosted on Github here.
If you’ve ever taken a programming course or encountered an intermediate online tutorial in a language that encourages object-oriented programming (OOP), you will have probably studied an example that describes the language feature in terms of a motivating example (or two). You might have designed Dogs that subclass Animals, which emit ‘woof!’s; Cars with max_speeds; Persons with boolean genders, and so on…
But these are toy examples that are underwhelming and too straightforward to help in practice. Furthermore, they’re not all that interesting. Once you’ve completed the typical OOP introduction, you often end up with code that performs a function implemented elsewhere a hundred times over, with greater efficiency. That in itself is not that bad: we all need pedagogical examples simple enough to introduce in a 75-minute lecture. What’s worse is that you’ve written code that you simply don’t care about. And that’s a recipe for demoralization. For the beginning programmer (yours truly) OOP becomes a convenient abstraction that bored you to death once or twice (and went ‘woof!’).
Nothing motivates design like the task of modeling a system with which you are otherwise quite familiar. Daniel Shiffman’s free online book, The Nature of Code, focuses on the simulation of natural (physical) systems and gently introduces OOP as a means to modeling “the real world” (rather than a toy example of a Car with an Engine). Of course, Shiffman’s models are simplistic too, relying basic mechanics and vector math to animate their construction. Nevertheless, his examples and exercises leverage your best guesses about how the world works and challenge you to implement them in code, which is the nature of (many kinds of) programming: to take the big world and, in code, make a small world that — invariably imperfectly — reflects the large.
My advice, then, is dive right in with a project that you care about, preferably a project that requires many “moving parts” such as interdependent entities (nodes that talk to and consume others), user extensibility, and large amounts of object reuse: a model of a mini-universe of sorts. The model doesn’t have to be physical: it can be of social relationships, knowledge, data. All it has to do is matter.
In the remainder of this post I will sketch the design of the project I am currently working.
Project design
The goal of the software is to generate 3D data visualizations from quantitative and qualitative data ingested from a CSV file. The display of the visualization should be separate from its construction, so that ultimately different display ‘engines’ can be swapped in and out to allow for the presentation of the visualization on, for example, the computer screen, a VR headset, a smartphone, or even in the form of a 3D-printed model.
As it stands, the engine has a structure as depicted below. Incidentally, there is a well-documented and ‘popular’ domain-specific language for the description of the relationships between objects, which can generate similar-looking diagrams out of code, called the Unified Modeling Language (UML), but to take it on requires its own post. So, the picture below is a rough approximation of the design of the engine rather than a reproducible blueprint (such as that provided by UML and the like).
There is exactly one Scene in the application, which contains a list of PrimitiveGroups, which themselves contain Primitives. A Primitive corresponds to a single data point: one row of the CSV input. A Primitive has a location in 3D space, as well as a velocity, which allows for the animated restructuring of the Primitives on the fly. Some simple primitives are included: a sphere (PrimitiveSphere), a cube (PrimitiveCube). New Primitives must subclass the Primitive class (which should never be instantiated: it is an abstract class). Primitives must have display() and update() methods. The display() method contains the calls to Processing’s draw functions (e.g. box()). At this point, you realize that Primitive should (and can) be implemented as a Java interface. After all, Processing.org is Java at base. The Scene also contains an Axis object which can be switched on or off.
How does the engine generate the Primitives per the contents of the data file? And according to what rules? In many ways, this is the heart of any visualization engine. The concept of a DataBinding is introduced.
A DataBinding realizes a one-way mapping from the columns of a data source (i.e. kinds of data) to the properties of a Primitive, by returning a PrimitiveGroup which contains one Primitive for every row in the data source (read by the DataHandler, which is a very thin wrap around Processing’s Table object).
The mapping is specified by the contents of a DataBindingSchema, which is a hashmap (read in from a YAML file, see examples 1 2) in which the keys are the properties of Primitives and the values are column names in the data source. As a consequence, the DataBindingSchema specifies how the visual properties of Primitives respond to the data stored in the CSV file that is being read in. The DataBinding also has a validation method which throws a custom exception when the DataBindingSchema refers to column names and/or primitive properties which do not exist. It will ultimately also do type-checking.