So the last research meeting with CK, JK and BF was very a compelling discussion - about what is hierarchical structures and are students really exhibiting a sense of hierarchical data or are they merely coding efficiently. Well for me the one line summary of this discussion is "It doesn't matter!"
I have felt since the very first few representations began trickling in from our pilot subjects - it doesn't matter whether they are row by column or not. It is the computer that makes the row by column fully normalized flat table be the "best" data organization for capturing all the information in the most flexible way. NH in one of earlier SERG meetings mentioned that he is really tired of the tyranny of the rectilinear data structures!
In a recent administration to a large group of students, of all the representations we received, only about 40% were flat tables. Yet with the exception of 1 student, everyone captured all the information they needed to construct a fully-normalized structure. The sheer variety of structures that were on display by the students were mind-boggling. After having thought about all the ways in which we can structure the data and what makes for an efficient organization, we continue to be surprised at how many different "exhaustive" organizations the students can create.
We went into this work with a hypothesis that students would have trouble creating structures for hierarchical data and the only really "correct" - as in exhaustive in terms of capturing the data and case-centric in terms of maintaining the correlation amongst various attributes of a single object - structures would be fully normalized, flat row by column tables. We have instead seen so many different structures that are both exhaustive and case-centric that as software designers, the path to me is very clear.
Our software needs to provide a way for users to enter and organize data in many different ways and make transparent to the user the isomorphism between the three-way nested table and the partitioned
plot space and the fully flat row by column table for example.
The three representations above all record all the information in the situation and they maintain correlation among the values of the different attributes for a single object (the case). Any of these representations allow us to answer multi-variable correlation questions like - Are westbound trucks faster than eastbound cars? In that sense, they are equivalent representations and the user should be able to record data in one of these representations but be able to use the data in any of these forms or other forms that they may create.
Now to make that happen in software...