Moving Code Tests Forward, And Looking Back To The Study

Moving Code Tests Forward, and Looking Back to the Study

Nov 08, 2012
2 Comments

Code Tests have made huge strides since the last update. While the primary goal is to beef up technically, when my brain has been gettng too caught up with the techy stuff, I've been taking breaks by implementing some early ideas for UI improvements. I've broken down the updates below:

Technical Progress

If I had to summarize, I could say that the technical journey of Code Tests since the last update has been my realization that instead of 7 crawler, only 5 are really needed, and then, upon further investigation, discovering the need for two completely new crawlers, going back from needing 5 to needing 7! But a little more detail:

I have implemented the SpreadExpressionStatementCrawler. This crawler can be used to compare different ExpressionStatements anywhere in the code. For example, a rule that I pulled from Michelle's mentor study to test out this crawler looks for code where an object calls "setOpacity" with parameter 0, and then that same object calls other methods later in the code (I made a super-fun world for this with magical disappearing birds). The SpreadExpressionStatementCrawler will then return a NodeHolder with the setOpacity call and all the subsequent calls.

Initially, I had planned for tests involving comparing multiple chunks of identical code to be just one case for the SpreadExpressionStatementCrawler, but after working with it, I feel like it should actually be its own separate crawler. Perhaps if only to respect the return types of isValidConstruct: for SpreadExpressionStatementCrawler, isValidConstruct should return a set containing one group of statements passing the rule, which gets represented by a NodeHolder. However, these tests involving multple chunks really should return multiple groups of identical (or similar) statements, which really should get represented by multiple NodeHolders.
Early after the last updater, I began work on the two other crawlers types that I had initially identified from my notes on Michelle's mentor study: the UserMethodCrawler and ConstructBodyCrawler. These crawlers were envisioned to actually invoke another one of the crawler types, but only crawl with and return a UserMethod or specified construct, respectively. However, in the process of implementing these, this pattern didn't seem right. There were several technical obstacles to get this to happen, as well as interface issues and complications.

My solution was to eliminate these two crawlers entirely. Instead, the crawlers that originally had been envisioned as the "secondary crawlers" for the UserMethodCrawler and ConstructBodyCrawler (the three ExpressionStatement crawler types) now have additional customization options to determine whether the return type should be "Default," "Method," or "Construct." Although technically a little tricky to implement, this ended up preventing the barriers I kept running into with the mechanics and interface of having "secondary crawlers."

So, for example, let's say a user has a rule looking for any time the setVehicle method is called by an object. With a "Default" return type, the test would return every instance of the setVehicle method that occurs in the program. With the "Method" return type, the test will return every method that has a setVehicle method within it. With a "Construct" return type and "ForEachInArray" selected, the test will return every ForEachInArray loop that contains a setVehicle method.

Currently, these "Extended Return Types" are working successfully except for Construct returns on SpreadExpressionStatementCrawler, which is still being tweeked.
After getting to a point where the majority of the crawlers have been implemented and are in a functional state, I decided to return to the mentor study notes I made last month. In these notes, I had initially taken the rules that mentors wrote for Michelle's study and determined how I would crawl them in a general sense. Then I used the patterns that emerged to design the different crawler classes I have been implementing.

This past week I sorted all of this data by the crawler I eventually decided on for each rule, and made sure that, in the manner that they have been implemented, the crawlers are still capable of coding for these rules. The good news is that, with the exception of a few question marks that I'm just going to need to actually write the code for to see if they work, my quick analysis looked positive.

The one thing I did notice was that out of the 50 rules that I extracted for this data, there was one rule that i now realize is going to need its own crawler type: a SpreadExpressionCrawler (just like a SpreadExpressionStatementCrawler but for Expressions instead of ExpressionStatements). Although there was only 1 rule written for the study that made use of this type of crawler, I can picture potential rules that might also make use of this crawler, so I have decided it should be implemented as well.

What's an example of a rule that might make use of a SpreadExpressionCrawler? How about a world where a Number variable is only ever assigned two distinct values, but those values are assigned to it at multple points throughout the program--maybe this student needs to be introduced to Booleans instead!

(Now you can see how I'm back to 7 crawler types!)
During my last demo (of the ConsecutiveExpressionStatementCrawler), my prepared test was pushed a little further beyond its limits and it appeared that the crawler was not returning multiple instances of repeated statements. Turns out there were a couple of small bugs in that crawler, but those are all smoothed out now, and the ConsecutiveExpressionStatementCrawler appear to be running like clockwork (although I will not be surprised if more bugs appear with continued testing)
I noticed this week that the ConsecutiveExpressionStatementCrawler does not "respect" statements with bodies. That is to say, imagine a test looking for three identical method calls in a row. If calls A, B, and C are next to each other and identical, but A is the last statement of a DoTogether, B isn't encased by anything, and C is the first statement of an If-Else block, the crawler used to return ABC as passing the test. I was unsure whether crawler "should" or "shouldn't" consider this as passing, so I turned the option over to users with a checkbox. It is initally set to "respect constructs" (so that ABC would NOT be considered passing), but this can be turned off if the user so desires.

User Experience

Although it's far from being ready for prime-time, I took some breaks from the technical aspects of Code Tests to implement some user experience improvements I had been thinking of, using some of my awesome knowledge that I've gotten from HCI class.

Previously in code tests, after clicking the "Run Test" button, feedback was minimal. A test that returned no passing or failing constructs appeared identical to a test that exploded during execution. Now, next to the "Run Test" button is a string displaying the state of the test execution: "Ready," "Running test," "Done," or "Interrupted." This feedback has already lead to much less frustration for myself as I'm writing tests, and I'm sure will also help clarify test execution for the user.

In addition, currently exceptions that occur during the execution of code tests output to the Java console. However, this is only available when running out of Eclipse, not in a stand-alone application. So I've modified the "output" box in the code tests interface to also display the thrown exceptions when they occur.
As I often think while writing (and reading) my personal blog entries, the names that I have chosen for the crawlers I've made are quite a mouthful. I chose them to be informative as to their relation to the AST, but chances are mentor's aren't going to be thinking in terms of an AST (at least not initially anyway). What's more, they're probably not going to be thinking in terms of crawlers either.

To address this, I've changed the language used for crawler selection. Before, a label for "Crawler Type:" stood above a dropdown containing all the crawler names that I've been talking about for the past month: ConstructCrawler, ExpressionStatementCrawler, ConsecutiveExpressionStatementCrawler,... etc. To the user, this doesn't mean very much.

Instead, I've translated the crawler type into the type of rule it is used for. The label now reads "This is a rule about..." and the dropdown completes the sentence: "...a control construct"; "...a single line of code."; "...consecutive lines of code." I think this is much nicer.
The Code Test interface now has four steps explicitly labelled and numbered to help guide users through the construction of a test:

1. This is a rule about... (As discussed above)
2. Customize (select the customization options for the selected crawler)
3. Write isValidMethod (see next bullet point below)
4. Run Test (indicating that the user is ready to click the "Run Test" button)

This seems like a good starting point for helping users make sense of the Code Tests interface, although I feel like it still leaves some questions in the user's mind.
As part of this walkthrough, the user is explicitly told what the input is and what the return type should be for isValidConstruct. Because now different crawlers have different signatures and return things differently, this is absolutely necessary for the user to understand how to write their tests.

Future Work

Finish the construct return type for SpreadESCrawler
Implement the two new crawlers I have identified. Considering how closely related these are to SpreadExpressionStatementCrawler, they should be fairly straightforward to write.
Fix stencils. I've figured out why this currently isn't working--stencils are tied to a window, while code tests are now a perspective on the same window as the actual code. Not sure the best way to address this--perhaps make a modification of the stencil class that will respect perspectives? At any rate, I don't want to sink too much time into this until I have the other parts of code tests working, as I'm woried about it distracting from the other efforts.
Get local Save/Load working again. This was broken over the summer, but even still was only written to work with the old school ConstructCrawler anyway. An update is in order.
At some point, I feel like the sheer difficulty of writing code tests is going to need to be addressed. API functions that have been written are helpful, but they can only go so far. The "setOpacity" test that I discussed above probably took 25 minutes for me to write, and I'm the person who's most familiar with Code Tests. Is a mentor really going to have the patience to write that test once they think up the rule? Or am I just currently on the pioneering edge, and once mentors have previous example to pattern their tests off of, things can be quicker for them? I'm not sure how soon this should be addressed, but it's been on my mind as of recently.

Comments

kyle said:

I'm really interested in trying to figure out solutions to help our mentors pick the right crawler. I'm also a little concerned how all these changes are going to affect running the code tests on the web server since all of these changes have completely broken that setup. But I think this is getting ahead of the problem. We first need to test with mentors to see what works.

Posted on Nov 09, 2012
caitlin said:

Please share the magical disappearing birds world! This all sounds fantastic and I'd like to chat about starting to put in place a plan to get some mentor-ish types to come back in and try some of these out.  On the stencils front, I suspect these are using an old way of doing this. Perhaps we can look into that as the tutorial and remix perspective have driven some updates. To fix it might just be a matter of moving over to a newer set of calls. And perhaps we can remove the old world order at the same time to prevent confusion for others.     The difficulty issue is definitely there. My hope was the the crawlers would make them more reasonable to write. And it may be the case that the pattern and adapt is a good way to go.  Or we may start to see other possibilities. Perhaps we should get a few of us to spend some time writing some tests based on the first study answers and see whether that sparks ideas.

Posted on Nov 09, 2012

Moving Code Tests Forward, and Looking Back to the Study

Comments

Aaron Zemach