John Ratke's Software Development Blog

Sunday, March 27, 2011

A Few Google Tech Talks

I was browsing recently and I noticed a couple of new Google Tech Talks.

First, Miško Hevery's talk "How to Write Clean, Testable Code", given at Google NYC. Another presentation of it (with slides) appears to be up at his blog.

Here are a few things from the talk and Q&A session afterward that I found interesting:

0) Miško says something like "people think it's cool when I'm giving a demo and I press the save short cut and the tests run; but getting the setup right can take hours". You want to make it look like magic. You want to make it look easy, even though it often may not be easy to get to that point. This is where the discipline comes in. You'll also notice that this key binding to save and run tests is the same thing that Gary Bernhardt does in a recent Destroy All Software screencast. That's a great technique that I'd like to get working soon in my development setup.

1) Miško suggests getting started in open source. No schedule pressures and the worst they can do is reject your patch.

2) During the Q&A part, a nice guy with a British accent mentioned how he got started. Specifically he talked about how almost every developer has the experience of testing a program out by writing a "print" statement in the code and then executing it and checking the output value. This guy's mindset was to write automated tests instead, and I thought that was an interesting insight. Granted, print statements are still a fast way to get an idea of when and how often a piece of code is executing while running the complete program, which can be useful when learning a new code base.

3) Another guy said he was starting a brief project in Python using Django and asked if anyone had any advice about how to get started unit testing. Unfortunately no one gave any good suggestions or comments to that guy.

Here are some older talks by Miško that are also great, in case you missed them:

Another interesting talk I watched was Tools for Continuous Integration at Google, which was presented prior to the Google Test Automation Conference 2010. This talk was about Google's distributed build system and the lengths they've gone to to make builds fast. This ties into my previous post about the importance of tools.

Saturday, February 26, 2011

Icy Projectile Challenge Results

The Icy Projectile Challenge programming competition is finished!

Here you can see the complete tournament results
http://queue.acm.org/challenge/2011/tournament/tournament.html

You can click on each dot on the tournament tree to launch a Java applet to watch that particular match play out.

I made it to the "Elite Eight" ;-)

One advanced technique that I wish I had come up with was the ability to throw a snowball at a competitor from a crouched position. Repeatedly doing that allowed people to keep their opponents immobilized (potentially forever, if the opponent was pinned down and couldn't be pushed away by the snowball impact force and the thrower wasn't immobilized himself.)

For posterity's sake, I've decided to share the development of my player here: https://github.com/jratke/ICPC
The majority of my player development took place in the file python_example/player.py

I started out basing the code on the python hunter.py and gradually adding capabilities. My final basic strategy, as you can probably tell from the matches, was to have three players go out and throw snowballs at the other team, while one player would plant snowmen at four points designed to have my snowman domain covering much of the field. The "planter" player also tries to convert all of the other team's snowmen to my team if he sees any as he goes on his way to the next point. Once he successfully completes the fourth snowman, he can keep going around the points to try and convert more snowmen, or (if I haven't seen any snowmen of the opposite team at all) he can join in and throw snowballs like the other three.

Congratulations to Kamran for winning and congratulations to the other finalists. And thanks again to the organizers.

Tuesday, February 8, 2011

Icy Projectile Challenge

Queue magazine, the magazine of The Association for Computing Machinery (ACM), is holding a programming competition called The Icy Projectile Challenge. The competition is based on the Challenge problem from ACM's 2010 International Collegiate Programming Competition. The contest is to write a program to control a team of virtual children in a snowball fight. One nice thing about it is that you can write your program in C++, C#, Java, or Python.

I've entered the contest, and I'm writing my program in Python. So far it is over 1000 lines.

You can see the preliminary standings here.

I wish I would have entered sooner! I hope there is enough time for me to improve my player even more. I'm looking forward to the final tournament!! Their web site says that the challenge is run by David Sturgill at the University of North Carolina at Greensboro with help from Matt Slaybaugh at ACM. So thanks to them for putting this on. It's a great idea, great fun, and I hope they do something like it next year as well!

Saturday, January 22, 2011

The Importance of Code Browsing and Other Tools

Recently, I watched this video of Steve Yegge giving a talk at the Northwest C++ Users' Group.

The title of his talk was "Open Scalable Language Toolchains", and in it he describes the project that he works on at Google.

Here is a copy of the abstract, and I'd like to highlight a few key points:

Modern IDEs and compilers generate a wealth of information, and you can't have any of it. Tools in the compiler family -- even the best IDEs -- tend to be monolithic, language-specific, generally non-scalable special-purpose applications. Even when they do support headless analysis, none of them do it the same way, and very few of them can do cross-language analysis. At Google I've put together a team with the long-term goal of addressing these problems in a general way. We've built infrastructure to run IDE-quality code analyzers such as Eclipse and clang over Google's entire corpus and all open-source code. We translate the intermediate representations into a language-neutral index, then serve the index data back through language-neutral APIs and query interfaces.

Steve says in the beginning of the talk that his project is for dealing with gigantic code bases, say 50 million lines or more.

A lot of gigantic systems involve multiple languages. Consider Android, which is a mix of C++ and Java. Note that in one slide he says:
-Consistency also enables cross-language analysis
-Analyze across RPC calls, embedded languages.

"Sort of the holy grail, but we'll get there", he says.

Yes, I believe they will, if they are not there already, and it will allow their software engineers to easily navigate across the boundaries between different languages.

The code indexing happens nightly on their distributed servers. Once generated, the index can be accessed from various different types of IDE clients. And the project is about more than just code browsing. It also enables static analysis. It sounds like they've already built up a pretty neat static analysis query tool. So it's a pretty amazing and ambitious project, all in all.

I think it shows the emphasis that Google places on having quality tools. Facebook engineering manager Yishan Wong also placed a high priority on tools; the top priority, in fact. Granted that his post is in the context of growing a small engineering team up to a medium engineering team, but I believe that tools should still be a very important priority, even in a large organization. Yishan's other posts on engineering management are interesting as well.

The Importance of Code Browsing Tools for Software Development

A good code browsing tool will help you whether your code is well written or not.

If you've got a huge mass of poorly written code with a lot of duplication and a lot of extra coupling, at least a good code browsing tool will help you navigate around more quickly, which should assist you (a little bit) in understanding the complexities of code.

If your code is well written, then that means it is well factored. Part of having well-factored code is reducing duplication. It's the old DRY (Don't Repeat Yourself) principle from the Pragmatic Programmers.

Recently, Jake Scruggs tweeted this:

Reducing duplication increases coupling which isn't always a good trade. #rubyconf

And that is true. Suppose that I have some duplicated code in a few classes. If I factor it out, I now have references (function calls) to the newly-unique code in each of those classes. Suppose that a parameter to that function now has to change. The coupling shows up because the unique function needs to change and all of the calling functions need to change. But consider what would have happened if I hadn't factored out that code. The code would have needed to change in each of the places where it was duplicated, which, on a larger scale, can be both tedious and error prone.

So, if your code follows the DRY principle, you have more coupling. You don't have duplication, which is good for maintainability. But you've broken the code down into lots of small pieces; pieces that are coupled together. Learning and understanding a code base like that can be initially challenging as well, since you need to figure out how all the little pieces work together. It's not as challenging as figuring out a poorly written code base, but it is still challenging. In order to figure it out, you need to navigate more and understand the connections and dependencies, and that's when having a good code browser really helps. So I can understand why this tool is so important to Google.

Thursday, December 16, 2010

Learning Python

One thing I'm working on right now is learning Python.

Why Python? Well, a few reasons, really. It is a "clear and powerful object-oriented programming language". Also from the Python page, "Data types are strongly and dynamically typed. Mixing incompatible types (e.g. attempting to add a string and a number) causes an exception to be raised, so errors are caught sooner."

It can be used for all sorts of things like web development, database access, graphics, game development, or even to control two huge robot grappling arms. I plan to use it to interface with twitter, for one thing.

I had done a bit of Ruby programming a while back, and there's nothing wrong with it, but I think I prefer the overall Python philosophy. For a good comparison of Python and Ruby, see Gary Bernhardt's presentation: Python vs. Ruby: A Battle to The Death. One thing Gary laments though, is the difference in the testing community/philosophy, where he notes that the Ruby testing community is more developed, or advanced, shall we say. Ruby's extreme flexibility allows for it to be easily used for some pretty innovative testing approaches. But I'm okay with these differences, for now at least.

There are many ways to learn Python, but I'm doing it mostly by using Greg Malcolm's Python Koans. The idea is to learn the language by getting a series of failing test cases to pass by filling in the missing details in the test assertions. It is essentially using Test Driven Development to learn a programming language, which is a pretty great idea. Instead of just randomly playing around, the Koans provide you with a guided series of steps needed to learn. You know exactly how many tests you have fixed, and how many you have remaining to be fixed. It's great to see the progress you make as you fix the test cases and learn more things as a side effect. And the tests don't limit you because you can easily add more tests to play around and explore any area you like.

By the way, this is a pretty cool visualization of the commit history of the Python project. It kind of shows how software is like a living, growing thing.

So, anyway, I'll let you know once I'm finished with the Koans, and if/when I create anything cool using Python.