Much of the essence of building a program is in fact the debugging of the specification.

— Fred Brooks, The Mythical Man-Month

Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live.

— Martin Golding

A Testing Joke

A program manager, a software engineer, and a software tester were on their way to a meeting. They were driving down a steep mountain road when suddenly the brakes on their car failed. The car careened almost out of control down the road, bouncing off the crash barriers, until it miraculously ground to a halt scraping along the mountainside. The car's occupants, shaken but unhurt, now had a problem: they were stuck halfway down a mountain in a car with no brakes. What were they to do?

"I know," said the program manager, "Let's have a meeting, propose a Vision, formulate a Mission Statement, define some Goals, and by a process of Continuous Improvement find a solution to the Critical Problems, and we can be on our way."

"No, no," said the software engineer, "That will take far too long, and besides, that method has never worked before. I've got my Swiss Army knife with me, and in no time at all I can strip down the car's braking system, isolate the fault, fix it, and we can be on our way."

"Well," said the software tester, "Before we do anything, I think we should push the car back up the road and see if it happens again."

Levenshtein Distance

I spent the weekend working on implementing a few string comparison algorithms for mailcheck and decided to share one of the easiest ones to implement, the Levenshtein distance algorithm.

The Levenshtein distance is calculated as the fewest number of deletions, insertions, or substitutions required to transform one string into another (If you add the transposition operation, you get the Damerau–Levenshtein distance, and if you allow only substitution you get the Hammng distance.) Turns out implementation is a simple dynamic programming exercise because the Levenshtein distance can easily be calculated for various length substrings of each of the two strings being compared.

Take a look at the data structure containing the substring edit distance in action with strings of your choosing here.

The javascript implementation I came up with is as follows for your your perusing pleasure:

My first GitHub pull request went through this week.

Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.

— Brian W. Kernighan

There are only three hard things in Computer Science: cache invalidation, naming things, and handling of the 29th of February.

DATE_NOT_FOUND - The Daily WTF

Web programming is the science of coming up with increasingly complicated ways of concatenating strings.

Greg Brockman

ENDING SCRIPT - I BLEW UP AND FAILED

— Another great Microsoft easter egg. I mean, if you're going to fail, do so with style...

Buying RAM is for people who don't know how to write algorithms.
Algorithms are for people who don't know how to buy RAM.
System.FormatException: The string 'True' is not a valid Boolean value.

— C#

I say it is, C#. I call your bluff.

There are two types of people who want to make games. People who don't know what they're getting into, and people who are jumping on the hot thing the other people are jumping onto. The people who actually do make great games are rounding error.

Mike Lee

Totally not true. Java's far superior to Perl. And I think everyone who knows Perl views believes themselves as superior to everyone.

(via fuckyeahcomputerscience)

E-Level Commit Messages

Ever wanted to see what it's like to be a programmer? Our final group project for EN.600.321, Object Oriented Software Engineering, last fall had over 355 commit messages, a collection of which are given here:

On Nov 10, 2009, at 6:10 PM, Pablo wrote:

See if you can break it :)
Which time?

Begin forwarded message:
PL: Fixed Parker crashing
I crash all the time.

Begin forwarded message:
PL: EditCard saves preferences, fixed lil bug with title. The default title is title, and that was being saved as the title, even though it's not the title.
Nope, the title is not title.

Begin forwarded message:
HPS: Committing before Pablo messes any of my other functionality up.
Sounds about right.

Begin forwarded message:
Sizing bug? U mean how it's smaller when the app first launches select wut shows in cardareagui?
Pablo's typing with things stuck in his teeth again.

Begin forwarded message:
HPS: Added an "unselectAll" feature to remove selection artifacting. Doesn't work with TextBoxes 'cause class hierarchy is ridiculous? Text boxes aren't selectable? Yet no exception is thrown?
Yes?

Begin forwarded message:
JY: changes to pixmap/rectangle to more gracefully handle resizing. Only issue is JVM crashes when we push it on the stack.
Small issue.

Begin forwarded message:
HPS: Fixed NullPointerException with Signals in DragDropTextItem. Keep it clean, dudes.
Keep it clean, keep it classy.

Begin forwarded message:
Edit: This is only half a commit. WTF, svn?
Yes, indeed.

Begin forwarded message:
Filed in Pivotal:
...
Bug: saving doesn't actually work
No, that's a "feature".

Begin forwarded message:
JY: shift resizing of pixmaps. Also, any thoughts to having a right click menu for decks? ie, right click to close, test, or edit cards? I know u mac users don't right click often, but it is what all the cool kids are doing on windows.
Not true. We play Portal now, even.

Begin forwarded message:
Doesn't crash for me :S

Pablo
On Dec 13, 2009, at 3:31 PM, Parker wrote:
Known bug #1: Apparently, using the built-in Apple swatches in the 3rd pane of the QColorPicker on a Mac causes the JVM to crash. All I can say is, "Please don't do that".
You would, Pablo. Please don't do that.

Begin forwarded message:
HPS: Added exactly one lie to fix the bug with saving.
I think I meant "line", but there are often lies in my code...

Begin forwarded message:
PL: Fixed a bug... don't remember what it was
Neither do I. (It was 3 AM, though.)

Begin forwarded message:
HPS: Actually writing out z-values. Teeheehee.
JY: Fixed zheight issue of text edits - good catch parker
You preemptively fixed a bug. Nice job.

Begin forwarded message:
GA - fixed some behavioural problems during testing.
Your's or the program's?

Begin forwarded message:
GA - more behaviour modification
Definitely your's.

Google RE2

Regular expressions (regex) provide a means for matching text, whether characters, words, or patterns. They can be infinitely expressive and yet typically compact. An example is "[hc]at", which would match "hat" and "cat".

The implementation of regular expressions in almost all programming languages today allow for backreferences. Backreferences allow you to reuse part of the regex match later in the regular expression. For example "<([A-Z][A-Z0-9]*)\b[^>]*>.*?</\1>" would match any opening and closing html tag and the text in-between. The "\1" is used to reference the part of the regex in parenthesis, in this case, the type of tag we're trying to match. Thus, we can ensure that the we get the same type of beginning and ending tags.

Backtracking sounds really cool, but we run into a problem. Regular expressions containing backreferences have a potential for exponential run time and unbounded stack (memory) usage. Unbounded stack usage in typical regular expression implementations leads to stack overflows and server crashes and all sorts of terrible things. That is to say, under certain circumstances, evaluating a regular expression over a large enough input could hang forever, use up all memory allocated to it, or both. This is a very important problem in Google's case.

Enter bright computer scientists. Basing their approach on automata and computational theory (I took that class!), Google researchers have created RE2, a "principled approach to regular expressions". It claims the implementation guarantees that searches complete in linear time with respect to the size of the input and won't overflow the stack space. They have limited the worst case runtime by removing support for backexpressions, however.

In his paper "Regular Expression Matching Can Be Simple And Fast", Russ Cox, the Google engineer who also authored the press release, writes, "Given a choice between an implementation with a predictable, consistent, fast running time on all inputs or one that usually runs quickly but can take years of CPU time (or more) on some inputs, the decision should be easy." For Google, yes. For those who like fancy syntactic sugar, it's still great to have another option.

Sources and Further Reading:
Google RE2 Press Release
Wikipedia
The Command Line
regular-expressions.info
Code Repository: RE2 on Google Code

I think I might be more excited about the fact that he's drawing on a wall than the fact that he's doing physics.

For Those Who Know C++

string code = "";
code.append(0);
very much does not equal
string code = "";
code.append("0");

That was 20 minutes of my life.

Telling an inspiring story about a beautiful design feels disingenuous. Yes, we all strive for beautiful code. But that is not what a talented young programmer needs to hear. I wish someone had instead warned me that programming is a desperate losing battle against the unconquerable complexity of code, and the treachery of requirements.

— Jonathan Edwards, Beautiful Code