In a company of 10,000, stuff like that happens with clockwork regularity; your security team is pitted against the sum of human ingenuity. You work to lower the base rate of security lapses, but even with the best tooling and education efforts, there’s that 1% or 5% you’re bound to miss. A breach is only a matter of time; your average CISO is losing sleep over this, not over buffer overflows.
“When we had electromechanical systems, we used to be able to test them exhaustively,” says Nancy Leveson, a professor of aeronautics and astronautics at the Massachusetts Institute of Technology who has been studying software safety for 35 years. [...] “We used to be able to think through all the things it could do, all the states it could get into.” [...]
Software is different. Just by editing the text in a file somewhere, the same hunk of silicon can become an autopilot or an inventory-control system. This flexibility is software’s miracle, and its curse. Because it can be changed cheaply, software is constantly changed; and because it’s unmoored from anything physical—a program that is a thousand times more complex than another takes up the same actual space—it tends to grow without bound. “The problem,” Leveson wrote in a book, “is that we are attempting to build systems that are beyond our ability to intellectually manage.”
As part of DeepMind's mission to solve intelligence, we created a system called AlphaCode that writes computer programs at a competitive level. AlphaCode achieved an estimated rank within the top 54% of participants in programming competitions by solving new problems that require a combination of critical thinking, logic, algorithms, coding, and natural language understanding.
Between AlphaCode and GitHub's Copilot, it's sure an interesting time to be a programmer.
Nasdaq's computers can only count so high because of the compact digital format they use for communicating prices. The biggest number they can handle is $429,496.7295. Nasdaq is rushing to finish an upgrade later this month that would fix the problem.
32-bit numbers, man.
A funny yet serious look at the encryption debate. I won't spoil it for you.
As of version 3.8.10, the SQLite library consists of approximately 94 thousand lines of C source code. By comparison, the project has 91,515 thousand lines of test code - 971 times as much test code!
So you fixed a conflict somewhere in your repo, then later stumbled on exactly the same one (perhaps you did another merge, or ended up rebasing instead, or cherry-picked the faulty commit elsewhere…). And bang, you had to fix that same conflict again.
That sucks.
Especially when Git is so nice that it offers a mechanism to spare you that chore, at least most of the time: rerere. OK, so the name is lousy, but it actually stands for Reuse Recorded Resolution, you know.
In this article, we'll try and dive into how it works, what its limits are, and how to best benefit from it.
I think I just reached a new level of git-foo. This makes me feel like I just learned how to use rebase correctly again.
This is still one of the most useful career posts I ever read, but as far as I can tell I never linked it here.
Engineers are hired to create business value, not to program things.
You can't wish away Design Process. It has been in existence since the dawn of civilization. And the latest clever development tools, no matter how clever, cannot replace the best practices and real-life collaboration that built cathedrals, railroads, and feature-length films.
I've noticed this more as I've transitioned to a more agile development world at Skype. Constantly shipping and constantly shipping MVPs means we're less frequently shipping code we're proud of.
I keep coming back to this because all of our assumptions are wrong. See also: More falsehoods programmers believe about time.
So much gold in this rant:
Not a single living person knows how everything in your five-year-old MacBook actually works. Why do we tell you to turn it off and on again? Because we don't have the slightest clue what's wrong with it, and it's really easy to induce coma in computers and have their built-in team of automatic doctors try to figure it out for us.
The human brain isn't particularly good at basic logic and now there's a whole career in doing nothing but really, really complex logic. Vast chains of abstract conditions and requirements have to be picked through to discover things like missing commas. Doing this all day leaves you in a state of mild aphasia as you look at people's faces while they're speaking and you don't know they've finished because there's no semicolon.
You immerse yourself in a world of total meaninglessness where all that matters is a little series of numbers went into a giant labyrinth of symbols and a different series of numbers or a picture of a kitten came out the other end.
In a paper to be presented at the Association for Computing Machinery's Annual Symposium on the Theory of Computing in May, [the researchers] demonstrate a new analytic technique suggesting that, in a wide range of real-world cases, lock-free algorithms actually give wait-free performance.
The easy, simple parallel programming algorithms don't appear to behave much worse than the complicated ones in practice.
Your algorithm does it wrong. Here's proof.
Programming has to work like this. Programmers must be able to read the vocabulary, follow the flow, and see the state. Programmers have to create by reacting and create by abstracting. Assume that these are requirements. Given these requirements, how do we redesign programming?
An interesting essay on the fundamental attributes of programming, programming languages, and programming environments that should make you think about how we teach and learn programming in the future.
We propose an automatic technique for repairing program defects. Our approach does not require difficult formal specifications, program annotations or special coding practices. Instead, it works on off-the-shelf legacy applications and readily-available testcases. We use genetic programming to evolve program variants until one is found that both retains required functionality and also avoids the defect in question. Our technique takes as input a program, a set of successful positive testcases that encode required program behavior, and a failing negative testcase that demonstrates a defect.
A best paper award winner at ICSE in 2009, this is a very interesting read on the possibility that test suites can sufficiently declare the specification of the program, and defects discovered can be removed automatically.
I have a theory. That theory is that software engineers see themselves very differently than those with whom they work. I've come to this conclusion after over a decade in the software industry working at companies large and small. Companies (product managers, designers, other managers) tend to look at software engineers as builders. It's the job of the product manager to dream up what to build, the job of the designer to make it aesthetically pleasing, and the job of the engineer to build what they came up with. Basically, engineers are looked at as the short-order cooks of the industry.
And here's the real crux of the problem: software engineers aren't builders. Software engineers are creators.
Much of the essence of building a program is in fact the debugging of the specification.
— Fred Brooks, The Mythical Man-Month
Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live.
— Martin Golding
A program manager, a software engineer, and a software tester were on their way to a meeting. They were driving down a steep mountain road when suddenly the brakes on their car failed. The car careened almost out of control down the road, bouncing off the crash barriers, until it miraculously ground to a halt scraping along the mountainside. The car's occupants, shaken but unhurt, now had a problem: they were stuck halfway down a mountain in a car with no brakes. What were they to do?
"I know," said the program manager, "Let's have a meeting, propose a Vision, formulate a Mission Statement, define some Goals, and by a process of Continuous Improvement find a solution to the Critical Problems, and we can be on our way."
"No, no," said the software engineer, "That will take far too long, and besides, that method has never worked before. I've got my Swiss Army knife with me, and in no time at all I can strip down the car's braking system, isolate the fault, fix it, and we can be on our way."
"Well," said the software tester, "Before we do anything, I think we should push the car back up the road and see if it happens again."
I spent the weekend working on implementing a few string comparison algorithms for mailcheck and decided to share one of the easiest ones to implement, the Levenshtein distance algorithm.
The Levenshtein distance is calculated as the fewest number of deletions, insertions, or substitutions required to transform one string into another (If you add the transposition operation, you get the Damerau–Levenshtein distance, and if you allow only substitution you get the Hammng distance.) Turns out implementation is a simple dynamic programming exercise because the Levenshtein distance can easily be calculated for various length substrings of each of the two strings being compared.
Take a look at the data structure containing the substring edit distance in action with strings of your choosing here.
The javascript implementation I came up with is as follows for your your perusing pleasure:
My first GitHub pull request went through this week.
Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.
— Brian W. Kernighan
There are only three hard things in Computer Science: cache invalidation, naming things, and handling of the 29th of February.
It's not just temporary. Now that it works you're not going to touch it because something else depends on that behavior and touching it will be far too risky. You're probably not going to clean it up, except perhaps to put it under the bed and hope no one trips over it later.
Even on a tight schedule, last-minute block-ship bugs appear and what should have been a simple, straightforward bug fix will turn into some Giger-esque state-driven nightmare causing everyone associated with the project to invent new profanities because the ones they have don't seem emphatic enough.
All because you weren't ruthless enough on your code. Amen!
The two biggest reasons they found? Communication and defect rates.
This is an interesting debate for me because some of the feature teams we have at Hotmail would consume all the developers of some of the small companies mentioned, but we also have PMs and testers who are on point to help handle some of the inefficiencies in communication and catch defects as they are coded. I imagine some of these smaller companies do not.
All in all, I've seen benefits firsthand to both approaches.
A ground-floor introduction to Algebraic Data Types, which form the basis for languages such as ML, Haskell, and OCaml.
The Paypal competitor backed by Paypal cofounders. And yes, being a techie I did look at the API and the service they provide. Definitely a good design.
It is irresponsible to not use it.
A survey of several algorithms, including the very clever and efficient alias method, used to model a loaded die, which has many handy algorithmic uses.
Web programming is the science of coming up with increasingly complicated ways of concatenating strings.
Some really clever stuff in here if you're the kind of person who has to write terrible code to keep their job:
Let's start off with probably the most fiendish technique ever devised: Compile the code to an executable. If it works, then just make one or two small little changes in the source code...in each module. But don't bother recompiling these.
Never use i for the innermost loop variable. Use anything but. Use i liberally for any other purpose especially for non-int variables. Similarly, use n as a loop index.
Document only the details of what a program does, not what it is attempting to accomplish. That way, if there is a bug, the fixer will have no clue what the code should be doing.
If a module in a library needs an array to hold an image, just define a static array. Nobody will ever have an image bigger than 512 x 512, so a fixed-size array is OK.
Use three dimensional arrays.
Smuggle octal literals into a list of decimal numbers.
Ensure it only works in debug mode with "#if TESTING==1".
Reverse the usual definitions of true and false. Then force the program to do comparisons like "if ( var == TRUE )" and "if ( var != FALSE )" Or even consider using values 1 and 2 or -1 and 0.
So funny, but it makes me really thankful that most people in the world like maintainable code.
How much money do engineers make?Wrong question. The right question is "What kind of offers do engineers routinely work for?", because salary is one of many levers that people can use to motivate you.
Your most important professional skill is communication.
People routinely assume that I am among the best programmers they know entirely because a) there exists observable evidence that I can program and b) I write and speak really, really well.
Communication is a skill. Practice it: you will get better. One key sub-skill is being able to quickly, concisely, and confidently explain how you create value to someone who is not an expert in your field and who does not have a priori reasons to love you. If when you attempt to do this technical buzzwords keep coming up ("Reduced 99th percentile query times by 200 ms by optimizing indexes on…"), take them out and try again. You should be able to explain what you do to a bright 8 year old, the CFO of your company, or a programmer in a different specialty, at whatever the appropriate level of abstraction is.
Co-workers and bosses are not usually your friendsYou will spend a lot of time with co-workers. You may eventually become close friends with some of them, but in general, you will move on in three years and aside from maintaining cordial relations you will not go out of your way to invite them over to dinner. They will treat you in exactly the same way. You should be a good person to everyone you meet — it is the moral thing to do, and as a sidenote will really help your networking — but do not be under the delusion that everyone is your friend.
At the end of the day, your life happiness will not be dominated by your career.Either talk to older people or trust the social scientists who have: family, faith, hobbies, etc etc generally swamp career achievements and money in terms of things which actually produce happiness. Optimize appropriately. Your career is important, and right now it might seem like the most important thing in your life, but odds are that is not what you'll believe forever. Work to live, don't live to work.
Good advice.
The BlueGene/Q processors that will power the 20 petaflops Sequoia supercomputer being built by IBM for Lawrence Livermore National Labs will be the first commercial processors to include hardware support for transactional memory. Transactional memory could prove to be a versatile solution to many of the issues that currently make highly scalable parallel programming a difficult task.Apparently the memory supports transactional code blocks in atomic fashion using processes similar to "load-link/store-conditional" (PowerPC) and "compare and swap" (x86), and it's all done using FPGAs. Pretty nifty.
Which IT or CS decision has resulted in the most expensive mistake?
The best candidate I have been able to come up with is the C/Unix/Posix use of NUL-terminated text strings. The choice was really simple: Should the C language represent strings as an address length tuple or just as the address with a magic character (NUL) marking the end?
Yeah, they chose wrong.
I actually don't mind working on legacy code. Problem solving has always come naturally to me, I was trained to interpret illegible contest code in high school, and two of my previous jobs involved taking ownership of old software projects. It is definitely a fun and unique challenge every time. Trying to get my head around a decent portion of the Hotmail code base is proving more than a little daunting, however.
Ars Technica rounds up what we think we know about developing for Windows 8, including possible new runtimes, HTML5, and a unified presentation layer, and discusses the possible future of Win32, WPF, .NET, and Silverlight.
Most programmers have only a vague notion of how competent they are at what they do for a living.
When you don't compete, per se, what drives you to be better?
My immediate response was to try to log into the server and see if there was anything I could to do keep it from falling over.
Sounds kinda like what we do at work, too. Great article on the Delicious -> Pinboard exodus and a lesson on why you should always stress test.
ENDING SCRIPT - I BLEW UP AND FAILED
— Another great Microsoft easter egg. I mean, if you're going to fail, do so with style...
Buying RAM is for people who don't know how to write algorithms.
Algorithms are for people who don't know how to buy RAM.
System.FormatException: The string 'True' is not a valid Boolean value.— C#
I say it is, C#. I call your bluff.There are two types of people who want to make games. People who don't know what they're getting into, and people who are jumping on the hot thing the other people are jumping onto. The people who actually do make great games are rounding error.— Mike Lee
Totally not true. Java's far superior to Perl. And I think everyone who knows Perl views believes themselves as superior to everyone.
(via fuckyeahcomputerscience)
Achievements Unlocked →
Great article from one of the primary developers of the Xbox achievement system on its inner workings and technical implementation.
E-Level Commit Messages
Ever wanted to see what it's like to be a programmer? Our final group project for EN.600.321, Object Oriented Software Engineering, last fall had over 355 commit messages, a collection of which are given here:
On Nov 10, 2009, at 6:10 PM, Pablo wrote:See if you can break it :)Which time?
Begin forwarded message:PL: Fixed Parker crashingI crash all the time.
Begin forwarded message:PL: EditCard saves preferences, fixed lil bug with title. The default title is title, and that was being saved as the title, even though it's not the title.Nope, the title is not title.
Begin forwarded message:HPS: Committing before Pablo messes any of my other functionality up.Sounds about right.
Begin forwarded message:Sizing bug? U mean how it's smaller when the app first launches select wut shows in cardareagui?Pablo's typing with things stuck in his teeth again.
Begin forwarded message:HPS: Added an "unselectAll" feature to remove selection artifacting. Doesn't work with TextBoxes 'cause class hierarchy is ridiculous? Text boxes aren't selectable? Yet no exception is thrown?Yes?
Begin forwarded message:JY: changes to pixmap/rectangle to more gracefully handle resizing. Only issue is JVM crashes when we push it on the stack.Small issue.
Begin forwarded message:HPS: Fixed NullPointerException with Signals in DragDropTextItem. Keep it clean, dudes.Keep it clean, keep it classy.
Begin forwarded message:Edit: This is only half a commit. WTF, svn?Yes, indeed.
Begin forwarded message:Filed in Pivotal:No, that's a "feature".
...
Bug: saving doesn't actually work
Begin forwarded message:JY: shift resizing of pixmaps. Also, any thoughts to having a right click menu for decks? ie, right click to close, test, or edit cards? I know u mac users don't right click often, but it is what all the cool kids are doing on windows.Not true. We play Portal now, even.
Begin forwarded message:Doesn't crash for me :SYou would, Pablo. Please don't do that.
Pablo
On Dec 13, 2009, at 3:31 PM, Parker wrote:
Known bug #1: Apparently, using the built-in Apple swatches in the 3rd pane of the QColorPicker on a Mac causes the JVM to crash. All I can say is, "Please don't do that".
Begin forwarded message:HPS: Added exactly one lie to fix the bug with saving.I think I meant "line", but there are often lies in my code...
Begin forwarded message:PL: Fixed a bug... don't remember what it wasNeither do I. (It was 3 AM, though.)
Begin forwarded message:HPS: Actually writing out z-values. Teeheehee.You preemptively fixed a bug. Nice job.
JY: Fixed zheight issue of text edits - good catch parker
Begin forwarded message:GA - fixed some behavioural problems during testing.Your's or the program's?
Begin forwarded message:GA - more behaviour modificationDefinitely your's.Advice for Computer Science College Students →
Joel Spolsky is rapidly becoming one of my favorite people to read:
1. Learn how to write before graduating.
2. Learn C before graduating.
3. Learn microeconomics before graduating.
4. Don't blow off non-CS classes just because they're boring.
5. Take programming-intensive courses.
6. Stop worrying about all the jobs going to India.
7. No matter what you do, get a good summer internship.The Kids Are All Right →
Something important and valuable is indeed being lost as Apple shifts to [closed computing]. But it's a trade-off, because something new that is important and valuable has been gained.Great commentary, Mr. Gruber.
The Hacker, the Architect, and the Superhero →
Good article describing three styles of programming that Mike has seen over the course of his career: the hacker who intuitively understands code and its purpose, the architect that designs and organizes for the layman, and the superhero, the one who intuitively handles complexity.
I'm honestly not sure which one I am best described by, because I see aspects of all three in my coding style. More than likely I'd tend toward the hacker style, but mostly 'cause I've never been in a job that allowed me to design a code from scratch.
Google RE2
Regular expressions (regex) provide a means for matching text, whether characters, words, or patterns. They can be infinitely expressive and yet typically compact. An example is "[hc]at", which would match "hat" and "cat".
The implementation of regular expressions in almost all programming languages today allow for backreferences. Backreferences allow you to reuse part of the regex match later in the regular expression. For example "<([A-Z][A-Z0-9]*)\b[^>]*>.*?</\1>" would match any opening and closing html tag and the text in-between. The "\1" is used to reference the part of the regex in parenthesis, in this case, the type of tag we're trying to match. Thus, we can ensure that the we get the same type of beginning and ending tags.
Backtracking sounds really cool, but we run into a problem. Regular expressions containing backreferences have a potential for exponential run time and unbounded stack (memory) usage. Unbounded stack usage in typical regular expression implementations leads to stack overflows and server crashes and all sorts of terrible things. That is to say, under certain circumstances, evaluating a regular expression over a large enough input could hang forever, use up all memory allocated to it, or both. This is a very important problem in Google's case.
Enter bright computer scientists. Basing their approach on automata and computational theory (I took that class!), Google researchers have created RE2, a "principled approach to regular expressions". It claims the implementation guarantees that searches complete in linear time with respect to the size of the input and won't overflow the stack space. They have limited the worst case runtime by removing support for backexpressions, however.
In his paper "Regular Expression Matching Can Be Simple And Fast", Russ Cox, the Google engineer who also authored the press release, writes, "Given a choice between an implementation with a predictable, consistent, fast running time on all inputs or one that usually runs quickly but can take years of CPU time (or more) on some inputs, the decision should be easy." For Google, yes. For those who like fancy syntactic sugar, it's still great to have another option.
Sources and Further Reading:
Google RE2 Press Release
Wikipedia
The Command Line
regular-expressions.info
Code Repository: RE2 on Google CodeWhat Happened to Programming? →
Great article on the cut-and-paste mentality of modern programming.
I think I might be more excited about the fact that he's drawing on a wall than the fact that he's doing physics.
For Those Who Know C++
string code = "";very much does not equal
code.append(0);
string code = "";
code.append("0");
That was 20 minutes of my life.
Git Branching Model →
Excellent article that proposes an effective way of managing version control branching in git.
Telling an inspiring story about a beautiful design feels disingenuous. Yes, we all strive for beautiful code. But that is not what a talented young programmer needs to hear. I wish someone had instead warned me that programming is a desperate losing battle against the unconquerable complexity of code, and the treachery of requirements.— Jonathan Edwards, Beautiful Code