The term "Programmer-Archeologist" is from Vernor Vinge[1]. He describes how, thousands of years hence, pretty much everything that can be programmed already has been, so the bulk of the skill of a programmer is in knowing where to find it and how to make it work, rather than programming per se.
Even today, that's a good lesson to learn. We aren't there yet — his kind of "Mature Programming Environment" is still being built.
Already today, though, a lot of the skill of a programmer lies in knowing the libraries and frameworks available, the idioms and patterns commonly used. The programming environment is still maturing, but already it's deep enough for this to be a very important aspect, to already constitute a fair chunk of the skill of a programmer.
We teach a bunch of sorting algorithms in first year Comp Sci; what gets lost, perhaps, sometimes, is that these are almost purely teaching examples. Sorting is a good example problem — it's simple enough to describe in one sentence, yet interesting enough to show some real issues in coding and algorithm design. The practical application is essentially zero: to sort data, call the library function. If you have too much data for the library function to cope, put it in a database. Writing a sorting function is a specialist skill, rarely invoked.
Perhaps the programmer-archeologist aspect should be stressed more; though like all "familiarity with the field" aspects, it's a difficult one to teach and even harder to test on exams.
[1] A Deepness in the Sky, chapter 17; pp.223-227 in my copy.
⇦ Doing Facebook from scratch | ⇨ Idea: to-do schedulers and GPS navigators |
9 March 2009, 5:36 UTCcomment by Paul Harrison
I think JoelOnSoftware recommends going one level deeper than your problem at hand if you want high quality software.
If you have a large amount of static data to sort, it can be worthwhile to perform a once off on-disk sort. Doing a large number of inserts into a B+-tree is generally less efficient than this. Or if you must use a database, and have multiple processors, it's important that your database has fine grained locking (SQLite just bit me with this).
Selecting the right tool requires knowing quite a bit about their respective implementations. Often the process is: work out what is theoretically possible and roughly how it would work -> hypothesize the existence of software that already implements this -> google. The packages themselves change fairly often (immature software environment...), the underlying theory less often.
11 March 2009, 1:21 UTCcomment by Thorne
In "Space Family Stone" Heinlein has a passage where Mr Stone convinces his twin prodigy sons, who are convinced of their (practical, engineering-oriented) mathematical prowess that they should study more mathematics. He does this by pulling out (drawing? I forget) a chart which maps out all the territory of human mathematical knowledge. He draws a circle around one tiny corner, and tells the boys that everything they know is in that corner.
When I talk with people like you and Ms George about programming, I tend to get this same feeling quite a lot. My own knowledge is practical, functional, but truncated, profoundly lacking in higher theory.
I would find a chart such as the one described; a map of programming theory, just showing what is out there in the broadest terms, *insanely* useful.
11 March 2009, 4:53 UTCcomment by sabik
@Paul: having a fair idea of the next level down does sound like a good idea, yes. You should probably have a fair of the next level up, too, so that you'll end up working on three levels (your own in detail, and the two immediately above and below in less detail but still fairly well).
I suppose for knowing the adjacent levels, it is good to know the names of the various implementations and other concepts — perhaps more so than the implementations themselves. You need to know what "quicksort" and "stable sort" mean for you, but not necessarily how to write them.
Even with a large amount of static data to sort on disk, you should consider using the standard sort(1) program, possibly pre-filtering the data to calculate sort keys. It doesn't do all the fancy tricks, but quicksort+merge is reasonable. If you're doing it often enough that the performance improvement of fancy heapsort+merge would be important, someone is doing something wrong somewhere (though sometimes that would be the person sending you the data and you can't always help that).
11 March 2009, 5:31 UTCcomment by sabik
@Thorne: Hmm, a sort of "Computer Science for Programmers" poster / book / course?
I guess the complexity section (chapter, lecture) could start with sorting and briefly work through all the way to a currently-open Millennium Prize Problem...
Hmm, that does sound like it would be a good thing to have. An overview of the field, arranged and colour-coded, with each theoretical box tied (at least indirectly) to a practical technique.
Speaking of Ms George, her Developing Programmers is similar in some ways, but different; still, such an overview would probably tie in with DP.



