Category Archives: Computing

Thinking About a Data Langauge

I’ve been thinking a lot lately about how a query and data definition language would look if I were able to write one myself. Well, it turns out I can write one myself, it just takes a lot of time. I don’t have a lot of time, but I’ve written down some of my ideas to clarify them and placed them here. This is mostly just a text-friendly way of writing relational algebra, but it has a few extras that would make it much nicer to use than SQL.

The mid-term goal is to implement at least some of the query language either as part of a runtime that sits outside of Postgres and feels like psql with something better than SQL, or hack an alternate parser into Postgres that would provide parsed Query trees to the optimizer. I haven’t decided which will be more time consuming in the long run just yet.

I’ll keep that area updated as I have time to work on the language spec more. There are quite a few ideas I’ve got left to commit to writing but just can’t yet due to time.

Version String Comparison in Python

Comparing and sorting version strings in Python scripts has come up a few times lately. Here are some simple approaches. These examples assume that the version strings will be numeric representations and not include elements like “rc-1″ or “alpha” or whatever. If your problem includes these kinds of elements, don’t worry — they are solvable by applying a touch of regex-fu to the processes below.

First, sorting a bunch of version number strings. The problem is that many version strings reported by various packages are just that: strings. Usually the only way to get version information is to ask for it (“[prog] –version” in a shell script, or “SELECT version();” from a database, or whatever) and interpret whatever gets sent to stdout. Those strings don’t mean the same thing to comparison operators that they mean to us so long as they remain strings. For example, the string ’3.5.10′ is greater alphabetically than ’3.15.1′ but is a higher version. So we need to convert them to tuples of integers to make comparison of them natural (again, all these examples assume integer-only version strings, but minor changes to the process can allow you to compare anything — and assigning a custom collation order can allow you to sort against any arbitrary order of arbitrary symbols, but that’s beyond the scope of the basic nature of the problem I’m addressing here):

>>> vers
['3.5.10', '3.15.1', '2.5.7', '0.20.0', '2.12.5', '10.4.3']
>>> s_vers = [tuple([int(x) for x in n.split('.')]) for n in vers]
>>> s_vers
[(3, 5, 10), (3, 15, 1), (2, 5, 7), (0, 20, 0), (2, 12, 5), (10, 4, 3)]
>>> vers[0] > vers[1]
True
>>> s_vers[0] > s_vers[1]
False
>>> cmp(vers[0], vers[1])
1
>>> cmp(s_vers[0], s_vers[1])
-1
>>> s_vers.sort()
>>> s_vers
[(0, 20, 0), (2, 5, 7), (2, 12, 5), (3, 5, 10), (3, 15, 1), (10, 4, 3)]

The list comprehension (actually, two nested list comprehensions) assignment to s_vers is the important part of this. Once that is done you can compare whatever you want. If the version number is buried as an element in a dict or larger list (likely) you can do this conversion in place by adding a new element to the contained structures and then sort the greater list based on that element:

>>> packages
[{'version': '3.5.10', 'name': 'foo'}, {'version': '3.15.1', 'name': 'foo'}, {'version': '2.5.7', 'name': 'foo'}, {'version': '0.20.0', 'name': 'foo'}, {'version': '2.12.5', 'name': 'foo'}, {'version': '10.4.3', 'name': 'foo'}]

OK, that’s pretty ugly (uglier depending on how your browser renders <pre> type text), so I’ll print them in order so we can watch the list change more easily.

>>> for p in packages:
...   print p
...
{'version': '3.5.10', 'name': 'foo'}
{'version': '3.15.1', 'name': 'foo'}
{'version': '2.5.7', 'name': 'foo'}
{'version': '0.20.0', 'name': 'foo'}
{'version': '2.12.5', 'name': 'foo'}
{'version': '10.4.3', 'name': 'foo'}
>>> for p in packages:
...   p.update({'version_tuple': tuple([int(x) for x in p['version'].split('.')])})
...
>>> for p in packages:
...   print p
...
{'version_tuple': (3, 5, 10), 'version': '3.5.10', 'name': 'foo'}
{'version_tuple': (3, 15, 1), 'version': '3.15.1', 'name': 'foo'}
{'version_tuple': (2, 5, 7), 'version': '2.5.7', 'name': 'foo'}
{'version_tuple': (0, 20, 0), 'version': '0.20.0', 'name': 'foo'}
{'version_tuple': (2, 12, 5), 'version': '2.12.5', 'name': 'foo'}
{'version_tuple': (10, 4, 3), 'version': '10.4.3', 'name': 'foo'}
>>> packages.sort(key = lambda x:x['version_tuple'])
>>> for p in packages:
...   print p
...
{'version_tuple': (0, 20, 0), 'version': '0.20.0', 'name': 'foo'}
{'version_tuple': (2, 5, 7), 'version': '2.5.7', 'name': 'foo'}
{'version_tuple': (2, 12, 5), 'version': '2.12.5', 'name': 'foo'}
{'version_tuple': (3, 5, 10), 'version': '3.5.10', 'name': 'foo'}
{'version_tuple': (3, 15, 1), 'version': '3.15.1', 'name': 'foo'}
{'version_tuple': (10, 4, 3), 'version': '10.4.3', 'name': 'foo'}

We started out with a list of dictionaries, each containing a package name and a version string. The first loop updates each dictionary to include a version tuple, and the next orders the dictionaries within the list by the tuple values. Viola! We have a list of dictionaries sorted by version number. Of course, if there are more than one package name involved you will want to sort on the package name first, then the version tuple as a secondary criteria (so you don’t compare versions of package ‘foo’ against versions of package ‘bar’, or sort glibc against firefox, for example).

If lambdas are unfamiliar to you, don’t be scared off by the package.sort() line up there — lambdas are perfectly safe, reliable and quite concise once you understand the way they are used.

From here writing a sort function for lists of version strings should be pretty obvious. And… that means that writing a comparison function for two individual elements that works the same way the built-in cmp() function works is trivial:

>>> def ver_tuple(z):
...   return tuple([int(x) for x in z.split('.') if x.isdigit()])
...
>>> def ver_cmp(a, b):
...   return cmp(ver_tuple(a), ver_tuple(b))
...
>>> vers
['3.5.10', '3.15.1', '2.5.7', '0.20.0', '2.12.5', '10.4.3']
>>> ver_cmp(vers[0], vers[1])
1
>>> ver_cmp(vers[0], vers[0])
0
>>> ver_cmp(vers[3], vers[4])
-1

Nice and easy.

Now I can’t figure out why comparison functions I’ve seen floating around occupy so much space and are hard to follow — full of class declarations and exec loops within exec loops (!!!) and other nonsense. At the most you will need to add some regular expression matching to extract/split on the correct substrings from the version string. That means you would have to import the re module and the list comprehension will grow by a few (maybe 10) characters.

Most common Bash date commands for timestamping

From time to time I get asked how to use the date command to generate a timestamp. Here is an idiot-friendly script you can post for reference in your team’s bin/ if you get interrupted about timestamp questions or have an aversion to typing phrases like “man date” (with or without a space).

All but the first and last two produce filename-friendly strings. (Thanks to Rich for the reminder to include UTC and timezoned stamps here):

#! /bin/bash

# An overly obvious reference for most commonly requested bash timestamps
# Now all you Mac fags can stop pestering me.

cat << EOD
        Format/result         |       Command              |          Output
------------------------------+----------------------------+------------------------------
YY-MM-DD_hh:mm:ss             | date +%F_%T                | $(date +%F_%T)
YYMMDD_hhmmss                 | date +%Y%m%d_%H%M%S        | $(date +%Y%m%d_%H%M%S)
YYMMDD_hhmmss (UTC version)   | date --utc +%Y%m%d_%H%M%SZ | $(date --utc +%Y%m%d_%H%M%SZ)
YYMMDD_hhmmss (with local TZ) | date +%Y%m%d_%H%M%S%Z      | $(date +%Y%m%d_%H%M%S%Z)
YYMMSShhmmss                  | date +%Y%m%d%H%M%S         | $(date +%Y%m%d%H%M%S)
YYMMSShhmmssnnnnnnnnn         | date +%Y%m%d%H%M%S%N       | $(date +%Y%m%d%H%M%S%N)
Seconds since UNIX epoch:     | date +%s                   | $(date +%s)
Nanoseconds only:             | date +%N                   | $(date +%N)
Nanoseconds since UNIX epoch: | date +%s%N                 | $(date +%s%N)
ISO8601 UTC timestamp         | date --utc +%FT%TZ         | $(date --utc +%FT%TZ)
ISO8601 Local TZ timestamp    | date +%FT%T%Z              | $(date +%FT%T%Z)
EOD

If executed, it will produce the (obvious) output:

        Format/result         |       Command              |          Output
------------------------------+----------------------------+------------------------------
YY-MM-DD_hh:mm:ss             | date +%F_%T                | 2013-05-17_10:16:09
YYMMDD_hhmmss                 | date +%Y%m%d_%H%M%S        | 20130517_101609
YYMMDD_hhmmss (UTC version)   | date --utc +%Y%m%d_%H%M%SZ | 20130517_011609Z
YYMMDD_hhmmss (with local TZ) | date +%Y%m%d_%H%M%S%Z      | 20130517_101609JST
YYMMSShhmmss                  | date +%Y%m%d%H%M%S         | 20130517101609
YYMMSShhmmssnnnnnnnnn         | date +%Y%m%d%H%M%S%N       | 20130517101609418928482
Seconds since UNIX epoch:     | date +%s                   | 1368753369
Nanoseconds only:             | date +%N                   | 427187053
Nanoseconds since UNIX epoch: | date +%s%N                 | 1368753369431083605
ISO8601 UTC timestamp         | date --utc +%FT%TZ         | 2013-05-17T01:16:09Z
ISO8601 Local TZ timestamp    | date +%FT%T%Z              | 2013-05-17T10:16:09JST

Interview from Another Dimension

I was asked if I was interested in covering a temporary administration position a few days ago because finding bilingual Unix people is pretty hard here in Japan. It sounded marginally interesting and stood a chance of getting me in touch with the local Unix community, so I said sure, have the interviewer give me a call.

One day the positioning agency asked for a resume. I sent one in. The next day at 3pm I got a call saying that I would get a call an hour later to conduct a phone interview.

At 4pm I didn’t get a call.

At 5:30 I called their office back to say that I didn’t get a call. They called me back asking if I’m still available today — I tell them that if its OK that I’ll be playing with my kids then I’m game. They call back again telling me that the company is really going to call this time but from the office in Yokohama, not Okinawa — I’m fine with that. They also told me that the guy calling would “be a foreigner, like you” — I’m fine with that, too.

Not a minute later I did get a call but not from Yokohama, and from a foreigner but not “like me”. The call was from India over the world’s worst connection.

This amazed me. For one thing it was 2013. I expected bad connections when calling across multiple satellite hops from contested jungle territory in Southeast Asia in 2004. But this was a lot worse than that, and this guy was supposed to be calling from an office. And he supposedly works for a high-tech company looking to contract me. It bears mentioning that you could get crystal-clear cell connections from most of Afghanistan in 2010.

So that was the first weird smell. The second hint of rotten tuna was the voice. I couldn’t, for the life of me, understand most of what he was trying to say. I’ve never been one of those “You gotta speak ‘merican!” types (hard to justify it being an expatriot myself), but if you’re going to speak English it should be English and should be intelligible, if not at least generally correct. Otherwise speak Japanese, or German, or get an interpreter, or have someone else do the interview — I’m open to any of the above. If you do know English but have a heavy accent, just slow down. But such ideas are lost on some people.

His speech had a magical pattern to it. Merely missing syllables or mushing sounds together like most non-native speakers was beneath this guy. He set a new standard for unintelligible second-hand language by injecting new syllables and sounds into each word.

The deft ease and fleet pace at which he mangled the language makes me think in retrospect that he probably considered English to be his first language. Maybe it was just taught to him wrong as some sort of cosmic joke. It was what speech would sound like if you could somehow hear a hash salt being added to it. This blew Pig Latin out of the water.

An abridged transcript of the conversation follows:

indian_guy_voice

Him: “Dis is Gumbntator Hlalrishvkttsh koling flum Ueeplo en Eendeya an ayam surchelin Mestarh Kleg Ewurlet?”
I could sort of make out what he was trying to say.
Me: “This is he.”
Him: “Ah see. But dis is Governator Ralrishevdish koling flum Weepro en Indeya an ayam surchen Mestarh Kleeg Iwuuret?”
Perhaps he couldn’t make out what I was trying to say?
Me: “Yes, I am the person you are looking for.”
Him: “OK.”
Me: “…”
Him: “…”
Me: “You are calling about the interview?”
Him: “So ifna kolik abbaud arun foha.”
Me: “I’m sorry, the line is echoing very badly, can you please say that again?”
Him: “So if colling aboud arun four?”
So here I think he’s calling to schedule a call at four because they screwed up today’s schedule already.
Me: “Tomorrow? Yes, you can call me at four.”
Him: “OK. So hou abbaud you al habing eksperens an de Sulrais Ziss?”
Now I don’t know what he’s saying, but I know its not a scheduling question.
Me: “Can you please say that again? This connection must be very bad.”
Him: “You al hawing eksperens wit Lenaks an de Sularis swistems?”
Me: “Yes, I have experience on Linux and Solaris systems. Mostly Linux, though, because that is the platform I develop on.”
And here it began to dawn on me that this was the actual interview. In Indo-Pig Latin.
Him: “Okai. Bud wud abaudd yor kulanted lol on de dekuhnikal missm?”
Me: “I must be having a bad phone day. Please give me a moment to get to a quieter room so I can hear you.”
Him: “So komaing fru dat ayem ah phookink zandngar an…” [and so on...]
He kept babbling on and on about something that I couldn’t hear as I moved to an environment better suited to auditory-verbal cryptanalysis. Hope I didn’t miss anything paradigm shifting.
Him: “…[continued spacetalk]…”
Me: “What would you like to know about my experience?”
Him: “Inna suba sisesutm hau ew mak da pashink?”
Me: “The reception is poor again, can you please say that again?”
Him: “Inna subaa susutem hao eww poot a pach?”
Me: “Patching? Are you asking me how to patch a server? It depends on what you mean by ‘patching’. Are we patching sources to rebuild a program, or installing upgraded binaries through a package manager or performing an automated patch and rebuild the way ebuilds and ports work?”
Him: “Yesss. Inna sabaa, hou eww poot a pach?”
Me: “What system are we talking about?”
Him: “Inna sauce.”
Me: “Sauce? In source? Oh,  Solaris? If we are receiving updated binaries I would use the package manager. I haven’t seen people bypass IPS and use the patch manager directly for a while.”
Him: “Zo uatt ai am gunda be dou nuh is abbauda passhin inna sabaa. Hau yu du?”
Me: “I’m sorry, I think you are asking me how I would patch a Solaris server, and without knowing anything else about the question I think you mean we are receiving updates from a repository. My answer is that I would use the package manager, probably IPS, or if just patches then the old patch manager. But I don’t really understand your question. It is really broad.”
Him: “SO hao eww do?”
Me: “You mean the command sequence?”
Him: “Yeis.”
Me: “You want me to spell it out over the phone?”
Him: “Yeis.”
I couldn’t help but snicker a little… is this really the way system administration interviews go?
Me: “OK, which version of Solaris?”
Him: “Inna sabaa.”
Me: “I understand in a server, but that doesn’t really change the question much, unless I’m missing something. Which version of Solaris? We are talking about Solaris, right?”
Him: “Zo vot ah em denkning niss uii nut dokkin abbaud da deweropent zicheeshn. Dust a passh a sabaa.”
Me: “Right, not a development situation, just patching a server. But this is a difficult question to answer unless I know what system we are talking about. They don’t all work the same way.”
Him: “Du eww habba poosiija fou da makkink na fou da af emma lepozitorian?”
Me: “I’m sorry, the phone is being worse than usual again, can you please ask the question again?”
Him: “Enna proosiija fou passhing. Eww habba lepozitori an poosiija. Du garanti ob da safti?”
Me: “My procedure to guarantee the safety? You mean during patching? If I make a repository? Was that part of the question?”
Him: “Yeis.”
Me: “OK, yes, in a production environment I would expect that we have separate testing and production repositories at least. I would patch or update the test servers, run applicable tests for whatever application or server software we have installed, and then deploy the update to the production servers. But this is a really basic thing to say, and I can’t give you any details without knowing what system we are talking about. Is this even a Solaris question?”
Him: “So abbaudda Lennuks.”
Me: “Linux? The question is about Linux?”
Him: “Onna Lenuks hau eww makka lepozitori?”
Me: “Repositories on Linux? Which distro?”
Him: “Onna Lenuks.”
Me: “OK… What package manager are we talking about? RPM, yum, smitty, portage, aptitude, they all do things very differently. Even RPM is different on different distros that use it.”
Him: “Yeis. Onna Lenuks. Hau eww mak da lepozitori?”
Me: “Just assuming you mean Red Hat or CentOS or something else derived from Fedora, I would collect the RPMs we want to distribute, sign them, write a meta RPM for yum installation that has the public key and config file in it and build the repository metadata with createrepo. But if this is not a development environment we’re probably just mirroring an existing repository, so most of the time syncing with the master is sufficient. If not we could sync, re-sign, and recreate the repodata with createrepo.”
Him: “So hau eww mak da lepositori?”
Me: “I think I just told you. I have maintained several software repositories in the past and using createrepo is by far the easiest and most reliable way to do it, if we are talking about a yum repository full of RPMs for a distro like Red Hat Enterprise Linux.”
Him: “Yeis. So da Redhat.”
Me: “Maybe I don’t understand the question. You want me to tell you how to create a repository?”
Him: “Inna Lenuks hau eww mobbing fom weri zmar drraib enna rojikalworuum?”
Me: “Sorry, I can’t hear the question very well, the phone is full of echoes. You are asking me in Linux how to do something?”
Him: “Mobbing werri zmorr drraib anna rojikalworuum.”
Me: “Moving a small drive in Logical Volume Manager?”
Him: “Yeis.”
And here is where it dawned on me that I should have hung up at the first sign of weirdness. Instead I had hung on and now I was really along for the ride. Until the bittersweet end…
Me: “Do you mean changing a physical block device from one volume to another, or moving the volume itself?”
Him: “Retzsai eyabba  werri zmorr drraib anna wanna denk u poot enna rojikalworuum. Hau kann godu boot?”
Me: “You are asking me how to move a Linux installation from a small drive onto a logical volume, and then boot it later?”
Him: “Yeis.”
Me: “Assuming this is a simple case I would copy the filesystem to a new partition within the logical volume and add an entry to the bootloader so that we could boot it from the new location. But what bootloader we are using in this case? Grub or LILO or Grub2?”
Him: “Inna Lenuks.”
Me: “Right, in Linux, but which bootloader are we using?”
Him: “In da Lenuks.”
Me: “Right, but are we using Grub or LILO?”
Him: “LILO. Inna Lenuks.”
At this point I was relieved just to get something other than “Inna Lenuks” by itself out of him.
Me: “OK, assuming that the version of LILO we are using is logical volume aware, I would add the entry to the LILO configuration file that points to the location of the kernel on the relocated installation.”
Him: “Wat fail?”
Me: “What fail? You mean what file? The LILO configuration file.”
Him: “So wat fail?”
Me: “You mean where is it? Its usually in ‘slash E T C slash L I L O dot C O N F’.”
Him: “Inna Redhadd.”
Me: “In Red Hat? LILO isn’t a part of that distro any more. They use Grub2 now.”
Him: “Uadda za komunt fur addikt inna neu intree?”
Me: “The command for adding the new entry? There is not a command to add a new LILO entry, you have to edit the configuration file directly. Grub2 has some commands like grub-install and grub-update. But you still have to check the configuration file to make sure things are in the right place. Is that what you mean?”
Him: “Inna Lenuks?”
Crap! We’re back to this again. I really don’t know how to debug this guy. He’s worse than the Emacs Psychoanalyst.
Me: “Yes, in Linux. But this is not exactly a Linux question. The bootloader can load anything, so I don’t know what you mean.”
Him: “Adnanujinnadundaweenananndana…[A good five-minute bunch of spacetalk that I completely cannot understand. It was riveting, though. Like a symphony it had its own movements. Initially with the monotone of a public announcement, then to the lively staccato of a friend relating a happy story, capping with a crescendo of alternate gravelly and soft sounds unique to Indian speakers, and ending with a friendly chuckle -- as if he had enjoyed himself and was ready to say goodbye.]…”
Me: “OK, thank you for the call.”

I have no idea what most of that was about. I got the feeling he asked me some Solaris questions and some Linux questions and some general installation-wide question at the end that I never quite got a fix on. Actually, I never quite got a fix on anything at all, and I don’t think he did either.

This was the weirdest interview experience in my life. It is like a trick they would pull you at Robin Sage but this guy was for real; no OC is going to come evaluate me on how I did and counsel me how to better deal with the crazy and ambiguous.

Now for the scary part. This is the new face of IT outsourcing. Think long and hard whether you want to trust your data integrity and the construction of business systems you expect to get reliable answers out of to companies that have trouble communicating with their own (prospective, in this case) subcontractors and employees.

Since this is Japan, I wonder how on earth they manage to conduct interviews of Japanese people?

Am I alone here? Has anyone else ever experienced this sort of thing? (Other than when calling Dell or Microsoft tech support and being redirected to India, that is.)

Freenode Year-End Weather Review and 2013 Forecast

##c, ##c++, ##java, ##javascript and almost all other channels named after an Algol-descended language remained strong in n00b angst and help-vampire congestion (strong counter example is #bash, see below), rendering them useless for anything other than observing flame wars amongst programming newbies arguing over terms they’ve only just discovered on Wikipedia. Expect no change in temperature or inclination for 2013, and as always prepare for flurries of students hoping to get their homework done for them throughout August, October, December, March and May.

#bash took first place for overall 24-hour activity within its stated zone in 2012 — quite an achievement. This was enabled by nothing more than the militant purism of its main participants who happen to actually know (most of) what they are talking about. The intensity of discussion in #bash is likely in increase over 2013 as realization dawns on more new *nix admins and even OS X users that their systems represent a complete programming environment. A corresponding increase in the volume of beginner reference links in-channel is likely — with an associated increase in RTFM calls directed at those who don’t read links or delivered by the less patient/coddling of the regulars.

#fedora, #ubuntu, #centos, and other distribution-named channels fell into two categories in 2012:

  1. overrun with help-vampires asking the same 3 new-release migration questions
  2. overwhelmed with utter silence

The channels #ubuntu and #centos took the first-place poo cake for overall deafening off-topic, RTFM-worthy and amateur architectural astronautic clamor while #archlinux, #gentoo and #fedora managed to achieve a much better signal-to-noise ratio, mostly due to a greater percentage of knowledgeable participants. Expect very little change in 2013 with the exception of #ubuntu and #fedora. The former may grow even worse as the population of those who don’t know any better flock to Ubuntu as Steam picks up, er, steam and the latter may grow gradually quieter as new changes implemented in Fedora 18 cause a probable nose-dive in that distribution’s popularity across the year.

#django was one of the strongest on-topic, 24-hour hour activity channels focused around getting actual work done, with the vast majority of interaction involving at least marginally researched questions and a great deal more courtesy than usual this millennium. This indicates that the Django project has likely reached its Goldilocks point as a project where it is just enough below the radar that the “new thing” from 1~2 years past is still soaking up the n00bs, b00bs and help-vampires (in this case, #RubyOnRails) and enough srsly gentlemen have noticed it to make it a usefully mainstream place to work. If no unexpected storms of blockbuster “Lern da Web wit Djago in 10 Dais!!1!” tutorials or books occur across YouTube and bookstores expect #django to experience only a slight increase in temperature and no bumpkin brain blizzards or humility hurricanes. The status of Django on Python3 is the most likely leading indicator of trouble here (see below).

#django-dev was boring and dead for the most part, aside from the occasional thin mist of packager discussion and “why doesn’t the TLS setting for mail mean real TLS on the correct port?” talk (nonsense!). Some rumblings of the impending Python3 reckoning could be heard, but were still far enough in the distance as to avoid a full-blown #fedora style storm in 2012. Expect this to change in 2013, as Python3 will finally give Django devs enough to talk about to wake kick them off of the ML and into IRC activity. The action is likely to be a bit below storm-strength due to the project’s (general) adherence to its own release guidelines, but may from time to time bear watching.

#RubyOnRails and related channels were clogged with help-vampires and n00bs in similar fashion to the Algol-language and distro channels. This has remained fairly steady since 2009 or so, with the effect being bolstered by the presence of all those people who gave up on mobile programming just before they might have actually figured out how native applications work. Save a major drive to some other fascinating technical mistake (“Web 3.0?” “cloud vX”?) that goes viral, the Rails community will likely continue to experience idiot floods and hails of stupidity through 2013. For the serious who are in need of actual relevant discussion, forums, IRL meetings with Real People You Know and project-specific channels for projects that happen to be built around Rails will be the only places to find it.

#guile managed a slight edge over both #lisp and #scheme last year in Occasional Wizardry, but the overall volume of discussion was far lower than either #lisp or #scheme — giving #guile the best signal-to-noise ratio anywhere but also rendering it an incredibly boring place to hang out on an average day (as in, #guile remains a statistical outlier, though an interesting one). It is uncertain whether the effects of a new project, a new major version or a new implementation of Guile, Scheme or Common Lisp will have any effect or even be noticed by anyone, anywhere, so a prediction for 2013 is beyond me. I have a sneaking suspicion that someone might eventually catch on that guile2 includes a webserver ready-made for scripting in a functional language (among other features), but the population of paren-loving teens is so low at the moment and the current infatuation with the Web and the Java religion of Absolutely Everything Must Be An Object (Amen) still so strong with the sort of computer science faculty that thinks that every student should get a gold sticker for showing up that it is hard to see if anything short of a viral breakout video complete with tits, violence and gore would be noticed.

#haskell took first prize in 2012 for overall, unadulterated, near-constant uber geekness and Deep Black Magic. Three factors influenced this strongly: the near exclusive population of serious math nerds who like to flaunt their grokness, the tendency of such people to never admit they don’t grok a mind-melting snippet in channel and instead boil in silence until something makes sense to them, and the tendency for newcomers to either struggle unflaggingly until they earn their place among the immortals or simply give up and never, ever venture into #haskell again. In this, the uniqueness of Haskell as a language serves a positive filtration role in the community much the way that the old “be smart or go home” sort of freshman math classes did back when it was OK to admit that computer science wasn’t for everybody. Expect very little change to this trend in 2013, though by the end of the year commercial projects using Haskell may be revealed as actually using Haskell, and this may drive a slight, temporary increase in interest.

#erlang was a bit like #haskell, but more average in every aspect: less magic, more noise, fewer quitters, more eternal (but not really annoying) n00bs. This is mostly due to the revelation among high schoolers and college language hipsters that Facebook uses Erlang for a smattering of projects that can’t afford downtime and how Erlang can cope with such requirements in a novel way. Other functional language channels generally fell into the pattern of the lisps and Haskell and Erlang, but these last two deserved particular mention. In 2013 Erlang stands a very small chance of sucking brains away from other interesting languages such as Lua and anything matching .*ML.*. In that case expect Erlang to eventually grow more like #bash in nature over 2013, with a particular threshold being crossed if #erlang itself becomes a bothersome place to hang out due to an excess of help-vampires and alternative Erlang-based project channels becoming the alternative arteries of community brilliance. Saving such a spontaneous increase in notoriety, however, #erlang is likely to follow or return to the majority patterns of 2012.

This has been the Freenode Year-End Weather Review and 2013 Forecast. All other networks either suck or were set up with specific crowds in mind (such as botnets).

ODFapper 0.04 for Django on Unix

I’ve pulled the most common bits out of the Django views where I had rendered ODFs before and have the itty-bitty beginnings of an ODF handling/rendering library started now. “Library” is a bit of a stretch, since its only a few functions and a Bash script, but it abstracts the most mysterious/tedious parts of ODF handling already.

This is just a simple tarball right now. It unpacks like a Django app would, but only contains a copy of odfapper, funcs.py and a templatetags/odf_tags.py. But since we are doing template tag registration you need to include it in your INSTALLED_APPS in settings.py and add a new variable ODFAPPER_PATH which needs to be a string with the absolute path to /your/project/location/odfapper/odfapper. Importing into a views.py (or wherever) that you want to render ODFs in is done with from odfapper.funcs import render_odf, and I go into a bit more detail in the README included in the tarball. At the moment this post and the stuff in README (which is just an expansion of internal notes) is all the documentation.

I’ve got a ton more work I’d like to do on this. If there is any interest I could put a GitHub repo up — but other (paying) work calls to which this isn’t central, so… let me know if this is a direction anyone wants to head in and I’ll keep it going. There are a bajillion tiny things that are not so hard to do that would make ODF handling enormously more intuitive.

Link to 0.04 archive: odfapper_django-0.04.bz2
Link to just the shell update: odfapper-0.04.bz2

It bears mentioning that this is tested against Django 1.4, but not 1.5 yet. Unless the template loader classes have changed a lot then this should still work fine anyway, though.

Packing/Unpacking ODFs: Now Automated

I do lots of automated ODF rendering using framework data from AOW, Snap and Django. Its a pain to keep it straight, so I’m slowly extracting the common bits. The most annoying and easiest to distill is the basic process of guarding against clobbering something else in the filesystem, getting all the files out, reformatting the XML to be easier to edit, and later repacking it in a way that doesn’t make an office program tell you “This file is screwed up — I can fix it, but I hate you”.

Here is a script that handles the packing and unpacking of ODFs in a more friendly way than having to remember how all the time: ODFapper-0.01.bz2.

It takes very few options because it is written against my most common use case: unpacking to touch up a template, render the marked-up content.xml (and add images, or whatever) and then repack it to a new location from within a framework. In other words, ODFs can be used identically to HTML page rendering.

This is just a tiny part of what I do with ODFs, but it covers the most common bits (in particular, all that checking that a framework isn’t clobbering something in the filesystem), and answers the most frequent question I get from people who are curious about how I render ODFs: how to make them palatable to LibreOffice later.

I’ll eventually pull the rest of the render/repack ODF process out of my various programs that already do that and maybe make a for-real project of ODFapper. But that takes time, for now if you want an easy way to pack/unpack ODFs without screwing them up feel free to incorporate this script in whatever you’re doing. If you are a Bash wizard with suggestions, of course, I’m all ears…

Language Preachers: A Language’s Worst Enemy

Nothing crushes a language’s chance at becoming popular like the activities of its political and religious agitators. It doesn’t matter if the situation of the language is that its on the rise, on the decline or on the rebound. Language preachers suck, always. The only chance a language has for a successful PR campaign is when that campaign is based completely on lies, is extremely well funded, and appears to provide a remedy-by-aversion to the sort of problems incompetent instructors and consultants can’t think through and yet still encourages the sale of completely abstract content about abstractions without actually having to write useful code (Java). It doesn’t matter if this campaign is a complete farce, it’ll work if it permits enough of the stupid to perceive themselves as now smart enough because it can generate an entire sub-industry based on trying to prove that something can be done in this newly minted language for the incompetent — a heavy side of brand new buzz words helps, too (if only the “cloud” had been a language…).

But that’s just Java. Other languages don’t have the benefit of an academic revolution to assist in shoving a particular interpretation of a single programming paradigm down our throats (complete with a language that enforces just that one paradigm). Most languages just have to make it on their own either on merit (Haskell, a few lisps, Python), be the language of the year’s hottest killer app (PHP and the original web, Ruby and Rails), be the language of a killer app (Javascript and Mozilla/Firefox), or be both strong on merits and be the mother of a whole family of killer apps (C being the obvious example of the closest thing to modern immortality).

Recently I’ve been involved in a few discussions about the relative merits of a few different languages, database systems, distro flavors, kernel details, etc. Of all these sopics the top two likely start a religious war are choice of language and choice of database. My made-up estimate is that language discussions are twice as likely to incite digital jihad than database discussions.

Maybe it is because we spend most of our time dealing directly in a language and actually thinking in that language, so it is internalized to a level nothing else is. I don’t “think” in LibreOffice (though I’ve seen a few people who can think in Calc functions and others in Excel functions — and I hope the two types never meet to talk about it), I don’t carry on an internal monologue of sorts in Firefox or Nautilus or Gnome — but I do talk to myself in a sort of visual way in Python, Guile, Haskell, C, Bash, Perl, plpgsql, opcodes and a few other languages when the need arises. Programmers are prone to do this — and it turns out that they are most likely to do it only in a single language, ever (pick one of Java, Perl, Javascript, PHP, or Ruby for the best religious wars).

To someone who doesn’t speak enough languages to have evolved a sort of quiet disdain for them all a weird form of moral relativism seems to pervade. All languages other than the one they already know well are at once “the best for some job, just depends on what you’re doing” and completely inferior at everything compared to $the_only_lang_i_know_well. And of course the complete n00b who doesn’t even know one language well yet will swear that LangX — which he will never take the time to study but will take the time to flood IRC with airy questions about — is the cure for cancer, war and was handed down by Mayan priests without actually knowing anything about it. Obviously there is a conflict here.

Most languages suck, and some suck far less than others. A very few suck so much less that they are almost good. That list is very short. The list of languages I think to myself in regularly is a mix of languages that I don’t completely despise and ones that I do despise but must use because useful projects are already written in them. It used to grind on me that I had to use inferior tools so often, but I’ve gotten past by accepting that most programmers are either really bad at what they do or are compelled by economic circumstances to abandon both their code and the quest for decent tools prematurely. Meh.

The last few years I’ve noticed a lot of Perl people getting zealous about their language. I’m not writing to bash Perl or its community — different communities go through their own inner social cycles as well. But the last six years or so has seen Perl people transition from a group that still self-assuredly represented the original web/CGI hackers and were held in guruly esteem to a group that is panicked about the low percentage of new projects written in Perl compared to the late 90′s and early 00′s. They’ve gone from a group that is solving problems to a group that is actively trying to tell everyone that Perl is still the thing to use, that its still hip and cool and better than everything else out there.

This is hurting Perl more than it is helping. Perl is indeed a good language, but the days when it was the de facto standard for a large subset of new project types is over. Panicking about it won’t help — but writing useful programs sure might. Today there are many alternatives that are at least as good as Perl ever was and many that are just better. Algol was and still is a great language, but I’m not about to start a new project in it. C is just better, and so far hasn’t been surpassed in that particular space (though its time may come, too). I make the decision to not use Algol out of an informed regard for the difficulty of maintaining Algol code in 2012 versus maintaining other code in 2012. If someone were constantly bombarding me with pro-Algol propaganda and biased trash benchmark comparisons then I would avoid Algol on principle instead of even learning enough about it to know whether or not it could have possibly been a good choice to begin with.

If Perl is the awzumest evar, then there is no need to proselytize. If it is on the decline, then there is a huge need to defend the temple — and this is the perception I have of this sort of behavior. It makes Perl people sound like Java people when it was a crap language trying to gain some mindshare — though in defense of that tactic Java did take over the shallow end of the pool, and that turned out to be the largest part of the pool by a vast margin.

The the Lisp community screwed itself by engaging in Java-style cheerleading for a time. This drove a wedge between the people who really could have benefitted from learning it and the community of language snobs who were already privy to The Great Mystery. The Postgres community was that way as well until very recently. I’m a card carrying lisper myself, but I admit the community just got retarded when it came to interacting with others. The only thing that has redeemed Lisp in recent years is that some of the best commercial Lisp developers also happen to be excellent writers. If it weren’t for that there would have been no revival because the inner community mixed with the outer community like water and oil over this language preaching business.

Perl is a genuinely good language and it has a genuinely good runtime (and the two are totally different issues). If it is fading, let it fade with majesty as one of the traditional uber-hacker languages of yore, bringing out your Perl magic as a way to intrigue and enlighten n00bs instead of making the word “Perl” synonymous with a noisy cult that is more interested in evangelizing than writing solutions to problems.

Don’t make the same mistake the Lispers did when they felt their language was threatened — its the difference between newcomers deliberately avoiding Perl instead of not having tried it yet.

Fedora: A Study in Featuritis

Its a creeping featurism! No, its a feeping creaturism! No, its an infestation of Feature Faeries! No, its Fedora!

I’ve been passively watching this thread (link to thread list) on the Fedora development list and I just can’t take anymore. I can’t bring myself to add to the symphony, either, because it won’t do any good — people with big money have already funded people with big egos to push forward with the castration of Fedora, come what may. So I’m writing a blog post way out here in the wilds of the unread part of the internet instead, mostly to satisfy my own urge to scream. Even if alone in the woods. Into a pillow. Inside a soundproof vault.

I already wrote an article about the current efforts to neuter Unix, so I won’t completely rehash all of that here. But its worth noting that the post about de-Nixing *nix generated a lot more support than hatred. When I write about political topics I usually get more hate mail than support, so this was unique. “But Unix isn’t politics” you might naively say — but let’s face it, the effort to completely re-shape Unix is nothing but politics; there is very little genuinely new or novel tech going on there (assloads of agitation, no change in temperature). In fact, that has ever been the Unix Paradox — that most major developments are political, not technical in nature.

As an example, in a response to the thread linked above, Konstantin Ryabitsev said:

So, in other words, all our existing log analysis tools have to be modified if they are to be of any use in Fedora 18?

In a word, yes. But what is really happening is that we will have to replace all existing *nix admins or at a minimum replace all of their training and habits. Most of the major movement within Fedora from about a year ago is an attempt to un-nix everything about Linux as we know it, and especially as we knew it as a descendant in the Unix tradition. If things keep going the way they are OS X might wind up being more “traditional” than Fedora in short order (never thought I’d write that sentence — so that’s why they say “never say ‘never’”).

Log files won’t even be really plain text anymore? And not “just” HTML, either, but almost definitely some new illegible form of XML by the time this is over — after all, the tendency toward laughably obfuscated XML is almost impossible to resist once angle brackets have made their way into any format for any reason. Apparently having log files sorted in Postgres wasn’t good enough.

How well will this sit with embedded systems, existing utilities, or better, embedded admins? It won’t, and they aren’t all going to get remade. Can you imagine hearing phrases like this and not being disgusted/amused/amazed: “Wait, let me fire up a browser to check what happened in the router board that only has a serial terminal connection can’t find its network devices”; or even better, “Let me fire up a browser to check what happened in this engine’s piston timing module”?

Unless Fedora derived systems completely take over all server and mobile spaces (and hence gain the “foist on the public by fiat” advantage Windows has enjoyed in spite of itself) this evolutionary branch is going to become marginalized and dumped by the community because the whole advantage of being a *nix admin was that you didn’t have to retrain everything every release like with Windows — now that’s out the window (oops, bad pun).

There was a time when you could pretty well know what knowledge was going to be eternal (and probably be universal across systems, or nearly so) and what knowledge was going to change a bit per release. That was always one of the biggest cultural differences between Unix and everything else. But those days are gone, at least within Fedoraland.

The original goals for systemd (at least the ones that allegedly sold FESCO on it) were to permit parallel service boot (biggest point of noise by the lead developer initially, with a special subset of this noise focused around the idea of Fedora “going mobile” (advanced sleep-states VS insta-boot, etc.)) and sane descendant process tracking (second most noise and a solid idea), with a little “easy to multi-seat” on the side to pacify everyone else (though I’ve seen about zero evidence of this actually getting anywhere yet). Now systemd goals and features have grown to cover everything to include logging. The response from the systemd team would likely be”but how can it not include logging?!?” Of course, that sort of reasoning is how you get monolithic chunk projects that spread like cancer. Its ironic to me that when systemd was introduced HAL was held up as such a perfect example of what not to do when writing a sub-system specifically because it became such an octopus — but at least HAL stayed within its govern-device-thingies bounds. I have no idea where the zone of responsibility for systemd starts and the kernel or userland begins anymore. That’s quite an achievement.

And there has been no end to resistance to systemd, and not just because of init script changeover and breakages. There have been endless disputes about the philosophy underlying its basic design. But don’t let that stop anybody and make them think. Not so dissimilar to the Gnome3/Unity flop.

I no longer see a future where this distro and its commercially important derivative is the juggernaut in Linux IT — particularly since it really won’t be Linux as we understand it, it will be some other operating system running atop the same kernel.

Come to think of it, changing the kernel would go over better than making all these service and subsystem changes — because administrators and users would at least still know what was going on for the most part and with a change in kernel the type of things that likely would be different (services) would be expected and even well-received if they represented clear improvements over whatever had preceded them.

Consider how similar administering Debian/Hurd is to administering Debian/Linux, or Arch/Hurd is to administering Arch/Linux. And how similar AIX and HP/UX are to administering, say, RHEL 6. We’re making such invasive changes through systemd that a change of kernel from a monolothic to a microkernel is actually more sensible — after all, most of the “wrangle services atop a kernel a new way” ideas are already managed a more robust way as part of the kernel design, not as an intermediate wonder-how-it’ll-work-this-week subsystem.

Maybe that is simpler. But it doesn’t matter, because this is about deliberately divisive techno politicking on one side (in the vain hope that “if our wacko system dominates the market, we’ll own the training market by default even if Scientific Linux and CentOS still dominate in raw numbers!”), and ego masturbation on the other (“I’ll be such a rebel if I shake up the Unix community by repeatedly deriding so-called ‘Unix traditions‘ as outdated superstitions and generally giving the Unix community the bird!”) on the other.

Here’s a guide to predicting the most likely outcomes:

  • To read the future history* of how these efforts work out as a business tactic, check the history of Unix from the mid-1980′s to early 2000′s and see how well “diversification” in the interest of carving out corporate empires works. I find it strikingly suitable that political abuse of language has found its way into this effort — conscious efforts at diversification (defined as branching away from every other existing system, even your own previous releases) is always performed under the label of “standardization” or “conformance to existing influences and trends”. Har har. Joke’s on you, though, Fedora. (*Yeah, its already written, so you can just read this one. Easy.)
  • To predict the future history of a snubbed Unix community, consider that the Unix community is so used to getting flipped the bird by commercial interests that lose their way that it banded together to write Linux and the entire GNU constellation from scratch. Consider also that the original UNIX was started by developers who were snubbed and felt ill at ease with another, related system whose principal flaw was (ironically) none other than the same featuritis the Linux community is now enduring.

I don’t see any future where Fedora succeeds in any of its logarithmically expanding goals as driven by Red Hat. And with that, I don’t see a bright future for Red Hat beyond v7 if they don’t get this and other priorities sorted**. As a developer who wishes for the love of everything holy that I could just focus on developing consumer business applications, I’m honestly sad to say that I’m having to look for a new “main platform” to develop for, because this goose looks about cooked.

** (sound still doesn’t work reliably — Ekiga is broken out of the box, Skype is owned by Microsoft now — Fedora/Red Hat don’t have a prayer at getting on mobile (miracles aside) — nobody is working on anything solid to stand a business on once the “cloud” dream bubble pops — virtualization is already way overinvested in and done better elsewhere already anyway — easy-to-fix media issues aren’t being looked at — a new init system makes everything above worse, not better, and is distracting and requires admins to completely retrain besides…)