Use a Bigger Hammer

I spent much of today in my advisor’s lab, trying to incorporate some of his suggestions into the next draft of my thesis proposal. So far, so good. However, as a result of this afternoon’s labours, I have a bone to pick with Adobe Systems, Inc., the creators of the Portable Document Format (PDF) language. Specifically, I would like to register my extreme displeasure with their Adobe Reader product, which, for the lack of a better description, sucks rocks. Not only is it bloatware, but it failed to do the one very simple thing I needed from it.

Now, you could argue that I shouldn’t expect much from a free download, but given how much Adobe gets for the complete set of Acrobat tools, you’d think they would want to impress people with the quality of their stuff. If you thought that in this case, you’d be dead wrong. And besides, there is lots of really good software available for free from the open source community, so my standards have gotten pretty high.

The basic issue was this: I wrote my proposal in LaTeX, which I converted into PDF format to send to my advisor for comments. This part is nice and easy, thanks to TeXShop, a really beautiful open-source TeX front end for MacOS X. The trouble began when he supplied his comments by inserting them directly into the PDF file, using one of the Acrobat tools, and e-mailed the resulting file back to me.

Ordinarily, I use Apple’s “Preview” application to view and print PDF files. It is nice and small and fast, and it came free with my operating system—what more could you ask for in a program? This time, unfortunately, it didn’t work. I could see the original contents of the file just fine, but Preview doesn’t know how to interpret the special markup Adobe’s tools inserted, so Sean’s comments were invisible to me. Well, bugger. So, I went off and downloaded the latest version of Adobe Reader.

Once I’d installed all 83 megabytes of that, I was able to see Sean’s comments in the file. Much to the good, apart from the overuse of disk space. Since it’s tedious to have to flip back and forth between two applications to read comments and edit, I figured I’d print out the comments, and then edit from a hard copy. I prefer that mode of operation anyway, since it lets me scribble in the margins, which is something the computer doesn’t do well.*

* There are products that let you write on the screen with a stylus, but they are not as naturalistic as a real pencil on paper.

So, I tried the obvious thing: I opened the PDF file, and invoked the “Print…” command. I chose the option to print markup as well as the main document, and fired it off to the printer.

The result was a hardcopy in which each of the places that were hilighted in yellow on screen, indicating the presence of a comment, were highlighed in grey (this being a black-and-white printer). No comment text was printed.

So, I went online and looked for documentation. I figured this had to be an FAQ, but I couldn’t find it anywhere. Even Lord Google couldn’t find the answer, although it did an excellent job of finding the parts of Adobe’s web site I needed to look at. Someone had asked the same question on their online forum, but nobody had answered it. I carefully read through all of the help files included with the program including the “ReadMe.html” that nobody ever reads unless they’re a CPA. No dice: I could view those comments about ten ways from Sunday, but if there is any way of printing them short of taking nine pages of screenshots and printing those, I’ll be buggered if I could find it.

To make a long story short, I wasted more than an hour trolling through documentation online, a spectacular confirmation of the principle that computers save time like kudzu prevents soil erosion. My eventual solution to the problem? I borrowed an unused monitor from somewhere else in the lab, and hooked it up to my PowerBook’s external video port. I probably could have done this in the first place, and saved both time and dead trees, but sometimes you just get a bee in your bonnet about making it work the way you want.

The moral of the story? If it doesn’t work, hit it with a hammer. If it still doesn’t work, use a bigger hammer.

Oh, and also, if Adobe were to spend a quarter of the effort they put into supporting DRM on improving the UI and printing support, Reader would probably be a damned fine product. I regret the necessity of the subjunctive.

Eschew a Supervacaneous Copia Verborum

The abstract for today’s Jones Seminar talk began with the following line:

“Stratospheric ozone destruction due to the release of anthropogenic halogenated hydrocarbons continues.”

Now, as far as I can tell, Dr. Linteris is trying to tell us that humans continue to screw up the ozone layer with nasty chemicals. Fair enough. But this talk was supposed to be for a general audience, so I have to wonder what the hell made him think he had to write “anthropogenic halogenated hydrocarbons” instead of “man-made chemicals.” The latter may be less accurate, but it’s a damned sight easier to read.

I’ve got nothing against Dr. Linteris, I’m just pointing this out as an example of a more general problem. Academic writing often suffers from excessive wordiness, to its own detriment. Every human activity has its own language of terms and analogies, designed to simplify communication, and science and engineering are no exception. But I think it’s important to keep your audience in mind, too: If Dr. Linteris were giving this talk to the Fire Research Division at NIST, and he opened up with a line like “Humans continue to screw up the ozone layer with nasty chemicals,” they’d probably think he’d been drinking. On the other hand, “stratospheric ozone destruction due to the release of anthropogenic halogenated hydrocarbons,” even to a well-educated layman’s ear, bears the sharp overtone of wanking.

I have to admit that I’m sometimes guilty of larding my own prose, a fact that comes as no surprise to anyone reading this. I love the elaborate variety of English vocabulary, and I enjoy finding clever ways to condense a complex idea into a sentence that is pithy, yet precise. Some writers, however, seem to be deliberately hiding the apparent simplicity of their work, to make it seem more tricky and grandiose than perhaps it is. This behaviour is not limited to the humanities, either, despite stereotypes to the contrary: Scientists, social scientists, and engineers are just as guilty as literary critics, when it comes to obfuscatory prolixity.* I find such writing to be insincere, even fraudulent, especially from an academic. To paraphrase Bill McKeeman, there is a short, simple story lurking inside every complex idea, and the true task of an honest academic should be to tell that story, as clearly and as accurately as possible.

* In fact, sometimes scientific writing is worse, because of all the carefully-defined mathematical vocabulary used in the sciences. Then again, academics in the humanities often go to great lengths to avoid giving any clear definitions for their terms, so maybe it all works out about the same.

Now, I’m not saying we should get rid of all technical vocabulary, and force everybody to write in “plain language.” Can you imagine reading Dick and Jane Explain Deconstructionist Criticism, or Uncle Bubba’s Down-Home Guide to the Higgs Boson? Technical jargon has a legitimate use, helping people communicate about complicated ideas. Like a telescope, jargon can be a powerful tool, if you use it correctly. If you look at it from the wrong end, it can be pretty confusing, but that’s no reason to throw out your telescope. Still, I think the world would be a better place if we made some kind of concerted effort to stomp out deliberate attempts to confuse the issue.

If nothing else, it would make it easier to figure out what the hell these papers I’m reading are trying to say. Wouldn’t that be nice?

Playing the Odds

Some people say that lotteries are a tax on stupid people, although you don’t hear that sentiment very often from the people who have won. Nevertheless, it got me thinking about the problem of how to tell when it’s worthwhile to lay out a few bucks for a lottery ticket. The analysis that follows is quite simple, but I found it entertaining to think about, so maybe you will too.

Let’s start from the assumption that it’s worthwhile to play the lottery if you can reasonably expect to come out ahead. For my purposes, that means you wind up with more money than you started with. Unfortunately, you can’t know in advance what will happen—that’s the whole point of the game. All you really know is what it costs to play, how much you can expect to win, and the conditions for winning. So, let’s say that C represents the cost of a lottery ticket (in dollars), V stands for the value of the jackpot (in dollars), and P stands for the probability that you win the jackpot (a value between 0 and 1).

Most lotteries work this way: When you buy a ticket, you pick some numbers that get printed on the ticket. Later on, the people who run the lottery pick some numbers at random,* and if their numbers match yours exactly, you win V dollars; otherwise, you lose. Actually, most lotteries have various smaller prizes if you match most of the numbers, even if you didn’t win the jackpot, but I’ll come back to that later—for now, we’ll concentrate on the all-or-nothing version of the lottery.

* The meaning of the word “random” is an interesting question by itself. Broadly speaking, it means “unpredictable,” and for the sake of simplicity, I’m going to assume that when lottery numbers are picked, any possible combination of numbers is equally likely.

So, supposing you play the lottery, how much money would you expect to win? If all your numbers match, you get V − C dollars, which is the total jackpot less the cost of the ticket—but the probability of that occurring is very small. If you miss, however, you are out C dollars: this is much more likely. If you want to figure out how much you should expect to win, you will have to take these probabilities into account.

To get some intuition for how this works, consider a very simple lottery in which you pay $1 for a ticket, choose a single number between 1 and 6, and you win a $13 jackpot if a single roll of a fair six-sided die matches the number you picked. For this simple lottery, the chance of winning is P = 1/6, or, to take the pessimist’s view, the chance of losing is 5/6. Suppose you play this lottery over and over again, many times. About 1/6 of the time, you will win, coming out $12 ahead; 5/6 of the time, you will lose, coming out $1 behind. By this reasoning, if you played this game n times in a row, you could reasonably expect to have made around . Factoring out n gives us the amount we might expect to win on a single play as which is about $1.17, just slightly more than the cost of a ticket. That doesn’t mean you will actually win $1.17, but that if you played many times, your average winnings would be $1.17 per ticket, so you would definitely expect to come out ahead if you kept playing long enough.

By contrast, suppose the jackpot were only $7 instead. Now when you win, you are $6 ahead, so that the expected value is or -17¢, which means you would expect to lose an average of seventeen cents each time you play—not a winning proposition.

Thus, the expected value of a lottery, which I will represent as E, can be computed in this way:

What does this mean? Well, if E comes out positive, you’d expect to come out ahead on average. On the other hand, if E comes out negative, you’d expect to fall behind on average, and if E is exactly zero, you’d expect to break even on average. Of course, this says nothing about the exact outcome of a single play, but it does at least give you a good baseline to work from. Another way to look at this is that you break even when the probability of winning (P) times the value of the jackpot (V) is exactly equal to the cost of a ticket (C). Increasing the jackpot improves the value of E in your favour; any decrease in the probability of winning reduces E to your detriment.

A more realistic example is the “Tri-State Megabucks,” run by the New Hampshire State Lottery Commission. The published odds of winning the jackpot are P = 1/5245786. A single ticket costs C = $1. In order for the expected value of this lottery to break even, the value of the jackpot must satisfy the following equation:

This occurs when V = $5,245,786. So, that means the jackpot has to be at least 5.25 million dollars before it is statistically worthwhile for you to pay $1 for a ticket. And that’s even before you take into account all the other rules and regulations governing multiple winners, amount of prize, and so forth. By the time you figure all the rest of it in, the real jackpot will need to be pretty high to make it worth your dollar. Also, even if the jackpot is $10M, your expected profit on a $1 ticket is only 90¢. That plus another five bucks might have bought you a demitasse of plain espresso at your local Starbucks, if you hadn’t spent it on a lottery ticket.

That brings up something else I haven’t mentioned before, something economists call “opportunity cost.” Suppose you spend $1 on a lottery ticket, and then a few minutes later, you stop by the coffee shop. You wanted a blackberry scone, but it costs $6, and you only have $5 left in your wallet. If you hadn’t spent that dollar on a lottery ticket, you could have had your scone, which is obviously worth $6 to you (or you wouldn’t have stopped in for one). If you win the lottery, then you probably don’t care; but if you lose, then you have not only lost the $1 you spent on the ticket, but you have also forgone an opportunity to buy a $6 blackberry scone. That opportunity has some value to it (even if it’s not really the whole $6), and the loss of that value should be counted against the lottery ticket. Putting a number on opportunity cost is not easy, but if you really wanted that scone, you just bought a $7 lottery ticket, and raised your break-even jackpot to something like 36.7 million dollars.

In real lotteries, the situation is not quite as bleak as this, since you don’t actually have to match all the numbers in order to win something. In the Tri-State Megabucks, for example, you win the jackpot with probability 1/5245786, a $10,000 prize with probability 1/874298, a $1,000 prize with probability 1/24980, a $50 prize with probability 1/9992, a $40 prize with probability 1/588, a $5 prize with probability 1/441, a $2 prize with probability 1/53, and a $1 prize (free ticket) with probability 1/40. In general, if you have k prizes, the expected value is

For the Tri-State Megabucks, this breaks even when the jackpot is approximately $3.95M, taking into account all the lesser prizes and ignoring opportunity cost.

So, should you play, or shouldn’t you? Clearly if the expected value of the lottery isn’t at least as much as the ticket price, you should probably keep your money, unless you’re pretty sure the gods are smiling upon you in all your endeavors. What’s more, if you play hoping to get one of the lesser prizes, keep in mind that the expected value omitting the jackpot, leaving all other chances the same, is a net loss. The average only comes out ahead if you are gambling on your chances of winning the whole thing.

It’s your dollar. Spend it how you will.