Friday, February 27, 2009

Wholly Whexagons!

Dear Dr. Math,
I've noticed that hexagons show up in a lot of different places. Now that I've started looking for them, I see them everywhere! What's the deal with hexagons?
Jules, Canton OH

Dear Jules,

I know I'm not supposed to play favorites with mathematical objects, but I have to confess, the hexagon is probably my favorite shape. (Sorry, rhombicuboctahedron.)

While I suspect that you may be experiencing a fair amount of confirmation bias, it's true that hexagons do make an appearance in a large variety of different contexts. (It's also possible that you have the "hexagon madness" and are seeing them when they're not actually there. You might want to get that checked out.)

Part of the reason hexagons are so ubiquitous is that they have so many useful properties, probably even more than "familiar" shapes like squares and trapezoids. Primarily, I'm referring to regular hexagons--hexagons with 6 equal sides--like this guy:

Probably these are the ones you're seeing, Jules. Next time, I'll talk a little about the properties of irregular hexagons and why you might expect to see those, too.

First of all, a regular hexagon has the property that its opposite sides are parallel to each other, making it an ideal shape for a nut or bolt, because it fits nicely into a wrench:

Squares, octagons, and some other n-gons [those with even n] have the same property, but with a hexagonal nut or bolt, you can grab it at a variety of different angles, which is useful if you're putting together Ikea furniture in a tiny Manhattan apartment, for example. Also, since the exterior angles of an n-gon add up to 360° and there are n of them, each one measures 360/n. Therefore, more sides aren't really so good, because as the number of sides gets larger, the sharpness of the corners decreases, allowing for a greater possibility of slippage. A hexagon seems to be a nice compromise between a 2-gon and an -gon for these purposes.

Another important property of regular hexagons (that I'm sure you're aware of if you've ever looked at the floor of a public bathroom) is that they tile the plane. In other words, by putting a bunch of identical hexagons together as tiles, we can cover an entire plane surface:

There are other tilings, of course, with squares or triangles, but this one has some very appealing aspects. (For one, it's made of hexagons!) It turns out that among all possible tilings of the plane of shapes with a fixed area, the hexagonal one has the smallest possible perimeter.

One way to think about this is that the perimeter:area ratio goes down as a shape gets closer to being a circle. So, if we're using n-gons to tile the plane, we want n to be as big as possible. On the other hand, we have to be able to glue them together so that at each point of intersection, the angles add up to 360°. By the same reasoning as before, we can show that the interior
angles on an n-gon are each 180 - 360/n, and since there have to be at least 3 of these angles meeting at each corner, the greatest this angle can be is 120°. In this case, , so 360/n = 60; therefore, n=6. Hexagon!

Now, why does all of that matter? Well, say you weren't cutting these tiles out of a piece of ceramic but instead were building up walls to section an area into a number of chambers. If the material in those walls was really expensive for you to produce, it would bee in your best interests to make the chambers in the shape of a regular hexagon:

I don't know how bees managed to figure this out and yet here we are still living in rectangular grids like chumps.

A slightly different, but related, property of the regular hexagonal tiling is that it shows up if you're trying to pack together some circles:

See the hexagons?

Once again, this way of packing circles has the property of being optimal, in the sense that it leaves the least amount of empty space between circles. In fact, using a little trigonometry, we can even work out the efficiency of this packing:

The triangle in the picture is equilateral with all sides equal to 2*r, where r is the radius of the circles we're packing. If we split one in half (where the two black circles intersect), we'll get a right triangle with hypoteneuse 2*r and one leg r. Therefore, by the Pythagorean Theorem, if h is the height, then . So , and therefore, . That means the area of the triangle is .

Inside each triangle are three pieces of a circle, which together make up half of a circle of radius r. Thus, the area of the circular pieces is . This means the ratio of circular area to total area of the triangle is , approximately 0.91. Since the whole plane is made up of these triangles, the proportion of circle-area to total-area is the same, meaning the circles take up about 91% of the space. Pretty efficient, and fun at parties, too!

To Be Continued...


Tuesday, February 24, 2009


With Valentine's Day just passed and Ash Wednesday lurking around the corner, I know the topics of sex and pregnancy are on a lot of people's filthy guilt-ridden minds these days. To help people understand their risks, and to show I'm not a prude, I'm hosting a little get-together (an orgy, if you will) of questions all about sex. So turn the lights down low, put on some soft music, and enjoy this special "adults only" post about what we in the math business call "multiplication."*

Dear Dr. Math,
I read in an article that "Normally fertile couples have a 25 percent chance of getting pregnant each cycle, and a cumulative pregnancy rate of 75 to 85 percent over the course of one year." How do you go from 25% to 85? I don't see the connection between those two numbers.
Name Withheld

As is often the case, Name, the way to understand the probability of getting pregnant over some number of time intervals (I almost wrote "periods" there but then reconsidered) is instead to think about the probability of not getting pregnant during any of those intervals. We can use the fact that the chance of something happening is always 1 minus the chance of it not happening. This turns out to be a generally useful technique whenever you're interested in the occurrence of an event over multiple trials. To take my favorite over-simplified example of flipping a coin, if we wanted to find the chance of flipping an H (almost wrote "getting heads"--geez, this is har.., er, difficult) in the first 3 flips, we could go through all of the possible 3-flip sequences and count how many of them had at least one H, or we could just observe that only one sequence doesn't contain an H (namely, TTT). Since the probability of flipping T ("getting tails") is on each flip, the chance of "doing it three times" is . Thus, the probability of at least one H is . Phew.

Similarly here, there are lots of different ways to get pregnant over the course of a year (believe me), but only one way to not get pregnant. If we take the first statistic as correct, that the chance of a normally fertile couple getting pregnant in each cycle is 25%, then we could assume that the chance of not getting pregnant in each cycle was 75%, or 0.75. Assuming a "cycle" is 28 days long, there would be 13 cycles per year, so by the same reasoning as above, we could say that the chance of not getting pregnant in a year is , about 2.4%. So, the chance of "being in the family way" at some point during the year would be , or 97.6%.

Now, that doesn't match up with the observed number you quoted, 85%. In the study, of course, all they do is assemble some group of "normally fertile" couples and count the number of times they get pregnant in a year. We were trying to solve the problem "top down" whereas the data is observed from the "bottom up." What's going on? Well, the problem was our assumption that the different cycles were independent from each other, in the sense that knowing what happened in one cycle doesn't affect our estimation of what will happen in the next. For coin-flipping, this is a reasonable assumption, but for copulation, not so much. It makes sense that there should be some correlation between the different cycles, because the possible causes for infertility one month might continue to be true the next. For example, it could be that either or both partners have some kind of medical condition that makes conception less likely. Or maybe the guy's underwear is too tight, I don't know. But it seems that the assumption of independence probably doesn't hold. Also, it's not entirely clear what's meant by "normally fertile" here, since (as far as I know) it's only really possible to know if a couple is "fertile" if they've succeeded in having a baby. So, it's possible that the data includes some number of couples who were just less fertile and perhaps didn't know it.

The correct way to understand these compound probabilities is to consider the probability of not conceiving in one cycle conditional on the event that you had not conceived the cycles previously. Unfortunately, I don't have access to that information from personal experience, nor a good mental model for what numbers would be reasonable. However, it seems like the probability of not conceiving should be higher than ordinary if you know already that you've gone some number of months without conceiving. As a result, the odds of getting pregnant in a year should be lower than our estimate assuming independence, which does in fact agree with the data.

Dear Dr. Math,
Planned Parenthood's web site says, "Each year, 2 out of 100 women whose partners use condoms will become pregnant if they always use condoms correctly." Is that the same as saying that condoms are 98% effective? If so, does that mean that if you have sex 100 times, you'll likely get somebody pregnant twice? (I mean, if you're a man. If you're a woman I imagine the rate of impregnating your partner will probably slip in the direction of zero.) Yours always,
Name Withheld

Oh, you freaky Name Withheld, you've asked the question backwards! In fact, the statistic you give of 2 women out of 100 becoming pregnant in a year is how the effectiveness of condoms is defined. That is, in the birth control industry, specifically, when someone claims that a particular method is "x% effective," it means that if a group of women use that method, over the course of the year about (100-x)% of them will get pregnant. Now, there are a number of assumptions being made here, not the least of which is that those women (and their partners) used the method correctly. Without actually going into people's bedrooms (or living rooms, or kitchens?) and tallying up on a clipboard whether their condom use was "incorrect", it's impossible to know for sure. Instead, people who do surveys of this kind have to rely almost exclusively on what people say they did. And let me ask you something: If you accidentally impregnated someone/got impregnated by someone while nominally using some birth control method, would you say, when asked, that you had been using it "incorrectly"? Or would you, as all good carpenters do, blame your tools?

Another implicit assumption is that the respondents reflect a typical number of sexual encounters in a year. Again, I don't know how they decide what participants to include in this kind of study or how they verify the claims they get, but according to some studies I was able to find, the average "coital frequency", as it's romantically known, for both married and single people in the U.S. is somewhere around 7 encounters per month. Therefore, if we treated the experiences as being independent (with the same caveat as in the previous question), we could estimate the probability of unintended pregnancy in a single sexual encounter:

Let's call the probability p. So the chance of not getting pregnant during a given sex act is (1-p). We'll accept the 7 times/month figure and assume a total of sexual encounters per year, all including correct condom usage. As in the coin example, we've assumed independence, so the probability of not getting pregnant over the course of 84 trials is , which we're assuming is equal to the stated number of 98%. Therefore, we have:

And so , meaning that p is very small, about 0.02%. Therefore, if you had sex 100 times, as you say (and congrats, btw), you could expect to make an average of 0.02 babies.

Some important notes:
1) Our assumption of independence here may be more reasonable than in the previous example, because it's possible that whatever factors contribute to a birth control method failing despite proper use may be due more to chance than any kind of recurring trends.
2) Also, these numbers don't account for the fact that (as we saw above) the chance of getting pregnant in a year even without any protection is something like 85%. So, in a sense, condoms "only" reduce the risk of pregnancy from 85% to 2%.
3) We've only been talking about pregnancy here, not the risks of other things like STDs or panic attacks.
4) Wear a condom, people!

Dear Dr. Math,
Mathematically speaking, what number makes for the best sexual position?
Name Withheld

You seem to be asking a lot of questions, NW.

Personally, I've always enjoyed the ln(2π).


*Also acceptable: or "integration by parts".

Monday, February 23, 2009

The Infinite Monkey Strikes Back

I've gotten some very interesting responses to my post about the Infinite Monkey Theorem, concerning the likelihood of a monkey accidentally reproducing The Hobbit by randomly generating letters. So I thought I'd write a follow-up to address some of them; also it gives me another chance to imagine a monkey typing on a typewriter. (I tried letting a monkey type this up for me, but he wrote something much more interesting than I had planned.)

Dear Dr. Math,
I believe the monkey problem makes the simplifying assumption that all the letters in the text are independent, which they're not in real English (or any other language). How does this affect the results? Real texts are sampled very narrowly from the space of possible letter sequences.

Excellent question, CN; I'm glad you brought it up. In fact, the distribution of the letters in the text is irrelevant to the problem. The important assumption is that the letters being output by the monkey are equally likely and probabilistically independent of each other. Under this assumption, it doesn't matter if the text the monkey's trying to match is The Hobbit or the phone book or a sequence of all "7"s--the probability of matching any sequence of 360,000 characters will be . If you need convincing, consider the simpler example of flipping a fair coin 5 times. Any particular sequence of 5 flips, for example HHTHT, has the same probability, , of coming up. So, if we were trying to match any flip sequence, we'd have the same chance. The same is true here, just on a much larger scale (and with monkeys).

Now, many people object to this idea, because they think that a letter sequence like "a;4atg 9hviidp" is somehow more "random" than a sequence like "i heart hanson". Therefore, they reason, the first sequence would be more likely to occur by chance. But actually the two sequences have exactly the same probability of occurrence, under our assumptions. Really, the only difference between the two is what we could infer about the message source (beyond musical tastes) based on receiving such an output. I hope to discuss this in detail someday in the context of elementary hypothesis testing, but if, say, there were some doubt in our minds as to whether the source of these characters was, in fact, uniformly random, the latter message would give us considerable evidence to help support that doubt. The reason is that we could provide an alternative hypothesis that would make the observed data much more likely. For the monkey problem, however, we were assuming that we knew how the characters were being generated, so there's no doubt.

Dear Dr. Math,
Along the lines of your previous questions on large numbers and randomly generating the book The Hobbit, I'd like to ask about randomly generating images. A low res PC display is normally 640 x 480 pixels. If you randomly generated every combination of color pixels, wouldn't you have created every image imaginable at that resolution? That is, one of the screens would be the Mona Lisa, one would be your Ask Doctor Math page, one would be a picture of the Andromeda galaxy from close up, one would be a picture of you!, etc. If you only wanted to look at black & white images, you'd have a much smaller collection, but once again wouldn't you generate every B&W screen image possible?
With feature recognition software getting better all the time, one could "mine" these images for recognizable features. Similar to the way the pharmaceutical companies sequence through millions of plants to find new substances, one could sequence through these images to extract unknown info.

Dear Mike,

Absolutely, we could apply the same techniques to any form of information that can be reduced to a sequence of numbers or letters, like images, CDs, chess games, DNA sequences, etc. In fact, we needn't generate them randomly, either. As in The Library of Babel example, one could imagine a vast collection of all possible sequences, generated systematically one at a time with no repeats. Unfortunately, for any interesting form of information, the number of possibilities is simply too great to make it practical.

In your example of 640x480 pixel images, even assuming the images were 1-bit monochrome, there would still be 2 possibilities ("on" or "off") for each of the 640*480 = 307,200 pixels. Therefore, the number of possible images would be , which is about . Remember how big a googol is? Well, this number is about . So, not even all the crappy low-res monitors in the universe could possibly display them, even at their lousy maximum refresh rate of 60 images/second. And even worse, we'd have no reason to believe any of the images we did see, because they'd be indistinguishable from all the many other conflicting images.

Your comparison to pharmaceutical companies is interesting, but remember those companies are starting with a large (but manageable) collection of plants that actually exist, not searching through the space of all possible arrangements of plant cells or something. It's OK to search for a needle in a haystack sometimes, but not when the haystack is larger than the known universe.

Dear Dr. Math,
Unless I misunderstand (and that's quite possible), I think you've introduced a major flaw here... The "second chunk" begins at character number 2, not character number 360,001. There is no reason why these should be considered discrete chunks and so just because the first character isn't "I" doesn't affect the fact that the second and subsequent characters may spell out the work. Thusly, your monkeys are producing over 17 million "blocks" a day, not just 48...
A. Nonymous

Well, A, that all depends on how we set up our assumptions. The way I had pictured things, the monkey was typing out a whole manuscript of 360,000 characters at a time and then having someone (perhaps J.R.R. Tolkien himself!) check it over and see if it was exactly the same as The Hobbit. If not, the monkey tries again and, with very high probability, fails.

However, your idea is more interesting and perhaps more "realistic". That is, we could have Prof. Tolkien just watch over the monkey's shoulder as it typed and see if any string of 360,000 consecutive characters were The Hobbit. So, if the monkey started by typing a preamble of gibberish and then typed the correct text in its entirety, we'd still count it as correct. As you say, this means that the possible "chunks" we'd need to consider have a lot of overlap to them--we might find the text in characters 1 through 360,000 or 2 through 360,001, etc. But unfortunately, it's not just the number of chunks being produced we need to reconsider; because of the way they overlap, we've now introduced relationships between the chunks that mean our assumption of independence no longer holds. For example, if we knew the first block of characters was incorrect, we could determine whether it was even possible for the second block to be correct based on the particular way the first block was wrong. In fact, we'd know it was impossible unless the first block was something like "xin a hole in the ground there lived a hobbit...".

Actually, if we thought about things in this way, then CN's question above would be relevant, because the codependency of the overlapping chunks would depend heavily on the particular text we were trying to match. Consider the example of coin-flipping again: assume we were flipping the coin until we got the string TH. There are 3 possible ways we could fail on the first pair of flips, all equally likely: TT, HT, and HH. If we got TT or HT, then we could succeed on the third try by flipping an H. If we started with HH, there's no way we could get TH on the third flip. The number of ways of succeeding would be 2, out of a possible 6. So the probability of succeeding in the second block given that we failed in the first would be .

Now, if we were trying to match HH and we knew we failed on the first 2 flips, there would still be 3 equally likely possibilities. Either we flipped TT, TH, or HT. If we started off with TT or HT, we can't possibly win on the third flip. But if we got TH first, we'd have a chance of flipping H on the third flip and matching. Thus, our probability of matching in the second block given that we failed in the first would only be . Here's a chart showing all of the possibilities:

The two probabilities are different because TH can overlap in more ways with the wrong texts, whereas HH can only overlap with TH.

Therefore, our previous strategy of multiplying probabilities, which rested on the assumption of independence, won't work here. In order to explain how long it would take the monkey to produce The Hobbit with high probability under your scheme, I'd have to go into some fairly heavy-duty math involving Markov chains and their transition probabilities. The relevant probabilities can be found by raising a 360,000 x 360,000 matrix to the nth power--not generally an easy thing to do. But it turns out that the expected (i.e., average) number of characters the monkey would have to type before finishing would still be on the order of , similar to the the previous setup.

Either way, you and J.R.R. would have probably given up by that point.


Saturday, February 21, 2009

What "mean" means

Dear Dr. Math,
My parents live about 200 miles away from me, so I make the drive back and forth a lot, with no stops. Almost exactly halfway in between the speed limit changes, so instead of driving 55 mph I drive 80 mph. Since my average speed is 67.5 mph, shouldn't it take me 200/67.5 = 2
.96 hours to get there? I've noticed it always takes a little longer, but I don't get it. I've even set the cruise control and kept the speeds exactly constant.

Dear Chuck,

I'm going to go ahead and assume that you live in one of those places in Utah or west Texas where the speed limit actually is 80 mph. Otherwise, you've been speeding, and I can't endorse that kind of behavior. OK? OK. Don't make me write a post about the correlation between speeding and traffic fatalities. I swear I will turn this blog around.

Here's why your numbers didn't add up: while it's true that the average, in the sense of arithmetic mean, of 55 and 80 is mph, that's actually the wrong kind of average to be using in this circumstance. "Kinds of averages?" Oh yes. Allow me to explain:

In the course of your trip, you drive half the distance, 100 miles, at 55 mph. So that leg takes you hours. On the second half, you're going the legal speed limit of 80 mph, so that half should take you hours. Altogether, then, your driving time is 1.81 + 1.25 = 3.06 hours, a little more than you expected.

Rather than the arithmetic mean here, you should have been calculating your harmonic mean, which for two numbers A and B is defined as . To see why that's the right quantity, let's denote by S your real average speed for the trip, that is, the total distance you traveled divided by your total time. If T is the total time you spent driving, then ; equivalently, . If A is the speed you went for the first half and B is the speed for the second half, then another way you could calculate the total time is as , just like we did previously. As usual, in math when we compute the same thing two different ways we end up with an interesting equation. In this case, since the times are equal, we get:
Dividing through by 100 on both sides gives us

which, if you take reciprocals of both sides and multiply by 2, yields the formula for the harmonic mean. In this particular example, mph, so your guess of 67.5 mph was only off by a little bit.

So, when is the arithmetic mean the right one? If you had gone on a trip and spent an equal amount of time driving 55 mph and 80 mph, then your average speed would be the arithmetic mean of the two. To see that, let's just assume you drove 1 hour at each speed. Thus, your total distance traveled would be miles, and your total time is 2 hours, so the average speed is mph. Voilà! If you look at that calculation closely, you can pretty clearly see why it should always give you the arithmetic mean--you're just adding the two speeds together and dividing by 2. Similarly, another way to see that the arithmetic mean is inappropriate for the equal distance problem is to notice that by driving the same distance at each speed, you spend more time at the slower speed and less time at the faster one.

There's actually yet another kind of mean, called the geometric mean, which shows up when you're computing ratios, percents, interest rates, and other things that are typically multiplied together. For two numbers A and B, it's defined as . For example, let's say you were a rabbit farmer and your population of rabbits grew by 50% one year and only 10% the next. The combined effect at the end of two years would be that the population had increased by a factor of , for an increase of 65%. To achieve that same growth at a constant rate, say a factor of R for each year, you'd need , so . So in a sense the "average" growth rate was 28% per year. Many people in this kind of situation would be tempted to guess that the average was 30%, splitting the difference between 50% and 10%. You can see that it's not far off from the truth, but it's not quite right. And why be almost right when you can be exactly right?

The point of all these means is to replace the net effect of two different values with the effect of just a single value repeated. But you have to be careful to consider exactly how those quantities are interacting to produce that combined effect. When they simply add together, the relevant type of mean is the arithmetic one, when they multiply, the correct mean is geometric, and when they do that weird thing of combining via their reciprocals, you use the harmonic mean. Interestingly enough, for any two numbers, if M is their arithmetic mean, G is the geometric mean,and H is the harmonic mean, it's always the case that . In fact, there are other means, too, but these three are the major players.

Other situations where the harmonic mean might come up include: calculating average fuel economy of a car given an equal amount of city and highway driving, computing the total length of time it takes two people working together to complete a task, figuring out the net resistance of two electrical resistors in parallel, finding a pleasant harmonic note (hence the name) between two other musical notes, calculating the height of the intersection between two crossed wires, and answering questions about the uses of the harmonic mean!