### Project Focus

R • Logic • Communication Design

### Project Overview

As often as I have time, I’ll submit an answer to FiveThirtyEight‘s weekly puzzle series, The Riddler. The puzzles employ a combination of math, logic, and probability. This particular puzzle (& my explanation) was cited on FiveThirtyEight, and consequently is one I’m particularly proud of. Overall, it’s a good representation of both my internal problem solving process and my ability to communicate esoteric information.

### Process

The nitty gritty details of how to solve the puzzle are best explained below. This project entailed lots of scratch paper network diagrams, R to simulate & check my work, and finally creating visualizations to explain the problem solving process. My Github code can be found here.

### Project

The Riddler is a weekly puzzle series from Oliver Roeder at FiveThirtyEight that usually involves some combination of math, logic and probability.

This weeks FiveThirtyEight Riddler, The Perplexing Puzzle of the Proud Partygoers was:

It’s Friday and that means it’s party time! A group of

Npeople are in attendance at your shindig, some of whom are friends with each other. (Let’s assume friendship is symmetric — if person A is friends with person B, then B is friends with A.) Suppose that everyone has at least one friend at the party, and that a person is “proud” if her number of friends is strictly larger than the average number of friends that her own friends have. (A competitive lot, your guests.)Importantly, more than one person can be proud.

How large can the share of proud people at the party be?

The question lends itself to Graph Theory, so I started with a random 20 node (which present partygoers) network.

The above shows a simple graph that could represent a party. The orange nodes represent the partygoers, and the gray lines (edges) represent the other partygoers each person is friends with.

Since the condition of the problem is that every partygoer has at least one friend there, we’re dealing with a connected graph – every node is connected to at least one other. The minimum and maximum number of edges for a connected graph with N edges is:

Min Edges = N-1

Max Edges = (N*(N-1))/2

The first thing I did was generate a random graph for every possible edge condition for a 25 node graph (from 24 edges to 300 edges), and calculated the proportion of proud partygoers at each graph. This generated the following (presented in FiveThirtyEight style!):

I only plotted values in which the proportion of proud partygoers increased, and saw it peaked right before the maximum amount of edges (300). The peak value occurred at 299 edges. At the maximum edge amount, the proportion of proud partygoers fell to 0.

The next step was to see whether this trend held through for graphs with node amounts different than 25. I ran a similar script to the one above for varying amounts of partygoers, and recorded the edges at which the Proportion of Proud Partygoers theme was at a maximum.

What immediately stands out is that the data mirrors a quadratic, and looks very similar to the maximum edges formula earlier. Some selected data from the above chart is below:

Partygoers (N) | Friendships (edgeS) at Max Proud Partygoer Proportion |
---|---|

5 | 9.00 |

10 | 44.00 |

20 | 189.00 |

40 | 779.00 |

50 | 1224.00 |

100 | 4949.00 |

Taking a closer look at that data, and recalling that proportion of proud partygoers fell to 0 at the maximum edge value at N=25, you can conclude the number of edges at which proud partygoer proportion is maximized is:

Edges @ Max Partygoer Proportion = Max Edges -1 = (N*(N*1))/2

One less than the maximum number of edges.

So now that we know the makeup of a graph (in terms of # of edges/nodes) that maximizes proportion of proud partygoers, I graphed what that proportion is for a given number of partygoers (N).

The data looks similar to a f(x) = (x-1)/x function where the function will approach, but never be greater than 1 – which is intuitively in line with the problem; the proportion of proud partygoers won’t even be greater than the amount of partygoers.

For N=5 the maximum proportion of partygoers is 0.6. For which the network graph looks like:

3/5 of the nodes are directly connected to every other node in the graph, while 2/5 are only connected to 3 of the other nodes. Given that our graph has 1 less edge than the maximum number of edges, it makes sense that 2/N nodes would no longer be fully connected. Therefore in a graph of size N with [Max Edges – 1] Edges, the amount of proud partygoers would be N-2.

Partygoers | Max Proportion Proud Partygoers |
---|---|

5.00 | 0.60 |

10.00 | 0.80 |

20.00 | 0.90 |

40.00 | 0.95 |

50.00 | 0.96 |

100.00 | 0.98 |

Testing that theory against some of the data holds true. Therefore we conclude that the proportion of proud neighbors for a given party of N people is:

Prop. of Proud Partygoers = (N-2)/N

I don’t think that makes since. How could 98% of the people at the party have more friends than their own friends have?

The important thing to note here is the definition of proud per the problem. .

So 98% of people don’t have more friends than their own friends have, but rather have more than the average. Say we’re dealing with 10 proud partygoers. In the maximum proud partgoer scenario, 8 of those are connected to all 9 other partygoers. The other 2 are connected to all other party goers, except the other partygoer that isn’t’ connected everyone, putting them at 8 friends. Consequently that puts the average of the 8 partygoers at (7*9 + 2*8)/9 = 8.77. So those 8 are proud (they have 9 friends each). The other two partygoers have an average of (8*9)/8 = 9, so they are not proud (only 8 friends each).

Essentially that missing connection between two partygoers pushes down the average for everyone else at the party, thus making them proud.

Hey,

Nice post. I was wondering if you post your code anywhere? I would like to see the simulation of the graphs with each edge condition.

Thanks

Sure! I’ll clean it up sometime this week and upload to Github. I didn’t expect 538 to mention my blog, so I wasn’t expecting any interest haha.

That sounds great. Thanks.

Whats your github account?

Hey Ryan,

Sorry, it took me a while to get things up on GitHub. Here’s a link to my GitHub account

Hey,

Nice read! Really comprehensive, if say run time was long!

Just one thing – How do you know for sure there is no graphs with N or N-1 proud party goers? I completely agree with it being N-2 (and I’ve seen a proof,conditioned on paths where the number of friends strictly decrease, that’s a bit involved).

Thanks! Run time was certainly long haha, but without an extensive math/CS background I figured this is the best way to do it.

By defintion of the problem proud partygoers are defined as:

Since an average implies that some members are below it, or that everyone is at the average, you can’t have N proud partygoers.

N-1 partygoers isn’t possible since we’re dealing with undirected graphs (if you’re friends with someone, they’re also friends with you). The simulations showed that we approach maximum proud prortion near the maximum amount of edges. If you take one edge away from the maximum amount of edges, you end up with 2 partygoers with less friends since we’re dealing with undirected graphs.

However, if we were dealing with directed graphs, it’d certainly be interesting to see what the maximum amount looks like!

*i’d say runtime was long

Run time was certainly long haha, but without an extensive math/CS background I figured this is the best way to do it.

Not that it affects maximizing the number of proud partygoers, but I think a matching (N/2 edges) would satisfy the “everyone has a friend” requirement.

So a disconneted graph but still have every node connected to at least on other node? Good point! I hadn’t considered that

Here’s a proof that the N-2 solution is optimal.

http://web.pdx.edu/~caughman/Proud%20vertex%20problem.pdf