SETTINGS
Appearance
Language
About

Settings

Select a category to the left.

Appearance

Theme

Light or dark? Choose how the site looks to you by clicking an image below.

Light Dark

Language

Preferred Language

All content on utk.claranguyen.me is originally in UK English. However, if content exists in your preferred language, it will display as that instead. Feel free to choose that below. This will require a page refresh to take effect.

About

"utk.claranguyen.me" details

Domain Name: claranguyen.me
Site Version: 3.0.1
Last Updated: 2019/08/18
Metadata
Talk #1 - Topcoder Analysis - TheTreeAndMan
Description: Various algorithms for solving "TheTreeAndMan". Speeding up O(n7 lg n) to O(n lg n).
Presenters: Clara Nguyễn
Type: Class Presentation
Date Originally Presented: 2018/09/20
Downloads: [ Slides ]
 
Talk Summary
In this talk, I discuss about a Topcoder problem called TheTreeAndMan, as well as go over various algorithms to solve it.

TheTreeAndMan is not a difficult problem, despite its 1000 point status. You are given a directed graph in the form of a tree and you have to find the number of "men" in it.

Because of it being a relatively simple problem, I focus more on an alternate solution to solve the problem in an unusually fast time.

To solve this, I discuss three algorithms (assume lg N = log2 N):
  • Brute Force - O(N7 lg N)
  • A recursive DFS-like method - O(N3)
  • A mathy solution, with a probability "magic number" - O(N lg N)
What are "men"?

In the slides (Slide 3)

Summary

"Men" are defined as having a single head node, an "upper body" node, arms, "lower body", and legs. How they connect is obvious:
  • Arms attach to the "upper body".
  • "Upper body" attaches to the head and the "lower body".
  • "Lower body" attaches to the legs.
Method 1: Brute Force

The Rundown

Brute Force can be applied to this problem to find all possible solutions. But there's a twist to it, because we will run into duplicate men in the graph, which is not allowed. So we can implement this as follows:
  • Have a set of existing "men" that have been found
  • Nest 7 for loops to check for the following:
    • Head
    • Upper Body
    • Left Arm
    • Right Arm
    • Lower Body
    • Left Leg
    • Right Leg
  • In each iteration, check if combination is valid and see if it isn't in the set already.
Checking like this is obviously inefficient, since we would have to nest loops. And each time, we have to check to see if it's valid and if it already exists in the set or not.

Time complexity

The 7 nested loops makes for an impressive O(N7). And, in each call, we have to check the set, which is an O(lg N) operation.

Therefore, the final complexity is O(N7 lg N)... I think we can do better than that.
Method 2: Dynamic Programming

The Rundown

We can implement this in O(N3) time with Dynamic Programming and Memoisation. I will not go over the method here, as it's an instructor-given method. The slides briefly discuss the method for the truly curious.

Comparing with/without Memosation

It makes a huge difference, as usual...

Method 3: The Magic Number
"But I have one more thing to talk about...".

The Rundown

This problem can be solved in O(N lg N) time, but requires us to think about the graph completely differently. Google could not help me here. This method took several rewrites, along with drawings, and spreadsheets to figure out a mathematical approach to skip all steps and generate the correct answer... instantly.

The result? There is a way to find out the number of combinations for everything in constant time with the use of a magic number that we compute before traversing the tree. In fact, computing this number takes up the most time in this method.

I make some observations in my slides:
  • "There are N children in each node, but only 1 parent."

    Whenever considering how to traverse the graph, the most common way would be using a top-down approach. However, in this case, that'd mean we would have to go through N children for each node. Why do that when we could just go up? Since there's only one parent, it reduces N to 1.


  • "Don't traverse nodes. Multiply combinations instead."

    When checking for arms, legs, and lower body nodes, we don't have to resort to traversal. With a magic number, we can simply compute the number of combinations instantly.


  • "Number of head combinations is equal to depth."

    Store the depth of each node in the graph/tree. If we do a bottom-up approach pivoting on upper body nodes, we can compute the number of possible head configurations by simply looking at its depth (since all nodes above a man can be considered an extra head configuration).


  • "Store a magic number (Np) that we can subtract probabilities from."

    I'm getting there... If we know a child's number of combinations (excluding the parent), it is possible to figure out the number of combinations for the current node. It is also possible to compute the number of combinations of a node subtracted by another node's magic number. Since we are going with a bottom-up method, every node is guaranteed to have a valid magic number.


Generating Np, the Magic Number

In the slides, I define Np as follows:


Of course, no one understood what this actually meant. So I'll go over it. Let's use slide 30 as a reference point:



Each node will have its very own Np value, as well as its very own C value. In this context:
  • Np - Magic Number
  • C - Number of possible nodes it can go to (including itself, but not parents)
Let's look at Node 1. That Np is 23! Where did that come from? Let's look at what Node 1 can go to...
  • Node 2 - with a C of 1 (as it can go to only itself)
  • Node 3 - with a C of 3 (as it can go to 3, 4, 5)
  • Node 6 - with a C of 5 (as it can go to 6, 7, 8, 9, 10)
We have all we need to compute Np now. Rather than give a mathy definition, I'll give pseudocode.
Pseudocode
Set Np to 0.
Set INC to the C of the first child.

For every child (starting from the second one) {
	Np += (INC * current child's C)
	INC += current child's C
}
Working it out by hand,
  • Set INC to the first child's C. This is 1.
  • Loop from the second child on...
    1. Second Child (Node 3)...
      • Np is incremented by second child's C multiplied by INC. Np += 1 × 3. This gives us an Np of 3.
      • INC is incremented by the child's C. INC += 3. This gives us an INC of 4.
    2. Third Child (Node 6)...
      • Np is incremented by third child's C multiplied by INC. Np += 5 × 4. This gives us an Np of 23.
      • INC is incremented by the child's C. INC += 5. This gives us an INC of 9.
  • Return Np, which is 23.
Do this for every node in the graph. We can avoid making it O(N2) by being clever. Have each node build this value up incrementally as others are added into the graph (loop upwards to the root). This gives it an O(N lg N) time.

"Okay Clara that's nice and all, but what can I do with Np?"

Np allows us to compute the number of combinations for all arms instantly if we have an upper and lower body node defined. This is defined by the following equation:



From slide 32...



If we assume that the upper body node is Node 1 and assume that the lower body node is Node 6, we can compute all possible combinations of arms with the given equation.

We know that Node 1's Np is 23. We also know that Node 1's C is 10, and Node 6's C is 5. Therefore:



We can even check our answer. Possible arm combinations include 2-3, 2-4, 2-5. We didn't have to traverse at all. That entire step was skipped.

So what's next? We can multiply out the combinations of legs, arms, and the depth of the head node to generate the number of possible men from an upper/lower body pair in constant time.

Comparing to Dynamic Programming w/ Memoisation

Dynamic Programming never had a chance...



The problem limit was 2,000 nodes. This implementation solves a graph of 50,000 nodes in under 10 milliseconds. I think that's overkill enough for this talk.
How did the Topcoders do?
Not bad. Like I said, this isn't as hard as the 1000-pointer score might imply.
  • 444 Topcoders opened the problem.
  • 114 (25.68%) have submitted a solution.
  • 90 (78.95%) of the submissions were correct.
  • Overall Percentage was 20.27%
  • Best time was 12:00
  • Average Correct Time was 31:28