Searching descendants in Hierarchy family trees
Hi, I have a specific problem that I do not know how to go about. I am making a routine for a biological web application, where, to keep it simple, it is alike a dichotomous (always 2 branching) family tree with descendents and many branches. The family tree/branching can be long up to 50 or more generations. Branching Always in twos (= 2 children if you like)
I want a routine that from a particular branch, it gives all the descendents resulting ONLY from that branch.
The data of the branching is in mysql as follows (from csv file)
k/s: k = branch; s=descendent (stop)
par: Parent branch
this: This branch
next: Next branch (if 0 = descendent)
CODE k/s par this next descendant name
G-TST_01a k 0 1 2
G-TST_01b k 0 1 10
G-TST_02a k 1 2 3
G-TST_02b k 1 2 4
G-TST_03a s 2 3 0 A
G-TST_03b s 2 3 0 B
G-TST_04a k 2 4 5
G-TST_04b k 2 4 9
G-TST_05a k 4 5 6
G-TST_05b k 4 5 7
G-TST_06a s 5 6 0 C
G-TST_06b s 5 6 0 D
G-TST_07a k 5 7 8
G-TST_07b s 5 7 0 G
G-TST_08a s 7 8 0 E
G-TST_08b s 7 8 0 F
G-TST_09a s 4 9 0 H
G-TST_09b s 4 9 0 I
G-TST_10a s 1 10 0 J
G-TST_10b k 1 10 11
G-TST_11a k 10 11 12
G-TST_11b k 10 11 15
G-TST_12a k 11 12 13
G-TST_12b k 11 12 14
G-TST_13a s 12 13 0 K
G-TST_13b s 12 13 0 L
G-TST_14a s 12 14 0 M
G-TST_14b s 12 14 0 N
G-TST_15a k 11 15 16
G-TST_15b k 11 15 17
G-TST_16a s 15 16 0 O
G-TST_16b s 15 16 0 P
G-TST_17a s 15 17 0 Q
G-TST_17b s 15 17 0 R
....|.2..| ......____ C
__|........4... |___ D
1-6 branches / A-G Descendents
I want for exaple, all descendets from any branch, si as for example branch 2 give me A-D; Branch 6 = F,G, Branch 1=A-G.
There are 4 sitiations of which I resolves 3 but not the 4th
SS (Branch ending with 2 descendents => stop routine)
KS/SK (Branch with one descendent and one further branch => keep looping)
KK (Two branches => keep looping but it get complex looping to remember both branches)
Really you only have a few options here. Since rdbms' aren't designed to use inheritance directly in the model, you'll need to either create a tree datastructure (memory heavy), create looping or recursion with many selections (resource heavy), or figure out an ideal balance. If you know exactly what the maximum depth is, you could query it directly.
What I'd suggest is coming up with a scheme that lets you put more burden on the insertion of data into the db (given your second post you are using a db). The idea behind this would be to mock a binary search tree out of it (or even an mtree as you don't really want to enforce that at the db level [although you can depending on the structure you choose]), where you effectively rebalance the tree after every insertion. So for example, you insert a value and it has an id of 1 (don't use auto increment though!). Then you insert its first child, and you give it a value of 2. This is the "right" side of 1. Now you insert another child, but you don't want to call it three, as logically (from the numerical side), that would be the right side of 2. So you alter the values of the current keys, set the 2 to 3, set the 1 to 2, and set the new value to 1. It takes a bit of work to set this up properly, but the plus side is you can issue entire updates with a single query since it will always be relative to an exiting number. The trick is coming up with the numbers to start with.
Now when you select, you'll be able to pull based completely on the numerical values as if it were a numerical search tree. So when I go to select the children from say, Zeus, I can simply select from the table where the id's are either > or < Zeus' parents (so that would include zeus), depending on if Zeus is considerd > or < his parent's. The trick is how you set up Cronus and Rhea to represent a single entity since you cannot base it solely on either Cronus or Rhea.
Determining the leaf's are as simple as determining if they have children.
Ultimately, I think looping is your best option here. At 50x+ generations, you are starting to get a staggering amount of data. Using recursion or an entire tree are not wise. I'd personally take a mix of the above using a binary search approach and mix it in with OO tree structure so that I end up with a tree datastructure that I can search, is limited to just the branch requested, and can display as I want (even fetching just the leaf nodes on a breadth traversal). No matter how you slice it, hierarchy display in PHP and rdbms' is not a simple task.
Thank you for your time Fou Lu.
Your explanation makes sense and it is not easy. After 2 days arriving at nothing I started to get desperate and as a reaction I posted here, and even asked the service of professional codes from the website: www.teaminindia.co.uk (the first it came on google) as I thought this would be a 4 hour job for a pro and I could afford it.
However that day I worked hard, and now I managed to do the routine. It is based in looping with memorizing the number of loops where the is a multiple branching. Now I adapted that to a function and I am very happy.
IF it was a real family tree with multiple children, I think it would got above my limits, as you clearly explained, but my model involved of always having 2 branches (Descendants) from a parent, and these in turn can have 2, so on so forth, where the descendent is either another parent (keep branching = keep looping) or without children (=routine stops).
So the situation was one of the following 4
Branch to 2 Parents : K K
Branch to a Parent and a non-Parent: K S
Branch to a non-parent and a parent: S K (same as above)
Branch to 2 non parents: S S
SS, KS, SK were no problem: SS stops the lineage, SK / KS the lineage continues through K. However KK involved memorizing one of the K, which can keep going through a series of generations. I solved this by an array_push of the branch (K) to be followed once an SS is reached, and proper out during the loop when executed for the next in line. I place an 'end' in the stack loop array which will mean there is no more memory, and the looping stops. :thumbsup:
PS: have ever someone tried these services where you hire a coder to make for you a piece of code? What are the recommendations? E.g.
|All times are GMT +1. The time now is 07:03 AM.|
Powered by vBulletin®
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.