cnextend : Functionality for the hiearchical tree¶
-
class
idpflex.cnextend.ClusterNodeX(*args, **kwargs)[source]¶ Bases:
scipy.cluster.hierarchy.ClusterNodeExtension of
ClusterNodeto accommodate a parent reference and a protected dictionary of properties.-
add_property(a_property)[source]¶ Insert or update a property in the set of properties
Parameters: a_property ( ProfileProperty) – a property instance
-
distance_submatrix(dist_mat)[source]¶ Extract matrix of distances between leafs under the node.
Parameters: dist_mat (numpy.ndarray) – Distance matrix (square or in condensed form) among all N leaves of the tree to which the node belongs to. The row indexes of dist_mat must correspond to the node IDs of the leaves. Returns: square distance matrix MxM between the M leafs under the node Return type: ndarray
-
leaf_ids¶ ID’s of the leafs under the tree, ordered by increasing ID.
Returns: Return type: list
-
leafs¶ Find the leaf nodes under this cluster node.
Returns: node leafs ordered by increasing ID Return type: list
-
representative(dist_mat, similarity=<function mean>)[source]¶ Find leaf under node that is most similar to all other leaves under the node
Find the leaf that minimizes the similarity between itself and all the other leaves under the node. For instance, the average of all distances between one leaf and all the other leaves results in a similarity scalar for the leaf.
Parameters: - dist_mat (
ndarray) – condensed or square distance matrix MxM or NxN among all N leaves in the tree or among all M leaves under the node. If dealing with the distance matrix among all leaves in the tree, self.distance_submatrix is first applied. - similarity (function object) – reduction operation on a the list of distances between one leaf and the other (M-1) leaves.
Returns: representative leaf node
Return type: - dist_mat (
-
-
class
idpflex.cnextend.Tree(z=None)[source]¶ Bases:
objectHierarchical binary tree.
Parameters: z ( ndarray) – linkage matrix from which to create the tree. Seelinkage()-
from_linkage_matrix(z, node_class=<class idpflex.cnextend.ClusterNodeX>)[source]¶ Refactored
to_tree()converts a hierarchical clustering encoded in matrix z (by linkage) into a convenient tree object.Each node_class instance has a left, right, dist, id, and count attribute. The left and right attributes point to node_class instances that were combined to generate the cluster. If both are None then node_class is a leaf node, its count must be 1, and its distance is meaningless but set to 0.
Parameters: - z (
ndarray) – linkage matrix. Seelinkage() - node_class (
ClusterNodeX) – the type of nodes composing the tree. Now supportsClusterNodeXand parent classClusterNode
- z (
-
leafs¶ Returns: leaf nodes ordered by increasing ID Return type: list
-
nodes_above_depth(depth=0)[source]¶ Nodes at or above depth from the root node
Parameters: depth (int) – Depth level starting from the root level (depth=0) Returns: List of nodes ordered by increasing ID. Last one is the root node Return type: list
-
-
idpflex.cnextend.load_tree(filename)[source]¶ Load a previously saved tree
Parameters: filename (str) – File name containing the serialized tree Returns: Tree instance stored in file Return type: Tree
-
idpflex.cnextend.random_distance_tree(*args, **kwargs)[source]¶ Instantiate a tree where leafs and nodes have random distances to each other.
Distances randomly retrieved from a flat distribution of numbers between 0 and 1
Parameters: n_leafs (int) – Number of tree leaves Returns: Elements of the named tuple: - tree: TreeTree instance- distance_matrix:
ndarray - square distance matrix in between pair of tree leafs
- distance_matrix:
Return type: namedtuple