Skip to content Skip to sidebar Skip to footer

Parallelize Tree Creation With Dask

I need help about a problem that I'm pretty sure dask can solve. But I don't know how to tackle it. I need to construct a tree recursively. For each node if a criterion is met a co

Solution 1:

As I understand, the way you tackle recursion (or any dynamic computation) is to create tasks within a task.

I was experimenting with something similar, so below is my 5 minute illustrative solution. You'd have to optimise it according to characteristics of the algorithm.

Keep in mind that tasks add overhead, so you'd want to chunk the computations for optimal results.

Relevant doc:

Api reference:

import numpy as np
import time
from dask.distributed import Client, worker_client

# Create a dask client# For convenience, I'm creating a localcluster.
client = Client(threads_per_worker=1, n_workers=8)
client

classNode:
    def__init__(self, level):
        self.level = level
        self.val = None
        self.childs = None# This was missingdefmerge(node, childs):
    values = [child.val for child in childs]
    ifall(values) andsum(values)<0.1:
        node.val = np.mean(values)
    else:
        node.childs = childs
    return node        

defcompute_val():
    time.sleep(0.1)            # Is this required.return np.random.rand(1)

defbuild(node):
    print(node.level)
    if (np.random.rand(1) < 0.1and node.level>1) or node.level>5:
        node.val = compute_val()
    else:
        with worker_client() as client:
            child_futures = [client.submit(build, Node(level=node.level+1)) for _ inrange(2)]
            childs = client.gather(child_futures)
        node = merge(node, childs)
    return node

tree_future = client.submit(build, Node(level=0))
tree = tree_future.result()

Post a Comment for "Parallelize Tree Creation With Dask"