Parallelize Tree Creation With Dask
I need help about a problem that I'm pretty sure dask can solve. But I don't know how to tackle it. I need to construct a tree recursively. For each node if a criterion is met a co
Solution 1:
As I understand, the way you tackle recursion (or any dynamic computation) is to create tasks within a task.
I was experimenting with something similar, so below is my 5 minute illustrative solution. You'd have to optimise it according to characteristics of the algorithm.
Keep in mind that tasks add overhead, so you'd want to chunk the computations for optimal results.
Relevant doc:
Api reference:
- https://distributed.dask.org/en/latest/api.html#distributed.worker_client
- https://distributed.dask.org/en/latest/api.html#distributed.Client.gather
- https://distributed.dask.org/en/latest/api.html#distributed.Client.submit
import numpy as np
import time
from dask.distributed import Client, worker_client
# Create a dask client# For convenience, I'm creating a localcluster.
client = Client(threads_per_worker=1, n_workers=8)
client
classNode:
def__init__(self, level):
self.level = level
self.val = None
self.childs = None# This was missingdefmerge(node, childs):
values = [child.val for child in childs]
ifall(values) andsum(values)<0.1:
node.val = np.mean(values)
else:
node.childs = childs
return node
defcompute_val():
time.sleep(0.1) # Is this required.return np.random.rand(1)
defbuild(node):
print(node.level)
if (np.random.rand(1) < 0.1and node.level>1) or node.level>5:
node.val = compute_val()
else:
with worker_client() as client:
child_futures = [client.submit(build, Node(level=node.level+1)) for _ inrange(2)]
childs = client.gather(child_futures)
node = merge(node, childs)
return node
tree_future = client.submit(build, Node(level=0))
tree = tree_future.result()
Post a Comment for "Parallelize Tree Creation With Dask"