How to retrieve a task’s progress as percentage done in Dask?
I’d like to query for a task’s status in a Dask cluster by retrieving a percentage completed beyond the visual progress bar or dashboard. For example, I’m submitting this task below:
What is the cleanest way to detect whether I’m running in a dask Worker
Given a dask.distributed
cluster, for example a LocalCluster
, what is the most robust way to detect if I’m running a python code from within a Worker instance?
UserWarning while writing a large HDF file in Dask
I’m trying to write a large data (a tuple of dict; dict contains key and Dask Array) to disk as HDF using the function below.
Is there a way to restrict stack traces from emitting dataframe contents?
By default, our system logs stack traces in logs output. Generally, we’re careful to not log contents of dataframes we’re working with as they may contain sensitive user data. However, when Dask crashes, it seems to emit a sample of the dataframe it was working with:
Dask Can’t synchronously read data shared on NFS
Running Dask Scheduler on system A and workers on system A and B. NFS volume from system A is shared on the network through NFS with system B, and contains the data files. This folder has a symbolic link to my home directory due to path issues.