Overriding * imports globally for jupyter

Overriding * imports globally for jupyter - python

I'm running jupyter lab on windows and fastai.vision.utils.verify_images(fns) is giving me problems because it calls fastcore.parallel.parallel with default n_workers=8. There are many ways around it, but I was trying to figure out a code block that I could slap in any notebook and have it so all underlying calls to parallel will run with n_workers=1.
I tried the following cell:
import fastcore
import sys
_fastcore = fastcore
_parallel = lambda *args, **kwargs: fastcore.parallel.parallel(*args, **kwargs, n_workers=1)
_fastcore.parallel.parallel = _parallel
sys.modules['fastcore'] = _fastcore
fastcore.parallel.parallel
printing
<function __main__.<lambda>(*args, **kwargs)>
but when I try running verify_images it still fails as if the patch never happened
---------------------------------------------------------------------------
BrokenProcessPool Traceback (most recent call last)
<ipython-input-37-f1773f2c9e62> in <module>
3 # from mock import patch
4 # with patch('fastcore.parallel.parallel') as _parallel:
----> 5 failed = verify_images(fns)
6 # failed = L(fns[i] for i,o in enumerate(_parallel(verify_image, fns)) if not o)
7 failed
~\anaconda3\lib\site-packages\fastai\vision\utils.py in verify_images(fns)
59 def verify_images(fns):
60 "Find images in `fns` that can't be opened"
---> 61 return L(fns[i] for i,o in enumerate(parallel(verify_image, fns)) if not o)
62
63 # Cell
~\anaconda3\lib\site-packages\fastcore\parallel.py in parallel(f, items, n_workers, total, progress, pause, threadpool, timeout, chunksize, *args, **kwargs)
121 if total is None: total = len(items)
122 r = progress_bar(r, total=total, leave=False)
--> 123 return L(r)
124
125 # Cell
~\anaconda3\lib\site-packages\fastcore\foundation.py in __call__(cls, x, *args, **kwargs)
95 def __call__(cls, x=None, *args, **kwargs):
96 if not args and not kwargs and x is not None and isinstance(x,cls): return x
---> 97 return super().__call__(x, *args, **kwargs)
98
99 # Cell
~\anaconda3\lib\site-packages\fastcore\foundation.py in __init__(self, items, use_list, match, *rest)
103 def __init__(self, items=None, *rest, use_list=False, match=None):
104 if (use_list is not None) or not is_array(items):
--> 105 items = listify(items, *rest, use_list=use_list, match=match)
106 super().__init__(items)
107
~\anaconda3\lib\site-packages\fastcore\basics.py in listify(o, use_list, match, *rest)
54 elif isinstance(o, list): res = o
55 elif isinstance(o, str) or is_array(o): res = [o]
---> 56 elif is_iter(o): res = list(o)
57 else: res = [o]
58 if match is not None:
~\anaconda3\lib\concurrent\futures\process.py in _chain_from_iterable_of_lists(iterable)
482 careful not to keep references to yielded objects.
483 """
--> 484 for element in iterable:
485 element.reverse()
486 while element:
~\anaconda3\lib\concurrent\futures\_base.py in result_iterator()
609 # Careful not to keep a reference to the popped future
610 if timeout is None:
--> 611 yield fs.pop().result()
612 else:
613 yield fs.pop().result(end_time - time.monotonic())
~\anaconda3\lib\concurrent\futures\_base.py in result(self, timeout)
437 raise CancelledError()
438 elif self._state == FINISHED:
--> 439 return self.__get_result()
440 else:
441 raise TimeoutError()
~\anaconda3\lib\concurrent\futures\_base.py in __get_result(self)
386 def __get_result(self):
387 if self._exception:
--> 388 raise self._exception
389 else:
390 return self._result
BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
I suspect it has to do with fastai.vision.utils using * imports for fastcore. Is there a way to achieve what I want?

Since the parallel function has already been imported into the fastai.vision.utils module, the correct way is to monkeypatch that module rather than fastcore.parallel:
... # your code for custom `parallel` function goes here
import fastai.vision.utils
fastai.vision.utils.parallel = _parallel # assign your custom function here

Related

FSSpec Error Handling in Python - Timeout Error

I am trying to get Terraclimate Data from Microsoft Planetary and facing time out error. Is there a possiblity of increasing the timeout time ? Please find the code below and the error I am facing. I am using fsspec and xarray for downloading spatial data from MS Planetary portal.
import fsspec
import xarray as xr
store = fsspec.get_mapper(asset.href)
data = xr.open_zarr(store, **asset.extra_fields["xarray:open_kwargs"])
clipped_data = data.sel(time=slice('2015-01-01','2019-12-31'),lon=slice(min_lon,max_lon),lat=slice(max_lat,min_lat))
parsed_data = clipped_data[['tmax', 'tmin', 'ppt', 'soil']]
lat_list = parsed_data['lat'].values.tolist()
lon_list = parsed_data['lon'].values.tolist()
filename = "Soil_Moisture_sample.csv"
for(i,j) in zip(lat_list,lon_list):
parsed_data[["soil","tmax","tmin","ppt"]].sel(lon=i, lat=j, method="nearest").to_dataframe().to_csv(filename,mode='a',index=False, header=False)
I am getting the following error
TimeoutError Traceback (most recent call last)
File ~\Anaconda3\envs\satellite\lib\site-packages\fsspec\asyn.py:53, in _runner(event, coro, result, timeout)
52 try:
---> 53 result[0] = await coro
54 except Exception as ex:
File ~\Anaconda3\envs\satellite\lib\site-packages\fsspec\asyn.py:423, in AsyncFileSystem._cat(self, path, recursive, on_error, batch_size, **kwargs)
422 if ex:
--> 423 raise ex
424 if (
425 len(paths) > 1
426 or isinstance(path, list)
427 or paths[0] != self._strip_protocol(path)
428 ):
File ~\Anaconda3\envs\satellite\lib\asyncio\tasks.py:455, in wait_for(fut, timeout, loop)
454 if timeout is None:
--> 455 return await fut
457 if timeout <= 0:
File ~\Anaconda3\envs\satellite\lib\site-packages\fsspec\implementations\http.py:221, in HTTPFileSystem._cat_file(self, url, start, end, **kwargs)
220 async with session.get(url, **kw) as r:
--> 221 out = await r.read()
222 self._raise_not_found_for_status(r, url)
File ~\Anaconda3\envs\satellite\lib\site-packages\aiohttp\client_reqrep.py:1036, in ClientResponse.read(self)
1035 try:
-> 1036 self._body = await self.content.read()
1037 for trace in self._traces:
File ~\Anaconda3\envs\satellite\lib\site-packages\aiohttp\streams.py:375, in StreamReader.read(self, n)
374 while True:
--> 375 block = await self.readany()
376 if not block:
File ~\Anaconda3\envs\satellite\lib\site-packages\aiohttp\streams.py:397, in StreamReader.readany(self)
396 while not self._buffer and not self._eof:
--> 397 await self._wait("readany")
399 return self._read_nowait(-1)
File ~\Anaconda3\envs\satellite\lib\site-packages\aiohttp\streams.py:304, in StreamReader._wait(self, func_name)
303 with self._timer:
--> 304 await waiter
305 else:
File ~\Anaconda3\envs\satellite\lib\site-packages\aiohttp\helpers.py:721, in TimerContext.__exit__(self, exc_type, exc_val, exc_tb)
720 if exc_type is asyncio.CancelledError and self._cancelled:
--> 721 raise asyncio.TimeoutError from None
722 return None
TimeoutError:
The above exception was the direct cause of the following exception:
FSTimeoutError Traceback (most recent call last)
Input In [62], in <cell line: 3>()
1 # Flood Region Point - Thiruvanthpuram
2 filename = "Soil_Moisture_sample.csv"
----> 3 parsed_data[["soil","tmax","tmin","ppt"]].sel(lon=8.520833, lat=76.4375, method="nearest").to_dataframe().to_csv(filename,mode='a',index=False, header=False)
File ~\Anaconda3\envs\satellite\lib\site-packages\xarray\core\dataset.py:5898, in Dataset.to_dataframe(self, dim_order)
5870 """Convert this dataset into a pandas.DataFrame.
5871
5872 Non-index variables in this dataset form the columns of the
(...)
5893
5894 """
5896 ordered_dims = self._normalize_dim_order(dim_order=dim_order)
-> 5898 return self._to_dataframe(ordered_dims=ordered_dims)
File ~\Anaconda3\envs\satellite\lib\site-packages\xarray\core\dataset.py:5862, in Dataset._to_dataframe(self, ordered_dims)
5860 def _to_dataframe(self, ordered_dims: Mapping[Any, int]):
5861 columns = [k for k in self.variables if k not in self.dims]
-> 5862 data = [
5863 self._variables[k].set_dims(ordered_dims).values.reshape(-1)
5864 for k in columns
5865 ]
5866 index = self.coords.to_index([*ordered_dims])
5867 return pd.DataFrame(dict(zip(columns, data)), index=index)
File ~\Anaconda3\envs\satellite\lib\site-packages\xarray\core\dataset.py:5863, in <listcomp>(.0)
5860 def _to_dataframe(self, ordered_dims: Mapping[Any, int]):
5861 columns = [k for k in self.variables if k not in self.dims]
5862 data = [
-> 5863 self._variables[k].set_dims(ordered_dims).values.reshape(-1)
5864 for k in columns
5865 ]
5866 index = self.coords.to_index([*ordered_dims])
5867 return pd.DataFrame(dict(zip(columns, data)), index=index)
File ~\Anaconda3\envs\satellite\lib\site-packages\xarray\core\variable.py:527, in Variable.values(self)
524 #property
525 def values(self):
526 """The variable's data as a numpy.ndarray"""
--> 527 return _as_array_or_item(self._data)
File ~\Anaconda3\envs\satellite\lib\site-packages\xarray\core\variable.py:267, in _as_array_or_item(data)
253 def _as_array_or_item(data):
254 """Return the given values as a numpy array, or as an individual item if
255 it's a 0d datetime64 or timedelta64 array.
256
(...)
265 TODO: remove this (replace with np.asarray) once these issues are fixed
266 """
--> 267 data = np.asarray(data)
268 if data.ndim == 0:
269 if data.dtype.kind == "M":
File ~\AppData\Roaming\Python\Python38\site-packages\dask\array\core.py:1696, in Array.__array__(self, dtype, **kwargs)
1695 def __array__(self, dtype=None, **kwargs):
-> 1696 x = self.compute()
1697 if dtype and x.dtype != dtype:
1698 x = x.astype(dtype)
File ~\AppData\Roaming\Python\Python38\site-packages\dask\base.py:315, in DaskMethodsMixin.compute(self, **kwargs)
291 def compute(self, **kwargs):
292 """Compute this dask collection
293
294 This turns a lazy Dask collection into its in-memory equivalent.
(...)
313 dask.base.compute
314 """
--> 315 (result,) = compute(self, traverse=False, **kwargs)
316 return result
File ~\AppData\Roaming\Python\Python38\site-packages\dask\base.py:600, in compute(traverse, optimize_graph, scheduler, get, *args, **kwargs)
597 keys.append(x.__dask_keys__())
598 postcomputes.append(x.__dask_postcompute__())
--> 600 results = schedule(dsk, keys, **kwargs)
601 return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)])
File ~\AppData\Roaming\Python\Python38\site-packages\dask\threaded.py:89, in get(dsk, keys, cache, num_workers, pool, **kwargs)
86 elif isinstance(pool, multiprocessing.pool.Pool):
87 pool = MultiprocessingPoolExecutor(pool)
---> 89 results = get_async(
90 pool.submit,
91 pool._max_workers,
92 dsk,
93 keys,
94 cache=cache,
95 get_id=_thread_get_id,
96 pack_exception=pack_exception,
97 **kwargs,
98 )
100 # Cleanup pools associated to dead threads
101 with pools_lock:
File ~\AppData\Roaming\Python\Python38\site-packages\dask\local.py:511, in get_async(submit, num_workers, dsk, result, cache, get_id, rerun_exceptions_locally, pack_exception, raise_exception, callbacks, dumps, loads, chunksize, **kwargs)
509 _execute_task(task, data) # Re-execute locally
510 else:
--> 511 raise_exception(exc, tb)
512 res, worker_id = loads(res_info)
513 state["cache"][key] = res
File ~\AppData\Roaming\Python\Python38\site-packages\dask\local.py:319, in reraise(exc, tb)
317 if exc.__traceback__ is not tb:
318 raise exc.with_traceback(tb)
--> 319 raise exc
File ~\AppData\Roaming\Python\Python38\site-packages\dask\local.py:224, in execute_task(key, task_info, dumps, loads, get_id, pack_exception)
222 try:
223 task, data = loads(task_info)
--> 224 result = _execute_task(task, data)
225 id = get_id()
226 result = dumps((result, id))
File ~\AppData\Roaming\Python\Python38\site-packages\dask\core.py:119, in _execute_task(arg, cache, dsk)
115 func, args = arg[0], arg[1:]
116 # Note: Don't assign the subtask results to a variable. numpy detects
117 # temporaries by their reference count and can execute certain
118 # operations in-place.
--> 119 return func(*(_execute_task(a, cache) for a in args))
120 elif not ishashable(arg):
121 return arg
File ~\AppData\Roaming\Python\Python38\site-packages\dask\array\core.py:128, in getter(a, b, asarray, lock)
123 # Below we special-case `np.matrix` to force a conversion to
124 # `np.ndarray` and preserve original Dask behavior for `getter`,
125 # as for all purposes `np.matrix` is array-like and thus
126 # `is_arraylike` evaluates to `True` in that case.
127 if asarray and (not is_arraylike(c) or isinstance(c, np.matrix)):
--> 128 c = np.asarray(c)
129 finally:
130 if lock:
File ~\Anaconda3\envs\satellite\lib\site-packages\xarray\core\indexing.py:459, in ImplicitToExplicitIndexingAdapter.__array__(self, dtype)
458 def __array__(self, dtype=None):
--> 459 return np.asarray(self.array, dtype=dtype)
File ~\Anaconda3\envs\satellite\lib\site-packages\xarray\core\indexing.py:623, in CopyOnWriteArray.__array__(self, dtype)
622 def __array__(self, dtype=None):
--> 623 return np.asarray(self.array, dtype=dtype)
File ~\Anaconda3\envs\satellite\lib\site-packages\xarray\core\indexing.py:524, in LazilyIndexedArray.__array__(self, dtype)
522 def __array__(self, dtype=None):
523 array = as_indexable(self.array)
--> 524 return np.asarray(array[self.key], dtype=None)
File ~\Anaconda3\envs\satellite\lib\site-packages\xarray\backends\zarr.py:76, in ZarrArrayWrapper.__getitem__(self, key)
74 array = self.get_array()
75 if isinstance(key, indexing.BasicIndexer):
---> 76 return array[key.tuple]
77 elif isinstance(key, indexing.VectorizedIndexer):
78 return array.vindex[
79 indexing._arrayize_vectorized_indexer(key, self.shape).tuple
80 ]
File ~\Anaconda3\envs\satellite\lib\site-packages\zarr\core.py:788, in Array.__getitem__(self, selection)
786 result = self.vindex[selection]
787 else:
--> 788 result = self.get_basic_selection(pure_selection, fields=fields)
789 return result
File ~\Anaconda3\envs\satellite\lib\site-packages\zarr\core.py:914, in Array.get_basic_selection(self, selection, out, fields)
911 return self._get_basic_selection_zd(selection=selection, out=out,
912 fields=fields)
913 else:
--> 914 return self._get_basic_selection_nd(selection=selection, out=out,
915 fields=fields)
File ~\Anaconda3\envs\satellite\lib\site-packages\zarr\core.py:957, in Array._get_basic_selection_nd(self, selection, out, fields)
951 def _get_basic_selection_nd(self, selection, out=None, fields=None):
952 # implementation of basic selection for array with at least one dimension
953
954 # setup indexer
955 indexer = BasicIndexer(selection, self)
--> 957 return self._get_selection(indexer=indexer, out=out, fields=fields)
File ~\Anaconda3\envs\satellite\lib\site-packages\zarr\core.py:1247, in Array._get_selection(self, indexer, out, fields)
1241 if not hasattr(self.chunk_store, "getitems") or \
1242 any(map(lambda x: x == 0, self.shape)):
1243 # sequentially get one key at a time from storage
1244 for chunk_coords, chunk_selection, out_selection in indexer:
1245
1246 # load chunk selection into output array
-> 1247 self._chunk_getitem(chunk_coords, chunk_selection, out, out_selection,
1248 drop_axes=indexer.drop_axes, fields=fields)
1249 else:
1250 # allow storage to get multiple items at once
1251 lchunk_coords, lchunk_selection, lout_selection = zip(*indexer)
File ~\Anaconda3\envs\satellite\lib\site-packages\zarr\core.py:1939, in Array._chunk_getitem(self, chunk_coords, chunk_selection, out, out_selection, drop_axes, fields)
1935 ckey = self._chunk_key(chunk_coords)
1937 try:
1938 # obtain compressed data for chunk
-> 1939 cdata = self.chunk_store[ckey]
1941 except KeyError:
1942 # chunk not initialized
1943 if self._fill_value is not None:
File ~\Anaconda3\envs\satellite\lib\site-packages\zarr\storage.py:717, in KVStore.__getitem__(self, key)
716 def __getitem__(self, key):
--> 717 return self._mutable_mapping[key]
File ~\Anaconda3\envs\satellite\lib\site-packages\fsspec\mapping.py:137, in FSMap.__getitem__(self, key, default)
135 k = self._key_to_str(key)
136 try:
--> 137 result = self.fs.cat(k)
138 except self.missing_exceptions:
139 if default is not None:
File ~\Anaconda3\envs\satellite\lib\site-packages\fsspec\asyn.py:111, in sync_wrapper.<locals>.wrapper(*args, **kwargs)
108 #functools.wraps(func)
109 def wrapper(*args, **kwargs):
110 self = obj or args[0]
--> 111 return sync(self.loop, func, *args, **kwargs)
File ~\Anaconda3\envs\satellite\lib\site-packages\fsspec\asyn.py:94, in sync(loop, func, timeout, *args, **kwargs)
91 return_result = result[0]
92 if isinstance(return_result, asyncio.TimeoutError):
93 # suppress asyncio.TimeoutError, raise FSTimeoutError
---> 94 raise FSTimeoutError from return_result
95 elif isinstance(return_result, BaseException):
96 raise return_result
FSTimeoutError:

In the line:
store = fsspec.get_mapper(asset.href)
You can pass extra arguments to the fsspec backend, in this case HTTP, see fsspec.implementations.http.HTTPFileSystem. In this case, client_kwargs get passed to aiohttp.ClientSession, and include an optional timeout argument. Your call may look something like
from aiohttp import ClientTimeout
store = get_mapper(asset.href, client_kwargs={"timeout": ClientTimeout(total=5000, connect=1000)})

How to parallel a function taking two arguments and return a dictionary in DASK

I have a function batch_opt taking two arguments (integer i and pandas dataframe train) and return a python dictionary. When I was trying to parallelize the computation using DASK in Python, I got the type error of Delayed objects are immutable. I am new to DASK. Can anyone help me out here? Thanks.
results = []
for i in range(0, 2):
validation_res = delayed(batch_opt)(i, train)
results.append(validation_res)
start = time.time()
res = compute(*results)
print(time.time() - start)
Trace:
TypeError Traceback (most recent call last)
<ipython-input-19-8463f64dec56> in <module>
5
6 start = time.time()
----> 7 res = compute(*results)
8 print(time.time() - start)
~/.conda/envs/odop/lib/python3.8/site-packages/dask/base.py in compute(*args, **kwargs)
568 postcomputes.append(x.__dask_postcompute__())
569
--> 570 results = schedule(dsk, keys, **kwargs)
571 return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)])
572
~/.conda/envs/odop/lib/python3.8/site-packages/dask/threaded.py in get(dsk, result, cache, num_workers, pool, **kwargs)
77 pool = MultiprocessingPoolExecutor(pool)
78
---> 79 results = get_async(
80 pool.submit,
81 pool._max_workers,
~/.conda/envs/odop/lib/python3.8/site-packages/dask/local.py in get_async(submit, num_workers, dsk, result, cache, get_id, rerun_exceptions_locally, pack_exception, raise_exception, callbacks, dumps, loads, chunksize, **kwargs)
505 _execute_task(task, data) # Re-execute locally
506 else:
--> 507 raise_exception(exc, tb)
508 res, worker_id = loads(res_info)
509 state["cache"][key] = res
~/.conda/envs/odop/lib/python3.8/site-packages/dask/local.py in reraise(exc, tb)
313 if exc.__traceback__ is not tb:
314 raise exc.with_traceback(tb)
--> 315 raise exc
316
317
~/.conda/envs/odop/lib/python3.8/site-packages/dask/local.py in execute_task(key, task_info, dumps, loads, get_id, pack_exception)
218 try:
219 task, data = loads(task_info)
--> 220 result = _execute_task(task, data)
221 id = get_id()
222 result = dumps((result, id))
~/.conda/envs/odop/lib/python3.8/site-packages/dask/core.py in _execute_task(arg, cache, dsk)
117 # temporaries by their reference count and can execute certain
118 # operations in-place.
--> 119 return func(*(_execute_task(a, cache) for a in args))
120 elif not ishashable(arg):
121 return arg
<ipython-input-7-e3af5748e1cf> in batch_opt(i, train)
22 test.loc[:, 'seg'] = test.apply(lambda x: proc.assign_trxn(x), axis = 1)
23 test_policy_res, test_metrics_res = opt.analyze_result(fa_m, x, test, cum_to_day, cur_policy, policy)
---> 24 validation_res[(train_mon_yr_batch, test_mon_yr)] = {'train_policy': train_policy_res, 'train_result': train_metrics_res, 'test_policy': test_policy_res, 'test_result': test_metrics_res}
25 return validation_res
~/.conda/envs/odop/lib/python3.8/site-packages/dask/delayed.py in __setitem__(self, index, val)
564
565 def __setitem__(self, index, val):
--> 566 raise TypeError("Delayed objects are immutable")
567
568 def __iter__(self):
TypeError: Delayed objects are immutable

When terminate celery chain conditionally, it not return the data

I have chain of tasks, and want to terminate conditionally, I am following steps in https://stackoverflow.com/a/21106596/243031 but after that, we are not getting the output.
I have tasks as
from __future__ import absolute_import, unicode_literals
from .celery import app
#app.task(bind=True)
def add(self, x, y):
if (x + y) % 2 == 0:
self.request.callbacks[:] = []
return x + y
means when the sum is even and its part of chain, then stop that chain.
But it gives error.
In [13]: ~(add.s(2, 2) | add.s(3))
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-13-0fe85c5d0e22> in <module>
----> 1 ~(add.s(2, 2) | add.s(3))
~/virtualenv/lib/python3.7/site-packages/celery/canvas.py in __invert__(self)
479
480 def __invert__(self):
--> 481 return self.apply_async().get()
482
483 def __reduce__(self):
~/virtualenv/lib/python3.7/site-packages/celery/result.py in get(self, timeout, propagate, interval, no_ack, follow_parents, callback, on_message, on_interval, disable_sync_subtasks, EXCEPTION_STATES, PROPAGATE_STATES)
228 propagate=propagate,
229 callback=callback,
--> 230 on_message=on_message,
231 )
232 wait = get # deprecated alias to :meth:`get`.
~/virtualenv/lib/python3.7/site-packages/celery/backends/asynchronous.py in wait_for_pending(self, result, callback, propagate, **kwargs)
197 callback=None, propagate=True, **kwargs):
198 self._ensure_not_eager()
--> 199 for _ in self._wait_for_pending(result, **kwargs):
200 pass
201 return result.maybe_throw(callback=callback, propagate=propagate)
~/virtualenv/lib/python3.7/site-packages/celery/backends/asynchronous.py in _wait_for_pending(self, result, timeout, on_interval, on_message, **kwargs)
265 for _ in self.drain_events_until(
266 result.on_ready, timeout=timeout,
--> 267 on_interval=on_interval):
268 yield
269 sleep(0)
~/virtualenv/lib/python3.7/site-packages/celery/backends/asynchronous.py in drain_events_until(self, p, timeout, interval, on_interval, wait)
56 pass
57 if on_interval:
---> 58 on_interval()
59 if p.ready: # got event on the wanted channel.
60 break
~/virtualenv/lib/python3.7/site-packages/vine/promises.py in __call__(self, *args, **kwargs)
158 self.value = (ca, ck) = (retval,), {}
159 except Exception:
--> 160 return self.throw()
161 else:
162 self.value = (ca, ck) = final_args, final_kwargs
~/virtualenv/lib/python3.7/site-packages/vine/promises.py in __call__(self, *args, **kwargs)
155 ck = {}
156 else:
--> 157 retval = fun(*final_args, **final_kwargs)
158 self.value = (ca, ck) = (retval,), {}
159 except Exception:
~/virtualenv/lib/python3.7/site-packages/celery/result.py in _maybe_reraise_parent_error(self)
234 def _maybe_reraise_parent_error(self):
235 for node in reversed(list(self._parents())):
--> 236 node.maybe_throw()
237
238 def _parents(self):
~/virtualenv/lib/python3.7/site-packages/celery/result.py in maybe_throw(self, propagate, callback)
333 cache['status'], cache['result'], cache.get('traceback'))
334 if state in states.PROPAGATE_STATES and propagate:
--> 335 self.throw(value, self._to_remote_traceback(tb))
336 if callback is not None:
337 callback(self.id, value)
~/virtualenv/lib/python3.7/site-packages/celery/result.py in throw(self, *args, **kwargs)
326
327 def throw(self, *args, **kwargs):
--> 328 self.on_ready.throw(*args, **kwargs)
329
330 def maybe_throw(self, propagate=True, callback=None):
~/virtualenv/lib/python3.7/site-packages/vine/promises.py in throw(self, exc, tb, propagate)
232 if tb is None and (exc is None or exc is current_exc):
233 raise
--> 234 reraise(type(exc), exc, tb)
235
236 #property
~/virtualenv/lib/python3.7/site-packages/vine/utils.py in reraise(tp, value, tb)
28 if value.__traceback__ is not tb:
29 raise value.with_traceback(tb)
---> 30 raise value
TypeError: 'NoneType' object does not support item assignment
I tried self.request.chain = None and self.request.chain[:] = [],
#app.task(bind=True)
def add(self, x, y):
if self.request.chain and (x + y) % 2 == 0:
self.request.chain = None
return x + y
In logs, it shows, it return the data.
[2021-03-24 22:11:48,795: WARNING/MainProcess] Substantial drift from scrpc_worker#458cd596aed3 may mean clocks are out of sync. Current drift is
14400 seconds. [orig: 2021-03-24 22:11:48.795402 recv: 2021-03-25 02:11:48.796579]
[2021-03-24 22:11:52,227: INFO/MainProcess] Events of group {task} enabled by remote.
[2021-03-24 22:11:57,853: INFO/MainProcess] Received task: myprj.tasks.add[8aaae68f-d5ca-4c0a-8f2e-f1c7b5916e29]
[2021-03-24 22:11:57,867: INFO/ForkPoolWorker-8] Task myprj.tasks.add[8aaae68f-d5ca-4c0a-8f2e-f1c7b5916e29] succeeded in 0.01066690299999884s: 4
but it wait for some socket and when we press ctr+c, it give below traceback.
In [1]: from myprj.tasks import add
In [2]: ~(add.s(2, 2) | add.s(3))
^C---------------------------------------------------------------------------
KeyboardInterrupt Traceback (most recent call last)
<ipython-input-2-0fe85c5d0e22> in <module>
----> 1 ~(add.s(2, 2) | add.s(3))
~/virtualenv/lib/python3.7/site-packages/celery/canvas.py in __invert__(self)
479
480 def __invert__(self):
--> 481 return self.apply_async().get()
482
483 def __reduce__(self):
~/virtualenv/lib/python3.7/site-packages/celery/result.py in get(self, timeout, propagate, interval, no_ack, follow_parents, callback, on_message, on_interval, disable_sync_subtasks, EXCEPTION_STATES, PROPAGATE_STATES)
228 propagate=propagate,
229 callback=callback,
--> 230 on_message=on_message,
231 )
232 wait = get # deprecated alias to :meth:`get`.
~/virtualenv/lib/python3.7/site-packages/celery/backends/asynchronous.py in wait_for_pending(self, result, callback, propagate, **kwargs)
197 callback=None, propagate=True, **kwargs):
198 self._ensure_not_eager()
--> 199 for _ in self._wait_for_pending(result, **kwargs):
200 pass
201 return result.maybe_throw(callback=callback, propagate=propagate)
~/virtualenv/lib/python3.7/site-packages/celery/backends/asynchronous.py in _wait_for_pending(self, result, timeout, on_interval, on_message, **kwargs)
265 for _ in self.drain_events_until(
266 result.on_ready, timeout=timeout,
--> 267 on_interval=on_interval):
268 yield
269 sleep(0)
~/virtualenv/lib/python3.7/site-packages/celery/backends/asynchronous.py in drain_events_until(self, p, timeout, interval, on_interval, wait)
52 raise socket.timeout()
53 try:
---> 54 yield self.wait_for(p, wait, timeout=interval)
55 except socket.timeout:
56 pass
~/virtualenv/lib/python3.7/site-packages/celery/backends/asynchronous.py in wait_for(self, p, wait, timeout)
61
62 def wait_for(self, p, wait, timeout=None):
---> 63 wait(timeout=timeout)
64
65
~/virtualenv/lib/python3.7/site-packages/celery/backends/redis.py in drain_events(self, timeout)
149 if self._pubsub:
150 with self.reconnect_on_error():
--> 151 message = self._pubsub.get_message(timeout=timeout)
152 if message and message['type'] == 'message':
153 self.on_state_change(self._decode_result(message['data']), message)
~/virtualenv/lib/python3.7/site-packages/redis/client.py in get_message(self, ignore_subscribe_messages, timeout)
3615 number.
3616 """
-> 3617 response = self.parse_response(block=False, timeout=timeout)
3618 if response:
3619 return self.handle_message(response, ignore_subscribe_messages)
~/virtualenv/lib/python3.7/site-packages/redis/client.py in parse_response(self, block, timeout)
3501 self.check_health()
3502
-> 3503 if not block and not conn.can_read(timeout=timeout):
3504 return None
3505 response = self._execute(conn, conn.read_response)
~/virtualenv/lib/python3.7/site-packages/redis/connection.py in can_read(self, timeout)
732 self.connect()
733 sock = self._sock
--> 734 return self._parser.can_read(timeout)
735
736 def read_response(self):
~/virtualenv/lib/python3.7/site-packages/redis/connection.py in can_read(self, timeout)
319
320 def can_read(self, timeout):
--> 321 return self._buffer and self._buffer.can_read(timeout)
322
323 def read_response(self):
~/virtualenv/lib/python3.7/site-packages/redis/connection.py in can_read(self, timeout)
229 return bool(self.length) or \
230 self._read_from_socket(timeout=timeout,
--> 231 raise_on_timeout=False)
232
233 def read(self, length):
~/virtualenv/lib/python3.7/site-packages/redis/connection.py in _read_from_socket(self, length, timeout, raise_on_timeout)
196 sock.settimeout(timeout)
197 while True:
--> 198 data = recv(self._sock, socket_read_size)
199 # an empty string indicates the server shutdown the socket
200 if isinstance(data, bytes) and len(data) == 0:
~/virtualenv/lib/python3.7/site-packages/redis/_compat.py in recv(sock, *args, **kwargs)
70 else: # Python 3.5 and above automatically retry EINTR
71 def recv(sock, *args, **kwargs):
---> 72 return sock.recv(*args, **kwargs)
73
74 def recv_into(sock, *args, **kwargs):
KeyboardInterrupt:
First of all, terminating chain is good option or we have to add condition in next chain task that if some condition, don't process, just return what you get.
If we have to terminate the chain, what is the best option? where you get the output of partial chain.

H2O python rbind error

I have a 2000 rows data frame and I'm trying to slice the same data frame into two and combine them together.
t1 = test[:10, :]
t2 = test[20:, :]
temp = t1.rbind(t2)
temp.show()
Then I got this error:
---------------------------------------------------------------------------
EnvironmentError Traceback (most recent call last)
<ipython-input-37-8daeb3375743> in <module>()
2 t2 = test[20:, :]
3 temp = t1.rbind(t2)
----> 4 temp.show()
5 print len(temp)
6 print len(test)
/usr/local/lib/python2.7/dist-packages/h2o/frame.pyc in show(self, use_pandas)
383 print("This H2OFrame has been removed.")
384 return
--> 385 if not self._ex._cache.is_valid(): self._frame()._ex._cache.fill()
386 if H2ODisplay._in_ipy():
387 import IPython.display
/usr/local/lib/python2.7/dist-packages/h2o/frame.pyc in _frame(self, fill_cache)
423
424 def _frame(self, fill_cache=False):
--> 425 self._ex._eager_frame()
426 if fill_cache:
427 self._ex._cache.fill()
/usr/local/lib/python2.7/dist-packages/h2o/expr.pyc in _eager_frame(self)
67 if not self._cache.is_empty(): return self
68 if self._cache._id is not None: return self # Data already computed under ID, but not cached locally
---> 69 return self._eval_driver(True)
70
71 def _eager_scalar(self): # returns a scalar (or a list of scalars)
/usr/local/lib/python2.7/dist-packages/h2o/expr.pyc in _eval_driver(self, top)
81 def _eval_driver(self, top):
82 exec_str = self._do_it(top)
---> 83 res = ExprNode.rapids(exec_str)
84 if 'scalar' in res:
85 if isinstance(res['scalar'], list): self._cache._data = [float(x) for x in res['scalar']]
/usr/local/lib/python2.7/dist-packages/h2o/expr.pyc in rapids(expr)
163 The JSON response (as a python dictionary) of the Rapids execution
164 """
--> 165 return H2OConnection.post_json("Rapids", ast=expr,session_id=H2OConnection.session_id(), _rest_version=99)
166
167 class ASTId:
/usr/local/lib/python2.7/dist-packages/h2o/connection.pyc in post_json(url_suffix, file_upload_info, **kwargs)
515 if __H2OCONN__ is None:
516 raise ValueError("No h2o connection. Did you run `h2o.init()` ?")
--> 517 return __H2OCONN__._rest_json(url_suffix, "POST", file_upload_info, **kwargs)
518
519 def _rest_json(self, url_suffix, method, file_upload_info, **kwargs):
/usr/local/lib/python2.7/dist-packages/h2o/connection.pyc in _rest_json(self, url_suffix, method, file_upload_info, **kwargs)
518
519 def _rest_json(self, url_suffix, method, file_upload_info, **kwargs):
--> 520 raw_txt = self._do_raw_rest(url_suffix, method, file_upload_info, **kwargs)
521 return self._process_tables(raw_txt.json())
522
/usr/local/lib/python2.7/dist-packages/h2o/connection.pyc in _do_raw_rest(self, url_suffix, method, file_upload_info, **kwargs)
592 raise EnvironmentError(("h2o-py got an unexpected HTTP status code:\n {} {} (method = {}; url = {}). \n"+ \
593 "detailed error messages: {}")
--> 594 .format(http_result.status_code,http_result.reason,method,url,detailed_error_msgs))
595
596
EnvironmentError: h2o-py got an unexpected HTTP status code:
500 Server Error (method = POST; url = http://localhost:54321/99/Rapids).
detailed error messages: []
If I count rows (len(temp)), it works find. Also if I change the slicing index a little bit, it works find too. For example, if I change to this, it shows the data frame.
t1 = test[:10, :]
t2 = test[:5, :]
Do I miss something here? Thanks.

Unclear what happened without more information (logs would probably say why the rbind did not take).
What version are you using? I tried your code with iris on the bleeding edge and it all worked as expected.
By the way, rbind is typically going to be expensive, especially since what you're semantically after is a subset:
test[range(10) + range(20,test.nrow),:]
should also give you the desired subset (with caveat that you make the full list of row indices in python and pass it over REST to h2o).

Error invalid value when using CUDA [duplicate]

I'm having this error when trying to run this code in Python using CUDA. I'm following this tutorial but i'm trying it in Windows 7 x64 machine.
https://www.youtube.com/watch?v=jKV1m8APttU
In fact, I run check_cuda() and all tests passed. Can anyone help me what is the exact issue here.
My Code:
import numpy as np
from timeit import default_timer as timer
from numbapro import vectorize, cuda
#vectorize(['float64(float64, float64)'], target='gpu')
def VectorAdd(a, b):
return a + b
def main():
N = 32000000
A = np.ones(N, dtype=np.float64)
B = np.ones(N, dtype=np.float64)
C = np.zeros(N, dtype=np.float64)
start = timer()
C = VectorAdd(A, B)
vectoradd_time = timer() - start
print("C[:5] = " + str(C[:5]))
print("C[-5:] = " + str(C[-5:]))
print("VectorAdd took %f seconds" % vectoradd_time)
if __name__ == '__main__':
main()
Error Message:
---------------------------------------------------------------------------
CudaAPIError Traceback (most recent call last)
<ipython-input-18-2436fc2ab63a> in <module>()
1 if __name__ == '__main__':
----> 2 main()
<ipython-input-17-64de53fdbe77> in main()
7
8 start = timer()
----> 9 C = VectorAdd(A, B)
10 vectoradd_time = timer() - start
11
C:\Anaconda2\lib\site-packages\numba\cuda\dispatcher.pyc in __call__(self, *args, **kws)
93 the input arguments.
94 """
---> 95 return CUDAUFuncMechanism.call(self.functions, args, kws)
96
97 def reduce(self, arg, stream=0):
C:\Anaconda2\lib\site-packages\numba\npyufunc\deviceufunc.pyc in call(cls, typemap, args, kws)
297
298 devarys.extend([devout])
--> 299 cr.launch(func, shape[0], stream, devarys)
300
301 if any_device:
C:\Anaconda2\lib\site-packages\numba\cuda\dispatcher.pyc in launch(self, func, count, stream, args)
202
203 def launch(self, func, count, stream, args):
--> 204 func.forall(count, stream=stream)(*args)
205
206 def is_device_array(self, obj):
C:\Anaconda2\lib\site-packages\numba\cuda\compiler.pyc in __call__(self, *args)
193
194 return kernel.configure(blkct, tpb, stream=self.stream,
--> 195 sharedmem=self.sharedmem)(*args)
196
197 class CUDAKernelBase(object):
C:\Anaconda2\lib\site-packages\numba\cuda\compiler.pyc in __call__(self, *args, **kwargs)
357 blockdim=self.blockdim,
358 stream=self.stream,
--> 359 sharedmem=self.sharedmem)
360
361 def bind(self):
C:\Anaconda2\lib\site-packages\numba\cuda\compiler.pyc in _kernel_call(self, args, griddim, blockdim, stream, sharedmem)
431 sharedmem=sharedmem)
432 # Invoke kernel
--> 433 cu_func(*kernelargs)
434
435 if self.debug:
C:\Anaconda2\lib\site-packages\numba\cuda\cudadrv\driver.pyc in __call__(self, *args)
1114
1115 launch_kernel(self.handle, self.griddim, self.blockdim,
-> 1116 self.sharedmem, streamhandle, args)
1117
1118 #property
C:\Anaconda2\lib\site-packages\numba\cuda\cudadrv\driver.pyc in launch_kernel(cufunc_handle, griddim, blockdim, sharedmem, hstream, args)
1158 hstream,
1159 params,
-> 1160 None)
1161
1162
C:\Anaconda2\lib\site-packages\numba\cuda\cudadrv\driver.pyc in safe_cuda_api_call(*args)
220 def safe_cuda_api_call(*args):
221 retcode = libfn(*args)
--> 222 self._check_error(fname, retcode)
223
224 setattr(self, fname, safe_cuda_api_call)
C:\Anaconda2\lib\site-packages\numba\cuda\cudadrv\driver.pyc in _check_error(self, fname, retcode)
250 errname = ERROR_MAP.get(retcode, "UNKNOWN_CUDA_ERROR")
251 msg = "Call to %s results in %s" % (fname, errname)
--> 252 raise CudaAPIError(retcode, msg)
253
254 def get_device(self, devnum=0):
CudaAPIError: [1] Call to cuLaunchKernel results in CUDA_ERROR_INVALID_VALUE

I found a solution to my problem through NVIDIA Developer Forum. If you wanna know more info regarding the solution check out this link.
https://devtalk.nvidia.com/default/topic/962843/cuda-programming-and-performance/cudaapierror-1-call-to-culaunchkernel-results-in-cuda_error_invalid_value-in-python/?offset=3#4968130
In Short:
When I changed the N = 32000 or any other smaller amount, it did work nicely.
In fact, this means I am not compiling it in correct GPU type(check_cuda is the function call to verify it).
Hope my answer would help for someone.

This may mean, that you try to run more threads in one block as it is actually allowed. For me it was the case. So try to split your execution in blocks.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Overriding * imports globally for jupyter - python

Related

FSSpec Error Handling in Python - Timeout Error

How to parallel a function taking two arguments and return a dictionary in DASK

When terminate celery chain conditionally, it not return the data

H2O python rbind error

Error invalid value when using CUDA [duplicate]

Categories

Resources