How to do inference with fined-tuned huggingface models? - python

The bounty expires in 6 days. Answers to this question are eligible for a +100 reputation bounty.
George Liu wants to draw more attention to this question.
I have fine-tuned a Huggingface model using the IMDB dataset, and I was able to use the trainer to make predictions on the test set by doing trainer.predict(test_ds_encoded). However, when doing the same thing with the inference set that has a dummy label feature (all -1s instead of 0s and 1s), the trainer threw an error:
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [0,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [1,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [2,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [3,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [4,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [5,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [6,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [7,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [8,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [9,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [10,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [11,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [12,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [13,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [14,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [15,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [16,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [17,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [18,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [19,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [20,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [21,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [22,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [23,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [24,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [25,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [26,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [27,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [28,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [29,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [30,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [31,0,0] Assertion `t >= 0 && t < n_classes` failed.
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
/tmp/ipykernel_23/4156768683.py in <module>
----> 1 trainer.predict(inference_ds_encoded)
/opt/conda/lib/python3.7/site-packages/transformers/trainer.py in predict(self, test_dataset, ignore_keys, metric_key_prefix)
2694 eval_loop = self.prediction_loop if self.args.use_legacy_prediction_loop else self.evaluation_loop
2695 output = eval_loop(
-> 2696 test_dataloader, description="Prediction", ignore_keys=ignore_keys, metric_key_prefix=metric_key_prefix
2697 )
2698 total_batch_size = self.args.eval_batch_size * self.args.world_size
/opt/conda/lib/python3.7/site-packages/transformers/trainer.py in evaluation_loop(self, dataloader, description, prediction_loss_only, ignore_keys, metric_key_prefix)
2819 )
2820 if logits is not None:
-> 2821 logits = self._pad_across_processes(logits)
2822 logits = self._nested_gather(logits)
2823 if self.preprocess_logits_for_metrics is not None:
/opt/conda/lib/python3.7/site-packages/transformers/trainer.py in _pad_across_processes(self, tensor, pad_index)
2953 return tensor
2954 # Gather all sizes
-> 2955 size = torch.tensor(tensor.shape, device=tensor.device)[None]
2956 sizes = self._nested_gather(size).cpu()
2957
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
I then removed the label feature: trainer.predict(inference_ds_encoded.remove_columns('label')), but still got an error:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
/tmp/ipykernel_23/899960315.py in <module>
----> 1 trainer.predict(inference_ds_encoded.remove_columns('label'))
/opt/conda/lib/python3.7/site-packages/transformers/trainer.py in predict(self, test_dataset, ignore_keys, metric_key_prefix)
2694 eval_loop = self.prediction_loop if self.args.use_legacy_prediction_loop else self.evaluation_loop
2695 output = eval_loop(
-> 2696 test_dataloader, description="Prediction", ignore_keys=ignore_keys, metric_key_prefix=metric_key_prefix
2697 )
2698 total_batch_size = self.args.eval_batch_size * self.args.world_size
/opt/conda/lib/python3.7/site-packages/transformers/trainer.py in evaluation_loop(self, dataloader, description, prediction_loss_only, ignore_keys, metric_key_prefix)
2796
2797 # Prediction step
-> 2798 loss, logits, labels = self.prediction_step(model, inputs, prediction_loss_only, ignore_keys=ignore_keys)
2799 inputs_decode = inputs["input_ids"] if args.include_inputs_for_metrics else None
2800
/opt/conda/lib/python3.7/site-packages/transformers/trainer.py in prediction_step(self, model, inputs, prediction_loss_only, ignore_keys)
2999 """
3000 has_labels = all(inputs.get(k) is not None for k in self.label_names)
-> 3001 inputs = self._prepare_inputs(inputs)
3002 if ignore_keys is None:
3003 if hasattr(self.model, "config"):
/opt/conda/lib/python3.7/site-packages/transformers/trainer.py in _prepare_inputs(self, inputs)
2261 handling potential state.
2262 """
-> 2263 inputs = self._prepare_input(inputs)
2264 if len(inputs) == 0:
2265 raise ValueError(
/opt/conda/lib/python3.7/site-packages/transformers/trainer.py in _prepare_input(self, data)
2243 """
2244 if isinstance(data, Mapping):
-> 2245 return type(data)({k: self._prepare_input(v) for k, v in data.items()})
2246 elif isinstance(data, (tuple, list)):
2247 return type(data)(self._prepare_input(v) for v in data)
/opt/conda/lib/python3.7/site-packages/transformers/trainer.py in <dictcomp>(.0)
2243 """
2244 if isinstance(data, Mapping):
-> 2245 return type(data)({k: self._prepare_input(v) for k, v in data.items()})
2246 elif isinstance(data, (tuple, list)):
2247 return type(data)(self._prepare_input(v) for v in data)
/opt/conda/lib/python3.7/site-packages/transformers/trainer.py in _prepare_input(self, data)
2253 # may need special handling to match the dtypes of the model
2254 kwargs.update(dict(dtype=self.args.hf_deepspeed_config.dtype()))
-> 2255 return data.to(**kwargs)
2256 return data
2257
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
I also tried using the trained model object to make predictions, and I got a different error:
text = ["I like the film it's really exciting!", "I hate the movie, it's so boring!!"]
encoding = tokenizer(text)
outputs = model(**encoding)
predictions = outputs.logits.argmax(-1)
Error:
AttributeError Traceback (most recent call last)
/tmp/ipykernel_23/94414684.py in <module>
1 text = ["I like the film it's really exciting!", "I hate the movie, it's so boring!!"]
2 encoding = tokenizer(text)
----> 3 outputs = model(**encoding)
4 predictions = outputs.logits.argmax(-1)
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
1108 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1109 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1110 return forward_call(*input, **kwargs)
1111 # Do not call functions when jit is used
1112 full_backward_hooks, non_full_backward_hooks = [], []
/opt/conda/lib/python3.7/site-packages/transformers/models/distilbert/modeling_distilbert.py in forward(self, input_ids, attention_mask, head_mask, inputs_embeds, labels, output_attentions, output_hidden_states, return_dict)
752 output_attentions=output_attentions,
753 output_hidden_states=output_hidden_states,
--> 754 return_dict=return_dict,
755 )
756 hidden_state = distilbert_output[0] # (bs, seq_len, dim)
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
1108 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1109 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1110 return forward_call(*input, **kwargs)
1111 # Do not call functions when jit is used
1112 full_backward_hooks, non_full_backward_hooks = [], []
/opt/conda/lib/python3.7/site-packages/transformers/models/distilbert/modeling_distilbert.py in forward(self, input_ids, attention_mask, head_mask, inputs_embeds, output_attentions, output_hidden_states, return_dict)
549 raise ValueError("You cannot specify both input_ids and inputs_embeds at the same time")
550 elif input_ids is not None:
--> 551 input_shape = input_ids.size()
552 elif inputs_embeds is not None:
553 input_shape = inputs_embeds.size()[:-1]
AttributeError: 'list' object has no attribute 'size'
My code can be found on Kaggle here: https://www.kaggle.com/code/georgeliu/imdb-text-classification-with-transformers.

Related

Floating point exception (core dumped) for UNet implementation

I am trying to do an implementation of KiuNet ( https://github.com/jeya-maria-jose/KiU-Net-pytorch ). But when I am executing the train command like so:
python train.py --train_dataset "KiuNet/Train Folder/" --val_dataset "KiuNet/Validation Folder/" --direc 'KiuNet/Results/' --batch_size 1 --epoch 200 --save_freq 10 --modelname "kiunet" --learning_rate 0.0001
I am getting the following error:
Traceback (most recent call last):
File "KiuNet/KiU-Net-pytorch/train.py", line 235, in <module>
loss.backward()
File "/miniconda3/lib/python3.9/site-packages/torch/_tensor.py", line 487, in backward
torch.autograd.backward(
File "/miniconda3/lib/python3.9/site-packages/torch/autograd/__init__.py", line 197, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR
../aten/src/ATen/native/cuda/NLLLoss2d.cu:104: nll_loss2d_forward_kernel: block: [0,0,0], thread: [847,0,0] Assertion `t >= 0 && t < n_classes` failed.
../aten/src/ATen/native/cuda/NLLLoss2d.cu:104: nll_loss2d_forward_kernel: block: [0,0,0], thread: [958,0,0] Assertion `t >= 0 && t < n_classes` failed.
../aten/src/ATen/native/cuda/NLLLoss2d.cu:104: nll_loss2d_forward_kernel: block: [0,0,0], thread: [703,0,0] Assertion `t >= 0 && t < n_classes` failed.
../aten/src/ATen/native/cuda/NLLLoss2d.cu:104: nll_loss2d_forward_kernel: block: [0,0,0], thread: [830,0,0] Assertion `t >= 0 && t < n_classes` failed.
../aten/src/ATen/native/cuda/NLLLoss2d.cu:104: nll_loss2d_forward_kernel: block: [0,0,0], thread: [831,0,0] Assertion `t >= 0 && t < n_classes` failed.
../aten/src/ATen/native/cuda/NLLLoss2d.cu:104: nll_loss2d_forward_kernel: block: [0,0,0], thread: [575,0,0] Assertion `t >= 0 && t < n_classes` failed.
../aten/src/ATen/native/cuda/NLLLoss2d.cu:104: nll_loss2d_forward_kernel: block: [0,0,0], thread: [974,0,0] Assertion `t >= 0 && t < n_classes` failed.
../aten/src/ATen/native/cuda/NLLLoss2d.cu:104: nll_loss2d_forward_kernel: block: [0,0,0], thread: [77,0,0] Assertion `t >= 0 && t < n_classes` failed.
../aten/src/ATen/native/cuda/NLLLoss2d.cu:104: nll_loss2d_forward_kernel: block: [0,0,0], thread: [78,0,0] Assertion `t >= 0 && t < n_classes` failed.
../aten/src/ATen/native/cuda/NLLLoss2d.cu:104: nll_loss2d_forward_kernel: block: [0,0,0], thread: [719,0,0] Assertion `t >= 0 && t < n_classes` failed.
../aten/src/ATen/native/cuda/NLLLoss2d.cu:104: nll_loss2d_forward_kernel: block: [0,0,0], thread: [720,0,0] Assertion `t >= 0 && t < n_classes` failed.
../aten/src/ATen/native/cuda/NLLLoss2d.cu:104: nll_loss2d_forward_kernel: block: [0,0,0], thread: [592,0,0] Assertion `t >= 0 && t < n_classes` failed.
../aten/src/ATen/native/cuda/NLLLoss2d.cu:104: nll_loss2d_forward_kernel: block: [0,0,0], thread: [593,0,0] Assertion `t >= 0 && t < n_classes` failed.
../aten/src/ATen/native/cuda/NLLLoss2d.cu:104: nll_loss2d_forward_kernel: block: [0,0,0], thread: [209,0,0] Assertion `t >= 0 && t < n_classes` failed.
../aten/src/ATen/native/cuda/NLLLoss2d.cu:104: nll_loss2d_forward_kernel: block: [0,0,0], thread: [465,0,0] Assertion `t >= 0 && t < n_classes` failed.
../aten/src/ATen/native/cuda/NLLLoss2d.cu:104: nll_loss2d_forward_kernel: block: [0,0,0], thread: [337,0,0] Assertion `t >= 0 && t < n_classes` failed.
When I am running the train command with CUDA_LAUNCH_BLOCKING=1 I get the following error:
../aten/src/ATen/native/cuda/NLLLoss2d.cu:104: nll_loss2d_forward_kernel: block: [0,0,0], thread: [840,0,0] Assertion `t >= 0 && t < n_classes` failed.
../aten/src/ATen/native/cuda/NLLLoss2d.cu:104: nll_loss2d_forward_kernel: block: [0,0,0], thread: [580,0,0] Assertion `t >= 0 && t < n_classes` failed.
../aten/src/ATen/native/cuda/NLLLoss2d.cu:104: nll_loss2d_forward_kernel: block: [0,0,0], thread: [453,0,0] Assertion `t >= 0 && t < n_classes` failed.
../aten/src/ATen/native/cuda/NLLLoss2d.cu:104: nll_loss2d_forward_kernel: block: [0,0,0], thread: [326,0,0] Assertion `t >= 0 && t < n_classes` failed.
../aten/src/ATen/native/cuda/NLLLoss2d.cu:104: nll_loss2d_forward_kernel: block: [0,0,0], thread: [71,0,0] Assertion `t >= 0 && t < n_classes` failed.
../aten/src/ATen/native/cuda/NLLLoss2d.cu:104: nll_loss2d_forward_kernel: block: [0,0,0], thread: [712,0,0] Assertion `t >= 0 && t < n_classes` failed.
../aten/src/ATen/native/cuda/NLLLoss2d.cu:104: nll_loss2d_forward_kernel: block: [0,0,0], thread: [198,0,0] Assertion `t >= 0 && t < n_classes` failed.
../aten/src/ATen/native/cuda/NLLLoss2d.cu:104: nll_loss2d_forward_kernel: block: [0,0,0], thread: [199,0,0] Assertion `t >= 0 && t < n_classes` failed.
../aten/src/ATen/native/cuda/NLLLoss2d.cu:104: nll_loss2d_forward_kernel: block: [0,0,0], thread: [968,0,0] Assertion `t >= 0 && t < n_classes` failed.
../aten/src/ATen/native/cuda/NLLLoss2d.cu:104: nll_loss2d_forward_kernel: block: [0,0,0], thread: [959,0,0] Assertion `t >= 0 && t < n_classes` failed.
../aten/src/ATen/native/cuda/NLLLoss2d.cu:104: nll_loss2d_forward_kernel: block: [0,0,0], thread: [830,0,0] Assertion `t >= 0 && t < n_classes` failed.
../aten/src/ATen/native/cuda/NLLLoss2d.cu:104: nll_loss2d_forward_kernel: block: [0,0,0], thread: [574,0,0] Assertion `t >= 0 && t < n_classes` failed.
../aten/src/ATen/native/cuda/NLLLoss2d.cu:104: nll_loss2d_forward_kernel: block: [0,0,0], thread: [702,0,0] Assertion `t >= 0 && t < n_classes` failed.
../aten/src/ATen/native/cuda/NLLLoss2d.cu:104: nll_loss2d_forward_kernel: block: [0,0,0], thread: [191,0,0] Assertion `t >= 0 && t < n_classes` failed.
../aten/src/ATen/native/cuda/NLLLoss2d.cu:104: nll_loss2d_forward_kernel: block: [0,0,0], thread: [318,0,0] Assertion `t >= 0 && t < n_classes` failed.
../aten/src/ATen/native/cuda/NLLLoss2d.cu:104: nll_loss2d_forward_kernel: block: [0,0,0], thread: [319,0,0] Assertion `t >= 0 && t < n_classes` failed.
../aten/src/ATen/native/cuda/NLLLoss2d.cu:104: nll_loss2d_forward_kernel: block: [0,0,0], thread: [446,0,0] Assertion `t >= 0 && t < n_classes` failed.
../aten/src/ATen/native/cuda/NLLLoss2d.cu:104: nll_loss2d_forward_kernel: block: [0,0,0], thread: [63,0,0] Assertion `t >= 0 && t < n_classes` failed.
Floating point exception (core dumped)
My torch and CUDA version are: '1.13.0+cu117'
My Python version: Python 3.9.12
Any help is much appreciated!
The repository author mentions the following.
"This bug occurs when the ground truth masks have more classes than the number of classes in prediction. Please make sure you ground truth images have only 0 or 1 labels of pixels if you are training for binary segmentation. The datasets usually have the ground truth as 0 or 255 labels of pixels. So, please convert them to 0's and 1's."

Random Walks in python

There are classes for Location, Drunk, and Field. I was trying to have a subclass dirty field. That generates a dirty tile in the field. If a random walker moves on the dirty tile then, they keep moving.
I am getting the following error: I am hoping that y'all see something that I don't see.
edit: I have fixed the grammar errors and now when running I am getting a randint() error. When looking into randint should I change it to uniform?
KeyboardInterrupt Traceback (most recent call last)
Input In [20], in <cell line: 27>()
22 self.party[motive] =\
23 self.party[motive].move(x, y)
26 start = Location(0, 0)
---> 27 f = dirtyField()
29 homer = SouthDrunk('Homer')
30 f.addDrunk(homer, start)
Input In [20], in dirtyField.__init__(self, dirtyTiles, xRange, yRange)
6 w = 0
7 while (w < dirtyTiles):
----> 8 x = random.randint(-xRange, xRange)
9 y = random.randint(-yRange, yRange)
10 aDirtyTile = Location(x, y)
File ~\anaconda3\lib\random.py:338, in Random.randint(self, a, b)
334 def randint(self, a, b):
335 """Return random integer in range [a, b], including both end points.
336 """
--> 338 return self.randrange(a, b+1)
File ~\anaconda3\lib\random.py:314, in Random.randrange(self, start, stop, step)
312 width = istop - istart
313 if step == 1 and width > 0:
--> 314 return istart + self._randbelow(width)
315 if step == 1:
316 raise ValueError("empty range for randrange() (%d, %d, %d)" % (istart, istop, width))
File ~\anaconda3\lib\random.py:243, in Random._randbelow_with_getrandbits(self, n)
241 return 0
242 getrandbits = self.getrandbits
--> 243 k = n.bit_length() # don't use (n-1) here because n can be 1
244 r = getrandbits(k) # 0 <= r < 2**k
245 while r >= n:
KeyboardInterrupt:
class dirtyField(Field):
def __init__(self, dirtyTiles = 1000,
xRange = 100, yRange = 100):
Field.__init__(self)
self.dirtTile = []
w = 0
while (w < dirtyTiles):
x = random.randint(-xRange, xRange)
y = random.randint(-yRange, yRange)
aDirtyTile = Location(x, y)
self.dirtTile.append(aDirtyTile)
def moveDrunk(self, motive):
# per instructions if the axis is a dirty tile then the drunk moves until a clean tile.
# one tile at a time motive is another
Field.moveDrunk(self, motive)
while (self.party[motive] in self.dirtTiles):
self.party[motive] =\
self.party[motive].move(x, y)
x, y = motive.takeStep()
self.party[motive] =\
self.party[motive].move(x, y)
start = Location(0, 0)
f = dirtyField()
homer = SouthDrunk('Homer')
f.addDrunk(homer, start)
f.moveDrunk(homer)
print(f.getLoc(homer))
I had grammar errors and def moveDrunk was too much. I also needed w+=1 at the init

AttributeError: 'float' object has no attribute 'find'

i want to find degree holders.
The following code is causing AttributeError: 'float' object has no attribute 'find' and I do not how to fix it:
edu = Edu_data
# Function to identify degree
def degree(x):
#if x.find('Bachelor') != -1 or x.find("Bachelor's") != -1 or x.find('BS') != -1 or x.find('bs') != -1 or x.find('Bs') != -1 or x.find('Bachelors') != -1 or x.find('Undergraduate') != -1 or x.find('graduated')!= -1 or x.find('BSE')!= -1 or x.find('Enginee') != -1 or x.find('BCS') != -1:
if x.find('Bachelor') != -1 or x.find("Bachelor's") != -1 or x.find('BS') != -1 or x.find('bs') != -1:
return(1)
if x.find('Master') != -1 or x.find("Master's") != -1 or x.find('M.S') != -1 or x.find('MS') != -1 or x.find('MPhil') != -1 or x.find('MBA') != -1 or x.find('MicroMasters') != -1 or x.find('MSc') != -1 or x.find('MSCS') !=-1 or x.find('MSDS')!=-1:
return(2)
if x.find('PhD') != -1 or x.find('P.hd') != -1 or x.find('Ph.D') != -1 or x.find('ph.d') != -1:
return(3)
else:
return(0)
# Create degree column
edu['deg'] = list(map(degree, edu['Last_degree']))
edu
The full traceback for the error:
AttributeError Traceback (most recent call last)
<ipython-input-39-9a79283a9b17> in <module>
16
17 # Create degree column
---> 18 edu['deg'] = list(map(degree, edu['Last_degree']))
19
20 edu
<ipython-input-39-9a79283a9b17> in degree(x)
5 # Function to identify degree
6 def degree(x):
----> 7 if x.find('Bachelor') != -1 or x.find("Bachelor's") != -1 or x.find('BS') != -1 or x.find('bs') != -1 or x.find('Bs') != -1 or x.find('Bachelors') != -1 or x.find('Undergraduate') != -1 or x.find('graduated')!= -1 or x.find('BSE')!= -1 or x.find('Enginee') != -1 or x.find('BCS') != -1:
8 return(1)
9 if x.find('Master') != -1 or x.find("Master's") != -1 or x.find('M.S') != -1 or x.find('MS') != -1 or x.find('MPhil') != -1 or x.find('MBA') != -1 or x.find('MicroMasters') != -1 or x.find('MSc') != -1 or x.find('MSCS') !=-1 or x.find('MSDS')!=-1:
AttributeError: 'float' object has no attribute 'find'
image of dataset
Does the image show all your data. Is it possible that somewhere in the data frame's Last_degree column contains a float?
If yes, change this line:
edu['deg'] = list(map(degree, edu['Last_degree']))
to
edu['deg'] = list(map(degree, edu['Last_degree'].astype(str)))

pytorch Dataloader error "Too many open files" when yielding an int

I'm trying to implement a custom IterableDataset in which I read words from a file, get theirs unique id, gather them and return them batched.
import os
import torch
import tqdm
from torch.utils.data import IterableDataset, DataLoader
import vocab # C++ class bound to python with pybind11
class MyIterableDataset(IterableDataset):
def __init__(self, file_path, v, num_workers=4):
super(MyIterableDataset).__init__()
self.file_path = file_path
self.file_size = os.stat(file_path).st_size
self.v = v # vocab object, bound from a C++ class with pybind11
chunk_size = self.file_size // num_workers
start = 0
end = chunk_size
bonus = self.file_size - chunk_size * num_workers
if (bonus > 0):
end = chunk_size + 1
bonus -= 1
self.endpoints = [(start, end)]
for i in range(1, num_workers):
start = end
if (bonus > 0):
end += chunk_size + 1
bonus -= 1
else:
end += chunk_size
self.endpoints.append((start, end))
def read_word(self, f):
ch = ''
word = ""
while True:
ch = f.read(1)
if not ch:
return ''
if (str.isspace(ch)):
if len(word) > 0:
break
if (ch == '\n'):
return "\n"
else:
continue
word += ch
return word
def parse_file(self, start, words_to_read, id):
words_read = 0
f = open(self.file_path, "r")
f.seek(start, 0)
if id > 0:
while True:
ch = f.read(1)
if not ch or str.isspace(ch):
break
start += 1
f.seek(start, 0)
while True:
word = self.read_word(f)
if word and word != "\n":
wid = self.v.word2id(word)
if wid != -1:
words_read += 1
yield wid # if I yield 'word' instead, everything works. You can also yield 1 and you get the error
if words_read >= words_to_read or not word:
break
f.close()
def __iter__(self):
worker_info = torch.utils.data.get_worker_info()
words_to_read = self.v.get_train_words() // worker_info.num_workers
start, end = self.endpoints[worker_info.id]
return self.parse_file(start, words_to_read, worker_info.id)
Upon running a DataLoader over my dataset with
num_workers = 7
v = vocab.Vocab("./text8") # Vocab is a C++ class bound to python with pybind11
ds = MyIterableDataset(file_path=file_path, v=v, num_workers=num_workers)
wids = [j for _, j in tqdm.tqdm(enumerate(DataLoader(ds, num_workers=num_workers, batch_size=10)))]
whenever I yield the word id I get the following error:
RuntimeError Traceback (most recent call last)
<ipython-input-18-04575fb9c982> in <module>
2
3 t0 = time.time()
----> 4 tokens = [j for _, j in tqdm.tqdm(enumerate(DataLoader(ds, num_workers=num_workers, batch_size=10)))]
5 print()
6 print(time.time() - t0)
<ipython-input-18-04575fb9c982> in <listcomp>(.0)
2
3 t0 = time.time()
----> 4 tokens = [j for _, j in tqdm.tqdm(enumerate(DataLoader(ds, num_workers=num_workers, batch_size=10)))]
5 print()
6 print(time.time() - t0)
~/miniconda3/envs/word2gm/lib/python3.8/site-packages/tqdm/std.py in __iter__(self)
1165
1166 try:
-> 1167 for obj in iterable:
1168 yield obj
1169 # Update and possibly print the progressbar.
~/miniconda3/envs/word2gm/lib/python3.8/site-packages/torch/utils/data/dataloader.py in __next__(self)
433 if self._sampler_iter is None:
434 self._reset()
--> 435 data = self._next_data()
436 self._num_yielded += 1
437 if self._dataset_kind == _DatasetKind.Iterable and \
~/miniconda3/envs/word2gm/lib/python3.8/site-packages/torch/utils/data/dataloader.py in _next_data(self)
1066
1067 assert not self._shutdown and self._tasks_outstanding > 0
-> 1068 idx, data = self._get_data()
1069 self._tasks_outstanding -= 1
1070 if self._dataset_kind == _DatasetKind.Iterable:
~/miniconda3/envs/word2gm/lib/python3.8/site-packages/torch/utils/data/dataloader.py in _get_data(self)
1032 else:
1033 while True:
-> 1034 success, data = self._try_get_data()
1035 if success:
1036 return data
~/miniconda3/envs/word2gm/lib/python3.8/site-packages/torch/utils/data/dataloader.py in _try_get_data(self, timeout)
897 except OSError as e:
898 if e.errno == errno.EMFILE:
--> 899 raise RuntimeError(
900 "Too many open files. Communication with the"
901 " workers is no longer possible. Please increase the"
RuntimeError: Too many open files. Communication with the workers is no longer possible. Please increase the limit using `ulimit -n` in the shell or change the sharing strategy by calling `torch.multiprocessing.set_sharing_strategy('file_system')` at the beginning of your code
while if I yield the word everything works!
Can someone help me understand why this is happening in the first place?

Python unit test failure: should raise value error but not

Here is my code trying to learn unit testing.
Create a Student class for the purpose of testing. The test invalid test case constantly failed.
FAIL: test_invalid (__main__.TestStudent)
----------------------------------------------------------------------
Traceback (most recent call last):
File "mystudent.py", line 46, in test_invalid
s1.get_grade()
AssertionError: ValueError not raised
above is from running result.
Could anyone help me to figure out why I have this failure while I think I have put the right 'Raise Error' code there....
import unittest
class Student(object):
def __init__(self, name, score):
self.name = name
self.score = score
def get_grade(self):
try:
if self.score >= 60 and self.score < 80:
return 'B'
if self.score >= 80 and self.score <= 100:
return 'A'
if self.score >= 0 and self.score <60:
return 'C'
if self.score < 0 or self.score > 100:
raise ValueError('Invalid score value')
except Exception as e:
print('Value error!')
class TestStudent(unittest.TestCase):
def test_invalid(self):
s1 = Student('Bob', -1)
s2 = Student('Bat', 101)
with self.assertRaises(ValueError):
s1.get_grade()
with self.assertRaises(ValueError):
s2.get_grade()
if __name__ == '__main__':
unittest.main()
Thanks
You're catching the ValueError inside the function. You need to either remove the try/except block in the function or re-raise it after doing whatever you want inside:
def get_grade(self):
try:
if self.score >= 60 and self.score < 80:
return 'B'
if self.score >= 80 and self.score <= 100:
return 'A'
if self.score >= 0 and self.score <60:
return 'C'
if self.score < 0 or self.score > 100:
raise ValueError('Invalid score value')
except Exception as e:
print('Value error!')
raise # Passes the exception up

Categories