The following code runs the EleutherAI/gpt-neo-1.3B model. The model runs on CPUs, but I don't understand why it does not use my GPU. Did I missed something?
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("EleutherAI/gpt-neo-1.3B")
tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neo-1.3B")
prompt = ("What is the capital of France?")
input_ids = tokenizer(prompt, return_tensors="pt").input_ids
gen_tokens = model.generate(input_ids, do_sample=True, temperature=0.9, max_length=50 )
gen_text = tokenizer.batch_decode(gen_tokens)[0]
print (gen_text)
By the way, here is the output of the nvidia-smi command
Thu Feb 16 14:58:28 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.108.03 Driver Version: 510.108.03 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:73:00.0 On | N/A |
| 30% 31C P8 34W / 350W | 814MiB / 24576MiB | 22% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 NVIDIA RTX A5000 Off | 00000000:A6:00.0 Off | Off |
| 30% 31C P8 16W / 230W | 8MiB / 24564MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 3484 G /usr/lib/xorg/Xorg 378MiB |
| 0 N/A N/A 3660 G /usr/bin/gnome-shell 62MiB |
| 0 N/A N/A 4364 G ...662097787256072160,131072 225MiB |
| 0 N/A N/A 37532 G ...6/usr/lib/firefox/firefox 142MiB |
| 1 N/A N/A 3484 G /usr/lib/xorg/Xorg 4MiB |
+-----------------------------------------------------------------------------+
Related
Python verion: 3.7.6
Tensorflow version: 2.3.0
CUDA: 10.2.89
CUDNN: 10.2
nvcc --version:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Oct_23_19:32:27_Pacific_Daylight_Time_2019
Cuda compilation tools, release 10.2, V10.2.89
nvidia-smi output:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 451.48 Driver Version: 451.48 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1080 WDDM | 00000000:04:00.0 On | N/A |
| 0% 47C P8 8W / 200W | 463MiB / 8192MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1268 C+G Insufficient Permissions N/A |
| 0 N/A N/A 1308 C+G Insufficient Permissions N/A |
| 0 N/A N/A 4936 C+G ...\Direct4\jabra-direct.exe N/A |
| 0 N/A N/A 7500 C+G Insufficient Permissions N/A |
| 0 N/A N/A 7516 C+G ...w5n1h2txyewy\SearchUI.exe N/A |
| 0 N/A N/A 9668 C+G Insufficient Permissions N/A |
| 0 N/A N/A 10676 C+G C:\Windows\explorer.exe N/A |
| 0 N/A N/A 10828 C+G ...st\Desktop\Mattermost.exe N/A |
| 0 N/A N/A 11536 C+G ...8bbwe\Microsoft.Notes.exe N/A |
| 0 N/A N/A 14604 C+G ...es.TextInput.InputApp.exe N/A |
+-----------------------------------------------------------------------------+
I tried:
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))
Num GPUs Available: 0
Why tensorflow is not able to detect the GPU?
It seems you are trying to use the TensorFlow-GPU version and you have downloaded unsupported versions.
Note: GPU support is available for Ubuntu and Windows with CUDA enabled cards only.
If you have a Cuda enabled card follow the instructions provided below.
As stated in Tensorflow documentation. The software requirements are as follows.
Nvidia gpu drivers - 418.x or higher
Cuda - 10.1 (TensorFlow >= 2.1.0)
cuDNN - 7.6
Make sure you have these exact versions of the software mentioned above. See this
Also, check the system requirements here.
Make sure you have installed all the c++ redistributables - here
For downloading the software mentioned above see here.
For downloading TensorFlow follow the instructions provided here to correctly install the necessary packages.
I have been using this:
os.environ["CUDA_VISIBLE_DEVICES"] = "1"
in order to run on GPU. It has been working properly since today.
The problem now is that, in the middle of the runtime, my program stops using GPU and switches to CPU, so it becomes too slow.
Any idea on why is that happening?
Output at the beggining of the execution for nvidia-smi:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.67 Driver Version: 418.67 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 970 On | 00000000:01:00.0 On | N/A |
| 0% 42C P8 14W / 200W | 363MiB / 4039MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla K40c On | 00000000:05:00.0 Off | 0 |
| 35% 74C P0 136W / 235W | 11011MiB / 11441MiB | 94% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1037 G /usr/lib/xorg/Xorg 20MiB |
| 0 1150 G /usr/bin/gnome-shell 12MiB |
| 0 7430 G /usr/lib/xorg/Xorg 166MiB |
| 0 7560 G /usr/bin/gnome-shell 158MiB |
| 1 13772 C python3 10998MiB |
+-----------------------------------------------------------------------------+
And then, when it begins to run too slowly:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.67 Driver Version: 418.67 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 970 On | 00000000:01:00.0 On | N/A |
| 0% 42C P8 14W / 200W | 363MiB / 4039MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla K40c On | 00000000:05:00.0 Off | 0 |
| 35% 69C P0 63W / 235W | 11011MiB / 11441MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1037 G /usr/lib/xorg/Xorg 20MiB |
| 0 1150 G /usr/bin/gnome-shell 12MiB |
| 0 7430 G /usr/lib/xorg/Xorg 166MiB |
| 0 7560 G /usr/bin/gnome-shell 158MiB |
| 1 13772 C python3 10998MiB |
+-----------------------------------------------------------------------------+
I am trying to train a CNN model on AWS EC2 p3.16xlarge instance which has 8 GPUs. When I use the batch size of 500, even though the system has 8 GPUs, only one GPU is utilized all the time. When I increased the batch size to 1000, it uses only GPU and really slows compared to 500 case. If I increase the batch size to 2000, then a memory overflow occurs. How can I fix this issue?
I am using tensorflow backend. GPU utilization is as below,
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.104 Driver Version: 410.104 CUDA Version: 10.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla V100-SXM2... On | 00000000:00:17.0 Off | 0 |
| N/A 47C P0 69W / 300W | 15646MiB / 16130MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla V100-SXM2... On | 00000000:00:18.0 Off | 0 |
| N/A 44C P0 59W / 300W | 502MiB / 16130MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 Tesla V100-SXM2... On | 00000000:00:19.0 Off | 0 |
| N/A 45C P0 61W / 300W | 502MiB / 16130MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 Tesla V100-SXM2... On | 00000000:00:1A.0 Off | 0 |
| N/A 47C P0 64W / 300W | 502MiB / 16130MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 4 Tesla V100-SXM2... On | 00000000:00:1B.0 Off | 0 |
| N/A 48C P0 62W / 300W | 502MiB / 16130MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 5 Tesla V100-SXM2... On | 00000000:00:1C.0 Off | 0 |
| N/A 46C P0 61W / 300W | 502MiB / 16130MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 6 Tesla V100-SXM2... On | 00000000:00:1D.0 Off | 0 |
| N/A 46C P0 65W / 300W | 502MiB / 16130MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 7 Tesla V100-SXM2... On | 00000000:00:1E.0 Off | 0 |
| N/A 46C P0 63W / 300W | 502MiB / 16130MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 15745 C python3 15635MiB |
| 1 15745 C python3 491MiB |
| 2 15745 C python3 491MiB |
| 3 15745 C python3 491MiB |
| 4 15745 C python3 491MiB |
| 5 15745 C python3 491MiB |
| 6 15745 C python3 491MiB |
| 7 15745 C python3 491MiB |
+-----------------------------------------------------------------------------+
You are probably looking for multiple_gpu_model. You can see that in the keras documentation.
You can just take your model and do parallel_model = multi_gpu_model(model, gpus=n_gpus).
Next time don't forget to include a minimal working exemple.
currently i have this data retrieved from database as follows,
+------------+--------------+-------+-----+-------------+-----------+------------------+-----------------+
| Monitor ID | Casting Date | Label | AGE | Client Name | Project | Average Strength | Average Density |
+------------+--------------+-------+-----+-------------+-----------+------------------+-----------------+
| 1082 | 2018-07-05 | b52 | 1 | Trial Mix | Trial Mix | 21.78 | 2.436 |
| 1082 | 2018-07-05 | b52 | 2 | Trial Mix | Trial Mix | 33.11 | 2.406 |
| 1082 | 2018-07-05 | b52 | 4 | Trial Mix | Trial Mix | 43.11 | 2.447 |
| 1082 | 2018-07-05 | b52 | 8 | Trial Mix | Trial Mix | 48.22 | 2.444 |
| 1083 | 2018-07-05 | B53 | 1 | Trial Mix | Trial Mix | 10.44 | 2.421 |
| 1083 | 2018-07-05 | B53 | 2 | Trial Mix | Trial Mix | 20.0 | 2.400 |
| 1083 | 2018-07-05 | B53 | 4 | Trial Mix | Trial Mix | 27.78 | 2.397 |
| 1083 | 2018-07-05 | B53 | 8 | Trial Mix | Trial Mix | 33.33 | 2.409 |
| 1084 | 2018-07-05 | B54 | 1 | Trial Mix | Trial Mix | 12.89 | 2.430 |
| 1084 | 2018-07-05 | B54 | 2 | Trial Mix | Trial Mix | 24.44 | 2.427 |
| 1084 | 2018-07-05 | B54 | 4 | Trial Mix | Trial Mix | 34.22 | 2.412 |
| 1084 | 2018-07-05 | B54 | 8 | Trial Mix | Trial Mix | 41.56 | 2.501 |
+------------+--------------+-------+-----+-------------+-----------+------------------+-----------------+
how can i change the table to something like this?
+------------+--------------+-------+-----------+-----------+---------+-------------+---------+-------------+---------+-------------+---------+-------------+
| Monitor Id | Casting Date | Label | Client | Project | 1 Day | | 2 Days | | 4 Days | | 8 Days | |
+------------+--------------+-------+-----------+-----------+---------+-------------+---------+-------------+---------+-------------+---------+-------------+
| | | | | | avg str | avg density | avg str | avg density | avg str | avg density | avg str | avg density |
| | | | | | | | | | | | | |
| 1082 | 05/07/2018 | B52 | Trial Mix | Trial Mix | 21.78 | 2.436 | 33.11 | 2.406 | 43.11 | 2.44 | 48.22 | 2.444 |
| 1083 | 05/07/2018 | B53 | Trial Mix | Trial Mix | 10.44 | 2.421 | 20 | 2.4 | 27.78 | 2.397 | 33.33 | 2.409 |
| 1084 | 05/07/2018 | B54 | Trial Mix | Trial Mix | 12.89 | 2.43 | 24.44 | 2.427 | 34.22 | 2.412 | 41.56 | 2.501 |
+------------+--------------+-------+-----------+-----------+---------+-------------+---------+-------------+---------+-------------+---------+-------------+
i get the data by joining multiple table from the database using peewee
below is my full code to retrieve and format the data
from lib.database import *
import matplotlib.pyplot as plt
from datetime import datetime,timedelta
from prettytable import PrettyTable
import numpy as np
#table to hold data
table = PrettyTable()
table.field_names = ['Monitor ID','Casting Date','Label','AGE','Client Name','Project', 'Average Strength','Average Density']
#interval of 2 weeks ago
int = datetime.today()-timedelta(days=14)
result = MonitorCombine.select(ResultCombine.strength.alias('str'),ResultCombine.density.alias('density'),ResultCombine.age,MonitorCombine.clientname,MonitorCombine.p_alias,MonitorCombine.monitorid, MonitorCombine.monitor_label,MonitorCombine.casting_date).join(ResultCombine, on=(ResultCombine.monitorid == MonitorCombine.monitorid)).dicts().where(MonitorCombine.casting_date > int).order_by(MonitorCombine.monitor_label,ResultCombine.age.asc())
for r in result: table.add_row([r['monitorid'],r['casting_date'],r['monitor_label'],r['age'],r['clientname'],r['p_alias'],r['str'],r['density']])
print(table)
You have to pivot the data, since MariaDB has no pivot you could do it in sql:
SELECT
MonitorID,
CastingDate,
Label,
ClientName,
Project,
SUM(IF(Age=1, AverageStrength, 0)) AS AvgStr1,
SUM(IF(Age=2, AverageStrength, 0)) AS AvgStr2,
SUM(IF(Age=4, AverageStrength, 0)) AS AvgStr4,
SUM(IF(Age=8, AverageStrength, 0)) AS AvgStr8,
SUM(IF(Age=1, AverageDensity, 0)) AS AvgDensity1,
SUM(IF(Age=2, AverageDensity, 0)) AS AvgDensity2,
SUM(IF(Age=4, AverageDensity, 0)) AS AvgDensity4,
SUM(IF(Age=8, AverageDensity, 0)) AS AvgDensity8
FROM
YourTable
GROUP BY MonitorID, CastingDate, Label, ClientName, Project, Age
ORDER BY MonitorID, CastingDate;
When I training a VGG16 NN with GPU using TensorFlow, it always show me CUDA_ERROR_OUT_OF_MEMORY and always stops with the error tensorflow.python.framework.errors_impl.InternalError: Dst tensor is not initialized.
I searched the internet with those message and got some tips:
set config.gpu_options.allow_growth to True.
set config.gpu_options.per_process_gpu_memory_fraction to a smaller fraction like 0.6.
set smaller batch size.
But these tips don't work, the process runs just like nothing changed.
Here is my hardware:
GPU: NVIDIA GTX 1060
Memory: 3G + 4G(shared memory)
I monitored the usage of GPU using nvidia-smi, and below is the detail.
Before Running:
Thu Apr 19 14:21:59 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 388.31 Driver Version: 388.31 |
|-------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1060 WDDM | 00000000:01:00.0 On | N/A |
| N/A 50C P8 7W / N/A | 587MiB / 3072MiB | 2% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 7300 C+G ...osoft Office\root\Office16\POWERPNT.EXE N/A |
| 0 8244 C+G ...6)\Youdao\YoudaoNote\YNoteCefRender.exe N/A |
| 0 9988 C+G C:\Windows\explorer.exe N/A |
| 0 10696 C+G ...t_cw5n1h2txyewy\ShellExperienceHost.exe N/A |
| 0 10808 C+G ...dows.Cortana_cw5n1h2txyewy\SearchUI.exe N/A |
| 0 11024 C+G Insufficient Permissions N/A |
| 0 11092 C+G C:\Windows\System32\mstsc.exe N/A |
| 0 13076 C+G ...ogram Files (x86)\Skype\Phone\Skype.exe N/A |
| 0 14664 C+G ...osoft Office\root\Office16\POWERPNT.EXE N/A |
+-----------------------------------------------------------------------------+
Process begin:
Thu Apr 19 14:24:23 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 388.31 Driver Version: 388.31 |
|-------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1060 WDDM | 00000000:01:00.0 On | N/A |
| N/A 48C P2 28W / N/A | 1133MiB / 3072MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 7300 C+G ...osoft Office\root\Office16\POWERPNT.EXE N/A |
| 0 9988 C+G C:\Windows\explorer.exe N/A |
| 0 10696 C+G ...t_cw5n1h2txyewy\ShellExperienceHost.exe N/A |
| 0 10808 C+G ...dows.Cortana_cw5n1h2txyewy\SearchUI.exe N/A |
| 0 11024 C+G Insufficient Permissions N/A |
| 0 11092 C+G C:\Windows\System32\mstsc.exe N/A |
| 0 13076 C+G ...ogram Files (x86)\Skype\Phone\Skype.exe N/A |
| 0 14404 C ...ools\Anaconda3\envs\py36_tfg\python.exe N/A |
| 0 14664 C+G ...osoft Office\root\Office16\POWERPNT.EXE N/A |
+-----------------------------------------------------------------------------+
After 10 steps:
Thu Apr 19 14:30:40 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 388.31 Driver Version: 388.31 |
|-------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1060 WDDM | 00000000:01:00.0 On | N/A |
| N/A 64C P2 31W / N/A | 2595MiB / 3072MiB | 1% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 7300 C+G ...osoft Office\root\Office16\POWERPNT.EXE N/A |
| 0 9988 C+G C:\Windows\explorer.exe N/A |
| 0 10696 C+G ...t_cw5n1h2txyewy\ShellExperienceHost.exe N/A |
| 0 10808 C+G ...dows.Cortana_cw5n1h2txyewy\SearchUI.exe N/A |
| 0 11024 C+G Insufficient Permissions N/A |
| 0 11092 C+G C:\Windows\System32\mstsc.exe N/A |
| 0 13076 C+G ...ogram Files (x86)\Skype\Phone\Skype.exe N/A |
| 0 14404 C ...ools\Anaconda3\envs\py36_tfg\python.exe N/A |
| 0 14664 C+G ...osoft Office\root\Office16\POWERPNT.EXE N/A |
+-----------------------------------------------------------------------------+
After 60 steps:
some message showed, but can still run
2018-04-19 14:33:56.384528: E c:\l\work\tensorflow-1.1.0\tensorflow\stream_executor\cuda\cuda_driver.cc:924] failed to alloc 2147483648 bytes on host: CUDA_ERROR_OUT_OF_MEMORY
2018-04-19 14:33:56.423080: E c:\l\work\tensorflow-1.1.0\tensorflow\stream_executor\cuda\cuda_driver.cc:924] failed to alloc 1932735232 bytes on host: CUDA_ERROR_OUT_OF_MEMORY
2018-04-19 14:33:56.474281: E c:\l\work\tensorflow-1.1.0\tensorflow\stream_executor\cuda\cuda_driver.cc:924] failed to alloc 1739461632 bytes on host: CUDA_ERROR_OUT_OF_MEMORY
Thu Apr 19 14:36:13 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 388.31 Driver Version: 388.31 |
|-------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1060 WDDM | 00000000:01:00.0 On | N/A |
| N/A 63C P2 33W / N/A | 2602MiB / 3072MiB | 43% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 7300 C+G ...osoft Office\root\Office16\POWERPNT.EXE N/A |
| 0 9988 C+G C:\Windows\explorer.exe N/A |
| 0 10696 C+G ...t_cw5n1h2txyewy\ShellExperienceHost.exe N/A |
| 0 10808 C+G ...dows.Cortana_cw5n1h2txyewy\SearchUI.exe N/A |
| 0 11024 C+G Insufficient Permissions N/A |
| 0 11092 C+G C:\Windows\System32\mstsc.exe N/A |
| 0 13076 C+G ...ogram Files (x86)\Skype\Phone\Skype.exe N/A |
| 0 14404 C ...ools\Anaconda3\envs\py36_tfg\python.exe N/A |
| 0 14664 C+G ...osoft Office\root\Office16\POWERPNT.EXE N/A |
+-----------------------------------------------------------------------------+
After 170 steps:
About eight hundreds lines message showed, then the process stopped with errors
About eight hundreds lines:
2018-04-19 14:49:35.688274: E c:\l\work\tensorflow-1.1.0\tensorflow\stream_executor\cuda\cuda_driver.cc:924] failed to alloc 4294967296 bytes on host: CUDA_ERROR_OUT_OF_MEMORY
Stopped with some errors:
Traceback (most recent call last):
File "C:\DevTools\Anaconda3\envs\py36_tfg\lib\site-packages\tensorflow\python\client\session.py", line 1039, in _do_call
return fn(*args)
File "C:\DevTools\Anaconda3\envs\py36_tfg\lib\site-packages\tensorflow\python\client\session.py", line 1021, in _run_fn
status, run_metadata)
File "C:\DevTools\Anaconda3\envs\py36_tfg\lib\contextlib.py", line 88, in __exit__
next(self.gen)
File "C:\DevTools\Anaconda3\envs\py36_tfg\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InternalError: Dst tensor is not initialized.
[[Node: input/input/div/_79 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_111_input/input/div", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "vgg16_train_and_test.py", line 212, in <module>
train()
File "vgg16_train_and_test.py", line 124, in train
coord.join(threads)
File "C:\DevTools\Anaconda3\envs\py36_tfg\lib\site-packages\tensorflow\python\training\coordinator.py", line 389, in join
six.reraise(*self._exc_info_to_raise)
File "C:\DevTools\Anaconda3\envs\py36_tfg\lib\site-packages\six.py", line 693, in reraise
raise value
File "C:\DevTools\Anaconda3\envs\py36_tfg\lib\site-packages\tensorflow\python\training\queue_runner_impl.py", line 234, in _run
sess.run(enqueue_op)
File "C:\DevTools\Anaconda3\envs\py36_tfg\lib\site-packages\tensorflow\python\client\session.py", line 778, in run
run_metadata_ptr)
File "C:\DevTools\Anaconda3\envs\py36_tfg\lib\site-packages\tensorflow\python\client\session.py", line 982, in _run
feed_dict_string, options, run_metadata)
File "C:\DevTools\Anaconda3\envs\py36_tfg\lib\site-packages\tensorflow\python\client\session.py", line 1032, in _do_run
target_list, options, run_metadata)
File "C:\DevTools\Anaconda3\envs\py36_tfg\lib\site-packages\tensorflow\python\client\session.py", line 1052, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: Dst tensor is not initialized.
[[Node: input/input/div/_79 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_111_input/input/div", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]