Import trax takes too long to load - python

I was stumped the first time I loaded this library. In my local computer it tooks me at least 40s to load trax on a local Jupyter Notebook and more than 1 minute to load it on a shared Colab environment.
import trax
I'm not sure if it's an issue with my installation or a BUG in the version of trax I'm using?
I'm new in trax, and in fact my experience is with Keras and TensorFlow so I'd like to get an opinion from someone in the trax community, if this is normal or not.
Thanks a lot in advance!
BTW: I'm using trax 1.4.1 with Python 3.9.6 and my local computer has the following specs:
Intel(R) Core(TM) i5-6200U CPU # 2.30GHz, 4 cores and 16GB RAM.

After a while searching it seems that this is normal behavior. In fact there is an issue raised in the official Trax repository about this: import trax takes 17 seconds #1368.
Apparently the fastmath module that contains the trax re-implements for maths operations has plenty of dependencies.
From the issue thread:
This is a well known problem occurring on basically all setups (local, colab, gpu cluster) and it is not a big issue for running long experiments, however it does make local debugging hard. I have tried debugging the import graph with profiler, but without success yet. It looks like even from trax import fastmath has plenty of dependencies - here is the tree generated by importlab library for trax.fastmath.init module:
importlab --tree init.py
out:
https://gist.github.com/syzymon/3bb6f59063f918b4b62b77cdb223da72
So in conclusion, whether it takes you 17 seconds or
40 seconds like me, is a known behavior in Trax.

Related

Create a mess I need to clean up, while trying to solve conflicts between Macbook Pro M1 and Tensorflow library

I have a 2021 Macbook pro M1.
It is a well recognized problem that it conflicts with Tensoflow library, see here or here; in particular this last one was exactly the issue I experienced: when I tried to import tensorflow in jupyter notebook via the command
import tensorflow as tf
then the message
The kernel appears to have died. It will restart automatically.
appeared. Then searching in the above linked discussions, I have the feeling that the suggestion given at some point, which points at this link, SEEMS to be useful.
FIRST QUESTION: is this a/the solution for the M1-Tensorflow conflict?
I say "it seems" since before trying that I have been into the kind of tornado of desperate attempts leading a beginner like me to search for hints all around the web and then copy-paste commands taken here and there into the Terminal without understanding them all properly.
On one hand it sounds dumb, I admit, on the other the cost of understanding everything goes well beyond my humble intentions of learning some ML.
So, the final result is that I have a complete mess in my computer; the old libraries like numpy don't work anymore (when I import them inside a Python3 page opened with jupyter notebook with the command import numpy as np, the message
ModuleNotFoundError: No module named 'numpy'
appears), then the pip command doesn't work, if I use the pip3 to install, nothing changes. I read somewhere to use a virtual enviroment, and I followed the instructions even if I wasn't really aware of what I was doing; I downloaded XCode, miniforge3...
Well, I guess that there is somebody out there who can relate with this.
SECOND PROBLEM: I would like to clean-up everything dealing with Python/pip/anaconda and so on and install everything from scratch, possibly following the above link to solve the M1-tensorflow conflict...if it is correct. How can I do that?
Can somebody help me, please? Thanks

keras Deep Learning Slowness - Example IMDB dataset from Deep Learning with Python Chollet

I am having an issue with keras leading to my processor seemingly getting bogged down while working through examples.
In the IMDB data set for instance (exercise 3.4.1 in Deep Learning with Python by Chollet if anyone knows the book), running the script:
import keras
from keras.datasets import imdb
(train_data, train_labels), (test_data, test_labels) =
imdb.load_data(num_words=10000)
Produces an output looking something like:
[=====>...] - ETA: 59s✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓16105472/17464789
That updates increasingly slowly as the numbers get larger and move toward completion.
I'm assuming my installation of keras/Tensorflow/CUDA/cuDNN is to blame, but curious if you know of anything obvious that would solve the issue.
Running Ubuntu Linux, NVIDIA GTX 1080, Keras/Tensorflow (GPU)/CUDA,cuDNN (maybe, assuming I installed everything correctly which is probably not accurate).
Thanks!
This progress bar is shown during the first initial download and should not be present on subsequent imports of the data.
There might be several issues that cause this to slow down and/or fail:
Your internet connection is unstable.
There is an issue with the serving of the file. Maybe the repository server serves a corrupt file? (You could try to force download from another repository, see How to setup pip to download from mirror repository by default? )
Your local disc or or a previously partial download are corrput: You can try to delete a partial download in ~/.keras/datasets/mnist.npz
Check if your harddisk is full.

Unity3D Machine Learning Setup for ML-Agents on Windows 10 with Tensorflow

I have been trying to get the Machine Learning Setup for ML-Agents for Unity 3D up and running for the past several hours, with no luck.
First I followed this video, which goes over the initial installations which are also outlined in this GitHub repository.
Next, I moved on to part 2 of the video series (here), however problems started at minute 4:48, where I realized that the tutorial was using v 0.2, while I had v 0.3.
V 0.3 has done away with the PPO.ipynb file shown in the video. Everything is done through learn.py file.
I then decided to try and follow the official Unity installation guide:
https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Getting-Started-with-Balance-Ball.md
and got to Training with PPO section which I have not managed to resolve.
the problem arises here: The documentation states:
To summarize, go to your command line, enter the ml-agents directory and type:
python3 python/learn.py <env_file_path> --run-id=<run-identifier> --train
Note: If you're using Anaconda, don't forget to activate the ml-agents
environment first.
I tried to run:
python learn.py ball --run-id=ballBalance --train
but I am greeted with a number of warnings as follows:
File "learn.py", line 9, in
from unitytrainers.trainer_controller import TrainerController
File "C:\Users****\Downloads\ml-agents-master\python\unitytrainers__init__.py", line 1, in
from .buffer import *
I have been trying to solve this error message for quite some time now. It seems that the file learn.py is actually being found, but somehow not being interpreted correctly?
First 9 lines of learn.py:
# # Unity ML Agents
# ## ML-Agent Learning
import logging
import os
from docopt import docopt
from unitytrainers.trainer_controller import TrainerController
Any guidance on how I can solve this problem would be appreciated. Would gladly give more information where needed. The steps mentioned above should replicate the problem I am experiencing.
I am not completely sure whether I solved the same problem. But somewhere under my errors it also told me about line 9 in learn.py.
Nevertheless, I found this https://github.com/tensorflow/tensorflow/issues/18503
So all I did was installing tensorflow version 1.5 by executing:
pip install --upgrade --ignore-installed tensorflow-gpu==1.5
Afterwards it did run through errorless and the training worked fine.

Communications between Blender and Tensorflow

I have a simulation inside Blender that I would like to control externally using the TensorFlow library. The complete process would go something like this:
while True:
state = observation_from_blender()
action = find_action_using_tensorflow_neural_net(state)
take_action_inside_blender(action)
I don't have much experience with the threading or subprocess modules and so am unsure as to how I should go about actually building something like this.
Rather than mess around with Tensorflow connections and APIs, I'd suggest you take a look at the Open AI Universe Starter Agent[1]. The advantage here is that as long as you have a VNC session open, you can connect a TF based system to do reinforcement learning on your actions.
Once you have a model constructed via this, you can focus on actually building a lower level API system for the two things to talk to each other.
[1] https://github.com/openai/universe-starter-agent
Thanks for your response. Unfortunately, trying to get Universe working with my current setup was a bit of a pain. I'm also on a fairly tight deadline so I just needed something that worked straight away.
I did find a somewhat DIY solution that worked well for my situation using the pickle module. I don't really know how to convert this approach into proper pseudocode, so here is the general outline:
Process 1 - TensorFlow:
load up TF graph
wait until pickled state arrives
while not terminated:
Process 2 - Blender:
run simulation until next action required
pickle state
wait until pickled action arrives
Process 1 - TensorFlow:
unpickle state
feedforward state through neural net, find action
pickle action
wait until next pickled state arrives
Process 2 - Blender:
unpickle action
take action
This approach worked well for me, but I imagine there are more elegant low level solutions. I'd be curious to hear of any other possible approaches that achieve the same goal.
I did this to install tensorflow with the python version that comes bundled with blender.
Blender version: 2.78
Python version: 3.5.2
First of all you need to install pip for blender's python. For that I followed instructions mentioned in this link: https://blender.stackexchange.com/questions/56011/how-to-use-pip-with-blenders-bundled-python. (What I did was drag the python3.5m icon into the terminal and then further type the command '~/get-pip.py').
Once you have pip installed you are all set up to install 3rd party modules and use them with blender.
Navigate to the bin folder insider '/home/path to blender/2.78/' directory. To install tensorflow, drag the python3.5m icon into terminal, then drag pip3 icon into terminal and give the command install tensorflow.
I got an error mentioning module lib2to3 not found. Even installing the module didn't help as I got the message that no such module exists. Fortunately, I have python 3.5.2 installed on my machine. So navigate to /usr/lib/python3.5 and copy the lib2to3 folder from there to /home/user/blender/2.78/python/lib/python3.5 and then again run the installation command for tensorflow. The installation should complete without any errors. Make sure that you test the installation by importing tensorflow.

Change in running behavior of sklearn code between laptop and desktop

I am trying to debug a Fortran warning in some Sklearn code that runs perfectly on my laptop...but after transferring to my desktop (which is a fresh Ubuntu 15.10, fresh Pycharm, and fresh Anaconda3), I get the following error when running sklearn.cross_validation.cross_val_score:
/anaconda3/lib/python3.5/site-packages/sklearn/externals/joblib
/hashing.py:197: DeprecationWarning: Changing the shape of non-C contiguous
array by
descriptor assignment is deprecated. To maintain
the Fortran contiguity of a multidimensional Fortran
array, use 'a.T.view(...).T' instead
obj_bytes_view = obj.view(self.np.uint8)
The command I am submitting to cross_val_score is:
test_results = cross_val_score(learner(**learner_args),data,y=classes,n_jobs=n_jobs,scoring='accuracy',cv=LeaveOneOut(data.shape[0]))
Where the iterator is the sklearn cross validation object...and nothing else special is going on. What could be happening here? Am I missing some installation step?
Just for the record for people like me who found this SO post via Google, this is has been recorded as issue #6370 for scikit-learn.
As mentioned there:
This problem has been fixed in joblib master. It won't be fixed in scikit-learn until:
1) we do a new joblib release
2) we update scikit-learn master to have the new joblib release
3) if you are using released versions of scikit-learn, which I am guessing you are, you will have to wait until there is a new scikit-learn release
I was able to use the above workaround from #bordeo :
import warnings
warnings.filterwarnings("ignore")

Categories