Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I am very new to tensorflow. I have installed Tensorflow CPU version and also installed python 3.5.2. I have also created a simple program using tensorflow like some addition of two numbers etc. Now i want to create a graphical program using python and tensorflow. Is it possible? any simple program will be helpful(like draw a line, draw a graph etc..). Kindly mention the steps. i have tried lots of examples but its not working.
Thanks in advance!!
TensorFlow is not here to draw anything. Like, really not. It is built to compute stuff, over CPU and GPU and to give a Machine Learning library.
Now, you may have been confused because TensorFlow often mention "graphs". Those graphs are computational graphs, which is how each operations are linked one to another. In fact, "graphs" refers to a set of vertice and edges, which can be drawn, but is not a "graphical" object in itself. Still, you can see this graph, generated by TensorFlow, thanks to Tensorboard.
Now, what is totally possible is, in one hand to run intensive computations thanks to TensorFlow AND, in the other hand, another librarie to draw, plot etc...
The most famous/documented/used/best by far is matplotlib, aka pyplot. This is the lib to use, the lib to learn, as you will encounter it everywhere. In fact, I don't know any other lib.
Tell me if my answer is somehow off topic, or if you would like me to precise something.
Have fun!
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 months ago.
Improve this question
I have a dataset composed of several large csv files. Their total size is larger than the RAM of the machine on which the training is executed.
I need to train an ML model from Scikit-Learn or TF or pyTorch (Think SVR, not deep learning). I need to use the whole dataset which is impossible to load at once. Any recommendation on how to overcome this, please?
I have been in this situation before and my suggestion would be take a step back and look at the problem again.
Does your model absolutely need all of the data at once? Or can it be done in batches? It's also possible that the model you are using can be done in batches, but the library you are using does not support such a case. In that situation, either try to find a library that does support batches or if such a library does not exist (unlikely), "reinvent the wheel" yourself, i.e., create the model from scratch and allow batches. However, as your question mentioned, you need to use a model from Scikit-Learn, TensorFlow, or PyTorch. So if you truly want to stick with your mentioned libraries, there are techniques such as those that Alexey Larionov and I'mahdi mentioned in comments to your question in relation to PyTorch and TensorFlow.
Is all of your data actually relevant? Once I found that a whole subset of my data was useless to the problem I was trying to solve; another time I found that it was only marginally helpful. Dimensionality reduction, numerosity reduction, and statistical modeling may be your friends here. Here is a link to a wikipedia page about data reduction:
https://en.wikipedia.org/wiki/Data_reduction
Not only will data reduction reduce the amount of memory you need, it will also improve your model. Bad data in means bad data out.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 11 months ago.
Improve this question
I'm trying to clean the line noises from this captcha, so I can implement an algorithm to read them. However, I'm finding some difficulties to make it readable to an AI using some techniques, such as Open CV threshold combined with some resources from pil.Image. I also tried an algorithm to "chop" the image, which gave me a better results, but stil far from the expected. I want to know if there is an alternative to remove noises from captchas like this one effectively.
(I'm using python)
Initially, the Captcha looks like this:
Once processed using OpenCV + Pillow, I've got this:
Later, using the "chop method" this what we have:
However, I need a better final image, but I think this methods combination is not appropriate. Is there a better alternative?
I think you could try minisom: https://github.com/JustGlowing/minisom
SOM (Self organizes maps) are a type of neural networks that group clusters of points in data, with an appropiate threshold it could help you removing those lines that are not surrounding the numbers/letters, combining that with chop method could do the job.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
Recently I have completed ML course in coursera by Andrews Ng. It's an awesome course. I was working with octave through out the course. But, python is much popular when compared to octave. So, I have started to learn python now. I was implementing linear regression using python. In that I am doing nothing. Simply calling the predefined function for linear regression. But, in octave I used to write the code from scratch. I have to find parameters using gradient descent algorithm. But, no such things in python. I have referred the following link:
https://towardsdatascience.com/linear-regression-python-implementation-ae0d95348ac4
My question is, won't we use any algorithms like gradient descent to learn parameter Theta? Is everything is predefined in python?
Thanks.
Python is a programming language, just like Octave. So everything that can be done in Octave can be done using Python too. If you want to implement Linear Regression algorithm from scratch using Python in order to validate your understanding, of course you can do it (I have done it too). Why stop at Linear Regression, you can implement SVM, Decision Trees or even Deep Neural Networks from scratch in Python. And it is a good way to gain concrete understanding of these algorithms.
However, over the years all these have been implemented in Python in libraries like Sklearn etc. So as the complexity and volume of data increases, you would want to use one of these libraries or frameworks. Why? Because these are highly optimized implementations. To get high level feeling - implement Linear Regression using simple list and for loops, and then vectorize it with Numpy, you will see the difference in performance.
So to summarize - if you are curious, go ahead and implement the algorithms from scratch to gain solid understanding. As complexity and data volume will increase, start using the libraries and frameworks. Hope this helps.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
Context:
I'm trying learning machine-learning using python3. My intended goal is to create a CNN program that can guess simple 4 letters 72*24 pixels CAPTCHA image as below:
CAPTCHA Image Displaying VDF5. This challenge was inspired by https://medium.com/#ageitgey/how-to-break-a-captcha-system-in-15-minutes-with-machine-learning-dbebb035a710, which I thought would be a great challenge for me to learn k-means clustering and CNN.
Edit---
I see I was being too "build me this guy". Now that I found scikit, I'll try to learn it and apply that instead. Sorry for annoying you all.
It seems as if you are looking to build a machine learning algorithm for educational purposes. If so, import TensorFlow and get to it! However, seeing as your question seems to be "create this for me" you might be better off simply using existing implementations from the scikit learn package. Simply import scikit learn, make an instance of the KNearestNeighborClassifier train it, and boom you've cracked this problem.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I'm fairly new to data-science and barely started using python roughly about two months ago. I've been trying to do a Kaggle competition for fun (catsVsDogs) to try to learn things along the way. However I'm stopped at the very first step. The problem is that there is a training set, which contains about 25000 .jpg images of cats and dogs and the total directory is approximately 800 MB in size. Whenever I try to load the directory into python and save all the images in a matrix (say we have 100 of (300,200) size images, I would like to save them in a matrix of 100*(60000) size) I get either a memory error or the system just stops processing. I'm using canopy on a mac.
I've been trying to read a lot on the internet and find out how people deal with these big images, but it has been a week and I still couldn't find any good source. I would highly appreciate it if somebody helped me out or would just send me a link that describes the situations.
here's the link for Kaggle Competition (you can see there is no prizes involved and it's just for the sake of learning):
https://www.kaggle.com/c/dogs-vs-cats/data
The question is how do I manage to load this big dataset into python using canopy and start training a Neural Network. Or generally how do I deal with big datasets on a single computer without memory error.
I would recommend making an index of items that you wish to read (directory listing). Next read just the first item, train using just that item, remove that item from memory, move on to the next item, and repeat. You shouldn't need to have more then a few in memory at any given time.