How can I give maximum CPU Utilization to this Python Process [closed] - python

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I'm using Spyder IDE for Data Analysis using Python. My dataset is pretty large and hence I wish to give it maximum priority. I've set the priority to realtime however it is only using 13-15% of CPU. How can I give 100% CPU usage to it? I'm using Dell Insiron 15Z ultrabook with 2 RAMs of 4 GB each.
Edit: I'm now running two scripts on two different consoles. Now the CPU usage has increased to 75%. I know this isn't the technically correct way of implementing parallelism however being a beginner in Python, I had no other option. Thanks for the help :)

You're probably not using any multi-threading code or other parallelization methods. Because of this, you're code is running in just one thread, which can only run on one CPU at a time. Since you have eight CPUs, this results in 1/8 of total CPU consumption.
Parallelization of code is not a trivial task, and is highly dependent on the type of work your program is doing.

Related

When is a dataset too large for my machine's memory? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 months ago.
Improve this question
So the Dask tutorials state that:
"For too large datasets (larger than a single machine's memory), the scikit-learn estimators may not be able to cope"
Am I correct in saying: if I have 9.42 GB of "available physical memory" showing up on the System page, then a "too large dataset" for scikit would be anything in excess of 9.42 GB of data?
Am I correct in saying: if I have 9.42 GB of "available physical memory" showing up on the System page, then a "too large dataset" for scikit would be anything in excess of 9.42 GB of data?
Roughly yes, though the exact upper limit on the data size depends on various factors (such as which other objects are in the memory, how different algorithms handle copies or subsets of the data), etc. A rough rule of thumb for pandas dataframe is to use dataframes that are about 1/5 of the available RAM, and a similar ballpark estimate might be reasonable for scikit (with the caveat that much depends on the exact code/algo used).
Note that some scikit estimators can be fit on partial data, see this page.

Can dividing code too much make it inefficient? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
If code is divided into too many segments, can this make the program slow?
For example - Creating a separate file for just a single function.
In my case, I'm using Python, and suppose there are two functions that I need in the main.py file. If I placed them in different files (just containing the function).
(Suppose) Also, If I'm using the same library for the two functions and I've divided the functions into separate files.
How can this affect efficiency? (Machine performance-wise and Team-wise).
It depends on the language, the framework you use etc. However, dividing the code too much can make it unreadable, which is (most of the time) the bigger problem. Since most of the time you will (or should) be working in a team, you should consider how readable your code would be for them.
However, answering this in a definite way is difficult. You should ask a Senior developer on your team for guidelines.

Draw a graph using Tensorflow [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I am very new to tensorflow. I have installed Tensorflow CPU version and also installed python 3.5.2. I have also created a simple program using tensorflow like some addition of two numbers etc. Now i want to create a graphical program using python and tensorflow. Is it possible? any simple program will be helpful(like draw a line, draw a graph etc..). Kindly mention the steps. i have tried lots of examples but its not working.
Thanks in advance!!
TensorFlow is not here to draw anything. Like, really not. It is built to compute stuff, over CPU and GPU and to give a Machine Learning library.
Now, you may have been confused because TensorFlow often mention "graphs". Those graphs are computational graphs, which is how each operations are linked one to another. In fact, "graphs" refers to a set of vertice and edges, which can be drawn, but is not a "graphical" object in itself. Still, you can see this graph, generated by TensorFlow, thanks to Tensorboard.
Now, what is totally possible is, in one hand to run intensive computations thanks to TensorFlow AND, in the other hand, another librarie to draw, plot etc...
The most famous/documented/used/best by far is matplotlib, aka pyplot. This is the lib to use, the lib to learn, as you will encounter it everywhere. In fact, I don't know any other lib.
Tell me if my answer is somehow off topic, or if you would like me to precise something.
Have fun!

Loading Big Data on Python [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I'm fairly new to data-science and barely started using python roughly about two months ago. I've been trying to do a Kaggle competition for fun (catsVsDogs) to try to learn things along the way. However I'm stopped at the very first step. The problem is that there is a training set, which contains about 25000 .jpg images of cats and dogs and the total directory is approximately 800 MB in size. Whenever I try to load the directory into python and save all the images in a matrix (say we have 100 of (300,200) size images, I would like to save them in a matrix of 100*(60000) size) I get either a memory error or the system just stops processing. I'm using canopy on a mac.
I've been trying to read a lot on the internet and find out how people deal with these big images, but it has been a week and I still couldn't find any good source. I would highly appreciate it if somebody helped me out or would just send me a link that describes the situations.
here's the link for Kaggle Competition (you can see there is no prizes involved and it's just for the sake of learning):
https://www.kaggle.com/c/dogs-vs-cats/data
The question is how do I manage to load this big dataset into python using canopy and start training a Neural Network. Or generally how do I deal with big datasets on a single computer without memory error.
I would recommend making an index of items that you wish to read (directory listing). Next read just the first item, train using just that item, remove that item from memory, move on to the next item, and repeat. You shouldn't need to have more then a few in memory at any given time.

Computation time for RSA? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I'm quite new to programming and had done a Cryptography course on Coursera before I stared learning Python. Recently, as a project, I wanted to write my own code for the RSA algorithm. I have just finished writing the encryption process which is as such:
However, the program is running now and is taking a long time. I did notice, it took a long time for the keys and modulos to compute because of the sheer size. Because I am new to all of this, I don't know enough and was wondering if there was any way to speed up the process?
If my code is required to be posted, I can do it however I would prefer a more general answer on how to speed up code.
Thanks
I too took the course on coursera. You should check the following libraries out, it can speed up your calculations tremendously :
1.) http://userpages.umbc.edu/~rcampbel/Computers/Python/lib/numbthy.py ( check the powmod function)
2.) gmpy2 (gmpy2.readthedocs.org/en/latest/mpz.html)
3.) mpmath (code.google.com/p/mpmath/)

Categories