I run k_cliques_community on my network data and received a memory error after a long respond time. The code works perfectly fine for my other data but not this one.
c = list(k_clique_communities(G_fb, 3))
list(c[0])
Here is a snap shot of the trace error
I tried running your code on my system with 16 GB RAM and i7 -7700HQ, the kernel died after returning a memory error. I think it's because the computation of k-cliques of size 3 is taking a lot of computational power/memory since you have quite a large no. of nodes and edges. I think you need to look into other ways to find k-cliques, like GraphX - Apache Spark
You probably have an x32 installation of python. x32 installations are limited to 2 gigabytes of memory regardless of how much system memory you have. This is inconvenient, but it works. Uninstall python and install an x64 installation. Again, it's inconvenient, but it's the only solution that I've ever seen.
As stated in the networkx documentation:
To obtain a list of all maximal cliques, use list(find_cliques(G)). However, be aware that in the worst-case, the length of this list can be exponential in the number of nodes in the graph.
(this is referred to find_cliques, but it is related to your problem also).
That means, if there is a huge number of cliques, the strings you create with list will be so many that you'll probably run out of memory: strings need more space than int values to be saved.
I don't know if there is a work-around tho.
Related
This is a rather general question:
I am having issues that the same operation measured by time.clock() takes longer now than it used to.
While I had very some very similar measurements
1954 s
1948 s
1948 s
One somewhat different measurement
1999 s
Another even more different
2207 s
It still seemed more or less ok, but for another one I get
2782 s
And now that I am repeating the measurements, it seems to get slower and slower.
I am not summing over the measurements after rounding or doing other weird manipulations.
Do you have some ideas whether this could be affected by how busy the server is, the clock speed or any other variable parameters? I was hoping that using time.clock() instead of time.time() would mostly sort these out...
The OS is Ubuntu 18.04.1 LTS.
The operations are run in separate screen sessions.
The operations do not involve hard-disk acccess.
The operations are mostly numpy operations that are not distributed. So this is actually mainly C code being executed.
EDIT: This might be relevant: The measurements in time.time() and time.clock() are very similar in any of the cases. That is time.time() measurements are always just slightly longer than time.clock(). So if I haven't missed something, the cause has almost exactly the same effect on time.clock() as on time.time().
EDIT: I do not think that my question has been answered. Another reason I could think of is that garbage collection contributes to CPU usage and is done more frequently when the RAM is full or going to be full.
Mainly, I am looking for an alternative measure that gives the same number for the same operations done. Operations meaning my algorithm executed with the same start state. Is there a simple way to count FLOPS or similar?
The issue seems to be related to Python and Ubuntu.
Try the following:
Check if you have the latest stable build of the python version you're using Link 1
Check the process list, also see which cpu core your python executable is running on.
Check the thread priority state on the cpu Link 2
, Link 3
Note:
Time may vary due to process switching, threading and other OS resource management and application code execution (this cannot be controlled)
Suggestions:
it could be because of your systems build, try running your code on another machine or on a Virtual Machine.
Read Up:
Link 4
Link 5
Good Luck.
~ Dean Van Geunen
As a result of repeatedly running the same algorithm at different 'system states', I would summarize that the answer to the question is:
Yes, time.clock() can be heavily affected by the state of the system.
Of course, this holds all the more for time.time().
The general reasons could be that
The same Python code does not always result in the same commands being sent to the CPU - that is the commands depend not only on the code and the start state, but also on the system state (i.e. garbage collection)
The system might interfere with the commands sent from Python, resulting in additional CPU usage (i.e. by core switching) that is still counted by time.clock()
The divergence can be very large, in my case around 50%.
It is not clear which are the specific reasons, nor how much each of them contributes to the problem.
It is still to be tested whether timeit helps with some or all of the above points. However timeit is meant for benchmarking and might not be recommended to be used during normal processing. It turns off garbage collection and does not allow accessing return values of the timed function.
I'm running a simple Python program that takes many hours to create a large (102,000 x 102,000) float32 array (A). Total memory occupied by the array is ~40GiB, and I'm running this on an Amazon EC2 virtual machine (running Ubuntu) with 130GiB of RAM. There are no other user processes running, and I can use 'top' to see that, in fact, the program is only using about 56GiB during the array creation phase (don't ask me why it's using even that much), consistently leaving about about 68GiB free, even after allowing for system processes. So far, so good.
But the near-final line in the program is
A = np.sqrt(A)
and at this point, the program crashes with a memory error. I don't understand why this is happening, even allowing for the need for a temporary array of the same size (this should still leave about 28 GiB free). The only thing I can think of is that np.sqrt() might be internally using float64 for the scratch space. Is there any other possible explanation?
My only workaround right now is to compute the square root element-by-element in for loops.
I'm trying to read in a somewhat large dataset using pandas read_csv or read_stata functions, but I keep running into Memory Errors. What is the maximum size of a dataframe? My understanding is that dataframes should be okay as long as the data fits into memory, which shouldn't be a problem for me. What else could cause the memory error?
For context, I'm trying to read in the Survey of Consumer Finances 2007, both in ASCII format (using read_csv) and in Stata format (using read_stata). The file is around 200MB as dta and around 1.2GB as ASCII, and opening it in Stata tells me that there are 5,800 variables/columns for 22,000 observations/rows.
I'm going to post this answer as was discussed in comments. I've seen it come up numerous times without an accepted answer.
The Memory Error is intuitive - out of memory. But sometimes the solution or the debugging of this error is frustrating as you have enough memory, but the error remains.
1) Check for code errors
This may be a "dumb step" but that's why it's first. Make sure there are no infinite loops or things that will knowingly take a long time (like using something the os module that will search your entire computer and put the output in an excel file)
2) Make your code more efficient
Goes along the lines of Step 1. But if something simple is taking a long time, there's usually a module or a better way of doing something that is faster and more memory efficent. That's the beauty of Python and/or open source Languages!
3) Check The Total Memory of the object
The first step is to check the memory of an object. There are a ton of threads on Stack about this, so you can search them. Popular answers are here and here
to find the size of an object in bites you can always use sys.getsizeof():
import sys
print(sys.getsizeof(OBEJCT_NAME_HERE))
Now the error might happen before anything is created, but if you read the csv in chunks you can see how much memory is being used per chunk.
4) Check the memory while running
Sometimes you have enough memory but the function you are running consumes a lot of memory at runtime. This causes memory to spike beyond the actual size of the finished object causing the code/process to error. Checking memory in real time is lengthy, but can be done. Ipython is good with that. Check Their Document.
use the code below to see the documentation straight in Jupyter Notebook:
%mprun?
%memit?
Sample use:
%load_ext memory_profiler
def lol(x):
return x
%memit lol(500)
#output --- peak memory: 48.31 MiB, increment: 0.00 MiB
If you need help on magic functions This is a great post
5) This one may be first.... but Check for simple things like bit version
As in your case, a simple switching of the version of python you were running solved the issue.
Usually the above steps solve my issues.
I am developing a simple recommendation system and trying to do some computation like SVD, RBM, etc.
To be more convincing, I am going to use the Movielens or Netflix dataset to evaluate the performance of the system. However, the two datasets both have more than 1 million of users and more than 10 thousand of items, it's impossible to put all the data into memory. I have to use some specific modules to handle such a large matrix.
I know there are some tools in SciPy can handle this, and divisi2 used by python-recsys also seems like a good choice. Or maybe there are some better tools I don't know?
Which module should I use? Any suggestion?
I would suggest SciPy, specifically Sparse. As Dougal pointed out, Numpy is not suited for this situation.
I found another solution named crab, I try finding and comparing some of them.
If your concern is just putting the data in the memory use 64bit python with 64bit numpy. If you dont have enough physical memory you can just increase virtual memory in os level. The size of virtual memory is only limited by your hdd size. Speed of computation however is a different beast!
Anyone following CUDA will probably have seen a few of my queries regarding a project I'm involved in, but for those who haven't I'll summarize. (Sorry for the long question in advance)
Three Kernels, One Generates a data set based on some input variables (deals with bit-combinations so can grow exponentially), another solves these generated linear systems, and another reduction kernel to get the final result out. These three kernels are ran over and over again as part of an optimisation algorithm for a particular system.
On my dev machine (Geforce 9800GT, running under CUDA 4.0) this works perfectly, all the time, no matter what I throw at it (up to a computational limit based on the stated exponential nature), but on a test machine (4xTesla S1070's, only one used, under CUDA 3.1) the exact same code (Python base, PyCUDA interface to CUDA kernels), produces the exact results for 'small' cases, but in mid-range cases, the solving stage fails on random iterations.
Previous problems I've had with this code have been to do with the numeric instability of the problem, and have been deterministic in nature (i.e fails at exactly the same stage every time), but this one is frankly pissing me off, as it will fail whenever it wants to.
As such, I don't have a reliable way to breaking the CUDA code out from the Python framework and doing proper debugging, and PyCUDA's debugger support is questionable to say the least.
I've checked the usual things like pre-kernel-invocation checking of free memory on the device, and occupation calculations say that the grid and block allocations are fine. I'm not doing any crazy 4.0 specific stuff, I'm freeing everything I allocate on the device at each iteration and I've fixed all the data types as being floats.
TL;DR, Has anyone come across any gotchas regarding CUDA 3.1 that I haven't seen in the release notes, or any issues with PyCUDA's autoinit memory management environment that would cause intermittent launch failures on repeated invocations?
Have you tried:
cuda-memcheck python yourapp.py
You likely have an out of bounds memory access.
You can use nVidia CUDA Profiler and see what gets executed before the failure.