I have this rather simple script which generates 1000000000 (nine zeroes) numbers and then store in a file the generated numbers a how many times they were generated.
import random
import csv
dic = {}
for i in range(0, 1000000000):
n = random.randint(0, 99999)
if n in dic:
dic[n] += 1
else:
dic[n] = 1
writer = csv.writer(open('output', 'w'))
for key, value in dic.iteritems():
writer.writerow([key, value])
writer.close()
The script is exiting with a Killed message. According to this question What does 'killed' mean?, using dic.iteritems() should be enough for preventing such issue, but that's not the case.
So how could I proceed to accomplish such task?
It doesn't look like your problem is the dict. Your problem is here:
for i in range(0, 1000000000):
^^^^^^^^^^^^^^^^^^^^
On Python 2, that's a list 1000000000 items long, more than your system can handle. You want xrange, not range. xrange generates numbers on demand. (On Python 3, range does what xrange used to and xrange is gone.)
Oh, and if you think 11 GB should be enough for that list: not in Python. Try sys.getsizeof(0) to see how many bytes an int takes.
Related
This is a really basic question for dictionaries from one of the exercises. The question is :
Write a Python script to generate and print a dictionary that contains a number (between 1 and n) in the form (x, x*x).
This here is my code:
import random
def dict_generate():
n = int(input("Please enter an integer:"))
dic = {}
for x in range(n):
num = random.randrange(1,n)
key = num
value = num*num
dic[key] = value
print(dic)
dict_generate()
However, when I run it, it seems to skip over a few iterations, so the end result is that I get a dictionary with lesser values in it that the number I entered. For example, I entered 5, but instead of 5 sets of keys and values, my output was this:
{3: 9, 1: 1, 2: 4}
Running this code through python tutor, the problem seems to be that the values are not being inserted into the dictionary. I can't seem to understand why. If someone could explain it would be really helpful thanks.
(I am aware that there are many other ways to complete this question but I'm trying to understand where I went wrong.)
num = random.randrange(1,n)
key = num
Since you use the key as a random number it may come again and again the same number for the key so your dictionary has less than 5 keys.
file = open('funny_file.txt', 'r')
list = []
for i in file:
list.append(int(i[:-1]))
print(len([num for num in list for i in range(100) if num == 3**i]))
The result is good but can i make it in other easier way?
Assuming the largest exponent is 100 (from your range(100)). You could do a fairly O(1) computation. First you can calculate 3^100 = 515377520732011331036461129765621272702107522001. Now, this number divided by any power of 3 (smaller than this number) has a reminder of 0. So you could write:
def check_power_of_three(n):
return 515377520732011331036461129765621272702107522001 % n == 0
Now, about your code. I do not get why are you doing two loops, when you can achieve the same thing using just one.
file = open('funny_file.txt', 'r')
numList = [] # I changed the name to avoid conflicts with `list` type
powerOfThree = []
for i in file:
n = int(i[:-1])
numList.append(n)
if check_power_of_three(n):
powerOfThree.append(n)
print(len(powerOfThree))
# Also remember to close() the file
file.close()
Outputs:
1
^ Which is 27
I'm assuming by 'square of the 3', you mean powers of three.
There are some things going on in your print statement that I would recommend you to revise. Generally, when you are interested in achieving a particular result, it makes sense to create a separate function for it that encapsulates that functionality. In this case, it could be a function that tells us whether it's input is a power of three. Using the logarithm with base three you should be able to come up with an implementation of that function.
One way to optimize the code is to skip computation for the numbers which are already explored before, like below:
file = open('funny_file.txt', 'r')
list = []
for i in file:
list.append(int(i[:-1]))
powerof_3_nums = []
not_power_of_3 = []
for num in list:
if num in powerof_3_nums:
continue
if num in not_power_of_3:
continue
for i in range(100):
if num == 3**i:
powerof_3_nums.append(num)
print(len(powerof_3_nums))
I wrote a python function largestProduct(n) which returns the largest number made from product of two n-digit numbers. The code runs fine for n untill 3, but shows memory error for n>3. Is there a way I can improve my code to avoid this error?
def largestProduct(n):
number_lst = []
prod_lst = []
count = 0
for i in range(10**(n-1),(10**n)):
number_lst.append(i)
while count!= len(number_lst):
for i in range(len(number_lst)):
prod_lst.append(number_lst[count]*number_lst[i])
count +=1
prod_set = list(set(prod_lst))
return max(prod_lst)
Well, you don't need any storage to loop through what you need:
def largestProduct(n):
range_low = 10**(n-1)
range_high = 10**n
largest = 0
# replace xrange with range for Python 3.x
for i in xrange(range_low, range_high):
for j in xrange(range_low, range_high):
largest = max(largest, i*j)
return largest
But why would you want to do this? The largest product of two n-long numbers is always the largest number you can write with n digits squared, i.e.:
def largestProduct(n):
return (10**n-1)**2
You should consider creating a generator function. In the end you can iterate over the output of your function which only processes each element one by one instead of holding the whole list in memory.
see https://wiki.python.org/moin/Generators
I am having some trouble checking if a sequence exists in a sorted list or not. For example, if my list is
n=[1,2,3,4,5,6,7]
I can write my code as:
for i in range(1,len(n)):
if n[i]==n[0]+1:
print('yes')
But, this logic fails if I am checking 5 numbers at a time. For example, if my list is
n=[1,3,4,5,6,7,8]
my sequence does not contain the first 5 numbers but it definitely exists in the list.
I was wondering if someone could give me some hints about how to check a sequence number by number for 5 numbers at a time. Thanks
You can use this recipe from Python2.6's itertools page:
from itertools import groupby
from operator import itemgetter
data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28,29]
# Find runs of consecutive numbers using groupby. The key to the solution
# is differencing with a range so that consecutive numbers all appear in
# same group.
for k, g in groupby(enumerate(data), lambda x: x[0]-x[1]):
seq = list(map(itemgetter(1), g))
# if you don't want the actual items then:
# sum(1 for _ in g) >= 5: print('yes')
if len(seq) >= 5:
print (seq)
Output:
[25, 26, 27, 28, 29]
search_target = [1,2,3,4]
search_space = range(1000)
sub_size = len(search_target)
lists = [search_space[i:i+sub_size] for i in range(len(search_space)-sub_size)]
if search_target in lists:
print "YUP"
I'm not sure if this is an approach you would like, but using a buffer works.
The idea is you would iterate through your array/list and collect the numbers (in order) until you run into one that is NOT in sequence, where you would dump the contents in the buffer and keep going. If the buffer contains enough numbers at the end you have your sequence.
This way you would only need to iterate the list once and once only O(n).
def check_for_sequence(array, length):
buf = []
for n in array:
if buf == [] or (buf[-1] + 1 == n):
buf.append(n)
else:
buf = [n]
if len(buf) >= length:
print buf[:length]
else:
print "No sequence of length %s" % length
You can also make this (possibly) faster by terminating the for loop as soon as your buffer hits the required length. But then you'd have to check for the length every iteration.
I want to know whether there is an equivalent statement in lists to do the following. In MATLAB I would do the following
fid = fopen('inc.txt','w')
init =1;inc = 5; final=51;
a = init:inc:final
l = length(a)
for i = 1:l
fprintf(fid,'%d\n',a(i));
end
fclose(fid);
In short I have an initial value, a final value and an increment. I need to create an array (I read it is equivalent to lists in python) and print to a file.
In Python, range(start, stop + 1, step) can be used like Matlab's start:step:stop command. Unlike Matlab's functionality, however, range only works when start, step, and stop are all integers. If you want a parallel function that works with floating-point values, try the arange command from numpy:
import numpy as np
with open('numbers.txt', 'w') as handle:
for n in np.arange(1, 5, 0.1):
handle.write('{}\n'.format(n))
Keep in mind that, unlike Matlab, range and np.arange both expect their arguments in the order start, stop, then step. Also keep in mind that, unlike the Matlab syntax, range and np.arange both stop as soon as the current value is greater than or equal to the stop value.
http://docs.scipy.org/doc/numpy/reference/generated/numpy.arange.html
You can easily create a function for this. The first three arguments of the function will be the range parameters as integers and the last, fourth argument will be the filename, as a string:
def range_to_file(init, final, inc, fname):
with open(fname, 'w') as f:
f.write('\n'.join(str(i) for i in range(init, final, inc)))
Now you have to call it, with your custom values:
range_to_file(1, 51, 5, 'inc.txt')
So your output will be (in the fname file):
1
6
11
16
21
26
31
36
41
46
NOTE: in Python 2.x a range() returns a list, in Python 3.x a range() returns an immutable sequence iterator, and if you want to get a list you have to write list(range())
test.py contains :
#!/bin/env python
f = open("test.txt","wb")
for i in range(1,50,5):
f.write("%d\n"%i)
f.close()
You can execute
python test.py
file test.txt would look like this :
1
6
11
16
21
26
31
36
41
46
I think your looking for something like this:
nums = range(10) #or any list, i.e. [0, 1, 2, 3...]
number_string = ''.join([str(x) for x in nums])
The [str(x) for x in nums] syntax is called a list comprehension. It allows you to build a list on the fly. '\n'.join(list) serves to take a list of strings and concatenate them together. str(x) is a type cast: it converts an integer to a string.
Alternatively, with a simple for loop:
number_string = ''
for num in nums:
number_string += str(num)
The key is that you cast the value to a string before concatenation.
I think that the original poster wanted 51 to show up in the list, as well.
The Python syntax for this is a little awkward, because you need to provide for range (or xrange or arange) an upper-limit argument that is one increment beyond your actual desired upper limit. An easy solution is the following:
init = 1
final = 51
inc = 5
with open('inc.txt','w') as myfile:
for nn in xrange(init, final+inc, inc):
myfile.write('%d\n'%nn)
open('inc.txt','w').write("\n".join(str(i) for i in range(init,final,inc)))