ValueError: invalid literal for int() with base 10: 'pixel' [closed] - python

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 2 years ago.
Improve this question
the image variable comes from a text file (containing numbers, which i assume that they are a string in python). So what was i'm trying to do is to typecast the "pixel" variable (which is string elements from the text file) to an integer so that I could divide them. but the error in the title above shows. Is there a way to typecast it with it staying in the index? Thanks!
def getHist (image, bins):
hist = [0 for e in range(0,bins)]
interval = 256//bins
for row in image:
for pixel in row:
hist[int(pixel)//interval] += 1
return hist
def printHist (hist):
for i in range(0, len(hist)):
print(i+1,"-", hist[i], "\t- ", end="")
for j in range(0, hist[i]):
print("|", end="")
print()
def readImage (filename):
image = []
fileHandle = open(filename, "r")
for line in fileHandle:
image_row = line[0:-1].split(",")
image_row[-1] = image_row[-1][0:len(line)].split("\n")
image.append(image_row)
fileHandle.close()
return image
image = readImage("image.txt")
hist = getHist("image.txt",8)
printHist(hist)
Here is an example for the image parameter (from a text file):
212,16,142,183,92,211,0,221,54,226
227,56,137,252,140,241,1,3,153,51
157,144,121,99,17,185,125,27,76,129
49,55,81,220,194,8,62,179,96,142
74,178,80,24,2,34,247,177,244,82
93,117,154,152,35,224,38,70,193,52
181,61,45,141,163,222,160,168,203,104
234,114,244,53,252,48,66,7,218,95
49,189,18,31,184,207,53,141,148,188
238,6,104,189,244,132,28,92,147,123
where each row, each element is the pixel
the file loaded by a different function on my program, but yeah. i'm trying to convert each pixel to a int so that i could divide it with another int.

Replace
histogram = getHistogram("image.txt",8)
with
histogram = getHistogram(image, 8)
Otherwise you're iterating over the filename instead of the data from the file!

Related

Checking most used colors in image [duplicate]

This question already has an answer here:
fastest way to find the rgb pixel color count of image
(1 answer)
Closed last month.
I want to know the list of most used colors in this picture:
I tried the following code, but it takes too long:
from PIL import Image
colors = []
class Color:
def __init__(self, m, c):
self.col = c
self.many = m
im = Image.open("~/.../strowberry.jpeg")
def cool():
for i in im.getdata():
i = str(i)
i = i.replace(", ", "")
i = i.replace("(", "")
i = i.replace(")", "")
i = int(i)
colors.append(Color(1, i))
for x in colors:
num = 0
for j in range(len(colors)):
if x.col == colors[num].col:
del colors[num]
num -= 1
x.many += 1
num += 1
for obj in colors:
print(obj.many, obj.col)
cool()
Why is the code so slow and how can I improve the performance?
Do not reinvent the wheel. The Python Standard Library contains a Counter that can do this for you much more efficiently. Using this, you don't need to iterate over the data yourself. You also do not need to define a Class and perform the string operations. The code is very short and simple:
import collections
from PIL import Image
im = Image.open('strawberry.jpg')
counter = collections.Counter(im.getdata())
for color in counter:
print(f'{counter[color]} times color {color}')
If you really need the Color objects (for whatever you want to do with it later in your program), you can easily create this from the counter object using this one-liner:
colors = [Color(counter[color], color) for color in counter]
...and if you really need it in the same string format as in your original code, use this instead:
colors = [Color(counter[color], int(''.join(map(str, color)))) for color in counter]
Note that the two one-liners make use of list comprehension, which is very Pythonic and in many cases very fast as well.
The code int(''.join(map(str, color))) does the same as your 5 lines of code in the inner loop. This uses the fact that the original data is a tuple of integers, which can be converted to strings using map(str, ...) and then concatenated together using ''.join(...).
All this together took about 0.5 second on my machine, without the printing (which is slow anyway).

Wrong sequence for list using sorted() [duplicate]

This question already has answers here:
Sort a python list of strings with a numeric number
(3 answers)
Closed 3 years ago.
I have some images which I generate from url with random pictures. Then I try to sort them to work with it properly, but they sorting is messed up. Appreciate any advices or pointing to what I missing
Code ( image list generating ):
def image_downloader():
image_url = 'url'
for count in tqdm(range(20)):
image_data = requests.get(image_url).content
with open(f'image_{count}.jpg', 'wb') as handler:
handler.write(image_data)
sleep(0.5)
And my sorting ( trying to get it by generated picture "id" ):
local_folder_content = os.listdir('.')
images_list = list((image for image in local_folder_content if image.endswith('.jpg')))
pprint((sorted(images_list, key=lambda x: x[:-4].split('_')[1])))
Result( sorting is messed up) :
['image_0.jpg',
'image_1.jpg',
'image_10.jpg',
'image_11.jpg',
'image_12.jpg',
'image_13.jpg',
'image_14.jpg',
'image_15.jpg',
'image_16.jpg',
'image_17.jpg',
'image_18.jpg',
'image_19.jpg',
'image_2.jpg',
'image_3.jpg',
'image_4.jpg',
'image_5.jpg',
'image_6.jpg',
'image_7.jpg',
'image_8.jpg',
'image_9.jpg']
You can try something like this :
images_list.sort(key= lambda i: int(i.lstrip('image_').rstrip('.jpg')))
You have to generate all filenames with two (or more) digits:
with open(f'image_{str(count).zfill(2)}.jpg', 'wb') as handler:
Output:
image_01.jpg
image_02.jpg
image_04.jpg
In this case your images will be correctly sorted.

Having an issue with using median function in numpy

I am having an issue with using the median function in numpy. The code used to work on a previous computer but when I tried to run it on my new machine, I got the error "cannot perform reduce with flexible type". In order to try to fix this, I attempted to use the map() function to make sure my list was a floating point and got this error message: could not convert string to float: .
Do some more attempts at debugging, it seems that my issue is with my splitting of the lines in my input file. The lines are of the form: 2456893.248202,4.490 and I want to split on the ",". However, when I print out the list for the second column of that line, I get
4
.
4
9
0
so it seems to somehow be splitting each character or something though I'm not sure how. The relevant section of code is below, I appreciate any thoughts or ideas and thanks in advance.
def curve_split(fn):
with open(fn) as f:
for line in f:
line = line.strip()
time,lc = line.split(",")
#debugging stuff
g=open('test.txt','w')
l1=map(lambda x:x+'\n',lc)
g.writelines(l1)
g.close()
#end debugging stuff
return time,lc
if __name__ == '__main__':
# place where I keep the lightcurve files from the image subtraction
dirname = '/home/kuehn/m4/kepler/subtraction/detrending'
files = glob.glob(dirname + '/*lc')
print(len(files))
# in order to create our lightcurve array, we need to know
# the length of one of our lightcurve files
lc0 = curve_split(files[0])
lcarr = np.zeros([len(files),len(lc0)])
# loop through every file
for i,fn in enumerate(files):
time,lc = curve_split(fn)
lc = map(float, lc)
# debugging
print(fn[5:58])
print(lc)
print(time)
# end debugging
lcm = lc/np.median(float(lc))
#lcm = ((lc[qual0]-np.median(lc[qual0]))/
# np.median(lc[qual0]))
lcarr[i] = lcm
print(fn,i,len(files))

Writing a random amount of random numbers to a file and returning their squares

So, I'm trying to write a random amount of random whole numbers (in the range of 0 to 1000), square these numbers, and return these squares as a list. Initially, I started off writing to a specific txt file that I had already created, but it didn't work properly. I looked for some methods I could use that might make things a little easier, and I found the tempfile.NamedTemporaryFile method that I thought might be useful. Here's my current code, with comments provided:
# This program calculates the squares of numbers read from a file, using several functions
# reads file- or writes a random number of whole numbers to a file -looping through numbers
# and returns a calculation from (x * x) or (x**2);
# the results are stored in a list and returned.
# Update 1: after errors and logic problems, found Python method tempfile.NamedTemporaryFile:
# This function operates exactly as TemporaryFile() does, except that the file is guaranteed to have a visible name in the file system, and creates a temprary file that can be written on and accessed
# (say, for generating a file with a list of integers that is random every time).
import random, tempfile
# Writes to a temporary file for a length of random (file_len is >= 1 but <= 100), with random numbers in the range of 0 - 1000.
def modfile(file_len):
with tempfile.NamedTemporaryFile(delete = False) as newFile:
for x in range(file_len):
newFile.write(str(random.randint(0, 1000)))
print(newFile)
return newFile
# Squares random numbers in the file and returns them as a list.
def squared_num(newFile):
output_box = list()
for l in newFile:
exp = newFile(l) ** 2
output_box[l] = exp
print(output_box)
return output_box
print("This program reads a file with numbers in it - i.e. prints numbers into a blank file - and returns their conservative squares.")
file_len = random.randint(1, 100)
newFile = modfile(file_len)
output = squared_num(file_name)
print("The squared numbers are:")
print(output)
Unfortunately, now I'm getting this error in line 15, in my modfile function: TypeError: 'str' does not support the buffer interface. As someone who's relatively new to Python, can someone explain why I'm having this, and how I can fix it to achieve the desired result? Thanks!
EDIT: now fixed code (many thanks to unutbu and Pedro)! Now: how would I be able to print the original file numbers alongside their squares? Additionally, is there any minimal way I could remove decimals from the outputted float?
By default tempfile.NamedTemporaryFile creates a binary file (mode='w+b'). To open the file in text mode and be able to write text strings (instead of byte strings), you need to change the temporary file creation call to not use the b in the mode parameter (mode='w+'):
tempfile.NamedTemporaryFile(mode='w+', delete=False)
You need to put newlines after each int, lest they all run together creating a huge integer:
newFile.write(str(random.randint(0, 1000))+'\n')
(Also set the mode, as explained in PedroRomano's answer):
with tempfile.NamedTemporaryFile(mode = 'w+', delete = False) as newFile:
modfile returns a closed filehandle. You can still get a filename out of it, but you can't read from it. So in modfile, just return the filename:
return newFile.name
And in the main part of your program, pass the filename on to the squared_num function:
filename = modfile(file_len)
output = squared_num(filename)
Now inside squared_num you need to open the file for reading.
with open(filename, 'r') as f:
for l in f:
exp = float(l)**2 # `l` is a string. Convert to float before squaring
output_box.append(exp) # build output_box with append
Putting it all together:
import random, tempfile
def modfile(file_len):
with tempfile.NamedTemporaryFile(mode = 'w+', delete = False) as newFile:
for x in range(file_len):
newFile.write(str(random.randint(0, 1000))+'\n')
print(newFile)
return newFile.name
# Squares random numbers in the file and returns them as a list.
def squared_num(filename):
output_box = list()
with open(filename, 'r') as f:
for l in f:
exp = float(l)**2
output_box.append(exp)
print(output_box)
return output_box
print("This program reads a file with numbers in it - i.e. prints numbers into a blank file - and returns their conservative squares.")
file_len = random.randint(1, 100)
filename = modfile(file_len)
output = squared_num(filename)
print("The squared numbers are:")
print(output)
PS. Don't write lots of code without running it. Write little functions, and test that each works as expected. For example, testing modfile would have revealed that all your random numbers were being concatenated. And printing the argument sent to squared_num would have shown it was a closed filehandle.
Testing the pieces gives you firm ground to stand on and lets you develop in an organized way.

Writing to a multi dimensional array with split

I am trying to use python to parse a text file (stored in the var trackList) with times and titles in them it looks like this
00:04:45 example text
00:08:53 more example text
12:59:59 the last bit of example text
My regular expression (rem) works, I am also able to split the string (i) into two parts correctly (as in I separate times and text) but I am unable to then add the arrays (using .extend) that the split returns to a large array I created earlier (sLines).
f=open(trackList)
count=0
sLines=[[0 for x in range(0)] for y in range(34)]
line=[]
for i in f:
count+=1
line.append(i)
rem=re.match("\A\d\d\:\d\d\:\d\d\W",line[count-1])
if rem:
sLines[count-1].extend(line[count-1].split(' ',1))
else:
print("error on line: "+count)
That code should go through each line in the file trackList, test to see if the line is as expected, if so separate the time from the text and save the result of that as an array inside an array at the index of one less than the current line number, if not print an error pointing me to the line
I use array[count-1] as python arrays are zero indexed and file lines are not.
I use .extend() as I want both elements of the smaller array added to the larger array in the same iteration of the parent for loop.
So, you have some pretty confusing code there.
For instance doing:
[0 for x in range(0)]
Is a really fancy way of initializing an empty list:
>>> [] == [0 for x in range(0)]
True
Also, how do you know to get a matrix that is 34 lines long? You're also confusing yourself with calling your line 'i' in your for loop, usually that would be reserved as a short hand syntax for index, which you'd expect to be a numerical value. Appending i to line and then re-referencing it as line[count-1] is redundant when you already have your line variable (i).
Your overall code can be simplified to something like this:
# load the file and extract the lines
f = open(trackList)
lines = f.readlines()
f.close()
# create the expression (more optimized for loops)
expr = re.compile('^(\d\d:\d\d:\d\d)\s*(.*)$')
sLines = []
# loop the lines collecting both the index (i) and the line (line)
for i, line in enumerate(lines):
result = expr.match(line)
# validate the line
if ( not result ):
print("error on line: " + str(i+1))
# add an invalid list to the matrix
sLines.append([]) # or whatever you want as your invalid line
continue
# add the list to the matrix
sLines.append(result.groups())

Categories