Using Python to loop contents of a CSV - python

In Python I want to list out each number inside a simple CSV...
CSV:
07555555555, 07555555551
This is what I have tried:
for number in csv.reader(instance.data_file.read().splitlines()):
print(number)
However, this outputs the whole thing as one string like this...
['07446164630', '07755555555']
Why?
I have also tried to loop like this
for i, item in enumerate(csv.reader(instance.data_file.read().splitlines())):
print(i)
print(item)
I'm not sure I fully understand what I'm doing wrong so any help in explaining how to print each number in the file would be amazing.

csv.reader parses each line of a CSV, so your loop is iterating over the lines of the CSV file. Since both numbers are in one line, you get them in one array. If you want to iterate on the values of each line, use another, nested for loop.:
for line in csv.reader(instance.data_file.read().splitlines()):
for item in line:
number = int(item)
print(number) # or whatever you want
Or using enumerate to get the indices of each number:
for line in csv.reader(instance.data_file.read().splitlines()):
for index, item in enumerate(line):
number = int(item)
print(index, number) # or whatever you want

Use numpy's flatten module to convert matrices to 1D arrays:
import numpy as np
data = np.loadtxt(file_name, delimiter=',').flatten()
for item in data: print(item)

Related

Function for split digits in list and unique number/code for every digit

I have a data input which I want to bring into a specific shape.
Data looks like this:
01211231202143244400222255340042523252102440536423024350201113345340514003134
20023230143300003201455331343005145134541545264403161213336031512541125234215
01203204313112402314341530533423155434004002652564464622316236363153203455225
My code:
with open('Dataday8.in') as file:
data = [i for i in file.read().split()]
After the code it looks like:
data = ['01211231202143244400222255340042523252102440536423024350201113345340514003', '2002323014330000320145533134300514513454154526440316121333603151254112523421', '0120320431311240231434153053342315543400400265256446462231623636315320345522']
Buy I want it to be seperates after every digit (but each line should be in one bracket). Does not matter what I try my code never brings me to the goal.
After this I would like to give every digit in the list a unique number that I can work with it in a loop.
Thanks for any help
This is the code I tried to split after every digit in the list:
[x for x in data.split('\n')]
I am simply reporting Maurice Meyer's answer which perfectly solves the problem for better future readability. Furthermore, if you want to get integers, you simply need to di
data = [list(int(i)) for i in file.read().split()]
data = [[int(elem) for elem in row] for row in data]
Although already solved by Sala, i think your problem of int object is not iterable could be solved with this:
data = list([int(elem) for elem in row] for row in (i for i in file.read().split()))
You can also write it like this:
data = list(list(int(elem) for elem in row) for row in (i for i in file.read().split()))
This is a one-liner but of course you can break it down.

Python converting strings in a list to numbers

I have encountered the below error message:
invalid literal for int() with base 10: '"2"'
The 2 is enclosed by single quotes on outside, and double quotes on inside. This data is in the primes list from using print primes[0].
Sample data in primes list:
["2","3","5","7"]
The primes list is created from a CSV file via:
primes=csvfile.read().replace('\n',' ').split(',')
I am trying to trying to convert strings in primes list into integers.
Via Google I have come across similar questions to mine on SE, and I have tried the two common answers that are relevant to my problem IMO.
Using map():
primes=map(int,primes)
Using list comprehension:
primes=[int(i) for i in primes]
Unfortunately when I use either of them these both give the same error message as listed above. I get a similar error message for long() when used instead of int().
Please advise.
you want:
to read each csv lines
to create a single list of integers with the flattened version of all lines.
So you have to deal with the quotes (sometimes they may even not be here depending on how the file is created) and also when you're replacing linefeed by space, that doesn't split the last number from one line with the first number of the next line. You have a lot of issues.
Use csv module instead. Say f is the handle on the opened file then:
import csv
nums = [int(x) for row in csv.reader(f) for x in row]
that parses the cells, strips off the quotes if present and flatten + convert to integer, in one line.
To limit the number of numbers read, you could create a generator comprehension instead of a list comprehension and consume only the n first items:
n = 20000 # number of elements to extract
z = (int(x) for row in csv.reader(f) for x in row)
nums = [next(z) for _ in xrange(n)] # xrange => range for python 3
Even better, to avoid StopIteration exception you could use itertools.islice instead, so if csv data ends, you get the full list:
nums = list(itertools.islice(z,n))
(Note that you have to rewind the file to call this code more than once or you'll get no elements)
Performing this task without the csv module is of course possible ([int(x.strip('"')) for x in csvfile.read().replace('\n',',').split(',')]) but more complex and error-prone.
You can try this:
primes=csvfile.read().replace('\n',' ').split(',')
final_primes = [int(i[1:-1]) for i in primes]
try this:
import csv
with open('csv.csv') as csvfile:
data = csv.reader(csvfile, delimiter=',', skipinitialspace=True)
primes = [int(j) for i in data for j in i]
print primes
or to avoid duplicates
print set(primes)

Python: How to take multiple lines of inputed data and put it into a list

I know this is probably something incredibly simple, but I seem to be stumped.
Anyways for an assignment I have to have the user enter the number of data points(N) followed by the data points themselves. They must then be printed in the same manner in which they were entered (one data point/line) and then put into a single list for later use. Here's what I have so far
N = int(input("Enter number of data points: "))
lines = ''
for i in range(N):
lines += input()+"\n"
print(lines)
output for n = 4 (user enters 1 (enter) 2 (enter)...4 and the following is printed:
1
2
3
4
So this works and looks perfect however I now need to convert these values into a list to do some statistics work later in the program. I have tried making a empty list and bringing lines into it however the /n formating seems to mess things up. Or I get list index out of range error.
Any and all help is greatly appreciated!
How about adding every new input directly to a list and then just printing it.
Like this:
N = int(input("Enter number of data points: "))
lines = []
for i in range(N):
new_data = input("Next?")
lines.append(new_data)
for i in lines:
print(i)
Now every item was printed in a new line and you have a list to manipulate.
You could just add all the inputs to the list and use the join function like this:
'\n'.join(inputs)
when you want to print it. This gives you a string with each members of the list separated with the delimiter of your choice, newline in this case.
This way you don't need to add anything to the values you get from the user.
You could try to append all the data into a list first, and then print every item in it line by line, using a for loop to print every line, so there is no need to concatenate it with "\n"
N = int(input("Enter number of data points: "))
data = []
for i in range(N):
item = data.append(input())
for i in data:
print(i)

Lists - changing strings to integers in a list imported form CSV, not all data

I have the following code in python 3
import csv
import operator
with open('herofull.csv','r') as file:
reader=csv.reader(file,delimiter=',')
templist=list(reader)
print(templist)
and the data on the csv looks like this
CSVflie
The program imports the data into a list. I then want to change the last 3 items on each row that are now in the list to integers so I can do calculations with them, is this possible? I have tried all sorts with no luck. I can do it with a simple list but this is imported like a list within a list which is making my brain hurt. Please help
Ross
Probably simplest to do a loop, especially if you know it's the last three elements.
for row in templist:
for i in range(-3, 0):
row[i] = int(row[i])
This will not create a new list in memory, instead simply changing the existing templist.
Alternatively, if you know that the last three numbers are always going to contain 2 digits or less, you can do the following:
for line in templist:
for index, element in enumerate(line):
if element.isdigit() and len(element) <= 2:
line[index] = int(element)
This will create a new list with your data and convert any strings that are digits into integers.
new_list = []
for row in templist:
new_list.append(
[int(i) if i.isdigit() else i for i in row]
)
print(new_list)
cool...cheers
Had a brain wave and got this to work
import csv
with open('herofull.csv','r') as file:
reader=csv.reader(file,delimiter=',')
templist=list(reader)
for row in templist:
row[4]=int(row[4])
row[5]=int(row[5])
row[6]=int(row[6])
print(templist)
I can now do calculations and append the list, thanks for you help. It appears I just needed to stop thinking about it for while (wood, trees and all that)

Writing list to file only 10 times, python

I have a list of lists called sorted_lists
I'm using this to write them into a txt file. Now the thing is, this line below prints ALL the lists. I'm trying to figure it out how to print only first n (n = any number), for example first 10 lists.
f.write ("\n".join("\t".join (row) for row in sorted_lists)+"\n")
Try the following:
f.write ("\n".join("\t".join (row) for row in sorted_lists[0:N])+"\n")
where N is the number of the first N lists you want to print.
sorted_lists[0:N] will catch the first N lists (counting from 0 to N-1, there are N lists; list[N] is excluded). You could also write sorted_lists[:N] which implicitly means that it will start from the first item of the list (item 0). They are the same, the latter may be considered more elegant.
f.write ('\n'.join('\t'.join(row) for row in sorted_lists[:n+1])+'\n')
where n is the number of lists.
Why not simplify this code and use the right tools:
from itertools import islice
import csv
first10 = islice(sorted_lists, 10)
with open('output.tab', 'wb') as fout:
tabout = csv.writer(fout, delimiter='\t')
tabout.writerows(first10)
You should read up on the python slicing features.
If you want to look at only the first 10 entires of sorted_lists, you could do sorted_lists[0:10].

Categories