converting strings into integers in list comprehension - python

I am trying to write a function which takes the file and split it with the new line and then again split it using comma delimiter(,) after that I want to convert each string inside that list to integers using only list comprehension
# My code but it's not converting the splitted list into integers.
def read_csv(filename):
string_list = open(filename, "r").read().split('\n')
string_list = string_list[1:len(string_list)]
splitted = [i.split(",") for i in string_list]
final_list = [int(i) for i in splitted]
return final_list
read_csv("US_births_1994-2003_CDC_NCHS.csv")
Output:
TypeError: int() argument must be a string, a bytes-like object or a number, not 'list'
How the data looks after splitting with comma delimiter(,)
us = open("US_births_1994-2003_CDC_NCHS.csv", "r").read().split('\n')
splitted = [i.split(",") for i in us]
print(splitted)
Output:
[['year', 'month', 'date_of_month', 'day_of_week', 'births'],
['1994', '1', '1', '6', '8096'],
['1994', '1', '2', '7', '7772'],
['1994', '1', '3', '1', '10142'],
['1994', '1', '4', '2', '11248'],
['1994', '1', '5', '3', '11053'],
['1994', '1', '6', '4', '11406'],
['1994', '1', '7', '5', '11251'],
['1994', '1', '8', '6', '8653'],
['1994', '1', '9', '7', '7910'],
['1994', '1', '10', '1', '10498']]
How do I convert each string inside this output as integers and assign it to a single list using list comprehension.

str.split() produces a new list; so splitted is a list of lists. You'd want to convert the contents of each contained list:
[[int(v) for v in row] for row in splitted]
Demo:
>>> csvdata = '''\
... year,month,date_of_month,day_of_week,births
... 1994,1,1,6,8096
... 1994,1,2,7,7772
... '''
>>> string_list = csvdata.splitlines() # better way to split lines
>>> string_list = string_list[1:] # you don't have to specify the second value
>>> splitted = [i.split(",") for i in string_list]
>>> splitted
[['1994', '1', '1', '6', '8096'], ['1994', '1', '2', '7', '7772']]
>>> splitted[0]
['1994', '1', '1', '6', '8096']
>>> final_list = [[int(v) for v in row] for row in splitted]
>>> final_list
[[1994, 1, 1, 6, 8096], [1994, 1, 2, 7, 7772]]
>>> final_list[0]
[1994, 1, 1, 6, 8096]
Note that you could just loop directly over the file to get separate lines too:
string_list = [line.strip().split(',') for line in openfileobject]
and skipping an entry in such an object could be done with next(iterableobject, None).
Rather than read the whole file into memory and manually split the data, you could just use the csv module:
import csv
def read_csv(filename):
with open(filename, 'r', newline='') as csvfile:
reader = csv.reader(csvfile)
next(reader, None) # skip first row
for row in reader:
yield [int(c) for c in row]
The above is a generator function, producing one row at a time as you loop over it:
for row in read_csv("US_births_1994-2003_CDC_NCHS.csv"):
print(row)
You can still get a list with all rows with list(read_csv("US_births_1994-2003_CDC_NCHS.csv")).

Related

Taking a column from a csv file and putting it into a list in python

I need to write some code in python that takes a column from an csv file and makes it a list. Here is my code until now.
import csv
from collections import defaultdict
columns = defaultdict(list)
with open('Team1_BoMInput.csv') as f:
reader = csv.DictReader(f)
for row in reader:
for (k,v) in row.items():
columns[k].append(v)
y = (columns['Quantity'])
x = (columns[('Actual Price')])
b = ['2', '2', '1', '1', '1', '1', '1', '1', '1', '2', '1', '1', '3', '4', '1', '1', '1', '8', '2', '2', '1', '1', '1', '1', '4', '1', '2', '2', '2', '1', '2', '2', '2', '1', '2', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '3', '2', '1', '1', '1']
a = ['$6.41', '$14.97', '$6.78', '$11.44', '$22.61', '$1.58', '$11.68', '$19.99', '$12.99', '$3.66', '$24.99', '$1.04', '$0.09', '$1.92', '$4.80', '$1.50', '$17.92', '$1.36', '$65.52', '$24.38', '$1.91', '$3.40', '$13.79', '$39.55', '$1.94', '$3.38', '$11.34', '$18.33', '$21.13', '$8.24', '$30.14', '$125.97', '$26.54', '$8.58', '$12.77', '$11.42', '$1.32', '$2.63', '$8.58', '$0.40', '$0.57', '$2.54', '$2.83', '$1.41', '$9.03', '$3.38', '$5.98', '$4.51', '$2.54', '$6.76', '$4.51', '$1.13', '$14.24']
for i in range(0, len(b)):
b[i] = float(b[i])
print(b)
x = ([s.strip('$') for s in a])
for i in range(0, len(x)):
x[i] = float(x[i])
print(x)
instead of having the values of a and b listed in the program, I want it to take the column from the csv file and use the values of that.
Thanks in advance
Try this:
import pandas as pd
df=pd.read_csv("Team1_BoMInput.csv")
y=list(df['Quantity'])
x=list(df['Actual Price'])
Refer the Below Code, leveraging the pandas library for faster computations and lesser code
import pandas as pd
df=pd.read_csv("Team1_BoMInput.csv")
quantity_list_value=list(df.loc[:,"Quantity"].astype(float).values)
price_list_value=list(df.loc[:,"Actual Price"].apply(lambda x: x.replace("$","")).astype(float).values)
I think you code will not run unless you change it to this:
import csv
from collections import defaultdict
columns = defaultdict(list)
with open('Team1_BoMInput.csv') as f:
reader = csv.DictReader(f)
for row in reader:
for (k,v) in row.items():
if k not in columns:
columns[k] = list()
columns[k].append(v)
# Rest of your code

Append to list from another list

i have list like
list = ['1,2,3,4,5', '6,7,8,9,10']
I have problem with "," in list, because '1,2,3,4,5' its string.
I want to have list2 = ['1','2','3','4'...]
How i can do this?
Should be something like that:
nums = []
for str in list:
nums = nums + [int(n) for n in str.split(',')]
You can loop through and split the strings up.
list = ['1,2,3,4,5', '6,7,8,9,10']
result = []
for s in list:
result += s.split(',')
print(result)
Split each value in the original by , and then keep appending them to a new list.
l = []
for x in ['1,2,3,4,5', '6,7,8,9,10']:
l.extend(y for y in x.split(','))
print(l)
Use itertools.chain.from_iterable with map:
from itertools import chain
lst = ['1,2,3,4,5', '6,7,8,9,10']
print(list(chain.from_iterable(map(lambda x: x.split(','), lst))))
# ['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
Note that you shouldn't use list name for variables as it's a built-in.
You can also use list comprehension
li = ['1,2,3,4,5', '6,7,8,9,10']
res = [c for s in li for c in s.split(',') ]
print(res)
#['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
list2 = []
list2+=(','.join(list).split(','))
','.join(list) produces a string of '1,2,3,4,5,6,7,8,9,10'
','.join(list).split(',') produces ['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
join method is used to joined elements in a list by a delimiter. It returns a string in which the elements of sequence have been joined by ','.
split method is used to split a string into a list by a delimiter. It splits a string into an array of substrings.
# Without using loops
li = ['1,2,3,4,5', '6,7,8,9,10']
p = ",".join(li).split(",")
#['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']

How to create a dictionary with lists as values out of a csv file?

I have a csv file with 19 columns and want to make it as a dictionary that the first 2 columns be the key (maybe a tuple or just merge them as one string), and then all other 17 columns be a list as values. the excel file looks like this: image of the cvs file
I want to have a dictionary like this :
d1 = { "A , 222" : [1,1,1,0,1,1,1,1,1,1,1,1,1,1,1,1]}
d2={"B, 223" : [1,1,1,1,0,0,0,1,1,0,0,1,1,1,1,1]}
d3 = {....}
....
Here's a solution using csv.reader
from csv import reader
d = {}
with open('infile.csv', newline='') as f:
r = reader(f)
for row in r:
if not row:
continue # Handles blank rows
key1, key2, *value = row
d[(key1, key2)] = value
Edit:
The line key1, key2, *value = row will only work in Python 3. If that feature is not available to you, you can use
key1, key2 = row[:2]
value = row[2:]
Are you means this?
res = {}
with open(fileName.csv, "r") as f:
text = f.readlines()
for line in text[1:]:
part = line.strip().split(",")
key = ",".join(part[:2])
value = [int(i) for i in part[2:]]
res[key] = value
Using csv.reader you can do that like:
Code:
as_dict = {'{}, {}'.format(*row[:2]): row[2:] for row in reader if row}
Test Code:
data = StringIO(''.join('\n'.join(x.strip() for x in u"""
A,222,1,1,1,0,1,1,1,1,1,1,1,1,1,1,1,1
B,223,1,1,1,1,0,0,0,1,1,0,0,1,1,1,1,1
""".split('\n')[1:-1])))
reader = csv.reader(data)
as_dict = {'{}, {}'.format(*row[:2]): row[2:] for row in reader if row}
print(as_dict)
Results:
{
'A, 222': ['1', '1', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1'],
'B, 223': ['1', '1', '1', '1', '0', '0', '0', '1', '1', '0', '0', '1', '1', '1', '1', '1']
}

Variable Returns Are Blank Using Defined Function in Python [duplicate]

When I'm moving through a file with a csv.reader, how do I return to the top of the file. If I were doing it with a normal file I could just do something like "file.seek(0)". Is there anything like that for the csv module?
Thanks ahead of time ;)
You can seek the file directly. For example:
>>> f = open("csv.txt")
>>> c = csv.reader(f)
>>> for row in c: print row
['1', '2', '3']
['4', '5', '6']
>>> f.seek(0)
>>> for row in c: print row # again
['1', '2', '3']
['4', '5', '6']
You can still use file.seek(0). For instance, look at the following:
import csv
file_handle = open("somefile.csv", "r")
reader = csv.reader(file_handle)
# Do stuff with reader
file_handle.seek(0)
# Do more stuff with reader as it is back at the beginning now
This should work since csv.reader is working with the same.
I've found the csv.reader and csv.DictReader a little difficult to work with because of the current line_num. making a list from the first read works well:
>>> import csv
>>> f = open('csv.txt')
>>> lines = list( csv.reader(f) ) # <-- list from csvReader
>>>
>>> for line in lines:
... print(line)
['1', '2', '3']
['4', '5', '6']
>>>
>>> for line in lines:
... print(line)
['1', '2', '3']
['4', '5', '6']
>>>
>>>lines[1]
['4', '5', '6']
this captures the optional first row used by the dictReader but lets you work with the list again and again and even inspect individual rows.

Sorting list with strings & ints in numerical order

I'm currently dealing with the below list:
[['John', '1', '2', '3'], ['Doe', '1', '2', '3']]
I'm incredibly new to python, I'm wanting to order this list in numerical order (high - low) but maintain the string at the beginning of the list. Like this:-
[['John', '3', '2', '1'], ['Doe', '3', '2', '1']]
There will always be one name & integers there after.
I collect this list from a csv file like so:-
import csv
with open('myCSV.csv', 'r') as f:
reader = csv.reader(f)
your_list = list(reader)
print(sorted(your_list))
Any help is much appreciated. Thanks in advance..
Iterate over the list and sort only the slice of each sublist without the first item. To sort strings as numbers pass key=int to sorted. Use reverse=True since you need a reversed order:
>>> l = [['John', '1', '2', '3'], ['Doe', '1', '2', '3']]
>>>
>>> [[sublist[0]] + sorted(sublist[1:], key=int, reverse=True) for sublist in l]
[['John', '3', '2', '1'], ['Doe', '3', '2', '1']]

Categories