python parse csv to lists - python

I have a csv file thru which I want to parse the data to the lists.
So I am using the python csv module to read that
so basically the following:
import csv
fin = csv.reader(open(path,'rb'),delimiter=' ',quotechar='|')
print fin[0]
#gives the following
['"1239","2249.00","1","3","2011-02-20"']
#lets say i do the following
ele = str(fin[0])
ele = ele.strip().split(',')
print ele
#gives me following
['[\'"1239"', '"2249.00"', '"1"', '"3"', '"2011-02-20"\']']
now
ele[0] gives me --> output---> ['"1239"
How do I get rid of that ['
In the end, I want to do is get 1239 and convert it to integer.. ?
Any clues why this is happening
Thanks
Edit:*Never mind.. resolved thanks to the first comment *

Change your delimiter to ',' and you will get a list of those values from the csv reader.

It's because you are converting a list to a string, there is no need to do this. Grab the first element of the list (in this case it is a string) and parse that:
>>> a = ['"1239","2249.00","1","3","2011-02-20"']
>>> a
['"1239","2249.00","1","3","2011-02-20"']
>>> a[0]
'"1239","2249.00","1","3","2011-02-20"'
>>> b = a[0].replace('"', '').split(',')
>>> b[-1]
'2011-02-20'
of course before you do replace and split string methods you should check if the type is string or handle the exception if it isn't.
Also Blahdiblah is correct your delimiter is probably wrong.

Related

Decoding String list in python from a binary file

I need to read a list of strings from a binary file and create a python list.
I'm using the below command to extract data from binary file:
tmp = f.read(100)
abc, = struct.unpack('100c',tmp)
The data that I can see in variable 'abc' is exactly as shown below, but I need to get the below data into a python list as strings.
Data that I need as a list: 'UsrVal' 'VdetHC' 'VcupHC' ..... 'Gravity_Axis'
b'UsrVal\x00VdetHC\x00VcupHC\x00VdirHC\x00HdirHC\x00UpFlwHC\x00UxHC\x00UyHC\x00UzHC\x00VresHC\x00UxRP\x00UyRP\x00UzRP\x00VresRP\x00Gravity_Axis'
Here is how i would suggest you to do it with one liner.
You need to decode binary string and then you can do a split based on "\x00" which will return the list you are looking for.
e.g
my_binary_out = b'UsrVal\x00VdetHC\x00VcupHC\x00VdirHC\x00HdirHC\x00UpFlwHC\x00UxHC\x00UyHC\x00UzHC\x00VresHC\x00UxRP\x00UyRP\x00UzRP\x00VresRP\x00Gravity_Axis'
decoded_list = my_binary_out.decode("latin1", 'ignore').split('\x00')
#or
decoded_list = my_binary_out.decode("cp1252", 'ignore').split('\x00')
Output Will look like this :
['UsrVal', 'VdetHC', 'VcupHC', 'VdirHC', 'HdirHC', 'UpFlwHC', 'UxHC', 'UyHC', 'UzHC', 'VresHC', 'UxRP', 'UyRP', 'UzRP', 'VresRP', 'Gravity_Axis']
Hope this helps
If you're going for a quick and messy way here, AND assuming your string
b'UsrVal\x00VdetHC\x00VcupHC\x00VdirHC\x00HdirHC\x00UpFlwHC\x00UxHC\x00UyHC\x00UzHC\x00VresHC\x00UxRP\x00UyRP\x00UzRP\x00VresRP\x00Gravity_Axis'
is in fact interpreted as
" b'UsrVal\x00VdetHC\x00VcupHC\x00VdirHC\x00HdirHC\x00UpFlwHC\x00UxHC\x00UyHC\x00UzHC\x00VresHC\x00UxRP\x00UyRP\x00UzRP\x00VresRP\x00Gravity_Axis' "
Then the following few lines of code result with 'b' having the array you want.
a = {YourStringHere}
b = a[2:-1].split("\x00")

How to remove the stuff lists add when writing to textfiles

I need to write a list to a text file named accounts.txt in the following format:
kieranc,conyers,asdsd,pop
ethand,day,sadads,dubstep
However, it ends up like the following with brackets:
['kieranc', 'conyers', 'asdsd', 'pop\n']['ethand', 'day', 'sadads', 'dubstep']
Here is my code (accreplace is a list):
accreplace = [['kieranc', 'conyers', 'asdsd', 'pop\n'],['ethand', 'day', 'sadads', 'dubstep']]
acc = open("accounts.txt", "w")
for x in accreplace:
acc.write(str(x))
Since each element in accreplace is a list, str(x) doesn't help. It just adds quotes around it. To print the list in proper format use the code below:
for x in accreplace:
acc.write(",".join([str(l) for l in x]))
This will convert the list items into a string.

Python put string into dictionary

I want to convert a string into a dictionary. I saved this dictionary previously in a text file.
The problem is now, that I am not sure, how the structure of the keys are. The values are generated with Counter(dictionaryName). The dictionary is really large, so I cannot check every key to see how it would be possible.
The keys can contain simple quotes like ', double quotes ", commas and maybe other characters. So is there any possibility to convert it back into a dictionary?
For example this is stored in the file:
Counter({'element0':512, "'4,5'element1":50, '4:55foobar':23,...})
I found previous solutions with for example json, but I have problems with the double quotes and I cannot simply split for the commas.
If you trust the source, load from collections import Counter and eval() the string
How about something like:
>> from collections import Counter
>> line = '''Counter({'element0':512, "'4,5'element1":50, '4:55foobar':23})'''
>> D = eval(line)
>> D
Counter({"'4,5'element1": 50, '4:55foobar': 23, 'element0': 512})
You could remove the Counter( and ) parts, then parse the rest with ast.literal_eval as long as it only involves basic Python data types:
import ast
def parse_Counter_string(s):
s = s.strip()
if not (s.startswith('Counter(') and s.endswith(')')):
raise ValueError('String does not match expected format')
# Counter( is 8 characters
# 12345678
s = s[8:-1]
return Counter(ast.literal_eval(s))
In the future, I recommend picking a different way to serialize your data.
you can use demjson library for doing this, you can have the text directly in your program
import demjson
counter = demjson.decode("enter your text here")
if it is in the file ,you can do the following steps :
WD = dirname(realpath(__file__))
file = open(WD, "filename"), "r")
counter = demjson.decode(file.read())
file.close()

Python len not working

In the code below, I am trying to use len(list) to count the number of strings in an array in each of the tags variables from the while loop. When i did a sample list parameter on the bottom, list2, it printed 5 which works, but when i did it with my real data,it was counting the characters in the array, not the number of strings. I need help figuring out why that is and i am new to python so the simplest way possible please!
#!/usr/bin/python
import json
import csv
from pprint import pprint
with open('data.json') as data_file:
data = json.load(data_file)
#pprint(data)
# calc number of alert records in json file
x = len(data['alerts'])
count = 0
while (count < x):
tags = str(data['alerts'][count] ['tags']).replace("u\"","\"").replace("u\'","\'")
list = "[" + tags.strip('[]') + "]"
print list
print len(list)
count=count+1
list2 = ['redi', 'asd', 'rrr', 'www', 'qqq']
print len(list2)
Your list construction list = "[" + tags.strip('[]') + "]" creates a string, not a list. So yes, len works, it counts the characters in your string.
Your tags construction looks a bit off, you have a dictionary of data (data['alerts']) which you then convert to string, and strip of the '[]'. Why don't use just get the value itself?
Also list is a horrible name for your variable. This possible clashes with internal values.
list = "[" + tags.strip('[]') + "]"
print list
print len(list)
Ironically, list is a string, not a list. That's why calling len on it "was counting the characters in the array"
you need to make sure that your variable is a list rather than a str,
try:
print(type(yourList))
if it shows that it is a str, then try this:
len(list[yourList)
hope this answers your question
and when you want to establish a list variable, try this:
myList = []
for blah in blahblah:
myList.append(blah)
I think these definitely solved your problem, so I hope you noticed this part.

Remove string quotes from array in Python

I'm trying to get rid of some characters in my array so I'm just left with the x and y coordinates, separated by a comma as follows:
[[316705.77017187304,790526.7469308273]
[321731.20991025254,790958.3493565321]]
I have used zip() to create a tuple of the x and y values (as pairs from a list of strings), which I've then converted to an array using numpy. The array currently looks like this:
[['316705.77017187304,' '790526.7469308273,']
['321731.20991025254,' '790958.3493565321,']]
I need the output to be an array.
I'm pretty stumped about how to get rid of the single quotes and the second comma. I have read that map() can change string to numeric but I can't get it to work.
Thanks in advance
Using 31.2. ast — Abstract Syntax Trees¶
import ast
xll = [['321731.20991025254,' '790958.3493565321,'], ['321731.20991025254,' '790958.3493565321,']]
>>> [ast.literal_eval(xl[0]) for xl in xll]
[(321731.20991025254, 790958.3493565321), (321731.20991025254, 790958.3493565321)]
Above gives list of tuples for list of list, type following:
>>> [list(ast.literal_eval(xl[0])) for xl in xll]
[[321731.20991025254, 790958.3493565321], [321731.20991025254, 790958.3493565321]]
OLD: I think this:
>>> sll
[['316705.770172', '790526.746931'], ['321731.20991', '790958.349357']]
>>> fll = [[float(i) for i in l] for l in sll]
>>> fll
[[316705.770172, 790526.746931], [321731.20991, 790958.349357]]
>>>
old Edit:
>>> xll = [['321731.20991025254,' '790958.3493565321,'], ['321731.20991025254,' '790958.3493565321,']]
>>> [[float(s) for s in xl[0].split(',') if s.strip() != ''] for xl in xll]
[[321731.20991025254, 790958.3493565321], [321731.20991025254, 790958.3493565321]]

Categories