How To Change Lists of Unicode to ASCII - python

I'm trying to change these lists I'm getting from my school grade site from Unicode to ASCII.
The lists look like this:
x = grades[1]
print x #Input
[u'B', u'84'] #Output
But I don't want the u in there. I've tried to use
a.encode('ascii','ignore')
But I get
Traceback (most recent call last):
File "C:\Python27\Project.py", line 33, in <module>
L1.encode('ascii','ignore')
AttributeError: 'list' object has no attribute 'encode'
is there anyway to do this?

The problem is that you're trying to encode a list full of strings, not a string. You can't call string methods on a list of strings; you have to call them on each string.
The pythonic way to do this is with a list comprehension:
>>> x = [u'B', u'84']
>>> y = [s.encode('ascii', 'ignore') for s in x]
>>> y
['B', '84']
Under the covers, this is basically the same as:
>>> x = [u'B', u'84']
>>> y = []
>>> for s in x:
... y.append(s.encode('ascii', 'ignore'))
>>> y
['B', '84']
But it's more concise, harder to get wrong, and (once you get the basic idea of list comprehensions) easier to read.
It's also the same as either of the following:
y = map(lambda s: s.encode('ascii', 'ignore'), x)
y = map(partial(unicode.encode, encoding='ascii', errors='ignore'), x)
Generally, if you need to write a lambda or non-trivial partial, a list comprehension will be more readable than a map call. But in cases where you have a function ready to use, map is often nicer.

You could simply apply the str() function to each element of your list. In python 2, str objects are ASCII encoded. For example (in a python shell):
>>> x = [u'B', u'84']
>>> x
[u'B', u'84']
>>> x[0]
u'B'
>>> x[1]
u'84'
>>> str(x[0])
'B'
>>> str(x[1])
'84'
And if a list is needed you can use a list comprehension :
>>> y = [str(i) for i in x]
>>> y
['B', '84']
or the map function:
>>> z = map(str, x)
>>> z
['B', '84']
>>>
Hope this is what your looking for. Regards!

Related

Converting list to string then integer

I have a list with 1550500 numbers, and all of them with quotes.
Example: list('100', '150', '200', '250') etc...
I need to sum all the numbers, but before that I need to convert it to INT.
List Name: trip_list
My code:
mean_tripstr = str(trip_list)
mean_trip = [int(x) for x in mean_tripstr]
print(type(mean_trip))
Error message:
Traceback (most recent call last):
File "projeto1.py", line 235, in <module>
mean_trip = [int(x) for x in mean_tripstr]
File "projeto1.py", line 235, in <listcomp>
mean_trip = [int(x) for x in mean_tripstr]
ValueError: invalid literal for int() with base 10: '['
What am I doing wrong? I am new to coding...
Python has a map function, this takes a function and an iterable. There is also the sum function, which returns the sum of an iterable.
You can use this:
sum(map(int(trip_list))
Note that the map function does not return a list, it returns a generator. To convert it to a list, use
list(sum(map(int, trip_list)))
(this may take a while as it requires iterating over the entire list, and yours is quite long).
The error with your code is converting your list to a string, that is,
>>> my_list = ["5", "6"]
>>> my_list_str = str(my_list)
>>> my_list_str
"['5', '6']"
>>> type(my_list_str)
<class 'str'>
>>> type(my_list)
<class 'list'>
So when you try to iterate over the string, the first x is [ which is not a number (thus the exception).
As a sidenote, using list(map(int, a_list)) is faster than [int(i) for i in a_list]
>>> c1 = "list(map(int, a_list))"
>>> c2 = "[int(i) for i in a_list]"
>>> s = "a_list = [str(i) for i in range(1000)]"
>>> import timeit
>>> timeit.timeit(c1, setup=s, number=10000)
1.9165708439999918
>>> >>> timeit.timeit(c2, setup=s, number=10000)
2.470973639999997
You have to convert each element to int:
mean_tripstr = map(str,trip_list)
mean_trip = list(map(int,mean_tripstr))
The code above uses a generator, what is more efficient in cases when you just have to iterate in a list. The last line convert to a list again properly.
But, as you said, if you already have a list of strings, you can just do:
mean_trip = list(map(int,trip_list))
If you know numpy, you can do too:
import numpy as np
trip_list = np.array(trip_list)
mean_trip = trip_list.astype(np.int)

Convert an Array, converted to a String, back to an Array

I recently found an interesting behaviour in python due to a bug in my code.
Here's a simplified version of what happened:
a=[[1,2],[2,3],[3,4]]
print(str(a))
console:
"[[1,2],[2,3],[3,4]]"
Now I wondered if I could convert the String back to an Array.Is there a good way of converting a String, representing an Array with mixed datatypes( "[1,'Hello',['test','3'],True,2.532]") including integers,strings,booleans,floats and arrays back to an Array?
There's always everybody's old favourite ast.literal_eval
>>> import ast
>>> x = "[1,'Hello',['test','3'],True,2.532]"
>>> y = ast.literal_eval(x)
>>> y
[1, 'Hello', ['test', '3'], True, 2.532]
>>> z = str(y)
>>> z
"[1, 'Hello', ['test', '3'], True, 2.532]"
ast.literal_eval is better. Just to mention, this is also a way.
a=[[1,2],[2,3],[3,4]]
string_list = str(a)
original_list = eval(string_list)
print original_list == a
# True

Split string to various data types

I would like to convert the following string:
s = '1|2|a|b'
to
[1, 2, 'a', 'b']
Is it possible to do the conversion in one line?
Is it possible to do the conversion in one line?
YES, It is possible. But how?
Algorithm for the approach
Split the string into its constituent parts using str.split. The output of this is
>>> s = '1|2|a|b'
>>> s.split('|')
['1', '2', 'a', 'b']
Now we have got half the problem. Next we need to loop through the split string and then check if each of them is a string or an int. For this we use
A list comprehension, which is for the looping part
str.isdigit for finding if the element is an int or a str.
The list comprehension can be easily written as [i for i in s.split('|')]. But how do we add an if clause there? This is covered in One-line list comprehension: if-else variants. Now that we know which all elements are int and which are not, we can easily call the builtin int on it.
Hence the final code will look like
[int(i) if i.isdigit() else i for i in s.split('|')]
Now for a small demo,
>>> s = '1|2|a|b'
>>> [int(i) if i.isdigit() else i for i in s.split('|')]
[1, 2, 'a', 'b']
As we can see, the output is as expected.
Note that this approach is not suitable if there are many types to be converted.
You cannot do it for negative numbers or lots of mixed types in one line but you could use a function that would work for multiple types using ast.literal_eval:
from ast import literal_eval
def f(s, delim):
for ele in s.split(delim):
try:
yield literal_eval(ele)
except ValueError:
yield ele
s = '1|-2|a|b|3.4'
print(list(f(s,"|")))
[1, -2, 'a', 'b', 3.4]
Another way, is using map built-in method:
>>> s='1|2|a|b'
>>> l = map(lambda x: int(x) if x.isdigit() else x, s.split('|'))
>>> l
[1, 2, 'a', 'b']
If Python3, then:
>>> s='1|2|a|b'
>>> l = list(map(lambda x: int(x) if x.isdigit() else x, s.split('|')))
>>> l
[1, 2, 'a', 'b']
Since map in Python3 would give a generator, so you must convert it to list
It is possible to do arbitrarily many or complex conversions "in a single line" if you're allowed a helper function. Python does not natively have a "convert this string to the type that it should represent" function, because what it "should" represent is vague and may change from application to application.
def convert(input):
converters = [int, float, json.loads]
for converter in converters:
try:
return converter(input)
except (TypeError, ValueError):
pass
# here we assume if all converters failed, it's just a string
return input
s = "1|2.3|a|[4,5]"
result = [convert(x) for x in s.split("|")]
If you have all kinds of data types(more than str and int), I believe this does the job.
s = '1|2|a|b|[1, 2, 3]|(1, 2, 3)'
print [eval(x) if not x.isalpha() else x for x in s.split("|")]
# [1, 2, 'a', 'b', [1, 2, 3], (1, 2, 3)]
This fails if there exists elements such as "b1"

List elements disappeared when I used a lambda expression or assignment

I'm just trying to figure out what is happening in this python code. I was trying to use this answer here, and so I was tinkering with the console and my list elements just vanished.
What I was doing was loading the lines into a file into a list, then trying to remove the newline chars from each element by using the lambda thing from the other answer.
Can anyone help me explain why the list became empty?
>>> ================================ RESTART ================================
>>> x = ['a\n','b\n','c\n']
>>> x
['a\n', 'b\n', 'c\n']
>>> x = map(lambda s: s.strip(), x)
>>> x
<map object at 0x00000000035901D0>
>>> y = x
>>> y
<map object at 0x00000000035901D0>
>>> x
<map object at 0x00000000035901D0>
>>> list(x)
['a', 'b', 'c']
>>> x = list(x)
>>> x
[]
>>> y
<map object at 0x00000000035901D0>
>>> list(y)
[]
>>>
You are using Python3, so map returns a mapobject. You can only iterate over a mapobject once. So convert it to a list if you need to iterate over the items more than once. (also if you need to look up by index etc.)
Use
x = list(map(lambda s: s.strip(), x))
or better - a list comprehension
x = [s.strip() for s in x]

Converting each element of a list to tuple

to convert each element of list to tuple like following :
l = ['abc','xyz','test']
convert to tuple list:
newl = [('abc',),('xyz',),('test',)]
Actually I have dict with keys like this so for searching purpose I need to have these.
You can use a list comprehension:
>>> l = ['abc','xyz','test']
>>> [(x,) for x in l]
[('abc',), ('xyz',), ('test',)]
>>>
Or, if you are on Python 2.x, you could just use zip:
>>> # Python 2.x interpreter
>>> l = ['abc','xyz','test']
>>> zip(l)
[('abc',), ('xyz',), ('test',)]
>>>
However, the previous solution will not work in Python 3.x because zip now returns a zip object. Instead, you would need to explicitly make the results a list by placing them in list:
>>> # Python 3.x interpreter
>>> l = ['abc','xyz','test']
>>> zip(l)
<zip object at 0x020A3170>
>>> list(zip(l))
[('abc',), ('xyz',), ('test',)]
>>>
I personally prefer the list comprehension over this last solution though.
Just do this:
newl = [(i, ) for i in l]

Categories