Splitting lists at the commas - python

Currently I have a long list that has elements like this:
['01/01/2013 06:31, long string of characters,Unknown'].
How would I split each element into:
['01/01/2013 06:31], [long string of characters],[Unknown]? Can I even do that?
I tried variable.split(","), but I get "AttributeError: 'list' object has no attribute 'split'".
Here's my code:
def sentiment_analysis():
f = open('C:\path', 'r')
write_to_list = f.readlines()
write_to_list = map(lambda write_to_list: write_to_list.strip(), write_to_list)
[e.split(',') for e in write_to_list]
print write_to_list[0:2]
f.close()
return
I'm still not getting it, I'd appreciate any help!

Solution
You are given this:
['01/01/2013 06:31, long string of characters,Unknown']
Alright. If you know that there is only this one long string in this list, just extract the only element:
>>> x = ['01/01/2013 06:31, long string of characters,Unknown']
>>>
>>> y = x[0].split(",") # extract only element and split by comma
>>> print(y) # list of strings, with one depth
['01/01/2013 06:31', ' long string of characters', 'Unknown']
Now for whatever reasons, you actually want each eletent of the outer list to be a list with one string in it. That is easy enough to do - simply use map and anonymous functions:
... # continuation from snippet above
...
>>> z = map(lambda s: [s], y) # encapsulates each elem of y in a list
>>> print(z)
[['01/01/2013 06:31'], [' long string of characters'], ['Unknown']]
There you have it.
One-Liner Conclusion
No list comprehensions, no for loops, no generators. Just really simple functional programming and anonymous functions.
Given original list l,
res = map(lambda s: [s],
l[0].split(","))

List comprehension!
>>> variable = ['01/01/2013 06:31, long string of characters,Unknown']
>>> [x.split(',') for x in variable]
[['01/01/2013 06:31', ' long string of characters', 'Unknown']]
But wait, that's nested more than you wanted...
>>> itertools.chain.from_iterable(x.split(',') for x in variable)
<itertools.chain object at 0x109180fd0>
>>> list(itertools.chain.from_iterable(x.split(',') for x in variable))
['01/01/2013 06:31', ' long string of characters', 'Unknown']

Related

Python list to string loop

If I want to disassemble a string into a list, do some manipulation with the original decimal values, and then assemble the string from the list, what is the best way?
str = 'abc'
lst = list(str.encode('utf-8'))
for i in lst:
print (i, chr(int(i+2)))
gives me a table.
But I would like to create instead a presentation like 'abc', 'cde', etc.
Hope this helps
str_ini = 'abc'
lst = list(str_ini.encode('utf-8'))
str_fin = [chr(v+2) for v in lst]
print(''.join(str_fin))
To convert a string into a list of character values (numbers), you can use:
s = 'abc'
vals = [ord(c) for c in s]
This results in vals being the list [97, 98, 99].
To convert it back into a string, you can do:
s2 = ''.join(chr(val) for val in vals)
This will give s2 the value 'abc'.
If you prefer to use map rather than comprehensions, you can equivalently do:
vals = list(map(ord, s))
and:
s2 = ''.join(map(chr, vals))
Also, avoid using the name str for a variable, since it will mask the builtin definition of str.
Use ord on the letters to retrieve their decimal ASCII representation, and then chr to convert them back to characters after manipulating the decimal value. Finally use the str.join method with an empty string to piece the list back together into a str:
s = 'abc'
s_list = [ord(let) for let in s]
s_list = [chr(dec + 2) for dec in s_list]
new_s = ''.join(s_list)
print(new_s) # every character is shifted by 2
Calling .encode on the string converts to a bytes string instead, which is likely not what you want. Additionally, you don't want to be using built-ins as the names for variables, because then you will no longer be able to use the built-in keyword in the same scope.

Can i include multiple statements when creating a one-line for loop?

I have an array I want to iterate through. The array consists of strings consisting of numbers and signs.
like this: €110.5M
I want to loop over it and remove all Euro sign and also the M and return that array with the strings as ints.
How would I do this knowing that the array is a column in a table?
You could just strip the characters,
>>> x = '€110.5M'
>>> x.strip('€M')
'110.5'
def sanitize_string(ss):
ss = ss.replace('$', '').replace('€', '').lower()
if 'm' in ss:
res = float(ss.replace('m', '')) * 1000000
elif 'k' in ss:
res = float(ss.replace('k', '')) * 1000
return int(res)
This can be applied to a list as follows:
>>> ls = [sanitize_string(x) for x in ["€3.5M", "€15.7M" , "€167M"]]
>>> ls
[3500000, 15700000, 167000000]
If you want to apply it to the column of a table instead:
dataFrame = dataFrame.price.apply(sanitize_string) # Assuming you're using DataFrames and the column is called 'price'
You can use a string comprehension:
numbers = [float(p.replace('€','').replace('M','')) for p in a]
which gives:
[110.5, 210.5, 310.5]
You can use a list comprehension to construct one list from another:
foo = ["€13.5M", "€15M" , "€167M"]
foo_cleaned = [value.translate(None, "€M")]
str.translate replaces all occurrences of characters in the latter string with the first argument None.
Try this
arr = ["€110.5M","€110.5M","€110.5M","€110.5M","€110.5M","€110.5M","€110.5M"]
f = [x.replace("€","").replace("M","") for x in arr]
You can call .replace() on a string as often as you like. An initial solution could be something like this:
my_array = ['€110.5M', '€111.5M', '€112.5M']
my_cleaned_array = []
for elem in my_array:
my_cleaned_array.append(elem.replace('€', '').replace('M', ''))
At this point, you still have strings in your array. If you want to return them as ints, you can write int(elem.replace('€', '').replace('M', '')) instead. But be aware that you will then lose everything after the floating point, i.e. you will end up with [110, 111, 112].
You can use Regex to do that.
import re
str = "€110.5M"
x = re.findall("\-?\d+\.\d+", str )
print(x)
I didn't quite understand the second part of the question.

How to parse a string given the following?

I'm new with python, given this list:
a_list=['''('string','string'),...,('string','string'), 'STRING' ''']
How can I drop the quotes, parenthesis, and leaving out 'STRING' in order to get a string like this:
string string ... string
This is what I all ready tried:
new_list = ''.join( c for c in ''.join(str(v) for v
in a_list)
if c not in ",'()")
print new_list
I know that the other answer is the perfect one, But as you wanted to learn more about string techniques rather than using the present libraries then this will also work.
Do note that this is a VERY BAD way to solve your problem.
a_list=['''('string','string'),...,('string','string'), 'STRING' ''']
new_list = []
for i in a_list:
j = i.replace("'",'')
j = j.replace('(','')
j = j.replace(')','')
j = j.replace(',',' ')
j = j.replace('STRING','')
j = j.strip()
new_list.append(j)
print new_list
It will output
'string string ... string string'
If your string doesn't have a literal ..., you can use ast.literal_eval here. That will convert your string into a tuple (whose elements are 2-tuples and strings), since it basically is the string representation of a tuple. After that, it's a simple matter of iterating over the tuple and converting it to the form you want.
>>> import ast
>>> x = '''('string','string'), ('string2','string3'), ('string','string'), 'STRING' '''
>>> y = ast.literal_eval(x); print(y)
(('string', 'string'), ('string2', 'string3'), ('string', 'string'), 'STRING')
>>> ' '.join( ' '.join(elem) if type(elem) is tuple else '' for elem in y )
'string string string2 string3 string string '

Adding same string string repeatedly to a list to make it a tuple

I'm trying to combine a string with a series of numbers as tuples to a list.
For example, starting with:
a = [12,23,45,67,89]
string = "John"
I want to turn that into:
tuples = [(12,'John'),(23,'John'),(45,'John'),(67,'John'),(89,'John')]
I tried:
string2 = string * len(a)
tuples = zip(a, string2)
but this returned:
tuples = [(12,'J'), (23,'o'), ...]
If you want to use zip(), then create a list for your string variable before multiplying:
string2 = [string] * len(a)
tuples = zip(a,string2)
string * len(a) creates one long string, and zip() then iterates over that to pull out individual characters. By multiplying a list instead, you get a list with len(a) separate references to the string value; iteration then gives you string each time.
You could also use itertools.repeat() to give you string repeatedly:
from itertools import repeat
tuples = zip(a, repeat(string))
This avoids creating a new list object, potentially quite large.
>>> a = [12,23,45,67,89]
>>> string = "John"
>>> my_tuple = [(i,string) for i in a]
>>> print my_tuple
You can iterate over each position within a string so zip causes the behavior you were seeing previously.

Python List Replace [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Python and Line Breaks
In Python I know that when your handling strings you can use "replace" to replace a certain character in a string. For example:
h = "2,3,45,6"
h = h.replace(',','\n')
print h
Returns:
2
3
45
6
Is there anyway to do this with a list? For example replace all the "," in a list with "\n"?
A list like:
h = ["hello","goodbye","how are you"]
"hello"
"goodbye"
"how are you"
And the Script should output something like this:
Any suggestions would be helpful!
Looking at your example and desire, you can use the str.join and this is probably what you want
>>> h
['2', '3', '45', '6']
>>> print '\n'.join(str(i) for i in h)
2
3
45
6
similarly for your second example
>>> h = ["hello","goodbye","how are you"]
>>> print '\n'.join(str(i) for i in h)
hello
goodbye
how are you
If you really wan't the quotation mark for strings you can use the following
>>> h = ["hello","goodbye","how are you"]
>>> print '\n'.join('"{0}"'.format(i) if isinstance(i,str) else str(i) for i in h)
"hello"
"goodbye"
"how are you"
>>>
You could use list comprehension for that:
>>> search = 'foo'
>>> replace = 'bar'
>>> lst = ['my foo', 'foo', 'bip']
>>> print [x.replace(search, replace) for x in lst]
['my bar', 'bar', 'bip']
In a list like your h = [2,5,6,8,9], there really are no commas to replace in the list itself. The list contains the items 2, 5 and so on, the commas are merely part of the external representation to make it easier to separate the items visually.
So, to generate some output form from the list but without the commas, you can use any number of techniques. For instance, to join them all up into a single string without commas, use:
"".join([str(x) for x in h])
This will evaluate to 25689.
for each in h: print each
In 3.x:
for each in h: print(each)
A list is simply a representation of data. You can only affect the way it looks in the output.
You can replace the ',' in the string because the ',' is part of the string itself but you cannot replace the ',' in a list because it is not an item in the list rather it is what is used by python for delineating different items in such a list together with the opening and closing square brackets. It is just like asking if you could replace the '"' used in creating the string. On the other hand if the ',' is an item in the list and you want to replace it with a newline item then you could use list comprehensions like:
['\n' if x=="," else x for x in yourlist]
or if you want to print each item on a single line you could use:
for item in list:
print item

Categories