How to parse a string given the following? - python

I'm new with python, given this list:
a_list=['''('string','string'),...,('string','string'), 'STRING' ''']
How can I drop the quotes, parenthesis, and leaving out 'STRING' in order to get a string like this:
string string ... string
This is what I all ready tried:
new_list = ''.join( c for c in ''.join(str(v) for v
in a_list)
if c not in ",'()")
print new_list

I know that the other answer is the perfect one, But as you wanted to learn more about string techniques rather than using the present libraries then this will also work.
Do note that this is a VERY BAD way to solve your problem.
a_list=['''('string','string'),...,('string','string'), 'STRING' ''']
new_list = []
for i in a_list:
j = i.replace("'",'')
j = j.replace('(','')
j = j.replace(')','')
j = j.replace(',',' ')
j = j.replace('STRING','')
j = j.strip()
new_list.append(j)
print new_list
It will output
'string string ... string string'

If your string doesn't have a literal ..., you can use ast.literal_eval here. That will convert your string into a tuple (whose elements are 2-tuples and strings), since it basically is the string representation of a tuple. After that, it's a simple matter of iterating over the tuple and converting it to the form you want.
>>> import ast
>>> x = '''('string','string'), ('string2','string3'), ('string','string'), 'STRING' '''
>>> y = ast.literal_eval(x); print(y)
(('string', 'string'), ('string2', 'string3'), ('string', 'string'), 'STRING')
>>> ' '.join( ' '.join(elem) if type(elem) is tuple else '' for elem in y )
'string string string2 string3 string string '

Related

How to replace a character within a string in a list?

I have a list that has some elements of type string. Each item in the list has characters that are unwanted and want to be removed. For example, I have the list = ["string1.", "string2."]. The unwanted character is: ".". Therefore, I don't want that character in any element of the list. My desired list should look like list = ["string1", "string2"] Any help? I have to remove some special characters; therefore, the code must be used several times.
hola = ["holamundoh","holah","holish"]
print(hola[0])
print(hola[0][0])
for i in range(0,len(hola),1):
for j in range(0,len(hola[i]),1):
if (hola[i][j] == "h"):
hola[i] = hola[i].translate({ord('h'): None})
print(hola)
However, I have an error in the conditional if: "string index out of range". Any help? thanks
Modifying strings is not efficient in python because strings are immutable. And when you modify them, the indices may become out of range at the end of the day.
list_ = ["string1.", "string2."]
for i, s in enumerate(list_):
l[i] = s.replace('.', '')
Or, without a loop:
list_ = ["string1.", "string2."]
list_ = list(map(lambda s: s.replace('.', ''), list_))
You can define the function for removing an unwanted character.
def remove_unwanted(original, unwanted):
return [x.replace(unwanted, "") for x in original]
Then you can call this function like the following to get the result.
print(remove_unwanted(hola, "."))
Use str.replace for simple replacements:
lst = [s.replace('.', '') for s in lst]
Or use re.sub for more powerful and more complex regular expression-based replacements:
import re
lst = [re.sub(r'[.]', '', s) for s in lst]
Here are a few examples of more complex replacements that you may find useful, e.g., replace everything that is not a word character:
import re
lst = [re.sub(r'[\W]+', '', s) for s in lst]

Python list to string loop

If I want to disassemble a string into a list, do some manipulation with the original decimal values, and then assemble the string from the list, what is the best way?
str = 'abc'
lst = list(str.encode('utf-8'))
for i in lst:
print (i, chr(int(i+2)))
gives me a table.
But I would like to create instead a presentation like 'abc', 'cde', etc.
Hope this helps
str_ini = 'abc'
lst = list(str_ini.encode('utf-8'))
str_fin = [chr(v+2) for v in lst]
print(''.join(str_fin))
To convert a string into a list of character values (numbers), you can use:
s = 'abc'
vals = [ord(c) for c in s]
This results in vals being the list [97, 98, 99].
To convert it back into a string, you can do:
s2 = ''.join(chr(val) for val in vals)
This will give s2 the value 'abc'.
If you prefer to use map rather than comprehensions, you can equivalently do:
vals = list(map(ord, s))
and:
s2 = ''.join(map(chr, vals))
Also, avoid using the name str for a variable, since it will mask the builtin definition of str.
Use ord on the letters to retrieve their decimal ASCII representation, and then chr to convert them back to characters after manipulating the decimal value. Finally use the str.join method with an empty string to piece the list back together into a str:
s = 'abc'
s_list = [ord(let) for let in s]
s_list = [chr(dec + 2) for dec in s_list]
new_s = ''.join(s_list)
print(new_s) # every character is shifted by 2
Calling .encode on the string converts to a bytes string instead, which is likely not what you want. Additionally, you don't want to be using built-ins as the names for variables, because then you will no longer be able to use the built-in keyword in the same scope.

How do I change all strings in the format ' "string" ' or " 'string' " to simply be 'string'?

I think that I am getting a key error when accessing a dictionary because some of the keys in the dictionary seem to be strings of strings. Is there a way to strip these to simply be strings?
I was thinking of using a list comprehension but am wondering if there is a better way:
x = ' "string" '
x = [i for i in x if i not in ["'", '"']]
x = ''.join(x)
And now x = 'string'
x = ' "string" '
x = x.replace('"','\'').replace('\'','\'')
print(x)
Output
'string'
strip may be useful:
x = x.strip('"\' ')
s.strip([chars]) will remove any of the characters passed to the function from the start and end of the given string and return the result
You can filter the result :
print("".join(list(filter(lambda _:_.isalpha(),x))))
output:
string
or list comprehension:
print("".join([i for i in x if i.isalpha()]))
output:
string

Splitting lists at the commas

Currently I have a long list that has elements like this:
['01/01/2013 06:31, long string of characters,Unknown'].
How would I split each element into:
['01/01/2013 06:31], [long string of characters],[Unknown]? Can I even do that?
I tried variable.split(","), but I get "AttributeError: 'list' object has no attribute 'split'".
Here's my code:
def sentiment_analysis():
f = open('C:\path', 'r')
write_to_list = f.readlines()
write_to_list = map(lambda write_to_list: write_to_list.strip(), write_to_list)
[e.split(',') for e in write_to_list]
print write_to_list[0:2]
f.close()
return
I'm still not getting it, I'd appreciate any help!
Solution
You are given this:
['01/01/2013 06:31, long string of characters,Unknown']
Alright. If you know that there is only this one long string in this list, just extract the only element:
>>> x = ['01/01/2013 06:31, long string of characters,Unknown']
>>>
>>> y = x[0].split(",") # extract only element and split by comma
>>> print(y) # list of strings, with one depth
['01/01/2013 06:31', ' long string of characters', 'Unknown']
Now for whatever reasons, you actually want each eletent of the outer list to be a list with one string in it. That is easy enough to do - simply use map and anonymous functions:
... # continuation from snippet above
...
>>> z = map(lambda s: [s], y) # encapsulates each elem of y in a list
>>> print(z)
[['01/01/2013 06:31'], [' long string of characters'], ['Unknown']]
There you have it.
One-Liner Conclusion
No list comprehensions, no for loops, no generators. Just really simple functional programming and anonymous functions.
Given original list l,
res = map(lambda s: [s],
l[0].split(","))
List comprehension!
>>> variable = ['01/01/2013 06:31, long string of characters,Unknown']
>>> [x.split(',') for x in variable]
[['01/01/2013 06:31', ' long string of characters', 'Unknown']]
But wait, that's nested more than you wanted...
>>> itertools.chain.from_iterable(x.split(',') for x in variable)
<itertools.chain object at 0x109180fd0>
>>> list(itertools.chain.from_iterable(x.split(',') for x in variable))
['01/01/2013 06:31', ' long string of characters', 'Unknown']

Python: How can i avoid specific word in string from lowercasing while using str.lower

Iam new to python
i have a list of string as follws
mylist=["$(ProjectDir)Dir1\Dest1","$(OutDir)Dir2\Dest2","$(IntDir)Dir2\Dest2"]
i want to lower case each list item value as follows
mylist=["$(ProjectDir)dir1\dest1","$(OutDir)dir2\dest2","$(IntDir)dir3\dest3"]
ie i want to prevent $(ProjectDir),$(OutDir),$(IntDir) from lowercasing
The idea is very simple. You split the string with a regular expression describing parts that are not to be converted, then convert only its even parts, then join them back.
>>> import re
>>> mylist=["$(ProjectDir)Dir1\Dest1","$(OutDir)Dir2\Dest2","$(IntDir)Dir2\Dest2"]
>>> print ["".join([s if i%2 else s.lower() for (i,s) in enumerate(re.split('(\$\([^)]*\))', x))]) for x in mylist]
['$(ProjectDir)dir1\\dest1', '$(OutDir)dir2\\dest2', '$(IntDir)dir2\\dest2']
The main thing here is:
[ "".join([
s if i%2 else s.lower()
for (i,s) in enumerate(re.split('(\$\([^)]*\))', x))])
for x in mylist ]
You go through the list mylist
and for every x produce it modified version:
[ ... for x in mylist ]
You convert every x using this operation:
"".join([
s if i%2 else s.lower()
for (i,s) in enumerate(re.split('(\$\([^)]*\))', x))]
That means: split the string to parts that must be converted (even) and must not be converted (odd).
For example:
>>> re.split('(\$\([^)]*\))', x)
['', '$(ProjectDir)', 'Dir1\\Dest1']
and than enumerate them and convert all even parts:
>>> print list(enumerate(re.split('(\$\([^)]*\))', x)))
[(0, ''), (1, '$(ProjectDir)'), (2, 'Dir1\\Dest1')]
If a part is even or odd you check using this if:
s if i%2 else s.lower()
If you are allergic to regular expressions...
exclusions = ['$(ProjectDir)', '$(OutDir)', '$(IntDir)']
mylist = ["$(ProjectDir)Dir1\Dest1", "$(OutDir)Dir2\Dest2", "$(IntDir)Dir2\Dest2"]
## Lower case everything
mylist = [s.lower() for s in mylist]
## Revert the exclusions
for patt in exclusions:
mylist = [s.replace(patt.lower(), patt) for s in mylist]
print mylist

Categories