If I want to disassemble a string into a list, do some manipulation with the original decimal values, and then assemble the string from the list, what is the best way?
str = 'abc'
lst = list(str.encode('utf-8'))
for i in lst:
print (i, chr(int(i+2)))
gives me a table.
But I would like to create instead a presentation like 'abc', 'cde', etc.
Hope this helps
str_ini = 'abc'
lst = list(str_ini.encode('utf-8'))
str_fin = [chr(v+2) for v in lst]
print(''.join(str_fin))
To convert a string into a list of character values (numbers), you can use:
s = 'abc'
vals = [ord(c) for c in s]
This results in vals being the list [97, 98, 99].
To convert it back into a string, you can do:
s2 = ''.join(chr(val) for val in vals)
This will give s2 the value 'abc'.
If you prefer to use map rather than comprehensions, you can equivalently do:
vals = list(map(ord, s))
and:
s2 = ''.join(map(chr, vals))
Also, avoid using the name str for a variable, since it will mask the builtin definition of str.
Use ord on the letters to retrieve their decimal ASCII representation, and then chr to convert them back to characters after manipulating the decimal value. Finally use the str.join method with an empty string to piece the list back together into a str:
s = 'abc'
s_list = [ord(let) for let in s]
s_list = [chr(dec + 2) for dec in s_list]
new_s = ''.join(s_list)
print(new_s) # every character is shifted by 2
Calling .encode on the string converts to a bytes string instead, which is likely not what you want. Additionally, you don't want to be using built-ins as the names for variables, because then you will no longer be able to use the built-in keyword in the same scope.
Related
I have a list that has some elements of type string. Each item in the list has characters that are unwanted and want to be removed. For example, I have the list = ["string1.", "string2."]. The unwanted character is: ".". Therefore, I don't want that character in any element of the list. My desired list should look like list = ["string1", "string2"] Any help? I have to remove some special characters; therefore, the code must be used several times.
hola = ["holamundoh","holah","holish"]
print(hola[0])
print(hola[0][0])
for i in range(0,len(hola),1):
for j in range(0,len(hola[i]),1):
if (hola[i][j] == "h"):
hola[i] = hola[i].translate({ord('h'): None})
print(hola)
However, I have an error in the conditional if: "string index out of range". Any help? thanks
Modifying strings is not efficient in python because strings are immutable. And when you modify them, the indices may become out of range at the end of the day.
list_ = ["string1.", "string2."]
for i, s in enumerate(list_):
l[i] = s.replace('.', '')
Or, without a loop:
list_ = ["string1.", "string2."]
list_ = list(map(lambda s: s.replace('.', ''), list_))
You can define the function for removing an unwanted character.
def remove_unwanted(original, unwanted):
return [x.replace(unwanted, "") for x in original]
Then you can call this function like the following to get the result.
print(remove_unwanted(hola, "."))
Use str.replace for simple replacements:
lst = [s.replace('.', '') for s in lst]
Or use re.sub for more powerful and more complex regular expression-based replacements:
import re
lst = [re.sub(r'[.]', '', s) for s in lst]
Here are a few examples of more complex replacements that you may find useful, e.g., replace everything that is not a word character:
import re
lst = [re.sub(r'[\W]+', '', s) for s in lst]
So I just went into python not too long ago, it is to develop my OCR project. I want the software to detect the character "A" and convert it to a set of integers like 101.
list=['haha', 'haaa']
I am thinking of using a dictionary with keys and item to try replacing it. I added a define function for the process. I use this method I found in other post but it doesn't work.
Dictionary={'a':101,'h':111}
for a,b in Dictionary.items():
list = list.replace(a.lower(),b)
print (list)
First, you should make sure your list variable is not list as this is a keyword in python. Then, loop through the items and replace the key with the value at the key as such:
l = ['haha', 'haaa']
refDict = {'a':101,'h':111}
for i, item in enumerate(l):
for key in refDict:
item = item.replace(key, str(refDict[key]))
l[i] = item
Output after this code:
['111101111101', '111101101101']
Never use list as variable since it is already a python function.
One can use this:
l = ['haha', 'haaa']
conv_dict = {'a':101, 'h':111}
for j, ele in enumerate(l):
ele = list(ele)
for i, char in enumerate(ele):
ele[i] = conv_dict[char.lower()]
l[j] = int( ''.join(map(str, ele)))
print(l)
>> [111101111101, 111101101101]
This is not a robuste solution, since every character should be in the conv_dict to convert the char to int.
How it works:
Go over each word in the list
Convert string to list, with each char as element
Go over each character
Replace character with integer
Join the integers to one string and then convert it back to integer
Repeat for every string in list
I'm not very sure what output you're expecting but your question seems like you want the equivalent value of the elements in the dictionary to be substituted by the key values in the dictionary.
As each element of lst is considered in the first loop, an empty string ans is initialized. It then iterates through every character in the nested loop which concatenates the dictionary equivalent of each character into ans. The end result is appended into output
Dictionary={'a':101,'h':111}
lst=['haha', 'haaa']
output = []
for i in lst:
ans = ""
for j in i:
ans+=str(Dictionary[j])
output.append(ans)
print(output)
Output
['111101111101', '111101101101']
It sounds to me like you do not need to map the characters to a specific integer, just any unique integer. I would recommend not creating your own dictionary and using the standardized ascii mappings for characters (https://www.asciitable.com/). Python has a built-in function for converting characters to that value
Here is what that might look like (as others have pointed out, you also shouldn't use list as a variable name.
words = ['haha', 'haaa']
conversions = []
for word in words:
converted_word = []
for letter in word:
converted_word.append(ord(letter))
conversions.append(converted_word)
print(conversions)
This prints:
[[104, 97, 104, 97], [104, 97, 97, 97]]
How about str.translate?
lst = ['haha', 'haaa']
table = {ord('a'): '101', ord('h'): '111'}
lst = [s.translate(table) for s in lst]
print(lst)
Output (Try it online!):
['111101111101', '111101101101']
I have an array I want to iterate through. The array consists of strings consisting of numbers and signs.
like this: €110.5M
I want to loop over it and remove all Euro sign and also the M and return that array with the strings as ints.
How would I do this knowing that the array is a column in a table?
You could just strip the characters,
>>> x = '€110.5M'
>>> x.strip('€M')
'110.5'
def sanitize_string(ss):
ss = ss.replace('$', '').replace('€', '').lower()
if 'm' in ss:
res = float(ss.replace('m', '')) * 1000000
elif 'k' in ss:
res = float(ss.replace('k', '')) * 1000
return int(res)
This can be applied to a list as follows:
>>> ls = [sanitize_string(x) for x in ["€3.5M", "€15.7M" , "€167M"]]
>>> ls
[3500000, 15700000, 167000000]
If you want to apply it to the column of a table instead:
dataFrame = dataFrame.price.apply(sanitize_string) # Assuming you're using DataFrames and the column is called 'price'
You can use a string comprehension:
numbers = [float(p.replace('€','').replace('M','')) for p in a]
which gives:
[110.5, 210.5, 310.5]
You can use a list comprehension to construct one list from another:
foo = ["€13.5M", "€15M" , "€167M"]
foo_cleaned = [value.translate(None, "€M")]
str.translate replaces all occurrences of characters in the latter string with the first argument None.
Try this
arr = ["€110.5M","€110.5M","€110.5M","€110.5M","€110.5M","€110.5M","€110.5M"]
f = [x.replace("€","").replace("M","") for x in arr]
You can call .replace() on a string as often as you like. An initial solution could be something like this:
my_array = ['€110.5M', '€111.5M', '€112.5M']
my_cleaned_array = []
for elem in my_array:
my_cleaned_array.append(elem.replace('€', '').replace('M', ''))
At this point, you still have strings in your array. If you want to return them as ints, you can write int(elem.replace('€', '').replace('M', '')) instead. But be aware that you will then lose everything after the floating point, i.e. you will end up with [110, 111, 112].
You can use Regex to do that.
import re
str = "€110.5M"
x = re.findall("\-?\d+\.\d+", str )
print(x)
I didn't quite understand the second part of the question.
I have a line of code which is:
D = {'h' : 'hh' , 'e' : 'ee'}
str = 'hello'
data = ''.join(map(lambda x:D.get(x,x),str))
print data
this gives an output -> hheello
I am trying to understand how does map function work here. Does map take each character of
the string and compare it with dictionary key, and give back the corresponding key value?
How does it do for each character here? There is no iteration. Is there any good example to understand this better?
There is no loop because map requires an "iterable" (i.e. an object on which you can do an iteration) and does the loop itself.
map, if not present natively, could be implemented as:
def map(f, it):
return [f(x) for x in it]
or, even more explicitly, as:
def map(f, it):
result = []
for x in it:
result.append(f(x))
return result
In Python a string is an iterable, and on iteration loops over the characters in the string. For example
map(ord, "hello")
returns
[104, 101, 108, 108, 111]
because those are the character codes for the chars in the string.
Map just applies the function to each item of a list. E.g.,
map(lambda x: 10*x, [1,2,3,4])
gives
[10, 20, 30, 40]
It takes individual elements of str. Following is the readable code for the same implementation:
D = { 'h' : 'hh' , 'e' : 'ee'}
str = 'hello'
returns = [] # create list for storing return value from function
def myLambda(x): # function does lambda
return D.get(x,x)
for x in str: #map==> pass iterable
returns.append(myLambda(x)) #for each element get equivalent string from dictionary and append to list
print ''.join(returns) #join for showing result
Generally speaking, a map operation works like this:
MAP (f,L) returns L'
Input:
L is a list of n elements [ e1 , e2 , ... , en ]
f is a function
Output
L' is the list L after the application of f to each element individually: [ f(e1) , f(e2) , ... , f(en) ]
So, in your case, the join operation, which operates on lists, starts with the empty string and repeatedly concatenates each element e obtained in the following way:
Take a character x from str; return D.get(x,x)
Note that the above (which is the explaining of the map operation) will give you 'hh' and 'ee' with input 'h' and input 'e' respectively, while it will leave the other characters as they are.
Since str is a string, map() will apply the function (lambda in this case) to every item of the string. [map()][2] works on iterables and sequences, so it can work in a string because a string is a sequence.
Try this so you get the idea:
str2 = "123"
print map(int, str2)
>>> [1, 2, 3]
In this case you are casting each letter in str2 to int:
int("1") -> 1
int("2") -> 2
int("3") -> 3
and return them in a list:
[1, 2, 3]
Note: Don't use Python built-in names as names of variables. Don't use str as the name of a variable because you are hiding its built-in implementation. Use str1, my_str o, s ... instead.
I'm trying to combine a string with a series of numbers as tuples to a list.
For example, starting with:
a = [12,23,45,67,89]
string = "John"
I want to turn that into:
tuples = [(12,'John'),(23,'John'),(45,'John'),(67,'John'),(89,'John')]
I tried:
string2 = string * len(a)
tuples = zip(a, string2)
but this returned:
tuples = [(12,'J'), (23,'o'), ...]
If you want to use zip(), then create a list for your string variable before multiplying:
string2 = [string] * len(a)
tuples = zip(a,string2)
string * len(a) creates one long string, and zip() then iterates over that to pull out individual characters. By multiplying a list instead, you get a list with len(a) separate references to the string value; iteration then gives you string each time.
You could also use itertools.repeat() to give you string repeatedly:
from itertools import repeat
tuples = zip(a, repeat(string))
This avoids creating a new list object, potentially quite large.
>>> a = [12,23,45,67,89]
>>> string = "John"
>>> my_tuple = [(i,string) for i in a]
>>> print my_tuple
You can iterate over each position within a string so zip causes the behavior you were seeing previously.