Suppose I have a list ['a', '1','student'] in Python
I am iterating through this list and want to check which item in the list is numeric.
I have tried all(item.isdigit()), type(item)==str. but shows error.
note: Numeric values in list are enclosed in quotes so they are identified as strings.
How to get past that?
I am expecting to identify which item in list is numeric and which are alphabetical values. The challenge is the numeric values are enclosed in quotes identifying them as strings
If you are after an array of bools, you can use:
lst = ['a', '1', 'student']
y = [x.isdigit() for x in lst]
>>> y
[False, True, False]
Where x.isdigit() returns true if the string x represents a digit. You can also use [x.isnumeric() for x in lst] if you are after any string that is numeric. Which includes decimals.
Try isdigit() instead:
l = ['a', '1', 'student']
for item in l:
if item.isdigit():
print(f'{item} is a digit')
elif item.isalpha():
print(f'{item} is a letter')
In the example you gave, extracting a list containing all the numbers is done by:
myList = ['a', '1','student']
onlyNumbers = [x for x in myList if x.isdigit()]
print(onlyNumbers)
An important distinction to make is the difference between a "number" type (int, float or Decimal) and a string that represents the number. If you had a list containing items of both string and integer types, then you could extract the items by altering the previous example like so:
myList = ['a', 1,'student']
onlyNumbers = [x for x in myList if isinstance(x,int)]
print(onlyNumbers)
Note how the 1 is no longer quoted. This means that it is a integer type.
Here are some recommendations:
all(iterable,condition)
returns a single value indicating whether ALL values in an iterable are true. It does not extract all values which are true, which is a misconception you appear to have.
Instead of type(x)==str, you can do isinstance(x,str) which is slightly cleaner.
To extract numbers from strings, you can use regular expressions. Alternatively, it is easier to attempt to convert the string to that number type:
anInteger = int("13")
aFloat = float("1")+float("1.2")
aComplex = complex("1+1j")
This is an example to show it working in full. It relies on python's numeric types throwing an exception when a bad representation of that type is passed to the type's constuctor.
def isNumberType(repStr : str, numType):
try:
numType(repStr)
return True
except:
return False
x = ["12.32","12","ewa"]
onlyIntegers = [int(val) for val in x if isNumberType(val,int)]
print(onlyIntegers)
Related
Why does list(str) behaves as string here when [str] doesn't?
Is there a difference between these methods
Before someone marks this as a duplicate do link the answer because I've spent a fair bit of time scrawling through stackoverflow!
code
x = 'ar'
'a' in list(x)
#True
'a' in [x]
#False
l = list(x)
'a' in l
#True
type(list(x))
#list
type([x])
#list
This is because list() converts the string to a list where each letter is one element. But [] creates a list where the things inside are the elements. List() is converting the string to a list whereas [] is just putting the string in a list.
You can use debug output for clarifying such things. Like this:
x = 'ar'
print(list(x))
print([x])
Prints this:
['a', 'r']
['ar']
Then let's think logically. list(x) is a constructor of a list from the string, it creates a list of all characters of a given string. And [x] just creates a list with one item: x.
Because you are asking if the element 'a' is in the list. Which it is not, your only element is 'ar'. If you print([x]) the result should be ['ar']
[x] creates a single-element list, where the element is x. So if x = 'ar', then the resulting list is ['ar'].
list(x) casts the variable x into a list. This can work on any iterable object, and strings are iterable. The resulting list is ['a', 'r'].
The element 'a' is in the second list but not the first.
If I want to disassemble a string into a list, do some manipulation with the original decimal values, and then assemble the string from the list, what is the best way?
str = 'abc'
lst = list(str.encode('utf-8'))
for i in lst:
print (i, chr(int(i+2)))
gives me a table.
But I would like to create instead a presentation like 'abc', 'cde', etc.
Hope this helps
str_ini = 'abc'
lst = list(str_ini.encode('utf-8'))
str_fin = [chr(v+2) for v in lst]
print(''.join(str_fin))
To convert a string into a list of character values (numbers), you can use:
s = 'abc'
vals = [ord(c) for c in s]
This results in vals being the list [97, 98, 99].
To convert it back into a string, you can do:
s2 = ''.join(chr(val) for val in vals)
This will give s2 the value 'abc'.
If you prefer to use map rather than comprehensions, you can equivalently do:
vals = list(map(ord, s))
and:
s2 = ''.join(map(chr, vals))
Also, avoid using the name str for a variable, since it will mask the builtin definition of str.
Use ord on the letters to retrieve their decimal ASCII representation, and then chr to convert them back to characters after manipulating the decimal value. Finally use the str.join method with an empty string to piece the list back together into a str:
s = 'abc'
s_list = [ord(let) for let in s]
s_list = [chr(dec + 2) for dec in s_list]
new_s = ''.join(s_list)
print(new_s) # every character is shifted by 2
Calling .encode on the string converts to a bytes string instead, which is likely not what you want. Additionally, you don't want to be using built-ins as the names for variables, because then you will no longer be able to use the built-in keyword in the same scope.
I have an array I want to iterate through. The array consists of strings consisting of numbers and signs.
like this: €110.5M
I want to loop over it and remove all Euro sign and also the M and return that array with the strings as ints.
How would I do this knowing that the array is a column in a table?
You could just strip the characters,
>>> x = '€110.5M'
>>> x.strip('€M')
'110.5'
def sanitize_string(ss):
ss = ss.replace('$', '').replace('€', '').lower()
if 'm' in ss:
res = float(ss.replace('m', '')) * 1000000
elif 'k' in ss:
res = float(ss.replace('k', '')) * 1000
return int(res)
This can be applied to a list as follows:
>>> ls = [sanitize_string(x) for x in ["€3.5M", "€15.7M" , "€167M"]]
>>> ls
[3500000, 15700000, 167000000]
If you want to apply it to the column of a table instead:
dataFrame = dataFrame.price.apply(sanitize_string) # Assuming you're using DataFrames and the column is called 'price'
You can use a string comprehension:
numbers = [float(p.replace('€','').replace('M','')) for p in a]
which gives:
[110.5, 210.5, 310.5]
You can use a list comprehension to construct one list from another:
foo = ["€13.5M", "€15M" , "€167M"]
foo_cleaned = [value.translate(None, "€M")]
str.translate replaces all occurrences of characters in the latter string with the first argument None.
Try this
arr = ["€110.5M","€110.5M","€110.5M","€110.5M","€110.5M","€110.5M","€110.5M"]
f = [x.replace("€","").replace("M","") for x in arr]
You can call .replace() on a string as often as you like. An initial solution could be something like this:
my_array = ['€110.5M', '€111.5M', '€112.5M']
my_cleaned_array = []
for elem in my_array:
my_cleaned_array.append(elem.replace('€', '').replace('M', ''))
At this point, you still have strings in your array. If you want to return them as ints, you can write int(elem.replace('€', '').replace('M', '')) instead. But be aware that you will then lose everything after the floating point, i.e. you will end up with [110, 111, 112].
You can use Regex to do that.
import re
str = "€110.5M"
x = re.findall("\-?\d+\.\d+", str )
print(x)
I didn't quite understand the second part of the question.
I would like to remove the first two characters from each element (which are currently ints) that i have in a list. This is what i have:
lst = [2011,2012,3013]
I would like to get this
lst= [11,12,13]
I do not want a solution with some sort of replace 20 or 30 with '' however.
Given source= [2011,-2012,-3013] :
Result as ints, unsigned:
dest = [abs(x)%100 for x in source]
Result as ints, signed
dest = [(abs(x)%100)*(1 if x > 0 else -1) for x in source]
Result as strings, unsigned (preserves leading zeroes):
dest = list(map(lambda x : str(x)[-2:],source)
Result as strings, signed (preserves leading zeroes):
dest = list(map(lambda x : ("-" if str(x)[0]=="-" else "")+str(x)[-2:],source))
You can use:
list = [abs(number)%100 for number in list]
And it's a bad practice to name lists list. Use another name.
You can use module by 100,like:
my_list= [2011,2012,3013]
expected_list = [i%100 for i in my_list]
If you have negative numbers in my_list:
expected_list=[abs(i)%100 for i in my_list]
Or use string slicing:
expected_list = [int(str(i)[2:]) for i in my_list] #[2:],because you want to remove first two numbers
Please try avoid using reserved keywords as you variable name, as you have used list as your variable name.
Modulo:
just modulo each element with like 100
list= [2011,2012,3013]
for i in range(len(list)):
list[i] %= 100
I came across the following line of code in Python and I keep wondering what does it do exactly:
while '' in myList:
myList.remove('')
Thanks in advance.
It removes all empty strings from a list, inefficiently.
'' in myList tests if '' is a member of myList; it'll loop over myList to scan for the value. myList.remove('') scans through myList to find the first element in the list that is equal to '' and remove it from the list:
>>> myList ['', 'not empty']
>>> '' in myList
True
>>> myList.remove('')
>>> myList
['not empty']
>>> '' in myList
False
So, the code repeatedly scans myList for empty strings, and each time one is found, another scan is performed to remove that one empty string.
myList = [v for v in myList if v != '']
would be a different, more efficient way of accomplishing the same task. This uses a list comprehension; loop over all values in myList and build a new list object from those values, provided they are not equal to the empty string.
Put simply, it removes all empty strings from myList.
Below is a breakdown:
# While there are empty strings in `myList`...
while '' in myList:
# ...call `myList.remove` with an empty string as its argument.
# This will remove the one that is currently the closest to the start of the list.
myList.remove('')
Note however that you can do this a lot better (more efficiently) with a list comprehension:
myList = [x for x in myList if x != '']
or, if myList is purely a list of strings:
# Empty strings evaluate to `False` in Python
myList = [x for x in myList if x]
If myList is a list of strings and you are on Python 2.x, you can use filter, which is even shorter:
myList = filter(None, myList)
In Python, two single quotes '' or double quotes "" represent the empty string.
The condition to keep looping is while the empty string exists in the list, and will only terminate when there are no more empty strings.
Therefore, it removes all empty strings from a list.