Can I shorten the process of adding multiple variables in Python? - python

I have a list and from that list every variable uses one index for a value. Example:
val = [2, 4, 8, 6]
var1 = val[0]
var2 = val[1]
var3 = val[2]
var4 = val[3]
Can I put this into a loop somehow? Because I have 20 values so it is long to write 20 variables.
P.S of course, the values from added variables must be usable. And the format I'm using those variables looks like this:
D = {u'label1': var1, u'label2: var2...}

For your specific issue you could use your dict directly from the list
D = {u'label0' : var[0], u'label1' : val[1],...}
and create the dict as
D = dict(("var{}".format(i),v) for i,v in enumerate(val))
Then, you refer to it as values["var1"] for example, where you can put as key the name you like, label_ for instance.

Try it,
label_dict = {}
for i in range(len(val)):
label_dict['label' + str(i+1)] = val[i]

Related

How can I make a conditional expression?

I want to see the modeling output with two data frames.
One data frame has a target value of 1 to 8 and another has only 1,2,3,5,6,7
I made a dictionary to map the values, and I made a code as below to make the probability.
my_dict ={1:'a', 2:'b', 3:'c', 4:'d', 5:'e', 6:'f', 7:'g', 8:'f'}
def func(val):
for key, value in my_dict.items():
if val == key:
return value
return "There is no such Key"
inputData = [1, 2, 3, 4, 5]
inputData2 = np.array([inputData])
index = 1;
result_data = OrderedDict()
for x in xgb_model.predict_proba(inputData2,ntree_limit=None, validate_features=False,base_margin=None)[0]:
result_data[func(index)] = round(x,2)
index += 1
print("result_name : ", max(result_data.items(), key=operator.itemgetter(1))[0])
print("result_value : ", max(xgb_model.predict_proba(inputData2, ntree_limit=None, validate_features=False, base_margin=None)[0]))
print(result_data)
But in the second data frame, the key value is pushed back.
For example, a: 0.2, b:0.2, c:0.1, e:0.1, f:0.1 g:0.3 should appear, but in real data, the data should be:
a:0.2, b:0.2, c:0.1, d:0.1, e:0.1, f:0.3
I don’t know what I should do.
So I've been working on the code below.
Only a:0.2, b:0.2, c:0.1 comes out and ends.
for x in xgb_model.predict_proba(inputData2,ntree_limit=None, validate_features=False,base_margin=None)[0]:
if index not in y.target.unique().tolist():
continue
result_data[func(index)] = round(x,2)
index += 1
please let me know if you can't understand the code.
hope for help. Thank you.
In the second model that has 8 coefficients, you overwrite the value for f since it is defined both for the 6th as well as for the 8th element. Your dict should be defined as:
my_dict ={1:'a', 2:'b', 3:'c', 4:'d', 5:'e', 6:'f', 7:'g', 8:'h'}
But you could make the code much simpler by just using a string ("_abcdefgh") to get the correct letter for each index. You could, then, just use result_data[mystring[i]]= and drop the function.

How to find the name of a min (or max) attribute?

Given three or more variables, I want to find the name of the variable with the min value.
I can get the min value from the list, and I can get the index within the list of the min value. But I want the variable name.
I feel like there's another way to go about this that I'm just not thinking of.
a = 12
b = 9
c = 42
cab = [c,a,b]
# yields 9 (the min value)
min(cab)
# yields 2 (the index of the min value)
cab.index(min(cab))
What code would yield 'b'?
The magic of vars prevents you from having to make a dictionary up front if you want to have things in instance variables:
class Foo():
def __init__(self, a, b, c):
self.a = a
self.b = b
self.c = c
def min_name(self, names = None):
d = vars(self)
if not names:
names = d.keys()
key_min = min(names, key = (lambda k: d[k]))
return key_min
In action
>>> x = Foo(1,2,3)
>>> x.min_name()
'a'
>>> x.min_name(['b','c'])
'b'
>>> x = Foo(5,1,10)
>>> x.min_name()
'b'
Right now it'll crash if you pass an invalid variable name in the parameter list for min_name, but that's resolvable.
You can also update the dictionary and it's reflected in the source
def increment_min(self):
key = self.min_name()
vars(self)[key] += 1
Example:
>>> x = Foo(2,3,4)
>>> x.increment_min()
>>> x.a
3
You cannot get the name of the variable with the minimum/maximum value like this*, since as #jasonharper commented: cab is nothing more than a list containing three integers; there is absolutely no connection to the variables that those integers originally came from.
A simple workaround is to user pairs, like this:
>>> pairs = [("a", 12), ("b", 9), ("c", 42)]
>>> min(pairs)
('b', 9)
>>> min(pairs)[0]
'b'
See Green Cloak Guy's answer, but if you want to go for readability, I suggest following a similar approach to mine.
You'd have to get very creative for this to work, and the only solution I can think of is rather inefficient.
You can get the memory address of the data b refers to fairly easily:
>>> hex(id(b))
'0xaadd60'
>>> hex(id(cab[2]))
'0xaadd60'
To actually correspond that with a variable name, though, the only way to do that would be to look through the variables and find the one that points to the right place.
You can do this by using the globals() function:
# get a list of all the variable names in the current namespace that reference your desired value
referent_vars = [k for k,v in globals().items() if id(v) == id(cab[2])]
var_name = referent_vars[0]
There are two big problems with this solution:
Namespaces - you can't put this code in a function, because if you do that and then call it from another function, then it won't work.
Time - this requires searching through the entire global namespace.
The first problem could be alleviated by additionally passing the current namespace in as a variable:
def get_referent_vars(val, globals):
return [k for k,v in globals.items() if id(v) == id(val)]
def main():
a = 12
b = 9
c = 42
cab = [a, b, c]
var_name = get_referent_vars(
cab[cab.index(min(cab))],
globals()
)[0]
print(var_name)
# should print 'b'

I want to convert the categorical variable to numerical in Python

I have a dataframe having categorical variables. I want to convert them to the numerical using the following logic:
I have 2 lists one contains the distinct categorical values in the column and the second list contains the values for each category. Now i need to map these values in place of those categorical values.
For Eg:
List_A = ['A','B','C','D','E']
List_B = [3,2,1,1,2]
I need to replace A with 3, B with 2, C and D with 1 and E with 2.
Is there any way to do this in Python.
I can do this by applying multiple for loops but I am looking for some easier way or some direct function if there is any.
Any help is very much appreciated, Thanks in Advance.
Create a mapping dict
List_A = ['A','B','C','D','E',]
List_B = [3,2,1,1,2]
d=dict(zip(List_A, List_B))
new_list=['A','B','C','D','E','A','B']
new_mapped_list=[d[v] for v in new_list if v in d]
new_mapped_list
Or define a function and use map
List_A = ['A','B','C','D','E',]
List_B = [3,2,1,1,2]
d=dict(zip(List_A, List_B))
def mapper(value):
if value in d:
return d[value]
return None
new_list=['A','B','C','D','E','A','B']
map(mapper,new_list)
Suppose df is your data frame and "Category" is the name of the column holding your categories:
df[df.Category == "A"] = 3,2, 1, 1, 2
df[(df.Category == "B") | (df.Category == "E") ] = 2
df[(df.Category == "C") | (df.Category == "D") ] = 1
If you only need to replace values in one list with the values of other and the structure is like the one you say. Two list, same lenght and same position, then you only need this:
list_a = []
list_a = list_b
A more convoluted solution would be like this, with a function that will create a dictionary that you can use on other lists:
# we make a function
def convert_list(ls_a,ls_b):
dic_new = {}
for letter,number in zip(ls_a,ls_b):
dic_new[letter] = number
return dic_new
This will make a dictionary with the combinations you need. You pass the two list, then you can use that dictionary on other list:
List_A = ['A','B','C','D','E']
List_B = [3,2,1,1,2]
dic_new = convert_list(ls_a, ls_b)
other_list = ['a','b','c','d']
for _ in other_list:
print(dic_new[_.upper()])
# prints
3
2
1
1
cheers
You could use a solution from machine learning scikit-learn module.
OneHotEncoder
LabelEncoder
http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html
http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.LabelEncoder.html
The pandas "hard" way:
https://stackoverflow.com/a/29330853/9799449

Python : pass variable name as argument

I have a function f(x) in which many local variables are created. x is a string with the same name as one of these local variables and I would like to change this local variable by changing x. What is the clean way to do this? Currently I am using a lot of if/elif statements.
Some dummy code to represent my problem:
def f(x):
a = [1,2,3]
b = [2,3,4]
c = [3,4,5]
if x == "a":
a[0] = 10
elif x == "b":
b[0] = 10
elif x == "c":
c[0] = 10
return a,b,c
I would like for the right variable to change value but using all these if/elif statements feels a bit redundant.
Use a dict
Simply, use a dict:
def f(x):
values = {
"a": [1,2,3],
"b": [2,3,4],
"c": [3,4,5]
}
values[x][0] = 10
return values["a"], values["b"], values["c"]
If you really really want, use your original code and do locals()[x][0] = 10 but that's not really recommended because you could cause unwanted issues if the argument is the name of some other variable you don't want changed.
use dictionary like this:
def f(x):
d = {"a" :[1,2,3],"b" : [2,3,4],"c" : [3,4,5]}
d[x][0] = 10
return d
If your input is a string and you want to refer to a variable that matches that string you can use globals() like this:
globals()['x']
This way you can get the value and/or edit its contents.
You can use exec() and put your code in argument as a string
def f(x):
a = [1,2,3]
b = [2,3,4]
c = [3,4,5]
exec(x + "[0] = 10")
return a,b,c
print f("a")
# ([10, 2, 3], [2, 3, 4], [3, 4, 5])

Append several variables to a list in Python

I want to append several variables to a list. The number of variables varies. All variables start with "volume". I was thinking maybe a wildcard or something would do it. But I couldn't find anything like this. Any ideas how to solve this? Note in this example it is three variables, but it could also be five or six or anything.
volumeA = 100
volumeB = 20
volumeC = 10
vol = []
vol.append(volume*)
You can use extend to append any iterable to a list:
vol.extend((volumeA, volumeB, volumeC))
Depending on the prefix of your variable names has a bad code smell to me, but you can do it. (The order in which values are appended is undefined.)
vol.extend(value for name, value in locals().items() if name.startswith('volume'))
If order is important (IMHO, still smells wrong):
vol.extend(value for name, value in sorted(locals().items(), key=lambda item: item[0]) if name.startswith('volume'))
Although you can do
vol = []
vol += [val for name, val in globals().items() if name.startswith('volume')]
# replace globals() with locals() if this is in a function
a much better approach would be to use a dictionary instead of similarly-named variables:
volume = {
'A': 100,
'B': 20,
'C': 10
}
vol = []
vol += volume.values()
Note that in the latter case the order of items is unspecified, that is you can get [100,10,20] or [10,20,100]. To add items in an order of keys, use:
vol += [volume[key] for key in sorted(volume)]
EDIT removed filter from list comprehension as it was highlighted that it was an appalling idea.
I've changed it so it's not too similar too all the other answers.
volumeA = 100
volumeB = 20
volumeC = 10
lst = map(lambda x : x[1], filter(lambda x : x[0].startswith('volume'), globals().items()))
print lst
Output
[100, 10, 20]
do you want to add the variables' names as well as their values?
output=[]
output.append([(k,v) for k,v in globals().items() if k.startswith('volume')])
or just the values:
output.append([v for k,v in globals().items() if k.startswith('volume')])
if I get the question appropriately, you are trying to append different values in different variables into a list. Let's see the example below.
Assuming :
email = 'example#gmail.com'
pwd='Mypwd'
list = []
list.append(email)
list.append (pwd)
for row in list:
print(row)
# the output is :
#example#gmail.com
#Mypwd
Hope this helps, thank you.

Categories