Python: How can I use a String for a If-Statement? - python

In Python I have to build a (long) if statement dynamically.
How can I do this?
I tried the following test code to store the necessary if-statement within a string with the function "buildFilterCondition".
But this doesn´t work...
Any ideas? What is going wrong?
Thank you very much.
Input = [1,2,3,4,5,6,7]
Filter = [4,7]
FilterCondition = ""
def buildFilterCondition():
global FilterCondition
for f in Filter:
FilterCondition = FilterCondition + "(x==" + str(f) +") | "
#remove the last "| " sign
FilterCondition = FilterCondition[:-2]
print("Current Filter: " + FilterCondition)
buildFilterCondition()
for x in Input:
if( FilterCondition ):
print(x)
With my Function buildFilterCondition() I want to reach the following situation, because the function generates the string "(x==4) | (x==7)", but this doesn´t work:
for x in Input:
if( (x==4) | (x==7) ):
print(x)
The output, the result should be 4,7 (--> filtered)
The background of my question actually had a different intention than to replace an if-statement.
I need a longer multiple condition to select specific columns of a pandas dataframe.
For example:
df2=df.loc[(df['Discount1'] == 1000) & (df['Discount2'] == 2000)]
I wanted to keep the column names and the values (1000, 2000) in 2 separate lists (or dictionary) to make my code a little more "generic".
colmnHeader = ["Discount1", "Discount2"]
filterValue = [1000, 2000]
To "filter" the data frame, I then only need to adjust the lists.
How do I now rewrite the call to the .loc method so that it works for iterating over the lists?
df2=df.loc[(df[colmHeader[0] == [filterValue[0]) & (df[colmHeader[1]] == filterValue[1])]
Unfortunately, my current attempt with the following code does not work because the panda-loc function has not to be called sequentially, but in parallel.
So I need ALL the conditions from the lists directly in the .loc call.
#FILTER
colmn = ["colmn1", "colmn2", "colmn3"]
cellContent = ["1000", "2000", "3000"]
# first make sure, the lists have the same size
if( len(colmn) == len(cellContent)):
curIdx = 0
for curColmnName in colmn:
df_columns= df_columns.loc[df_columns [curColmnName]==cellContent[curIdx]]
curIdx += 1
Thank you again!

Use in operator
Because simple if better than complex.
inputs = [1, 2, 3, 4, 5, 6, 7]
value_filter = [4, 7]
for x in inputs:
if x in value_filter:
print(x, end=' ')
# 4 7

Use operator module
With the operator module, you can build a condition at runtime with a list of operator and values pairs to test the current value.
import operator
inputs = [1, 2, 3, 4, 5, 6, 7]
# This list can be dynamically changed if you need to
conditions = [
(operator.ge, 4), # value need to be greater or equal to 4
(operator.lt, 7), # value need to be lower than 7
]
for x in inputs:
# all to apply a and operator on all condition, use any for or
if all(condition(x, value) for condition, value in conditions):
print(x, end=' ')
# 4 5 6

Related

value range in Pandas

I have a simple code for titanic data:
import pandas as pd
def pClassSurvivorDetails(df,pClass):
print('\nResults for Pclass =', pClass, '\n -------------------- ')
print("The following did not survive")
notSurvive = df['Sex'][df['Survived']==0][df['Pclass']==pClass]
print(notSurvive.value_counts())
print("The following did survive")
survive = df['Sex'][df['Survived']==1][df['Pclass']==pClass]
print(survive.value_counts())
def main():
df = pd.read_csv("titanic.csv")
for value in [1, 2, 3]:
pClassSurvivorDetails(df,value, )
main()
Now I need to do the same result but instead of for value in [1, 2, 3] i need first number =x last number= y and all between should be included ...something like [1:3](but it doesn't work this way). Any ideas please...
To cycle through all values between two variables in Python, you can use:
for i in range(x, y):
Or, since it is up to and not including y, you could include y with:
for i in range(x, y + 1):
To get all values in this range, and then access only one, the simplest way is to store it as a list.
my_values = list(range(x, y))
And then you can access with indexing, e.g.:
my_values[2]

For cycle gets stuck in Python

My code below is getting stuck on a random point:
import functions
from itertools import product
from random import randrange
values = {}
tables = {}
letters = "abcdefghi"
nums = "123456789"
for x in product(letters, nums): #unnecessary
values[x[0] + x[1]] = 0
for x in product(nums, letters): #unnecessary
tables[x[0] + x[1]] = 0
for line_cnt in range(1,10):
for column_cnt in range(1,10):
num = randrange(1,10)
table_cnt = functions.which_table(line_cnt, column_cnt) #Returns a number identifying the table considered
#gets the values already in the line and column and table considered
line = [y for x,y in values.items() if x.startswith(letters[line_cnt-1])]
column = [y for x,y in values.items() if x.endswith(nums[column_cnt-1])]
table = [x for x,y in tables.items() if x.startswith(str(table_cnt))]
#if num is not contained in any of these then it's acceptable, otherwise find another number
while num in line or num in column or num in table:
num = randrange(1,10)
values[letters[line_cnt-1] + nums[column_cnt-1]] = num #Assign the number to the values dictionary
print(line_cnt) #debug
print(sorted(values)) #debug
As you can see it's a program that generates random sudoku schemes using 2 dictionaries : values that contains the complete scheme and tables that contains the values for each table.
Example :
5th square on the first line = 3
|
v
values["a5"] = 3
tables["2b"] = 3
So what is the problem? Am I missing something?
import functions
...
table_cnt = functions.which_table(line_cnt, column_cnt) #Returns a number identifying the table considered
It's nice when we can execute the code right ahead on our own computer to test it. In other words, it would have been nice to replace "table_cnt" with a fixed value for the example (here, a simple string would have sufficed).
for x in product(letters, nums):
values[x[0] + x[1]] = 0
Not that important, but this is more elegant:
values = {x+y: 0 for x, y in product(letters, nums)}
And now, the core of the problem:
while num in line or num in column or num in table:
num = randrange(1,10)
This is where you loop forever. So, you are trying to generate a random sudoku. From your code, this is how you would generate a random list:
nums = []
for _ in range(9):
num = randrange(1, 10)
while num in nums:
num = randrange(1, 10)
nums.append(num)
The problem with this approach is that you have no idea how long the program will take to finish. It could take one second, or one year (although, that is unlikely). This is because there is no guarantee the program will not keep picking a number already taken, over and over.
Still, in practice it should still take a relatively short time to finish (this approach is not efficient but the list is very short). However, in the case of the sudoku, you can end up in an impossible setting. For example:
line = [6, 9, 1, 2, 3, 4, 5, 8, 0]
column = [0, 0, 0, 0, 7, 0, 0, 0, 0]
Where those are the first line (or any line actually) and the last column. When the algorithm will try to find a value for line[8], it will always fail since 7 is blocked by column.
If you want to keep it this way (aka brute force), you should detect such a situation and start over. Again, this is very unefficient and you should look at how to generate sudokus properly (my naive approach would be to start with a solved one and swap lines and columns randomly but I know this is not a good way).

Quicksort in python3. Last Pivot

Thanks for taking the time to read this :) I'm implementing my own version of quick-sort in python and i'm trying to get it too work within some restrictions from a previous school assignment. Note that the reasons I've avoided using IN is because it wasn't allowed in the project i worked on (not sure why :3).
it was working fine for integers and strings but i cannot manage to adapt it for my CounterList() which is a list of nodes containing an arbitrary integer and string in each even though i'm only sorting by the integers contained in those nodes.
Pastebins:
My QuickSort: http://pastebin.com/mhAm3YYp.
The CounterList and CounterNode, code. http://pastebin.com/myn5xuv6.
from classes_1 import CounterNode, CounterList
def bulk_append(array1, array2):
# takes all the items in array2 and appends them to array1
itr = 0
array = array1
while itr < len(array2):
array.append(array2[itr])
itr += 1
return array
def quickSort(array):
lss = CounterList()
eql = CounterList()
mre = CounterList()
if len(array) <= 1:
return array # Base case.
else:
pivot = array[len(array)-1].count # Pivoting on the last item.
itr = 0
while itr < len(array)-1:
# Essentially editing "for i in array:" to handle CounterLists
if array[itr].count < pivot:
lss.append(array[itr])
elif array[itr].count > pivot:
mre.append(array[itr])
else:
eql.append(array[itr])
itr += 1
# Recursive step and combining seperate lists.
lss = quickSort(lss)
eql = quickSort(eql)
mre = quickSort(mre)
fnl = bulk_append(lss, eql)
fnl = bulk_append(fnl, mre)
return fnl
I know it is probably quite straightforward but i just can't seem to see the issue.
(Pivoting on last item)
Here is the test im using:
a = CounterList()
a.append(CounterNode("ack", 11))
a.append(CounterNode("Boo", 12))
a.append(CounterNode("Cah", 9))
a.append(CounterNode("Doh", 7))
a.append(CounterNode("Eek", 5))
a.append(CounterNode("Fuu", 3))
a.append(CounterNode("qck", 1))
a.append(CounterNode("roo", 2))
a.append(CounterNode("sah", 4))
a.append(CounterNode("toh", 6))
a.append(CounterNode("yek", 8))
a.append(CounterNode("vuu", 10))
x = quickSort(a)
print("\nFinal List: \n", x)
And the resulting CounterList:
['qck': 1, 'Fuu': 3, 'Eek': 5, 'Doh': 7, 'Cah': 9, 'ack': 11]
Which as you can tell, is missing multiple values?
Either way thanks for any advice, and your time.
There are two mistakes in the code:
You don't need "eql = quickSort(eql)" line because it contains all equal values, so no need to sort.
In every recursive call you loose pivot (reason for missing entries) as you don't append it to any list. You need to append it to eql. So after the code line shown below:
pivot = array[len(array)-1].count
insert this line:
eql.append(array[len(array)-1])
Also remove the below line from your code as it may cause recursion depth sometimes (only with arrays with some repeating values if any repeated value selected as pivot):
eql = quickSort(eql)

How to toggle between two values?

I want to toggle between two values in Python, that is, between 0 and 1.
For example, when I run a function the first time, it yields the number 0. Next time, it yields 1. Third time it's back to zero, and so on.
Sorry if this doesn't make sense, but does anyone know a way to do this?
Use itertools.cycle():
from itertools import cycle
myIterator = cycle(range(2))
myIterator.next() # or next(myIterator) which works in Python 3.x. Yields 0
myIterator.next() # or next(myIterator) which works in Python 3.x. Yields 1
# etc.
Note that if you need a more complicated cycle than [0, 1], this solution becomes much more attractive than the other ones posted here...
from itertools import cycle
mySmallSquareIterator = cycle(i*i for i in range(10))
# Will yield 0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 0, 1, 4, ...
You can accomplish that with a generator like this:
>>> def alternate():
... while True:
... yield 0
... yield 1
...
>>>
>>> alternator = alternate()
>>>
>>> alternator.next()
0
>>> alternator.next()
1
>>> alternator.next()
0
You can use the mod (%) operator.
count = 0 # initialize count once
then
count = (count + 1) % 2
will toggle the value of count between 0 and 1 each time this statement is executed. The advantage of this approach is that you can cycle through a sequence of values (if needed) from 0 - (n-1) where n is the value you use with your % operator. And this technique does not depend on any Python specific features/libraries.
e.g.
count = 0
for i in range(5):
count = (count + 1) % 2
print(count)
gives:
1
0
1
0
1
You may find it useful to create a function alias like so:
import itertools
myfunc = itertools.cycle([0,1]).next
then
myfunc() # -> returns 0
myfunc() # -> returns 1
myfunc() # -> returns 0
myfunc() # -> returns 1
In python, True and False are integers (1 and 0 respectively). You could use a boolean (True or False) and the not operator:
var = not var
Of course, if you want to iterate between other numbers than 0 and 1, this trick becomes a little more difficult.
To pack this into an admittedly ugly function:
def alternate():
alternate.x=not alternate.x
return alternate.x
alternate.x=True #The first call to alternate will return False (0)
mylist=[5,3]
print(mylist[alternate()]) #5
print(mylist[alternate()]) #3
print(mylist[alternate()]) #5
from itertools import cycle
alternator = cycle((0,1))
next(alternator) # yields 0
next(alternator) # yields 1
next(alternator) # yields 0
next(alternator) # yields 1
#... forever
var = 1
var = 1 - var
That's the official tricky way of doing it ;)
Using xor works, and is a good visual way to toggle between two values.
count = 1
count = count ^ 1 # count is now 0
count = count ^ 1 # count is now 1
To toggle variable x between two arbitrary (integer) values,
e.g. a and b, use:
# start with either x == a or x == b
x = (a + b) - x
# case x == a:
# x = (a + b) - a ==> x becomes b
# case x == b:
# x = (a + b) - b ==> x becomes a
Example:
Toggle between 3 and 5
x = 3
x = 8 - x (now x == 5)
x = 8 - x (now x == 3)
x = 8 - x (now x == 5)
This works even with strings (sort of).
YesNo = 'YesNo'
answer = 'Yes'
answer = YesNo.replace(answer,'') (now answer == 'No')
answer = YesNo.replace(answer,'') (now answer == 'Yes')
answer = YesNo.replace(answer,'') (now answer == 'No')
Using the tuple subscript trick:
value = (1, 0)[value]
Using tuple subscripts is one good way to toggle between two values:
toggle_val = 1
toggle_val = (1,0)[toggle_val]
If you wrapped a function around this, you would have a nice alternating switch.
If a variable is previously defined and you want it to toggle between two values, you may use the
a if b else c form:
variable = 'value1'
variable = 'value2' if variable=='value1' else 'value1'
In addition, it works on Python 2.5+ and 3.x
See Expressions in the Python 3 documentation.
Simple and general solution without using any built-in. Just keep the track of current element and print/return the other one then change the current element status.
a, b = map(int, raw_input("Enter both number: ").split())
flag = input("Enter the first value: ")
length = input("Enter Number of iterations: ")
for i in range(length):
print flag
if flag == a:
flag = b;
else:
flag = a
Input:
3 835Output:38383
Means numbers to be toggled are 3 and 8
Second input, is the first value by which you want to start the sequence
And last input indicates the number of times you want to generate
One cool way you can do in any language:
variable = 0
variable = abs(variable - 1) // 1
variable = abs(variable - 1) // 0

How to make a random but partial shuffle in Python?

Instead of a complete shuffle, I am looking for a partial shuffle function in python.
Example : "string" must give rise to "stnrig", but not "nrsgit"
It would be better if I can define a specific "percentage" of characters that have to be rearranged.
Purpose is to test string comparison algorithms. I want to determine the "percentage of shuffle" beyond which an(my) algorithm will mark two (shuffled) strings as completely different.
Update :
Here is my code. Improvements are welcome !
import random
percent_to_shuffle = int(raw_input("Give the percent value to shuffle : "))
to_shuffle = list(raw_input("Give the string to be shuffled : "))
num_of_chars_to_shuffle = int((len(to_shuffle)*percent_to_shuffle)/100)
for i in range(0,num_of_chars_to_shuffle):
x=random.randint(0,(len(to_shuffle)-1))
y=random.randint(0,(len(to_shuffle)-1))
z=to_shuffle[x]
to_shuffle[x]=to_shuffle[y]
to_shuffle[y]=z
print ''.join(to_shuffle)
This is a problem simpler than it looks. And the language has the right tools not to stay between you and the idea,as usual:
import random
def pashuffle(string, perc=10):
data = list(string)
for index, letter in enumerate(data):
if random.randrange(0, 100) < perc/2:
new_index = random.randrange(0, len(data))
data[index], data[new_index] = data[new_index], data[index]
return "".join(data)
Your problem is tricky, because there are some edge cases to think about:
Strings with repeated characters (i.e. how would you shuffle "aaaab"?)
How do you measure chained character swaps or re arranging blocks?
In any case, the metric defined to shuffle strings up to a certain percentage is likely to be the same you are using in your algorithm to see how close they are.
My code to shuffle n characters:
import random
def shuffle_n(s, n):
idx = range(len(s))
random.shuffle(idx)
idx = idx[:n]
mapping = dict((idx[i], idx[i-1]) for i in range(n))
return ''.join(s[mapping.get(x,x)] for x in range(len(s)))
Basically chooses n positions to swap at random, and then exchanges each of them with the next in the list... This way it ensures that no inverse swaps are generated and exactly n characters are swapped (if there are characters repeated, bad luck).
Explained run with 'string', 3 as input:
idx is [0, 1, 2, 3, 4, 5]
we shuffle it, now it is [5, 3, 1, 4, 0, 2]
we take just the first 3 elements, now it is [5, 3, 1]
those are the characters that we are going to swap
s t r i n g
^ ^ ^
t (1) will be i (3)
i (3) will be g (5)
g (5) will be t (1)
the rest will remain unchanged
so we get 'sirgnt'
The bad thing about this method is that it does not generate all the possible variations, for example, it could not make 'gnrits' from 'string'. This could be fixed by making partitions of the indices to be shuffled, like this:
import random
def randparts(l):
n = len(l)
s = random.randint(0, n-1) + 1
if s >= 2 and n - s >= 2: # the split makes two valid parts
yield l[:s]
for p in randparts(l[s:]):
yield p
else: # the split would make a single cycle
yield l
def shuffle_n(s, n):
idx = range(len(s))
random.shuffle(idx)
mapping = dict((x[i], x[i-1])
for i in range(len(x))
for x in randparts(idx[:n]))
return ''.join(s[mapping.get(x,x)] for x in range(len(s)))
import random
def partial_shuffle(a, part=0.5):
# which characters are to be shuffled:
idx_todo = random.sample(xrange(len(a)), int(len(a) * part))
# what are the new positions of these to-be-shuffled characters:
idx_target = idx_todo[:]
random.shuffle(idx_target)
# map all "normal" character positions {0:0, 1:1, 2:2, ...}
mapper = dict((i, i) for i in xrange(len(a)))
# update with all shuffles in the string: {old_pos:new_pos, old_pos:new_pos, ...}
mapper.update(zip(idx_todo, idx_target))
# use mapper to modify the string:
return ''.join(a[mapper[i]] for i in xrange(len(a)))
for i in xrange(5):
print partial_shuffle('abcdefghijklmnopqrstuvwxyz', 0.2)
prints
abcdefghljkvmnopqrstuxwiyz
ajcdefghitklmnopqrsbuvwxyz
abcdefhwijklmnopqrsguvtxyz
aecdubghijklmnopqrstwvfxyz
abjdefgcitklmnopqrshuvwxyz
Evil and using a deprecated API:
import random
# adjust constant to taste
# 0 -> no effect, 0.5 -> completely shuffled, 1.0 -> reversed
# Of course this assumes your input is already sorted ;)
''.join(sorted(
'abcdefghijklmnopqrstuvwxyz',
cmp = lambda a, b: cmp(a, b) * (-1 if random.random() < 0.2 else 1)
))
maybe like so:
>>> s = 'string'
>>> shufflethis = list(s[2:])
>>> random.shuffle(shufflethis)
>>> s[:2]+''.join(shufflethis)
'stingr'
Taking from fortran's idea, i'm adding this to collection. It's pretty fast:
def partial_shuffle(st, p=20):
p = int(round(p/100.0*len(st)))
idx = range(len(s))
sample = random.sample(idx, p)
res=str()
samptrav = 1
for i in range(len(st)):
if i in sample:
res += st[sample[-samptrav]]
samptrav += 1
continue
res += st[i]
return res

Categories