I need to change the value of two random variables out of four to '—'. How do I do it with maximum effectiveness and readability?
Code below is crap just for reference.
from random import choice
a = 10
b = 18
c = 15
d = 92
choice(a, b, c, d) = '—'
choice(a, b, c, d) = '—'
print(a, b, c, d)
>>> 12 — — 92
>>> — 19 — 92
>>> 10 18 — —
I've tried choice(a, b, c, d) = '—' but ofc it didn't work. There's probably a solution using list functions and methods but it's complicated and almost impossible to read, so I'm searching for an easier solution.
Taking into consideration that you need to create 4 separate variables. Here's what you can do:
from random import sample
a = 10
b = 18
c = 15
d = 92
for i in sample(['a','b','c','d'], k=2):
exec(f"{i} = '-'")
print(a,b,c,d)
Using sample ensures non-repeating values.
However, this approach is not recommended but is just provided to help you understand the problem better. I recommend using a list or dictionary as stated by other fellow developers.
Variable names are not available when you run your code, so you cannot change a "random variable". Instead, I recommend that you use a list or a dictionary. Then you can choose a random element from the list or a random key from the dictionary.
Given the constraint of four named variables, I might do:
from random import sample
a = 10
b = 18
c = 15
d = 92
v = sample("abcd", 2)
if "a" in v:
a = "_"
if "b" in v:
b = "_"
if "c" in v:
c = "_"
if "d" in v:
d = "_"
print(a, b, c, d)
This is readable, but it's also extremely repetitive. It's much easier to do it if there aren't four individual variables:
from random import sample
nums = {
"a": 10,
"b": 18,
"c": 15,
"d": 92,
}
for v in sample(nums, 2):
nums[v] = "_"
print(*nums.values())
import random
data = [10, 18, 15, 92]
already_changed = []
for i in range(2): # change 2 to however many numbers you want to change
index=random.randint(0,len(data)-1) # randomly select an index to change
while index in already_changed: # makes sure not to change the same index
index = random.randint(0, len(data) - 1)
data[index] = "_"
print(data)
You cannot do choice(a, b, c, d) = '-' because choice is not a variable and can therefore not be assigned.
You should store the variables in a list then replace random elements with your string. That could be done like so:
from random import randint
a = 10
b = 18
c = 15
d = 92
replacement = "--"
nums = [a, b, c, d]
while nums.count(replacement) < 2:
# create random index
index = randint(0, len(nums)-1)
# replace value at index with the replacement
nums[index] = replacement
print(*nums) # print each element in the list
However, be aware that this code doesnt change the values of a, b, c, d. If you want that, you need to reset the variables.
You can achieve this by picking any two indices from a list of items at random, and replacing the element at that index position with _:
import random
x = [10,18,15,92]
for i in random.sample(range(len(x)), 2):
x[i] = '_'
print(x)
Related
I'm simplifying an engineering problem as much as possible for this question:
I have a working code like this:
import numpy as np
# FUNCTION DEFINITION
def Calculations(a, b): # a function defined to work based on 2 arguments, a and b
A = a * b - a
B = a * b - b
d = A - B
return(A, B, d, a, b)
# STORE LIST CREATION
A_list = []
B_list = []
d_list = []
a_list = []
b_list = [] # I will need this list later
# 1st sequential iteration in a for loop
length = np.arange(60, 62.5, 0.5)
for l in length:
lower = 50 # this is what I want the program to update based on d
upper = 70.5 # this is what I want the program to update based on d
step = 0.5
width = np.arange(lower, upper, step)
# 2nd for loop, but here I wouldn't like a sequential iteration
for w in width:
A_list.append(Calculations(l, w)[0])
B_list.append(Calculations(l, w)[1])
d_list.append(Calculations(l, w)[2])
a_list.append(Calculations(l, w)[3])
b_list.append(Calculations(l, w)[4])
print(A_list, " \n")
print(B_list, " \n")
print(d_list, " \n")
print(a_list, " \n")
print(b_list, " \n")
This is the way I have it now, but not how I want it to work.
Here, the program iterates each time through the values of length(l) in a sequential manner, meaning it evaluates everything for l=60, then for l=60.5 and so on... this is ok, but then, for l=60 it evaluates first for w=50, then for w=50.5 and so on...
What I want is that, for l=60 he evaluates for any random value (let's call this n) between the 50 (lower) and 70.5 (upper) with a step of 0.5 (step), he will then find a particular d as one of the "returned" results, if the d is negative then the n he used is the new upper, if d is positive that n is the new lower, and he will continue to do this until d is zero.
I will keep trying to figure it out by myself, but any help would be appreciated.
PD:
As I said this example is a simplification of my real problem, as side questions I would like to ask:
The real condition of while loop to break is not when d is zero, but the closest possible to zero, or phrased in other way, the min() of the abs() values composing the d_list. I tried something like:
for value in d_list:
if value = min(abs(d_list)):
print(A_list, " \n")
print(B_list, " \n")
print(d_list, " \n")
print(a_list, " \n")
print(b_list, " \n")
but that's not correct.
I don't want to use a conditions such as if d < 0.2 because sometimes I will get d's like 0.6 and that may be ok, neither do I want a condition like if d < 1 because then if for example d = 0.005 I would get a lot of d's before that, satisfying the condition of being < 1, but I only want one for each l.
I also need to find the associated values in the returned lists, for that specific d
EDIT
I made a mistake earlier in the conditions for new upper and lower based on the obtained value of d, I fixed that.
Also, I tried solving the problem like this:
length = np.arange(60, 62.5, 0.5)
for l in length:
lower_w = 59.5 # this is what I want the program to update based on d
upper_w = 63 # this is what I want the program to update based on d
step = 0.5
width = np.arange(lower_w, upper_w, step)
np.random.shuffle(width)
for w in width:
while lower_w < w < upper_w:
A_list.append(Calculations(l,w)[0])
B_list.append(Calculations(l,w)[1])
d_list.append(Calculations(l,w)[2])
a_list.append(Calculations(l,w)[3])
b_list.append(Calculations(l,w)[4])
for element in d_list:
if element < 0:
upper = w
else:
lower = w
if abs(element) < 1 :
break
But the while loop does not get to break...
Use np.random.shuffle to pick the elements of width in a random order:
width = np.arange(lower, upper, step)
np.random.shuffle(width)
But here you don't really want the second loop, just pick one element from it at random, so use np.range.choice(width):
length = np.arange(60, 62.5, 0.5)
lower = 50 # this is what I want the program to update based on d
upper = 70.5 # this is what I want the program to update based on d
step = 0.5
for l in length:
width = np.arange(lower, upper, step)
if len(width) == 0:
width = [lower]
w = np.random.choice(width)
(A, B, d, a, b) = Calculations(l, w)
A_list.append(A)
B_list.append(B)
d_list.append(d)
a_list.append(l)
b_list.append(w)
if d < 0:
lower = w
elif d:
upper = w
No need to pass a and b in the return of the Calculations function, you can just append the original parameters to a_list and b_list.
Note that you will run into an error if your lower and upper bound are identical, because the list will just be empty, so you need to fill in the list with a bound if it returns [].
In Python you can do this:
def boring_function(x):
a = x + 1
b = x - 1
c = x / 1
return a, b, c
a, b, c = boring_function(1)
I am looking for an equivalent method to use in R. So far, I have been trying to make an R list to retrieve the variables as follows:
boring_function = function(x) {
a = x+1
b = x-1
c = x/1
l = list(a = a, b = b, c = c)
return(l)
}
result = boring_function(1)
a = result[["a"]]
b = result[["b"]]
c = result[["c"]]
However, note that this is greatly simplified example, and in reality I am making very complex functions that need to return values, lists etc separately. And in those cases, I have to return a single list that contain values, vectors, matrices, lists in lists etc. As you can imagine, the end result looks really messy and extremely hard to work with. Imagine, very often I have to do something like this in order to get my specific result from an R function:
specific_result = complex_function(x, y, z)[["some_list"]][["another_list"]][["yet_another_list"]][finally_the_result]
I find this method very confusing and I would prefer the Python approach. But I have to do my work in R. Seems like it is not possible to have the exact Python approach, but, is there anything similar in R? If not, what is the best way to do what I want to do?
1) gsubfn The gsubfn package supports this. Try help("list", "gsubfn") for more examples.
library(gsubfn)
boring_function <- function(x) {
a <- x + 1
b <- x - 1
c <- x / 1
list(a, b, c)
}
list[a, b, c] <- boring_function(0)
a
## [1] 1
b
## [1] -1
c
## [1] 0
# target components can be left unspecified if we don't need them
list[aa,,cc] <- boring_function(0)
aa
## [1] 1
cc
## [1] 0
2) with Using with we can get a similar effect with the limitation that the variables will not survive the with; however, we can pass a single return value out of the with and the body of the with can be as long as you like with as many statements as needed. The function must return a named list in this case. (Also look at ?within for another variation.)
boring_function2 <- function(x) {
a <- x + 1
b <- x - 1
c <- x / 1
list(a = a, b = b, c = c)
}
with(boring_function2(0), {
cat(a, b, c, "\n")
})
# at this point a, b and c are not defined
# we can work with a, b, and c in the with and in the end we only need Sum
Sum <- with(boring_function2(0), a + b + c)
Sum
## [1] 0
3) attach We can attach the list to the search path. This can be a bit confusing because variables in the workspace of the same name will mask those in the attached environment and, in general, this is not widely used or recommended but we show what is possible. As in (2) the function must return a named list.
attach(boring_function2(0), name = "boring")
a
## [1] 1
# found on search path
grep("boring", search(), value = TRUE)
## [1] "boring"
# remove from search path
detach("boring")
4) shortcuts If we have a named list but keep reusing the upper level we can refer to leaves in a variety of ways which may be more convenient in certain circumstances.
L <- list(a = list(a1 = 1, a2 = 2), b = list(b1 = 3, b2 = 4))
L[["a"]][["a1"]]
## [1] 1
La <- L$a
La$a1
## [1] 1
L[[c("a", "a1")]]
## [1] 1
ix <- c("a", "a1")
L[[ix]]
## [1] 1
L$a$a1
## [1] 1
L[[1:2]]
## [1] 2
5) walk tree If the elements of the nested list have unique names we can write a small function that will locate an element given its name.
findName <- function(L, name) {
result <- NULL
if (is.list(L)) {
if (name %in% names(L)) result <- L[[name]]
else for(el in L) {
result <- Recall(el, name)
if (!is.null(result)) break
}
}
result
}
# test
L <- list(a = list(a1 = 1, a2 = 2, a3 = list(s = 11, t = 12),
b = list(b1 = 3, b2 = 4)))
# only need to specify a2 or a3 rather than entire path to it
a2 <- findName(L, "a2")
identical(a2, L$a$a2)
## [1] TRUE
a3 <- findName(L, "a3")
identical(a3, L$a$a3)
## [1] TRUE
We could also define our own class whose $ searches recursively thorugh the list for a given name. Here LL$a2 uses findName to locate the name a2.
"$.llist" <- findName
LL <- structure(L, class = "llist")
identical(LL$a2, LL$a$a2)
## [1] TRUE
As already stated in the comments this is not supported in R. It is a implementation choice in Python, and I would argue that storing your results as you are describing is bad design when programming in R.
In practice if you "pass" your list down, it makes better sense to append the result to your list
res <- list(res1 = a)
res$res2 <- b
# Continuing
Or evaluate which parts of the result you need, and only return these parts of the result. Note that at this point you only need to subset the upper level res$final_result because it was appended at the upper level instead of appending to the level of res1 for example.
There is however a package that provides a similar interface to Pythons a, b, c = 1, 2, 3 called zeallot:
library(zeallot)
c(a, b, c) %<-% c(1, 2, 3)
(note the left hand needs to be a vector not a list), but this sacrifices some performance in your script. In general a better approach is to think about how "R" works with data, and store your results appropriately.
I have two dictionaries of data from 2016 and 2017 respectively which have the same 5 keys. I want to calculate the percentage of each key's value to the sum of the values in its dictionary and then join the two percentages of each individual key to a label. I have managed to do so below but my method requires a lot of for looping and seems somewhat clunky. I am looking for ways of condensing or rewriting my code so as to make it more efficient.
UsersPerCountry, UsersPerPlatform, UsersPerPlatform2016, UsersPerPlatform2017 = Analytics.UsersPerCountryOrPlatform()
labels = []
sizes16 = []
sizes17 = []
sumc1 = 0
sumc2 = 0
percentages = []
for k, v in dict1.iteritems():
sumv1 += v
for k, v in dict1.iteritems():
v1 = round(((float(v) / sumc1) * 100), 1)
percentages.append(v1)
labels.append(k)
sizes16.append(c)
for k, v in dict2.iteritems():
sumv1 += v
for k, v in dict2.iteritems():
v2 = round(((float(v) / sumc1) * 100), 1)
percentages.append(v2)
sizes17.append(c)
for i in range(5):
labels[i] += (', ' + str(percentages[i]) + '%' + ', ' + str(percentages[i + 5]) + '%')
This is what the label looks like:
EDIT: I have now added the variable declaration. I thought the hashed line about setting all variables to empty lists or 0 would suffice.
You could use Panda's data frame class to simplify things. I am a bit unsure of how your percentages are being calculated so that may need to be worked out a bit but otherwise, try this:
import pandas as pd
#convert data to DataFrame class
df1 = pd.DataFrame(dict1)
df2 = pd.DataFrame(dict2)
#compute the percentages
percnt1 = df1.sum(axis=0).div(df1.sum().sum())
percnt2 = df2.sum(axis=0).div(df2.sum().sum())
#to get the sum:
percnt1 + percnt2
Here's an example:
## create a data frame:
import numpy as np
df1 = pd.DataFrame({'Android':np.random.poisson(10,100), 'iPhone':np.random.poisson(10,100),
'OSX':np.random.poisson(10,100), 'WEBGL':np.random.poisson(10,100), 'Windows':np.random.poisson(10,100)})
In [11]: df1.head()
Out[11]:
Android OSX WEBGL Windows iPhone
0 12 12 9 9 5
1 9 8 14 7 11
2 12 10 7 10 11
3 11 12 7 17 5
4 15 16 15 11 13
In [10]: df1.sum(axis=0).div(df1.sum(axis=0).sum())
Out[10]:
Android 0.205279
OSX 0.198782
WEBGL 0.200609
Windows 0.198376
iPhone 0.196954
dtype: float64
Without Pandas:
You should take advantage of some of Python's built-in features, as well as functions. Here I'm trying to replicate what you're doing to be a little more Pythonic.
Note this is untested because you didn't give a full code snippet (sumc1 and c were undeclared). I wrote this based on what I think you're trying to do.
# Your size16/size17 lists appear to be full of the constant c
# can use Pythons list replication operation
sizes16 = [c]*len(dict1)
sizes17 = [c]*len(dict2)
# define function for clarity / reduce redundancy
def get_percentages(l):
s = sum(l)
percentages = [ round(((float(n) / s)*100),1) for n in l ] # percentages calculation is a great place for list comprehension
return percentages
# can grab the labels directly, rather than in a loop
labels = dict1.keys()
percentages1 = get_percentages(dict1.values())
percentages2 = get_percentages(dict2.values())
# no magic number 5
for i in range(len(labels)):
labels[i] += (', ' + str(percentages[i]) + '%' + ', ' + str(percentages[i + 5]) + '%')
That last line could be cleaned up if I had a better idea of what you were doing.
I haven't looked closely, but this code may run over the data an extra once or twice, so it may be a little less efficient. However, it's much more readable IMO.
Here's a way to go without an external library. You don't mention any problems in the way the code runs, just it's aesthetic (which one could argue has an effect on the way it runs). Anyway, this looks clean:
# Sample data
d1 = {'a':1.,'b':6.,'c':10.,'d':5.}
d2 = {'q':10.,'r':60.,'s':100.,'t':50.}
# List comprehension for each dictionary sum
sum1 = sum([v for k,v in d1.items()])
sum2 = sum([v for k,v in d2.items()])
# Using maps and lambda functions to get the distributions of each dictionary
d1_dist = map(lambda x: round(x/sum1*100, 1), list(d1.values()))
d2_dist = map(lambda y: round(y/sum2*100, 1), list(d2.values()))
# Insert your part with the labels here (I really didn't get that part)
>>> print(d1_dist)
[4.5, 45.5, 27.3, 22.7]
And if you want to join the original keys from a dictionary to these new distribution values, just use:
d1_formatted = dict(zip(list(d1.keys()), d1_dist))
>>> print(d1_formatted)
{'a': 4.5, 'c': 45.5, 'b': 27.3, 'd': 22.7}
I need to find the set of all subsets of size m for a given set of size n. Therefore in the simplest version of my algorithm I use itertools.combinations() but since n = 566 and m = 11 I can't manage to get to the end of computation even if I wait 10 hours then the process return killed: 9. Moreover I would need to have a set of sets as objects (where the elements of the set are sets as well) and not a list of tuples instead so that make the issue even more complicated computationally speaking.
from itertools import combinations, product
S = set()
Sf = frozenset(combinations(xrange(1, 566), 11))
for i in Sf:
t = frozenset(i)
S.add(t)
The more sofisticated version is instead to use intertools.combinations() on two sets A and B with n = 566 both and m = 2 and 9 respectively (I will call them d and k) and then final set S is made up by the union of the elements of this two sets (a and b for elements of A and B) in the way that S = {t = a ∪ b: a in A, b in B, a ∩ b = 0}
A = set()
As = frozenset(combinations(xrange(1, 566), 2))
for i in As:
a = frozenset(i)
A.add(a)
B = set()
Bs = frozenset(combinations(xrange(1, 566), 9))
for j in Bs:
b = frozenset(j)
B.add(b)
ab = set()
ba = set()
for i in A:
for j in B:
if i.issubset(j):
ab.add(i)
ba.add(j)
A_ab = A.difference(ab)
B_ba = B.difference(ba)
S = frozenset(product(B_ba, A_ab))
In the end I think my problems are computational time and memory usage (I'm using python 2.7.9 on i7 1.7 GHz with 8 GB of RAM mac). Obviously if there were a more efficient and optimized algorithm it would welcome!
So, I have a huge input file that looks like this: (you can download here)
1. FLO8;PRI2
2. FLO8;EHD3
3. GRI2;BET2
4. HAL4;AAD3
5. PRI2;EHD3
6. QLN3;FZF1
7. QLN3;ABR5
8. FZF1;ABR5
...
See it like a two column table, that the element before ";" shows to the element after ";"
I want to print simple strings iteratively that show the three elements that constitute a feedforward loop.
The example numbered list from above would output:
"FLO8 PRI2 EHD3"
"QLN3 FZF1 ABR5"
...
Explaining the first output line as a feedforward loop:
A -> B (FLO8;PRI2)
B -> C (PRI2;EHD3)
A -> C (FLO8;EHD3)
Only the circled one from this link
So, I have this, but it is terribly slow...Any suggestions to make a faster implementation?
import csv
TF = []
TAR = []
# READING THE FILE
with open("MYFILE.tsv") as tsv:
for line in csv.reader(tsv, delimiter=";"):
TF.append(line[0])
TAR.append(line[1])
# I WANT A BETTER WAY TO RUN THIS.. All these for loops are killing me
for i in range(len(TAR)):
for j in range(len(TAR)):
if ( TAR[j] != TF[j] and TAR[i] != TF[i] and TAR[i] != TAR[j] and TF[j] == TF[i] ):
for k in range(len(TAR )):
if ( not(k == i or k == j) and TF[k] == TAR[j] and TAR[k] == TAR[i]):
print "FFL: "+TF[i]+ " "+TAR[j]+" "+TAR[i]
NOTE: I don't want self-loops...from A -> A, B -> B or C -> C
I use a dict of sets to allow very fast lookups, like so:
Edit: prevented self-loops:
from collections import defaultdict
INPUT = "RegulationTwoColumnTable_Documented_2013927.tsv"
# load the data as { "ABF1": set(["ABF1", "ACS1", "ADE5,7", ... ]) }
data = defaultdict(set)
with open(INPUT) as inf:
for line in inf:
a,b = line.rstrip().split(";")
if a != b: # no self-loops
data[a].add(b)
# find all triplets such that A -> B -> C and A -> C
found = []
for a,bs in data.items():
bint = bs.intersection
for b in bs:
for c in bint(data[b]):
found.append("{} {} {}".format(a, b, c))
On my machine, this loads the data in 0.36s and finds 1,933,493 solutions in 2.90s; results look like
['ABF1 ADR1 AAC1',
'ABF1 ADR1 ACC1',
'ABF1 ADR1 ACH1',
'ABF1 ADR1 ACO1',
'ABF1 ADR1 ACS1',
Edit2: not sure this is what you want, but if you need A -> B and A -> C and B -> C but not B -> A or C -> A or C -> B, you could try
found = []
for a,bs in data.items():
bint = bs.intersection
for b in bs:
if a not in data[b]:
for c in bint(data[b]):
if a not in data[c] and b not in data[c]:
found.append("{} {} {}".format(a, b, c))
but this still returns 1,380,846 solutions.
Test set
targets = {'A':['B','C','D'],'B':['C','D'],'C':['A','D']}
And the function
for i in targets.keys():
try:
for y in targets.get(i):
#compares the dict values of two keys and saves the overlapping ones to diff
diff = list(set(targets.get(i)) & set(targets.get(y)))
#if there is at least one element overlapping from key.values i and y
#take up those elements and style them with some arrows
if (len(diff) > 0 and not i == y):
feed = i +'->'+ y + '-->'
forward = '+'.join(diff)
feedForward = feed + forward
print (feedForward)
except:
pass
The output is
A->B-->C+D
A->C-->D
C->A-->D
B->C-->D
Greetings to the Radboud Computational Biology course, Robin (q1/2016).