My code is as follows:
environment_rows = 3 #1
look_up_tables = [np.zeros((environment_rows, 20)) for _ in range(35)]
def get_starting_look_up_table(): #2
current_look_up_table = np.random.randint(35)
return current_look_up_table
def get_next_action(number_of_look_up_table,current_row_index,epsilon):
if np.random.random() < epsilon:
result=np.argmax(look_up_tables[number_of_look_up_table][current_row_index])
else: #choose a random action
result=np.random.randint(20)
return result
number_of_look_up_table = get_starting_look_up_table() #3
current_row_index = 0
epsilon = 0.9
action_index = get_next_action(current_row_index,number_of_look_up_table,epsilon)
In the first part, I produce the matrix related to look_up_tables, each of the arrays has three rows and twenty columns. We have a total of 35 of these arrays.
In the second part, using the 'get_starting_look_up_table' function, one of 35 is randomly selected. Also, by using the 'get next action' function, one of the 20 columns of the array selected by the previous function should be selected based on the largest value or randomly.
Finally, the functions are called in the third part, but I get the following error. When I run the line with the error separately, I don't have any problems. Thank you for guiding me in this regard.
IndexError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_28528/1385508929.py in <module>
2 current_row_index = 0
3
----> 4 action_index = get_next_action(current_row_index,number_of_look_up_table,epsilon)
5
6 reward = determine_rewards(number_of_look_up_table,current_row_index,action_index)
~\AppData\Local\Temp/ipykernel_28528/255015173.py in
get_next_action(number_of_look_up_table, current_row_index, epsilon)
9 def get_next_action(number_of_look_up_table,current_row_index,epsilon):
10 if np.random.random() < epsilon:
---> 11 return np.argmax(look_up_tables[number_of_look_up_table][current_row_index])
12 else: #choose a random action
13 return np.random.randint(20)
IndexError: index 27 is out of bounds for axis 0 with size 3
Your function call get_next_action(current_row_index, number_of_look_up_table, epsilon)
has the parameters in the wrong order. You switched current_row_index and number_of_look_up_table.
I think you want this:
action_index = get_next_action(number_of_look_up_table, current_row_index, epsilon)
I am new to python.
The below recursive code works as expected, so far so good.
What I am wondering about is why after 5 "loops" it stops calling itself recursively.
x=0
def factfunc(n):
global x
x+=1
print("loop #:", x, "-> ", end="")
if n < 0:
print("returning 'None'")
return None
if n < 2:
print("returning '1'")
return 1
print ("n=",n)
return n * factfunc(n - 1)
print ("Finally returned:", factfunc(5))
Output:
loop #: 1 -> n= 5
loop #: 2 -> n= 4
loop #: 3 -> n= 3
loop #: 4 -> n= 2
loop #: 5 -> returning '1'
Finally returned: 120
Hints would be appreciated. Thanks.
Not sure if I am supposed to answer my own question, doing it anyway:
Thanks to trincot's comment I believe I have understood now, I think this is what happens:
return 5 x # not returning anything here, instead calling itself again
(return 4 x # not returning anything here, instead calling itself again
(return 3 x # not returning anything here, instead calling itself again
(return 2 x # not returning anything here, instead calling itself again
(return 1) # returning 1 (no more recursive action from now on)
) # returning 1 x 2 = 2
) # returning 2 x 3 = 6
) # returning 6 x 4 = 24
# 'finally' returning 5 x 24 = 120
=> 120
Hope the above is understandable. Thanks again.
I was trying to write a function to generate Hamming numbers and encountered this code on www.w3resource.com.
The code is very easy but I can't seem to understand the order of output values.
def is_hamming_numbers(x):
if x==1:
return True
if x%2==0:
return is_hamming_numbers(x/2)
if x%3==0:
return is_hamming_numbers(x/3)
if x%5==0:
return is_hamming_numbers(x/5)
return False
def hamming_numbers_sequence(x):
if x==1:
return 1
hamming_numbers_sequence(x - 1)
if is_hamming_numbers(x)==True:
print('%s'%x,end=' ')
hamming_numbers_sequence(10)
I expected the output to be:
10 9 8 8 5 4 3 2
The output would be:
2 3 4 5 6 8 9 10
Could anyone please explain why is the order of numbers reversed? I tried to change the order in the code like this:
if is_hamming_numbers(x)==True:
print('%s'%x,end=' ') #this first
hamming_numbers_sequence(x - 1) #then this line
And it would give the output in the order I expected.
def hamming_numbers_sequence(x):
if x==1:
return 1
hamming_numbers_sequence(x - 1) // repeated here
if is_hamming_numbers(x)==True: // means x=2
print('%s'%x,end=' ')
this function hamming_numbers_sequence(x - 1) will repeat it self until reaching x=1 the blocking instruction if x==1: so the second function will enter with the value of x=2 is_hamming_numbers(2)==True:
so you have the out put you're having if you want to change it try it this way
def hamming_numbers_sequence(x):
print('%s'%x,end=' ')
if x==1:
return 1
hamming_numbers_sequence(x - 1)
if is_hamming_numbers(x)==True:
// do what ever you want here
def one_good_turn(n):
return n + 1
def deserves_another(n):
return one_good_turn(n) + 2
print(one_good_turn(1))
print(deserves_another(2))
Since I have two function one_good_turn(n) and deserves_another(n) while calling function I had passed parameter 1 and 2:
I expected the output to be:
2
4
but its shows:
2
5
Why is the output not what I had expected?
I believe you assume that one_good_turn(n) in deserves_another(n) will return the value that is previously computed. No. It gets the current input n which is 2, call the function again, do 2+1 which is 3. Then you add 3 + 2 = 5.
Maybe to get your desired output, you should pass 1 to deserves_another:
def one_good_turn(n):
return n + 1
def deserves_another(n):
return one_good_turn(n) + 2
print(one_good_turn(1)) # 2
print(deserves_another(1)) # 4
A better way is to return the value from one_good_turn and pass it to deserves_another. So you don't need to call one_good_turn again inside deserves_another:
def one_good_turn(n):
n = n + 1
print(n) # 2
return n
def deserves_another(n):
return n + 2
n = one_good_turn(1)
print(deserves_another(n)) # 4
one_good_turn(2) returns 2+1=3.
Then the result is passed to deserves_another, which returns 3+2=5.
I've got a lousy HTTPD access_log and just want to skip the "lousy" lines.
In scala this is straightforward:
import scala.util.Try
val log = sc.textFile("access_log")
log.map(_.split(' ')).map(a => Try(a(8))).filter(_.isSuccess).map(_.get).map(code => (code,1)).reduceByKey(_ + _).collect()
For python I've got the following solution by explicitly defining a function in contrast using the "lambda" notation:
log = sc.textFile("access_log")
def wrapException(a):
try:
return a[8]
except:
return 'error'
log.map(lambda s : s.split(' ')).map(wrapException).filter(lambda s : s!='error').map(lambda code : (code,1)).reduceByKey(lambda acu,value : acu + value).collect()
Is there a better way doing this (e.g. like in Scala) in pyspark?
Thanks a lot!
Better is a subjective term but there are a few approaches you can try.
The simplest thing you can do in this particular case is to avoid exceptions whatsoever. All you need is a flatMap and some slicing:
log.flatMap(lambda s : s.split(' ')[8:9])
As you can see it means no need for an exception handling or subsequent filter.
Previous idea can be extended with a simple wrapper
def seq_try(f, *args, **kwargs):
try:
return [f(*args, **kwargs)]
except:
return []
and example usage
from operator import div # FYI operator provides getitem as well.
rdd = sc.parallelize([1, 2, 0, 3, 0, 5, "foo"])
rdd.flatMap(lambda x: seq_try(div, 1., x)).collect()
## [1.0, 0.5, 0.3333333333333333, 0.2]
finally more OO approach:
import inspect as _inspect
class _Try(object): pass
class Failure(_Try):
def __init__(self, e):
if Exception not in _inspect.getmro(e.__class__):
msg = "Invalid type for Failure: {0}"
raise TypeError(msg.format(e.__class__))
self._e = e
self.isSuccess = False
self.isFailure = True
def get(self): raise self._e
def __repr__(self):
return "Failure({0})".format(repr(self._e))
class Success(_Try):
def __init__(self, v):
self._v = v
self.isSuccess = True
self.isFailure = False
def get(self): return self._v
def __repr__(self):
return "Success({0})".format(repr(self._v))
def Try(f, *args, **kwargs):
try:
return Success(f(*args, **kwargs))
except Exception as e:
return Failure(e)
and example usage:
tries = rdd.map(lambda x: Try(div, 1.0, x))
tries.collect()
## [Success(1.0),
## Success(0.5),
## Failure(ZeroDivisionError('float division by zero',)),
## Success(0.3333333333333333),
## Failure(ZeroDivisionError('float division by zero',)),
## Success(0.2),
## Failure(TypeError("unsupported operand type(s) for /: 'float' and 'str'",))]
tries.filter(lambda x: x.isSuccess).map(lambda x: x.get()).collect()
## [1.0, 0.5, 0.3333333333333333, 0.2]
You can even use pattern matching with multipledispatch
from multipledispatch import dispatch
from operator import getitem
#dispatch(Success)
def check(x): return "Another great success"
#dispatch(Failure)
def check(x): return "What a failure"
a_list = [1, 2, 3]
check(Try(getitem, a_list, 1))
## 'Another great success'
check(Try(getitem, a_list, 10))
## 'What a failure'
If you like this approach I've pushed a little bit more complete implementation to GitHub and pypi.
First, let me generate some random data to start working with.
import random
number_of_rows = int(1e6)
line_error = "error line"
text = []
for i in range(number_of_rows):
choice = random.choice([1,2,3,4])
if choice == 1:
line = line_error
elif choice == 2:
line = "1 2 3 4 5 6 7 8 9_1"
elif choice == 3:
line = "1 2 3 4 5 6 7 8 9_2"
elif choice == 4:
line = "1 2 3 4 5 6 7 8 9_3"
text.append(line)
Now I have a string text looks like
1 2 3 4 5 6 7 8 9_2
error line
1 2 3 4 5 6 7 8 9_3
1 2 3 4 5 6 7 8 9_2
1 2 3 4 5 6 7 8 9_3
1 2 3 4 5 6 7 8 9_1
error line
1 2 3 4 5 6 7 8 9_2
....
Your solution:
def wrapException(a):
try:
return a[8]
except:
return 'error'
log.map(lambda s : s.split(' ')).map(wrapException).filter(lambda s : s!='error').map(lambda code : (code,1)).reduceByKey(lambda acu,value : acu + value).collect()
#[('9_3', 250885), ('9_1', 249307), ('9_2', 249772)]
Here is my solution:
from operator import add
def myfunction(l):
try:
return (l.split(' ')[8],1)
except:
return ('MYERROR', 1)
log.map(myfunction).reduceByKey(add).collect()
#[('9_3', 250885), ('9_1', 249307), ('MYERROR', 250036), ('9_2', 249772)]
Comment:
(1) I highly recommend also calculating the lines with "error" because it won't add too much overhead, and also can be used for sanity check, for example, all the counts should add up to the total number of rows in the log, if you filter out those lines, you have no idea those are truly bad lines or something went wrong in your coding logic.
(2) I will try to package all the line level operations in one function to avoid chaining of map, filter functions, so it is more readable.
(3) From performance perspective, I generated a sample of 1M records and my code finished in 3 seconds and yours in 2 seconds, it is not a fair comparasion since the data is so small and my cluster is pretty beefy, I would recommend you generate a bigger file (1e12?) and do a benchmark on yours.