Weird results with unity ml agents python api - python

I am using the 3DBall example environment, but I am getting some really weird results that I don't understand why they are happening. My code so far is just a for range loop that views the reward and fills in the inputs needed with random values. However when I was doing it, never a negative reward was shown, and randomly there would be no decision steps, which would make sense, but shouldn't it just keep on simulating until there is a decision step? Any help would be greatly appreciated as other then the documentation there are little to no recourses out there for this.
env = UnityEnvironment()
env.reset()
behavior_names = env.behavior_specs
for i in range(50):
arr = []
behavior_names = env.behavior_specs
for i in behavior_names:
print(i)
DecisionSteps = env.get_steps("3DBall?team=0")
print(DecisionSteps[0].reward,len(DecisionSteps[0].reward))
print(DecisionSteps[0].action_mask) #for some reason it returns action mask as false when Decisionsteps[0].reward is empty and is None when not
for i in range(len(DecisionSteps[0])):
arr.append([])
for b in range(2):
arr[-1].append(random.uniform(-10,10))
if(len(DecisionSteps[0])!= 0):
env.set_actions("3DBall?team=0",numpy.array(arr))
env.step()
else:
env.step()
env.close()

I think that your problem is that when the simulation terminates and needs to be reset, the agent does not return a decision_step but rather a terminal_step. This is because the agent has dropped the ball and the reward returned in the terminal_step will be -1.0. I have taken your code and made some changes and now it runs fine (except that you probably want to change so that you don't reset every time one of the agents drops its ball).
import numpy as np
import mlagents
from mlagents_envs.environment import UnityEnvironment
# -----------------
# This code is used to close an env that might not have been closed before
try:
unity_env.close()
except:
pass
# -----------------
env = UnityEnvironment(file_name = None)
env.reset()
for i in range(1000):
arr = []
behavior_names = env.behavior_specs
# Go through all existing behaviors
for behavior_name in behavior_names:
decision_steps, terminal_steps = env.get_steps(behavior_name)
for agent_id_terminated in terminal_steps:
print("Agent " + behavior_name + " has terminated, resetting environment.")
# This is probably not the desired behaviour, as the other agents are still active.
env.reset()
actions = []
for agent_id_decisions in decision_steps:
actions.append(np.random.uniform(-1,1,2))
# print(decision_steps[0].reward)
# print(decision_steps[0].action_mask)
if len(actions) > 0:
env.set_actions(behavior_name, np.array(actions))
try:
env.step()
except:
print("Something happend when taking a step in the environment.")
print("The communicatior has probably terminated, stopping simulation early.")
break
env.close()

Related

Trying to simulate a jump process in discrete time

I was trying to simulate a jump process in python, defined as follows:
Actually, in order to simulate the process, I evaluate its value at each time t, building an "history" of the values the process has taken. This is done accordingly to the following function, contained in a class:
def evaluate(self, t):
try:
return self.history[t], {} # History of the process
except:
new_steps = t - len(self.history) # Calculate how much steps will be evaluated in order to reach t
values_to_sample = 1 + new_steps
new_values = np.zeros(values_to_sample)
temp = self.history[-1] # Take last seen value
for i in range(values_to_sample):
temp = temp + self.A*( UNKNOWN TERM - temp ) + np.random.normal(scale=self.sigma)
# UNKNOWN TERM is the element that I want to "simulate"
new_values[i] = temp
self.history = np.concatenate((self.history, new_values))
return self.history[t], {}
The problem is that I do not know how to simulate the Yt term. To my understanding, what I should do is evaluate at each new sample if a jump has occurred, but I do not know how to do that. Can someone help me understand what should I do?

Stuck in loop help - Python

The second 'if' statement midway through this code is using an 'or' between two conditions. This is causing the issue I just don't know how to get around it. The code is going through a data file and turning on the given relay number at a specific time, I need it to only do this once per given relay. If I use an 'and' between the conditions, it will only turn on the first relay that matches the current time and wait for the next hour and turn on the next given relay.
Could someone suggest something to fix this issue, thank you!
def schedule():
metadata, sched = dbx.files_download(path=RELAYSCHEDULE)
if not sched.content:
pass # If file is empty then exit routine
else:
relaySchedule = str(sched.content)
commaNum = relaySchedule.count(',')
data1 = relaySchedule.split(',')
for i in range(commaNum):
data2 = data1[i].split('-')
Time1 = data2[1]
currentRN = data2[0]
currentDT = datetime.datetime.now()
currentHR = currentDT.hour
global RN
global T
if str(currentHR) == str(Time1):
if T != currentHR or RN != currentRN:
relaynum = int(data2[0])
relaytime = int(data2[2])
T = currentHR
RN = currentRN
k = threading.Thread(target=SendToRelay(relaynum, relaytime)).start()
else:
print("Pass")
Desired Inputs:
sched.content = '1-19-10,3-9-20,4-9-10,'
T = ' '
RN = ' '
T and RN are global variables because the loop is running indefinitely, they're there to let the loop know whether the specific Time(T) and Relay Number(RN) have already been used.
Desired Outputs:
If the time is 9 AM then,
T = 9
RN should be whatever the given relay number is so RN = 3, but not sure this is the right thing to use.
Sorry if this is confusing. I basically need the program to read a set of scheduled times for specific relays to turn on, I need it to read the current time and if it matches the time in the schedule it will then check which relay is within that time and turn it on for however long. Once it has completed that, I need it to go over that same set of data in case there is another relay within the same time that also needs to turn on, the issue is that if I don't use T and RN variables to check whether a previous relay has been set, it will read the file and turn on the same relay over and over.
Try printing all used variables, check if everything is, what you think it is. On top of that, sometimes whietespaces characters causes problem with comparison.
I fixed it. For anyone wondering this is the new working code:
def schedule():
metadata, sched = dbx.files_download(path=RELAYSCHEDULE)
if not sched.content:
pass # If file is empty then exit routine
else:
relaySchedule = str(sched.content)
commaNum = relaySchedule.count(',')
data1 = relaySchedule.split(',')
for i in range(commaNum):
data2 = data1[i].split('-')
TimeSched = data2[1]
relaySched = data2[0]
currentDT = datetime.datetime.now()
currentHR = currentDT.hour
global RN
global T
if str(currentHR) == str(TimeSched):
if str(T) != str(currentHR):
RN = ''
T = currentHR
if str(relaySched) not in str(RN):
relaynum = int(data2[0])
relaytime = int(data2[2])
k = threading.Thread(target=SendToRelay(relaynum, relaytime)).start()
RN = str(RN) + str(relaySched)

Looping through a list of proxies

I am currently working on a function that is to loop through a list of functions and then restart back at the top once it reaches the bottom. So far this is the code that I have:
import time
createLimit = 100
proxyFile = 'proxies.txt'
def getProxies():
proxyList = []
with open(proxyFile, 'r') as f:
for line in f:
proxyList.append(line)
return proxyList
proxyList = getProxies()
def loopProxySwitch():
print("running")
current_run = 0
while current_run <= createLimit:
if current_run >= len(proxyList):
lengthOfList = len(proxyList)
useProxy = proxyList[current_run%lengthOfList]
print("Current Ip: "+useProxy)
print("Current Run: "+current_run)
print("Using modulus")
return useProxy
else:
useProxy = proxyList[current_run]
print("Current Ip: "+useProxy)
print("Current Run: "+current_run)
return useProxy
time.sleep(2)
print("Script ran")
loopProxySwitch()
The problem that I am having is that the loopProxySwitch function does not return or print anything within the while loop, however I don't see how it would be false. Here is the format of the text file with fake proxies:
111.111.111.111:2222
333.333.333.333:4444
444.444.444.444:5555
777.777.777.777:8888
919.919.919.919:0000
Any advice on this situation? I intend to incorporate this into a program that I am working on, however instead of cycling through the file on a timed interval, it would only loop on a certain returned condition (such as a another function letting the loop function know that some function has ran and that it is time to switch to the next proxy). If this is a bit confusing, I will be happy to elaborate and clear any confusion. Any suggestions, ideas, or fixes are appreciated. Thanks!
EDIT: Thanks to the comments below, I fixed the printing issue. However, the function does not loop through all the proxies... Any suggestions?
Nothing is printed because you return something before printing.
The loop will break the first time condition is met as it will return a value and exit the function without reaching the print statements(functions) and/or the next iteration.
BTW if you actually want to print the returned value you can print the function itself:
print(loopProxySwitch())

Stuff isn't appending to my list

I'm trying to create a simulation where there are two printers and I find the average wait time for each. I'm using a class for the printer and task in my program. Basically, I'm adding the wait time to each of each simulation to a list and calculating the average time. My issue is that I'm getting a division by 0 error so nothing is being appended. When I try it with 1 printer (Which is the same thing essentially) I have no issues. Here is the code I have for the second printer. I'm using a queue for this.
if printers == 2:
for currentSecond in range(numSeconds):
if newPrintTask():
task = Task(currentSecond,minSize,maxSize)
printQueue.enqueue(task)
if (not labPrinter1.busy()) and (not labPrinter2.busy()) and \
(not printQueue.is_empty()):
nexttask = printQueue.dequeue()
waitingtimes.append(nexttask.waitTime(currentSecond))
labPrinter1.startNext(nexttask)
elif (not labPrinter1.busy()) and (labPrinter2.busy()) and \
(not printQueue.is_empty()):
nexttask = printQueue.dequeue()
waitingtimes.append(nexttask.waitTime(currentSecond))
labPrinter1.startNext(nexttask)
elif (not labPrinter2.busy()) and (labPrinter1.busy()) and \
(not printQueue.is_empty()):
nexttask = printQueue.dequeue()
waitingtimes.append(nexttask.waitTime(currentSecond))
labPrinter2.startNext(nexttask)
labPrinter1.tick()
labPrinter2.tick()
averageWait = sum(waitingtimes)/len(waitingtimes)
outfile.write("Average Wait %6.2f secs %3d tasks remaining." \
%(averageWait,printQueue.size()))
Any assistance would be great!
Edit: I should mention that this happens no matter the values. I could have a page range of 99-100 and a PPM of 1 yet I still get divided by 0.
I think your problem stems from an empty waitingtimes on the first iteration or so. If there is no print job in the queue, and there has never been a waiting time inserted, you are going to reach the bottom of the loop with waitingtimes==[] (empty), and then do:
sum(waitingtimes) / len(waitingtimes)
Which will be
sum([]) / len([])
Which is
0 / 0
The easiest way to deal with this would just be to check for it, or catch it:
if not waitingtimes:
averageWait = 0
else:
averageWait = sum(waitingtimes)/len(waitingtimes)
Or:
try:
averageWait = sum(waitingtimes)/len(waitingtimes)
except ZeroDivisionError:
averageWait = 0

Why does this code make my Python interpreter hang?

I had an identical problem with my other laptop, which led me to get this new one (new in the NBC after Friends sense) -- the interpreter would hang on some kind of nested iteration, and even freeze up and/or go berserk if left to its own devices. In this case, I CTRL+C'd after about five seconds. The interpreter said it stopped at some line in the while loop, different each time, indicating that it was working but at a slooooooooooow pace. Some test print statements seemed to show some problem with the iteration controls (the counter and such).
Is it a CPU problem, or what?
from __future__ import print_function
import sys
# script is a useless dummy variable, setting captures the setting (enc or dec), steps captures the number of lines.
script, setting, steps = sys.argv
# Input handling bloc.
if setting not in ('enc', 'dec'):
sys.exit("First command line thing must either be 'enc' or 'dec'.")
try:
steps = int(steps)
except:
sys.exit("Second command line thing must be convertable to an integer.")
# Input string here.
string_to_convert = raw_input().replace(' ', '').upper()
if setting == 'enc':
conversion_string = ''
counter = 0
while len(conversion_string) < len(string_to_convert):
for char in string_to_convert:
if counter == steps:
conversion_string += char
counter = 0
counter += 1
steps -= 1
print(conversion_string)
Depending on the starting value of steps its possible for counter and steps to never be equal, which means conversion_string is never altered, so it is always shorter than string_to_convert and the loop never ends.
A naive example is, let steps=-1, since counter starts at 0 and increments, and steps always decrements, they will never be equal.
Actually, on further inspection, if steps is less than len(string_to_convert) this will always end in an infinite loop.
Consider:
steps=2
string_to_convert="Python"
The first iteration of the for loop will iterate counter to 2 and fetch the "t"; now steps = 1, conversion_string="t"
Next for loop will iterate counter to 1, fetch the "y"; now steps = 0, conversion_string="ty"
for loop iterates counter to 0, fetch the "P"; now steps = -1, conversion_string="tyP"
Now, steps = -1, counter can never equal it, for loop ends without changing conversion_string.
Step 4 repeats while decreasing steps without any ability to quit the while loop.
Thus, why it sometimes works and sometimes doesn't.

Categories