I am currently attempting to make a Bee Simulation for college and I have started working out the basics of how to do it.
The initial idea was to use PyGame and present the user with bees on the screen but for now I am just doing the basic functions first.
With the function which I am having issues with is the function where the bee will look for cells that are not being used and then go and use them. This is run on every new frame and run on every bee object so each bee will check each cell.
I'm using this code for this:
for i in range (0,len(hiveCells)):
if hiveCells[i] == "":
print("Not taken")
hiveCells[i] = "B"
else:
print("Taken")
But the issue with this is of course it finished within seconds and the bees had used the whole hive but I am needing a way to do this slowly and include time it takes to travel to that cell and then time it takes to actually use it?
What is the best way to do this? I was thinking of using coordinates and it will move closer to those coordinates every loop and check if it has reached them.
In order to include travel time for each Bee you would first need to define some kind of distance measure. A trivial choice would be to use the euclidian distance.
In order to incorporate this into you model you would need the following additions
Add a location (x,y), and possible (z) to each bee and each hive(cell)
Define how much time (in seconds) elapses per frame update.
Define the speed of the bee (in terms of m/s).
Now per frame update you know how much time has elapsed since the last update, and you can (using the bee speed and location) compute the new location of the bee.
The update frequency of the frame is now directly related to the time that is elapsed in your model.
Note that in order for this to work you would need some type of ID which relates the bee to the hive cell it claimed. I would recommend giving each bee a unique ID.
Then as soon as the bee claims a hive cell you store the unique bee ID in the hive cell, such that at each frame update you can compute the new location for each bee with respect to the hive cell it is flying to.
Additionally note that in order for this scheme to work the hive cell would need a location (which you could store in a similar sized array. But it might be the most clean to create an object for each Hive (cell), which stores it's coordinates and the bee-ID which claimed it. This would also allow you to further improve your model by adding additional information to (i.e. honey present, or whatever) the hive (cells)/bees.
Related
I'm asking for help/tips with system design.
I have some iot system with sensors PIR(motion), contactrons, temperature& humidity ...
Nothing fancy.
I'm collecting, filtering the raw data the data to build some observations on top.
So far I have some event_rules classes that are bound to sensors and return True/False depending on the data that's coming constantly from the queue(from sensors).
I know I need to run some periodic analyses on existing data e.g when Motion sensors are not reporting anymore or both incoming/existing that includes loading the data and analyzing data in some time window (counting/average, etc.)
That time window approach could help answer the questions like:
temperature increased over 10deg in last 1h* or no motion detected for past 10mins or High/low/no movement detected over last 30mins
My silly approach was to run some semi-cron python thread that executes rules one-by-one and checks the rules output every N seconds e.g every 30sec. Some rules includes a state machine and handles transitions from one state to another.
But this is soo baaad imho, imagine system scales-up and all of the sudden system is going to check hundreds of rules every N...seconds.
I know some generic approach is needed.
How shall I tackle the case? What is the correct approach? In the uC world I'd call it how to properly generate system clock that will check the rules, but again not all at once and in a kindla configurable manner.
I'd be thankful for the tips, maybe there are already some python libraries to address it. I'm using pandas for analyses and machine state for the state transitions, event rules are defined in SQL database and cast to polymorphic python class based on the rule type.
Using Pandas rolling Window could be a solution (Sources: pandas.pydata.org: Window, How to use rolling in pandas?).
This meant in general:
Step 1:
Define a timebased window based either on a number of rows (increased index id) or timebased (increased timestamp)
Step 2:
Apply this window onto the dataset
The code snippet below applies basic calculations (mean, min, max) to a dataframe and adds the results as new columns in the dataframe.
To keep the original dataframe clean I suggest to use a copy instead of:
import pandas as pd
df = pd.read_csv('[PathToDatasurce]')
df_copy = df.copy()
df_copy['moving-average'] = df_15['SourceColumn'].rolling(window=10).mean()
df_copy['moving-average'] = df_15['SourceColumn'].rolling(window=10).min()
df_copy['moving-average'] = df_15['SourceColumn'].rolling(window=10).max()
I built a device based on a microcontroller with some sensors attached to it, one of them is an orientation sensor that currently delivers information about pitch, yaw, roll and acceleration for x,y,z. I would like to be able to detect movement "events" when the device is well... moved around.
For example I would like to detect a "repositioned" event which basically would consist of series of other events - "up" (picked up), "move" (moved in air to some other point), "down" (device put back down).
Since I am just starting to figure out how to make it possible I would like to ask if I am getting the right ideas or wasting my time.
My idea is currently that I use the data I probed to create a dataset and try to use machine learning to detect if each element belongs to one of the events I am trying to detect. So basically I took the device and first rotated it on table a few times, then picked it up several times, then moved it in the air and finally put it down several times. This generated a set of data that has a structure like that:
yaw,pitch,roll,accelx,accely,accelz,state
-140,178,178,17,-163,-495,stand
110,-176,-166,-212,-97,-389,down
118,-177,178,123,16,-146,up
166,-174,-171,-375,-145,-929,up
157,-178,178,4,-61,-259,down
108,177,-177,-55,76,-516,move
152,178,-179,35,98,-479,stand
175,177,-178,-30,-168,-668,move
100,177,178,-42,26,-447,stand
-14,177,179,42,-57,-491,stand
-155,177,179,28,-57,-469,stand
92,-173,-169,347,-373,-305,down
[...]
the last "state" column is added by me - I added this after each test movement type and then shuffled the rows.
I got about 450 records this way and the idea is to use the machine learning to predict the "state" column for each record coming from the running device, then I could queue up the outcomes and if in some short period the "up" events are majority I can take it the device is being picked up.
Maybe instead of using each reading as a data row I should rather take the last 10 readings (lets say) and try to predict what happens per column - i.e. if I know last 10 yaw readings were the changes during I was moving the device up I should rather use this data - so 10 readings from each of the 6 columns is processed as row and then I have 6 results - again the ratio of result types may make it possible to detect the "movement" event that happened during these 10 readings.
I am currently about 30% into an online ML course and enjoying it but I'd really like to hear some comments from more experienced people.
Are my ideas a reasonable solution or am I totally failing to understand how I can use ML? If so, what resources shall I use to get myself started?
Your idea to regroup the reading seems interesting. But it all depends on how often you get a record and how you plan on grouping them.
If you get a record every 10-100ms, it could be a good idea to group them since it will help to have more accurate data reducing noise. You could take the mean of each column to get rid of that noise and help your classifier to better classify your different states.
Otherwise if you have a record every second, I think it's a bad idea to regroup the records since you will most certainely mix several actions together.
The best way would be to try out both ways if you have the time ^^
I have been turning around a problem by several days and I can not reach a successful solution to my code using python. I have a dataframe with sets of coordinates, let's say: the whole information of a formula 1 gran prix! Cool, I defined a function that calculate some attributes, for example the distance, the speed or the pits stop times for all the cars. My functions work but I can not find the way in which I have to save the information for all the cars, I can only get my first or my last result in a data frame.
I think that the question is very basic and I only need the correct way to save computed results in a dataframe, when the function is ran by several times. Thanks by your kind responses!
I want to build a backend for a mobile game that includes a "real-time" global leaderboard for all players, for events that last a certain number of days, using Google App Engine (Python).
A typical usage would be as follows:
- User starts and finishes a combat, acquiring points (2-5 mins for a combat)
- Points are accumulated in the player's account for the duration of the event.
- Player can check the leaderboard anytime.
- Leaderboard will return top 10 players, along with 5 players just above and below the player's score.
Now, there is no real constraint on the real-time aspect, the board could be updated every 30 seconds, to every hour. I would like for it to be as "fast" as possible, without costing too much.
Since I'm not very familiar with GAE, this is the solution I've thought of:
Each Player entity has a event_points attribute
Using a Cron job, at a regular interval, a query is made to the datastore for all players whose score is not zero. The query is
sorted.
The cron job then iterates through the query results, writing back the rank in each Player entity.
When I think of this solution, it feels very "brute force".
The problem with this solution lies with the cost of reads and writes for all entities.
If we end up with 50K active users, this would mean a sorted query of 50K+1 reads, and 50k+1 writes at regular intervals, which could be very expensive (depending on the interval)
I know that memcache can be a way to prevent some reads and some writes, but if some entities are not in memcache, does it make sense to query it at all?
Also, I've read that memcache can be flushed at any time anyway, so unless there is a way to "back it up" cheaply, it seems like a dangerous use, since the data is relatively important.
Is there a simpler way to solve this problem?
You don't need 50,000 reads or 50,000 writes. The solution is to set a sorting order on your points property. Every time you update it, the datastore will update its order automatically, which means that you don't need a rank property in addition to the points property. And you don't need a cron job, accordingly.
Then, when you need to retrieve a leader board, you run two queries: one for 6 entities with more or equal number of points with your user; second - for 6 entities with less or equal number of points. Merge the results, and this is what you want to show to your user.
As for your top 10 query, you may want to put its results in Memcache with an expiration time of, say, 5 minutes. When you need it, you first check Memcache. If not found, run a query and update the Memcache.
EDIT:
To clarify the query part. You need to set the right combination of a sort order and inequality filter to get the results that you want. According to App Engine documentation, the query is performed in the following order:
Identifies the index corresponding to the query's kind, filter
properties, filter operators, and sort orders.
Scans from the
beginning of the index to the first entity that meets all of the
query's filter conditions.
Continues scanning the index, returning
each entity in turn, until it encounters an entity that does not meet
the filter conditions, or reaches the end of the index, or has
collected the maximum number of results requested by the query.
Therefore, you need to combine ASCENDING order with GREATER_THAN_OR_EQUAL filter for one query, and DESCENDING order with LESS_THAN_OR_EQUAL filter for the other query. In both cases you set the limit on the results to retrieve at 6.
One more note: you set a limit at 6 entities, because both queries will return the user itself. You can add another filter (userId NOT_EQUAL to your user's id), but I would not recommend it - the cost is not worth the savings. Obviously, you cannot use GREATER_THAN/LESS_THAN filters for points, because many users may have the same number of points.
Here is a Google Developer article explaining similar problem and the solution using the Google code JAM ranking library. Further help and extension to this library can be discussed in the Google groups forum.
The library basically creates a N-ary tree with each node containing the count of the scores in a particular range. The score ranges are further divided all the way down till leaf node where its a single score . A tree traversal ( O log(n) ) can be used to find the number of players with score higher than the specific score. That is the rank of the player. It also suggests to aggregate the score submission requests in a pull taskqueue and then process them in a batch in a background thread in a backend.
Whether this is simpler or not is debatable.
I have assumed that ranking is not just a matter of ordering an accumulation of points, in which case thats just a simple query. I ranking involves other factors rather than just current score.
I would consider writing out an Event record for each update of points for a User (effectively a queue) . Tasks run collecting all the current Event records, In addition you maintain a set of records representing the top of the leaderboard. Adjust this set of records, based on the incoming event records. Discard event records once processed. This will limit your reads and writes to only active events in a small time window. The leader board could probably be a single entity, and fetched by key and cached.
I assume you may have different ranking schemes like current active rank (for the current 7 days), vs all time ranks. (ie players not playing for a while won't have a good current rank).
As the players view their rank, you can do that with two simple queries Players.query(Players.score > somescore).fetch(5) and Players.query(Players.score < somescore).fetch(5) this shouldn't cost too much and you could cache them.
Ive been working on a feature of my application to implement a leaderboard - basically stack rank users according to their score. Im currently tracking the score on an individual basis. My thought is that this leaderboard should be relative instead of absolute i.e. instead of having the top 10 highest scoring users across the site, its a top 10 among a user's friend network. This seems better because everyone has a chance to be #1 in their network and there is a form of friendly competition for those that are interested in this sort of thing. Im already storing the score for each user so the challenge is how to compute the rank of that score in real time in an efficient way. Im using Google App Engine so there are some benefits and limitations (e.g., IN [array]) queries perform a sub-query for every element of the array and also are limited to 30 elements per statement
For example
1st Jack 100
2nd John 50
Here are the approaches I came up with but they all seem to be inefficient and I thought that this community could come up with something more elegant. My sense is that any solution will likely be done with a cron and that I will store a daily rank and list order to optimize read operations but it would be cool if there is something more lightweight and real time
Pull the list of all users of the site ordered by score.
For each user pick their friends out of that list and create new rankings.
Store the rank and list order.
Update daily.
Cons - If I get a lot of users this will take forever
2a. For each user pick their friends and for each friend pick score.
Sort that list.
Store the rank and list order.
Update daily.
Record the last position of each user so that the pre-existing list can be used for re-ordering for the next update in order to make it more efficient (may save sorting time)
2b. Same as above except only compute the rank and list order for people who's profiles have been viewed in the last day
Cons - rank is only up to date for the 2nd person that views the profile
If writes are very rare compared to reads (a key assumption in most key-value stores, and not just in those;-), then you might prefer to take a time hit when you need to update scores (a write) rather than to get the relative leaderboards (a read). Specifically, when a user's score change, queue up tasks for each of their friends to update their "relative leaderboards" and keep those leaderboards as list attributes (which do keep order!-) suitably sorted (yep, the latter's a denormalization -- it's often necessary to denormalize, i.e., duplicate information appropriately, to exploit key-value stores at their best!-).
Of course you'll also update the relative leaderboards when a friendship (user to user connection) disappears or appears, but those should (I imagine) be even rarer than score updates;-).
If writes are pretty frequent, since you don't need perfectly precise up-to-the-second info (i.e., it's not financials/accounting stuff;-), you still have many viable approaches to try.
E.g., big score changes (rarer) might trigger the relative-leaderboards recomputes, while smaller ones (more frequent) get stashed away and only applied once in a while "when you get around to it". It's hard to be more specific without ballpark numbers about frequency of updates of various magnitude, typical network-friendship cluster sizes, etc, etc. I know, like everybody else, you want a perfect approach that applies no matter how different the sizes and frequencies in question... but, you just won't find one!-)
There is a python library available for storing rankings:
http://code.google.com/p/google-app-engine-ranklist/