I have a FlexGridSizer called self.grid with five columns, each row having two TextCtrl's a pair of RadioButton's and a CheckBox. What is the best way to retrieve the data associated with these objects? Currently I am successfully using
# get flat list of all items in flexgridsizer, self.grid
children = list(self.grid.GetChildren())
# change list into list of rows (5 items each)
table = [children[i:i+5] for i in range(0, len(children), 5)]
# parse list of 'sizeritems' to extract content
for x in range(len(table)):
for y in range(len(table[x])):
widget = table[x][y].GetWindow()
if isinstance(widget, wx.TextCtrl):
text = ""
for num in range(widget.GetNumberOfLines()):
text += widget.GetLineText(num)
table[x][y] = text
if isinstance(widget, wx.RadioButton):
table[x][y] = widget.GetValue()
if isinstance(widget, wx.CheckBox):
table[x][y] = (widget.GetLabel(), widget.GetValue())
This leaves me with table, a list of rows with five elements each, each item being relevant data: text for TextCtrl, bool for RadioButton, and (label, bool) for CheckBox.
This seems to get the job done, but it doesn't feel right.
Is there a better way to recover data from a FlexGridSizer? Alternatively, should I be using a different sizer/control for this layout? (I tried UltimateListCtrl, but it was buggy/didn't actually do what I needed).
you shouldnt really be doing that .. instead you should create references to them when created
self.widgetTable = []
for row in dataSet:
self.widgetTable.append([wx.TextCtrl(self,-1,value) for value in row])
then access them through that
self.widgetTable[0][0].GetValue()
Since you have working code, and seem to be asking about coding style, you may have some luck asking on Code Review.
That being said, what you have here isn't too terrible. I think isinstance() is pretty ugly, so when I did something like this, I went by order of the widgets since I knew every 5th widget was what I wanted. Maybe you could use a similar approach? Or use a try...except structure to avoid isinstance.
So there are two approaches here, the first based on the order of your widgets, and the second just guesses how you retrieve info.
Method 1: So if your widgets have regular order, you can do something like this: (horrendous variable names for demonstration only)
list_of_one_type_of_widget = map(lambda x: x.GetWindow(), self.grid.GetChildren())[::k]
list_of_values_for_first_type = map(lambda x: x.GetValue(), list_of_one_type_of_widget)
list_of_another_type_of_widget = map(lambda x: x.GetWindow(), self.grid.GetChildren())[1::k]
list_of_values_for_first_type = map(lambda x: (x.GetLabel(), x.GetValue()), list_of_another_type_of_widget)
Where k is the number of widget types you have. This is how I tackled the problem when I came up against it, and I think its pretty nifty and very concise. You're left with a list for each type of widget you have, so that makes processing easy if it depends on the widget. You could also pretty easily build this back into a table. Be sure to note how the second one is sliced with [1::k] rather than [::k]. Each subsequent widget type will need to be one greater than the previous.
Method 2: If you don't have a regular order, you can do something like this:
list_of_values = []
for widget in map(lambda x: x.GetWindow(), self.grid.GetChildren()):
try:
list_of_values.append(widget.GetValue())
except:
#Same as above, but with the appropriate Get function. If there multiple alternatives, just keep nesting try...except blocks in decrease order of commonness
You could make the case that the second method is more "pythonic", but that's up for debate. Additionally, without some clever tricks, you're left with one list in the end, which may not be ideal for you.
Some notes on your solution:
self.grid.GetChildren() is iterable, so you don't need to convert it to a list before using it as an iterable
You could change your sequential if statements to a if...elif...(else) construct, but its really not that big of a deal in this case since you don't expect any widget to more than one test
Hope this helps
Related
So suppose I have an array of some elements. Each element have some number of properties.
I need to filter this list from some subsets of values determined by predicates. This subsets of course can have intersections.
I also need to determine amount of values in each such subset.
So using imperative approach I could write code like that and it would have running time of 2*n. One iteration to copy array and another one to filter it count subsets sizes.
from split import import groupby
a = [{'some_number': i, 'some_time': str(i) + '0:00:00'} for i in range(10)]
# imperative style
wrong_number_count = 0
wrong_time_count = 0
for item in a[:]:
if predicate1(item):
delete_original(item, a)
wrong_number_count += 1
if predicate2(item):
delete_original(item, a)
wrong_time_count += 1
update_some_data(item)
do_something_with_filtered(a)
def do_something_with_filtered(a, c1, c2):
print('filtered a {}'.format(a))
print('{} items had wrong number'.format(c1))
print('{} items had wrong time'.format(c2))
def predicate1(x):
return x['some_number'] < 3
def predicate2(x):
return x['some_time'] < '50:00:00'
Somehow I can't think of the way to do that in Python in functional way with same running time.
So in functional style I could have used groupby multiple times probably or write a comprehension for each predicate, but that's obviously would be slower than imperative approach.
I think such thing possible in Haskell using Stream Fusion (am I right?)
But how do that in Python?
Python has a strong support to "stream processing" in the form of its iterators - and what you ask seens just trivial to do. You just have to have a way to group your predicates and attributes to it - it could be a dictionary where the predicate itself is the key.
That said, a simple iterator function that takes in your predicate data structure, along with the data to be processed could do what you want. TThe iterator would have the side effect of changing your data-structure with the predicate-information. If you want "pure functions" you'd just have to duplicate the predicate information before, and maybe passing and retrieving all predicate and counters valus to the iterator (through the send method) for each element - I donĀ“ t think it would be worth that level of purism.
That said you could have your code something along:
from collections import OrderedDict
def predicate1(...):
...
...
def preticateN(...):
...
def do_something_with_filtered(item):
...
def multifilter(data, predicates):
for item in data:
for predicate in predicates:
if predicate(item):
predicates[predicate] += 1
break
else:
yield item
def do_it(data):
predicates = OrderedDict([(predicate1, 0), ..., (predicateN, 0) ])
for item in multifilter(data, predicates):
do_something_with_filtered(item)
for predicate, value in predicates.items():
print("{} filtered out {} items".format(predicate.__name__, value)
a = ...
do_it(a)
(If you have to count an item for all predicates that it fails, then an obvious change from the "break" statement to a state flag variable is enough)
Yes, fusion in Haskell will often turn something written as two passes into a single pass. Though in the case of lists, it's actually foldr/build fusion rather than stream fusion.
That's not generally possible in languages that don't enforce purity, though. When side effects are involved, it's no longer correct to fuse multiple passes into one. What if each pass performed output? Unfused, you get all the output from each pass separately. Fused, you get the output from both passes interleaved.
It's possible to write a fusion-style framework in Python that will work correctly if you promise to only ever use it with pure functions. But I'm doubtful such a thing exists at the moment. (I'd loved to be proven wrong, though.)
I was looking for historical data from our Brazilian stock market and found it at Bovespa's
website.
The problem is the format the data is in is terrible, it is mingled with all sorts of
other information about any particular stock!
So far so good! A great opportunity to test my fresh python skills (or so I thought)!
I managed to "organize/parse" pretty much all of the data with a few lines of code,
and then stumbled on a very annoying fact about the data. The very information I needed, stock prices(open, high, low, close), had no commas and was formatted like this: 0000000011200, which would be equivalent to 11 digits before the decimal comma.
So basically 0000000011200 = 112,00... You get the gist..
I wrote a few lines of code to edit that and then the nightmare kicked in.
The whole data set is around 358K rows long, and with my current script the deeper it
runs inside the list to edit it the longer it takes per edit.
Here is the code snipped I used for that:
#profile
def dataFix(datas):
x = 0
for entry in datas:
for i in range(9, 16):
data_org[datas.index(entry)][i] = entry[i][:11]+'.'+entry[i][11:]
x += 1
print x
Would anyone mind shining some light into this matter?
datas.index(entry)
There's your problem. datas.index(entry) requires Python to go through the datas list one element at a time, searching for entry. It's an incredibly slow way to do things, slower the bigger the list is, and it doesn't even work, because duplicate elements are always found at their first occurrence instead of the occurrence you're processing.
If you want to use the indices of the elements in a loop, use enumerate:
for index, entry in enumerate(datas):
...
First, probably more easy to convert price directly to a more usable format.
For exemple Decimal format permit you to do easy calculation without loosing precision.
Secondly, i think you didn't even need the index and can just use append.
Thirdly, say welcome to list comprehension and slice :P
from decimal import Decimal
data_org = []
for entries in datas:
data_org.append([Decimal(entry).scaleb(-2) for entry in entries[9:16]])
or even:
data_org = [[Decimal(entry).scaleb(-2) for entry in entries[9:16]] for entries in datas]
or in a generator form:
data_org = ([Decimal(entry).scaleb(-2) for entry in entries[9:16]] for entries in datas)
or if you want to keeping the text form:
data_org = [['.'.join((entry[:-2], entry[-2:])) for entry in entries[9:16]] for entries in datas]
(replaceing [:11] by [:-2] permit to be independent of the input size and get 2 decimal from the end)
Good morning everybody,
my simple question is the following: I have 2 lists (let's call them a and b) of length T and I want to eliminate K random elements (with the same index) from each of them.
Let's suppose for the moment K << T, in order to neglect the probability to extract the same index twice or more. Can I simply generate a list aleaindex of K random numbers and pass it to del, like
for i in range(K):
aleaindex.append(random.randint(0, T-1))
del a[aleaindex]
del b[aleaindex]
And is there some Python trick to do this more efficiently?
Thank you very much in advance!
No, there is no way to do this.
The reason for this is that del deletes a name - if there is still another name attached to the object, it will continue to exist. The object itself is untouched.
When you store objects in a list, they do not have names attached, just indices.
This means that when you have a list of objects, Python doesn't know the names that refer to those objects (if there are any), so it can't delete them. It can, at most, remove them from that particular list.
The best solution is to make a new list that doesn't contain the values you don't want. This can be achieved with a list comprehension:
new_a = [v for i, v in enumerate(a) if i not in aleaindex]
You can always assign this back to a if you need to modify the list (a[:] = ...).
Note that it would also make more sense to make aleaindex a set, as it would make this operation faster, and the order doesn't matter:
aleaindex = {random.randint(0, T-1) for _ in range(K)}
Trying to count the matches across all columns.
I currently use this code to copy across certain fields from a Scrapy item.
def getDbModel(self, item):
deal = { "name":item['name'] }
if 'imageURL' in item:
deal["imageURL"] = item['imageURL']
if 'highlights' in item:
deal['highlights'] = replace_tags(item['highlights'], ' ')
if 'fine_print' in item:
deal['fine_print'] = replace_tags(item['fine_print'], ' ')
if 'description' in item:
deal['description'] = replace_tags(item['description'], ' ')
if 'search_slug' in item:
deal['search_slug'] = item['search_slug']
if 'dealURL' in item:
deal['dealurl'] = item['dealURL']
Wondering how I would turn this into an OR search in mongodb.
I was looking at something like the below:
def checkDB(self,item):
# Check if the record exists in the DB
deal = self.getDbModel(item)
return self.db.units.find_one({"$or":[deal]})
Firstly, Is this the best method to be doing?
Secondly, how would I find the count of the amount of columns matched i.e. trying to limit records that match at least two columns.
There is no easy way of counting the number of colum matches on MongoDBs end, it just kinda matches and then returns.
You would probably be better doing this client side, I am unsure exactly how you intend to use this count figure but there is no easy way, whether through MR or aggregation framework of doing this.
You could, in the aggregation framework, change your schema a little to put these colums within a properties field and then $sum the matches within the subdocuemnt. This is a good approach since you can also sort on it to create a type of relevance search (if that is what your intending).
As to whether this is a good approach depends. When using an $or MongoDB will use an index for each condition, this is a special case within MongoDB indexing, however it does mean you should take this into consideration when making an $or and ensure you have indexes to cover each condition.
You have also got to consider that MongoDB will effectively eval each clause and then merge the results to remove duplicates, which can be heavy for bigger $ors or a large working set.
Of course the format of your $or is wrong, you need an array of arrays of your fields. At the minute you have a single array with another array which has all your attributes. When used like this the attributes will actually have an $and condition between them so it won't work.
You could probably change your code to:
def getDbModel(self, item):
deal = []
deal[] = { "name":item['name'] }
if 'imageURL' in item:
deal[] = {"imageURL": tem['imageURL']}
if 'highlights' in item:
// etc
// Some way down
return self.db.units.find_one({"$or":deal})
NB: I am not a Python programmer
Hope it helps,
I'm trying to create a list of tasks that I've read from some text files and put them into lists. I want to create a master list of what I'm going to do through the day however I've got a few rules for this.
One list has separate daily tasks that don't depend on the order they are completed. I call this list 'daily'. I've got another list of tasks for my projects, but these do depend on the order completed. This list is called 'projects'. I have a third list of things that must be done at the end of the day. I call it 'endofday'.
So here are the basic rules.
A list of randomized tasks where daily tasks can be performed in any order, where project tasks may be randomly inserted into the main list at any position but must stay in their original order relative to each other, and end of day tasks appended to the main list.
I understand how to get a random number from random.randint(), appending to lists, reading files and all that......but the logic is giving me a case of 'hurty brain'. Anyone want to take a crack at this?
EDIT:
Ok I solved it on my own, but at least asking the question got me to picture it in my head. Here's what I did.
random.shuffle(daily)
while projects:
daily.insert(random.randint(0,len(daily)), projects.pop(0))
random.shuffle(endofday)
daily.extend(endofday)
for x in daily: print x
Thanks for the answers, I'll give ya guys some kudos anyways!
EDIT AGAIN:
Crap I just realized that's not the right answer lol
LAST EDIT I SWEAR:
position = []
random.shuffle(daily)
for x in range(len(projects)):
position.append(random.randint(0,len(daily)+x))
position.sort()
while projects:
daily.insert(position.pop(0), projects.pop(0))
random.shuffle(endofday)
daily.extend(endofday)
for x in daily: print x
I LIED:
I just thought about what happens when position has duplicate values and lo and behold my first test returned 1,3,2,4 for my projects. I'm going to suck it up and use the answerer's solution lol
OR NOT:
position = []
random.shuffle(daily)
for x in range(len(projects)):
while 1:
pos = random.randint(0,len(daily)+x)
if pos not in position: break
position.append(pos)
position.sort()
while projects:
daily.insert(position.pop(0), projects.pop(0))
random.shuffle(endofday)
daily.extend(endofday)
for x in daily: print x
First, copy and shuffle daily to initialize master:
master = list(daily)
random.shuffle(master)
then (the interesting part!-) the alteration of master (to insert projects randomly but without order changes), and finally random.shuffle(endofday); master.extend(endofday).
As I said the alteration part is the interesting one -- what about:
def random_mix(seq_a, seq_b):
iters = [iter(seq_a), iter(seq_b)]
while True:
it = random.choice(iters)
try: yield it.next()
except StopIteration:
iters.remove(it)
it = iters[0]
for x in it: yield x
Now, the mixing step becomes just master = list(random_mix(master, projects))
Performance is not ideal (lots of random numbers generated here, we could do with fewer, for example), but fine if we're talking about a few dozens or hundreds of items for example.
This insertion randomness is not ideal -- for that, the choice between the two sequences should not be equiprobable, but rather with probability proportional to their lengths. If that's important to you, let me know with a comment and I'll edit to fix the issue, but I wanted first to offer a simpler and more understandable version!-)
Edit: thanks for the accept, let me complete the answer anyway with a different way of "random mixing preserving order" which does use the right probabilities -- it's only slightly more complicated because it cannot just call random.choice;-).
def random_mix_rp(seq_a, seq_b):
iters = [iter(seq_a), iter(seq_b)]
lens = [len(seq_a), len(seq_b)]
while True:
r = random.randrange(sum(lens))
itindex = r < lens[0]
it = iters[itindex]
lens[itindex] -= 1
try: yield it.next()
except StopIteration:
iters.remove(it)
it = iters[0]
for x in it: yield x
Of course other optimization opportunities arise here -- since we're tracking the lengths anyway, we could rely on a length having gone down to zero rather than on try/except to detect that one sequence is finished and we should just exhaust the other one, etc etc. But, I wanted to show the version closest to my original one. Here's one exploiting this idea to optimize and simplify:
def random_mix_rp1(seq_a, seq_b):
iters = [iter(seq_a), iter(seq_b)]
lens = [len(seq_a), len(seq_b)]
while all(lens):
r = random.randrange(sum(lens))
itindex = r < lens[0]
it = iters[itindex]
lens[itindex] -= 1
yield it.next()
for it in iters:
for x in it: yield x
Use random.shuffle to shuffle a list
random.shuffle(["x", "y", "z"])
How to fetch a random element in a list using python:
>>> import random
>>> li = ["a", "b", "c"]
>>> len = (len(li))-1
>>> ran = random.randint(0, len)
>>> ran = li[ran]
>>> ran
'b'
But it seems you're more curious about how to design this. If so, the python tag should probably not be there. If not, the question is probably to broad to get you any good answers code-wise.
Combine all 3 lists into a DAG
Perform all possible topological sorts, store each sort in a list.
Choose one from the list at random
In order for the elements of the "project" list to stay in order, you could do the following:
Say you have 4 project tasks: "a,b,c,d". Then you know there are five spots where other, randomly chosen elements can be inserted (before and after each element, including the beginning and the end), while the ordering naturally stays the same.
Next, you can add five times a special element (e.g. "-:-") to the daily list. When you now shuffle the daily list, these special items, corresponding to "a,b,c,d" from above, are randomly placed. Now you simply have to insert the elements of the "projects" list sequentially for each special element "-:-". And you keep the ordering, yet have a completely random list regarding the tasks from the daily list.