remove multiple "classes" from a list in python - python

I have
class rel:
child=''
parent=''
listPar=[]
and in listPar I have a list of these classes (sorry for terms, I'm not sure if it is called class, is it?) so in listPar I have for example: room book ; book title ; room book;book title
And now im trying to remove all non unique occurences, so I want to have only
room book ; book title in listPar
Let's assume, that i have following code:
variable="Book"
variable2="Author"
toIns=rel()
toIns.parent=variable
toIns.child=variable2
listPar.append(toIns)
toIns2=rel()
toIns2.parent=variable
toIns2.child=variable2
listPar.append(toIns2)
and now how to remove all duplicates? (result ->
for elem in listPar:
print "child:",elem.child,"parent:",elem.parent
#child:author, parent:book
I have tried several things, but none of them seemed to fully work..could you please help me?

I'm presuming that the class you have given there isn't the actual class (as it would be worthless), but the easiest thing for you to do here - presuming the order of your elements doesn't matter to you, is to make your list into a set, which will remove all duplicates.
>>> a = ["test", "test", "something", "else"]
>>> a
['test', 'test', 'something', 'else']
>>> set(a)
{'test', 'something', 'else'}
Here I have use strings, but you could use any class that provides the equality operator and hash function. The equality function is used to check if the two classes are the same (as for a custom class, you need to define that) and a hash is used to make sets very efficient. Two classes giving the same hash must be the same. You can have two classes with the same hash that are not the same (it will fall back to the equality operator), but the more this happens the slower it will be. In general, using the sum of the hashes of the components of the class you use to check for equality is a good way to generate a decent hash.
So, for example:
class Book:
def __init__(self, title, author):
self.title = title
self.author = author
def __eq__(self, other):
return self.title == other.title and self.author == other.author
def __hash__(self):
return hash(self.title)+hash(self.author)
def __repr__(self):
return "Book("+repr(self.title)+", "+repr(self.author)+")"
We can use this class like before.
>>> a = [Book("Some Book", "Some Guy"), Book("Some Book", "Some Guy"), Book("Some Other Book", "Some Other Guy")]
>>> a
[Book('Some Book', 'Some Guy'), Book('Some Book', 'Some Guy'), Book('Some Other Book', 'Some Other Guy')]
>>> set(a)
{Book('Some Other Book', 'Some Other Guy'), Book('Some Book', 'Some Guy')}
If you do care about the order of the elements, even after removing duplicates, then you could do this:
def remove_duplicates_preserving_order(seq):
seen = set()
return [ x for x in seq if x not in seen and not seen.add(x)]
This works by hacking the dictionary comprehension a little - set.add() always returns 0, so you can check it is false (which it always will be) to add the element to the set.
Edit for update:
Please note that PEP-8 reccomends
using CapWords for classes, and lowercase_with_underscores for local
variables.
You seem to have a misunderstanding about how Python classes work. This class
doesn't make much sense, as these are all class attributes, not instance
attributes. This means that they will be the same for all instances of the
class, and that's not what you want. This means that when you change them the
second time, you will be changing it for all the instances, making them all
the same.
To make instance variables (the type you want) you want to create them inside
the constructor (__init__()) - check my example class to see how this works.
Once you have done this, you then need to implement __eq__() and __hash__()
functions so that Python knows what it means for two items of your class to be
equal. You can then use the methods I described above (either a set or the function
I gave) to remove duplicates.
Note that if this is all you wish to do with your data, a class might be overkill.
If you are always going to have two items, you could just use a tuple:
>>> a = [("Book", "Author"), ("Book", "Author"), ("OtherBook", "OtherAuthor")]
>>> set(a)
{('Book', 'Author'), ('OtherBook', 'OtherAuthor')}
As tuples already define equality for you as a sum of their parts.
Overall, you seem to lack an understanding of how classes are constructed and used in Python - I would suggest you go read up and learn how to use them before anything else, as not doing so will cause you a lot of problems.

Related

Python class is defined, everything is defined but I can't figure out a print method to print it all out

It's set up in a class function with variables like this.
Season_1 = AHS('yada', 'yada', 'yada')
Season_2 = AHS('yada', 'yada', 'yada')
Etc... Through 9 seasons.
What I can't figure out is how to set up a print method to print all of them out instead of
Print(season_1.yada)
Print(season_2.yada)
What can I do to make it simpler?
enter link description here here is a link for better context. I'm a python and stack noob sorry for that :/
You need to define a function that takes a list of seasons:
def print_yada(seasons):
for s in seasons:
print(s.yada)
print_yada([season_1, season_2])
As mentioned in the comments, start with a list rather than a bunch of numbered variable names.
seasons = [
AHS('yada', 'yada', 'yada'),
AHS('yada', 'yada', 'yada'),
AHS('yada', 'yada', 'yada')
]
print_yada(seasons)
There won't be a single AHS method that can do this, because you aren't dealing with a single instance of AHS, but rather a collection of them.
I would suggest putting your seasons in a list, instead of individual variables
seasons = [AHS('foo'), AHS('bar'), ...]
Then you can loop through them
for season in seasons:
print(season.yada)
As others have said, you probably require an actual list, persumably within your class, if it provides any functionality on the members above what built-in Python collections already do. If not, you don't need a class.
If, on the other hand, "yada", "yada", "yada" are actually fixed members that define what your class is, and you are basically looking for a way to print an object of such a class, you can define the __str__ member method of your class to create a readable output format for your objects, persumably referring to their members.
__str__() should return a string that is used when an object of your class is converted to one, such as when printing.
It has no parameters aside from the mandatory instance parameter.
E.g.:
class MyClass:
def __init__(self, a, b):
self.a, self.b = (a, b)
def __str__(self):
return f"My members are: {self.a}, {self.b}"
print(MyClass(3, "abc"))
My members are: 3, abc

Sort strings accompanied by integers in list

I am trying to make a leaderboard.
Here is a list i have :
list=['rami4\n', 'kev13\n', 'demian6\n']
I would like to be able to sort this list from highest number to smallest, or even smallest to highest, giving something like :
list=['kev13\n', 'demian6\n', 'rami4\n']
I tried to use stuff like re.findall('\d+', list[loop])[0] but i only managed to get, out of the list, the best player. Not wanting to repeat the code for as many players as there are, does anyone have an idea ?
You indeed have to use the re module, but also the key parameter of the sort() method.
reg = re.compile('\w*?(\d+)\\n')
lst.sort(key=lambda s: int(reg.match(s).group(1)))
It works fine using findall() as you did too:
reg = re.compile('\d+')
lst.sort(key=lambda s: int(reg.findall(s)[0]))
Note that I compile() the regular expression so it is computed once and for all rather than for each element in the list.
I have an other solution based on Object Oriented Programming and the overriding of the __lt__ special methods of str.
import re
class SpecialString(str):
def __lt__(self, other):
pattern=re.compile(r"\d+")
return int(pattern.search(str(self)).group(0)) < int(pattern.search(str(other)).group(0))
if __name__ == "__main__":
listing = ['rami4\n', 'kev13\n', 'demian6\n']
spe_list = [SpecialString(x) for x in listing]
spe_list.sort()
print(spe_list)
Which print to the standard output:
['rami4\n', 'demian6\n', 'kev13\n']
This method allows you to not rewrite the sort function and use the built-in one (which is probably optimized). More over, since your strings may be thinked like "specialization of the str class", the inheritance mecanism is very suitable because you keep all its properties but re-write its comparison mecanism.

is using array with meaningful field name values not pythonic for a Python struct?

I have seen several questions about how to implement something like a C struct in Python. Usually people recommend a namedtuple but the problem is, its fields are not mutable, and I don't see much point in having a structure that is not mutable, so if that is desired, then a dictionary is recommended, which in my opinion, is too verbose - you have to surround your field names with quotes, and, it must be very slow - you have to search for the field value.
I never see what seems like a natural solution to me:
i = -1
i += 1
FIELD_A = i
i += 1
FIELD_B = i
i += 1
FIELD_C = i
structure = [ 0, "foo", "bar" ]
structure[FIELD_A] = 1
The reason for the i manipulation, is it allows copy-and-paste without a possibility of assigning the same value twice or wasting space. The reason for capital letters is to make these values stand out as "constant".
Am I being naive and there something wrong, or not Pythonic, with the above?
The alternative to your code using a dict:
structure = {}
structure["FIELD1"] = 0
structure["FIELD2"] = "foo"
structure["FIELD3"] = "bar
Fewer lines of code, and more readable in my opinion because you need not wonder what is going on with i. I have actually used your approach above when working in MATLAB since there is no convenient dict alternative.
Additionally, there's nothing preventing you from using capital letters if you find that more readable.
" it must be very slow - you have to search for the field value." if you think dictionary lookups are slow then python is going to give you a rolling set of heart attacks. Consider
foo = 'somestring'
bar = bar
baz = someobj.variable
Python calculated the hash for 'somestring' because it does that with all strings when they are created. It looks foo and bar up in the module's namespace dict every time we mention them. And accessing object variables involves looking them up in the object's dict.
A couple of ways to get struct-like behavior is to define class level variables or to use __slots__ to define a canned set of variables for an object.
class MyStruct(object):
FIELD1 = 0
FIELD2 = 'foo'
FIELD3 = 'bar'
print(MyStruct.FIELD2)
s = MyStruct()
s.FIELD2 = 'baz'
print(s.FIELD2)
class MySlots(object):
__slots__ = ['FIELD1', 'FIELD2', 'FIELD3']
def __init__(self, FIELD1=0, FIELD2='foo', FIELD3='bar'):
self.FIELD1 = FIELD1
self.FIELD2 = FIELD2
self.FIELD3 = FIELD3
s = MySlots()
print(s.FIELD2)
These may be pleasing but they are no faster than using a dict.
I think the closest analogy, in a Pythonic way, to a C-struct would be a Python object with named attributes.
The cumbersome way:
class mystruct:
def __init__(self):
self.FIELD1 = None
self.FIELD2 = None
self.FIELD3 = None
x = mystruct()
x.FIELD1 = 42
x.FIELD3 = "aap"
It is possible to make anonymous objects with a number of attributes using the type function:
y = type('', (), {'FIELD1': None, 'FIELD2':None})()
y.FIELD1 = 42
But this is still cumbersome. But this can easily be generalized by writing a function that returns a function which will create an instance of an object.
# 'objmaker' takes a list of field names and returns a function
# which will create an instance of an object with those fields, their
# values initially set to None
objmaker = lambda *fields: type('', (), {field: None for field in fields})
# Now it's trivial to define new 'structs' - here we 'define' two
# types of 'structs'
mystruct_maker = objmaker('FIELD1', 'FIELD2', 'FIELD3')
yourstruct_maker = objmaker('x', 'y')
# And creating instances goes like this:
my_str1 = mystruct_maker()
my_str2 = mystruct_maker()
yr_str = yourstruct_maker()
yr_str.x = 42
my_str1.FIELD1 = yr_str.x
I appreciate all the other answers, but they don't really answer the question. The question is posed "what's wrong with my code", not "what is the best way to code xyz".
What's wrong, as I found out, is that, I am introducing a set of "global" constants. If I have another structure with the field FIELD_A, I am in trouble. FIELD_A should have structure scope, not global scope. That's one big reason what I did is sub-standard.

Python. How to efficiently remove custom object from array

Good afternoon dear colleagues! I have maybe quite an obvious question, but it can be considered as quite a mystery to me.
So, I have two custom classes, let it be Class A and Class B. Class B has a property, which contains multiple instances of Class A. It also contains methods to work with this property: add, remove, get single and all instances in this property.
However, apart from standard and quite over questioned deal with MVC pattern, I wonder: what is the most efficient and fast method to remove an object from this property (array) in Python, which implements some customization (e.g. compare objects by id, title and other properties).
I had implemented my own, but it seems way too complicated for such an essential operation.
class Skill:
def __init__(self, id_):
self.id = id_
class Unit:
def __init__(self):
self.skills = []
def get_skills(self):
return self.skills
def get_skill(self, index):
return self.skills[index]
def add_skill(self, skill):
self.skills.append(skill)
def remove_skill(self, skill_to_remove):
self.skills = filter(lambda skill: skill.id != skill_to_remove.id, self.skills)
If you need arbitrary criteria, then filtering is OK, but it is slightly shorter to use a list comprehension. For example, instead of
self.skills = filter(lambda skill: skill.id != skill_to_remove.id, self.skills)
use:
self.skills = [s for s in self.skills if s.id != skill_to_remove.id]
It's also possible to modify the list in-place (see this question) using slice assignment:
self.skills[:] = (s for s in self.skills if s.id != skill_to_remove.id)
If you are filtering skills based on an exact match with a "template" skill, i.e. matching all the properties of skill_to_remove then it might be better to implement an equality method for your Skill class (see this question). Then you could just use the remove method on self.skills. However, this will only remove the first matching instance.

How do I pass lots of variables to and from a function in Python?

I do scientific programming, and often want to show users prompts and variable pairs, let them edit the variables, and then do the calulations with the new variables. I do this so often, that I wrote a wxPython class to move this code out of the main program. You set up a list for each variable with the type of the variable (string, float, int), the prompt, and the variable's current value. You then place all of these lists in one big list, and my utility creates a neatly formated wxPython panel with prompts and the current values which can be edited.
When I started, I only had a few variables, so I would write out each variable.
s='this is a string'; i=1; f=3.14
my_list=[ ['s','your string here',s], ['i','your int here',i], ['f','your float here'],]
input_panel = Input(my_list)
# the rest of the window is created, the input_panel is added to the window, the user is
# allowed to make choices, and control returns when the user hits the calculate button
s,i,f = input_panel.results() # the .results() function returns the values in a list
Now I want to use this routine for a lot of variables (10-30), and this approach is breaking down. I can create the input list to the function over multiple lines using the list.append() statements. When the code returns from the function, though, I get this huge list that needs to be unpacked into the right variables. This is difficult to manage, and it looks like it will be easy to get the input list and output list out of sync. And worse than that, it looks kludgy.
What is the best way to pass lots of variables to a function in Python with extra information so that they can be edited, and then get the variables back so that I can use them in the rest of the program?
If I could pass the variables by reference into the function, then users could change them or not, and I would use the values once the program returned from the function. I would only need to build the input list over multiple lines, and there wouldn't be any possiblity of the input list getting out of sync with the output list. But Python doesn't allow this.
Should I break the big lists into smaller lists that then get combined into big lists for passing into and out of the functions? Or does this just add more places to make errors?
The simplest thing to do would be to create a class. Instead of dealing with a list of variables, the class will have attributes. Then you just use a single instance of the class.
There are two decent options that come to mind.
The first is to use a dictionary to gather all the variables in one place:
d = {}
d['var1'] = [1,2,3]
d['var2'] = 'asdf'
foo(d)
The second is to use a class to bundle all the arguments. This could be something as simple as:
class Foo(object):
pass
f = Foo()
f.var1 = [1,2,3]
f.var2 = 'asdf'
foo(f)
In this case I would prefer the class over the dictionary, simply because you could eventually provide a definition for the class to make its use clearer or to provide methods that handle some of the packing and unpacking work.
To me, the ideal solution is to use a class like this:
>>> class Vars(object):
... def __init__(self, **argd):
... self.__dict__.update(argd)
...
>>> x = Vars(x=1, y=2)
>>> x.x
1
>>> x.y
2
You can also build a dictionary and pass it like this:
>>> some_dict = {'x' : 1, 'y' : 2}
>>> #the two stars below mean to pass the dict as keyword arguments
>>> x = Vars(**some_dict)
>>> x.x
1
>>> x.y
2
You may then get data or alter it as need be when passing it to a function:
>>> def foo(some_vars):
... some_vars.z = 3 #note that we're creating the member z
...
>>> foo(x)
>>> x.z
3
If I could pass the variables by reference into the function, then users could change them or not, and I would use the values once the program returned from the function.
You can obtain much the same effect as "pass by reference" by passing a dict (or for syntactic convenience a Bunch, see http://code.activestate.com/recipes/52308/).
if you have a finite set of these cases, you could write specific wrapper functions for each one. Each wrapper would do the work of building and unpacking lists taht are passed to the internal function.
I would recommend using a dictionary
or a class to accumulate all details
about your variables
value
prompt text
A list to store the order in which you want them to be displayed
Then use good old iteration to prepare input and collect output
This way you will only be modifying a small manageable section of the code time and again.
Of course you should encapsulate all this into a class if your comfortable working with classes.
"""Store all variables
"""
vars = {}
"""Store the order of display
"""
order = []
"""Define a function that will store details and order of the variable definitions
"""
def makeVar(parent, order, name, value, prompt):
parent[name] = dict(zip(('value', 'prompt'), (value, prompt)))
order.append(name)
"""Create your variable definitions in order
"""
makeVar(vars, order, 's', 'this is a string', 'your string here')
makeVar(vars, order, 'i', 1, 'your int here')
makeVar(vars, order, 'f', 3.14, 'your float here')
"""Use a list comprehension to prepare your input
"""
my_list = [[name, vars[name]['prompt'], vars[name]['value']] for name in order]
input_panel = Input(my_list)
out_list = input_panel.results();
"""Collect your output
"""
for i in range(0, len(order)):
vars[order[i]]['value'] = out_list[i];

Categories