add a list to a list value python - python

So I have this class called Data
And the class has two attributes; var_names and units
is it possible to append list elements of the list (which belongs to attribute units) to each element in the list called var_names? So when I make a new object called myData, the bellow output is possible (so the sublist [1,2,3,4] would be associated to id element?
>>> myData=Data([[1,2,3,4],["a","b","c","d"],
[2.3,2.1,2.5,3.1]],var_names=["id","name","length"])
>>> print myData
id name length
1 a 2.3
2 b 2.1
3 c 2.5
4 d 3.1
Or that's not possible? And be only achieved visually (way of formatting the output)

You actually need namedtuple as others have suggested. Here is a possible implementation of what you would need
>>> class Data(object):
def __init__(self,*args,**kwargs):
args = args[0]
var_names = kwargs['var_names']
self.var_names = namedtuple('Data',var_names)
self.units = [self.var_names(*e) for e in zip(*args)]
def __repr__(self):
fields = [e for e in vars(self.units[0])]
fmt_string = '{:^10}'*len(fields)
return '\n'.join(fmt_string.format(*units) for units in self.units)
>>> myData=Data([[1,2,3,4],["a","b","c","d"],
[2.3,2.1,2.5,3.1]],var_names=["id","name","length"])
>>> print myData
1 a 2.3
2 b 2.1
3 c 2.5
4 d 3.1
>>>

If I'm understanding you right I think using a dictionary in place of a list for var_names should do what you want. Use collections.OrderedDict if ordering is important.
>>> import collections
>>> var_names = collections.OrderedDict([('id',[1,2,3,4]),
('name',['a','b','c','d']),
('length',[2.3,2.1,2.5,3.1])])
>>> for name, value in var_names.items():
... print name + ': ' + str(value)
id: [1, 2, 3, 4]
name: ['a', 'b', 'c', 'd']
length: [2.3, 2.1, 2.5, 3.1]

Related

Python pandas: sum item occurences in a string list by item substring

I've this list of strings:
list = ['a.xxx', 'b.yyy', 'c.zzz', 'a.yyy', 'b.xxx', 'a.www']
I'd like to count items occurences by item.split('.')[0].
Desiderata:
a 3
b 2
c 1
setup
I don't like assigning to variable names that are built-in classes
l = ['a.xxx', 'b.yyy', 'c.zzz', 'a.yyy', 'b.xxx', 'a.www']
option 1
pd.value_counts(pd.Series(l).str.split('.').str[0])
option 2
pd.value_counts([x.split('.', 1)[0] for x in l])
option 3
wrap Counter in pd.Series
pd.Series(Counter([x.split('.', 1)[0] for x in l]))
option 4
pd.Series(l).apply(lambda x: x.split('.', 1)[0]).value_counts()
option 5
using find
pd.value_counts([x[:x.find('.')] for x in l])
All yield
a 3
b 2
c 1
dtype: int64
First of all, list is not a good variable name because you will shadow the built in list. I don't know much pandas, but since it is not required here I'll post an answer anyway.
>>> from collections import Counter
>>> l = ['a.xxx', 'b.yyy', 'c.zzz', 'a.yyy', 'b.xxx', 'a.www']
>>> Counter(x.split('.', 1)[0] for x in l)
Counter({'a': 3, 'b': 2, 'c': 1})
I would try the Counter class from collections. It is a subclass of a dict, and gives you a dictionary where the values correspond to the number of observations of each type of key:
a = ['a.xxx', 'b.yyy', 'c.zzz', 'a.yyy', 'b.xxx', 'a.www']
from collections import Counter
Counter([item.split(".")[0] for item in a])
gives
Counter({'a': 3, 'b': 2, 'c': 1})
which is what you require

Looping over elements of named tuple in python

I have a named tuple which I assign values to like this:
class test(object):
self.CFTs = collections.namedtuple('CFTs', 'c4annual c4perren c3perren ntfixing')
self.CFTs.c4annual = numpy.zeros(shape=(self.yshape, self.xshape))
self.CFTs.c4perren = numpy.zeros(shape=(self.yshape, self.xshape))
self.CFTs.c3perren = numpy.zeros(shape=(self.yshape, self.xshape))
self.CFTs.ntfixing = numpy.zeros(shape=(self.yshape, self.xshape))
Is there a way to loop over elements of named tuple? I tried doing this, but does not work:
for fld in self.CFTs._fields:
self.CFTs.fld= numpy.zeros(shape=(self.yshape, self.xshape))
namedtuple is a tuple so you can iterate as over normal tuple:
>>> from collections import namedtuple
>>> A = namedtuple('A', ['a', 'b'])
>>> for i in A(1,2):
print i
1
2
but tuples are immutable so you cannot change the value
if you need the name of the field you can use:
>>> a = A(1, 2)
>>> for name, value in a._asdict().iteritems():
print name
print value
a
1
b
2
>>> for fld in a._fields:
print fld
print getattr(a, fld)
a
1
b
2
from collections import namedtuple
point = namedtuple('Point', ['x', 'y'])(1,2)
for k, v in zip(point._fields, point):
print(k, v)
Output:
x 1
y 2
Python 3.6+
You can simply loop over the items as you would a normal tuple:
MyNamedtuple = namedtuple("MyNamedtuple", "a b")
a_namedtuple = MyNamedtuple(a=1, b=2)
for i in a_namedtuple:
print(i)
From Python 3.6, if you need the property name, you now need to do:
for name, value in a_namedtuple._asdict().items():
print(name, value)
Note
If you attempt to use a_namedtuple._asdict().iteritems() it will throw AttributeError: 'collections.OrderedDict' object has no attribute 'iteritems'

Finding the minimum value for different variables

If i am doing some math functions for different variables for example:
a = x - y
b = x**2 - y**2
c = (x-y)**2
d = x + y
How can i find the minimum value out of all the variables. For example:
a = 4
b = 7
c = 3
d = 10
So the minimum value is 3 for c. How can i let my program do this.
What have i thought so far:
make a list
append a,b,c,d in the list
sort the list
print list[0] as it will be the smallest value.
The problem is if i append a,b,c,d to a list i have to do something like:
lst.append((a,b,c,d))
This makes the list to be -
[(4,7,3,10)]
making all the values relating to one index only ( lst[0] )
If possible is there any substitute to do this or any way possible as to how can i find the minimum!
LNG - PYTHON
Thank you
You can find the index of the smallest item like this
>>> L = [4,7,3,10]
>>> min(range(len(L)), key=L.__getitem__)
2
Now you know the index, you can get the actual item too. eg: L[2]
Another way which finds the answer in the form(index, item)
>>> min(enumerate(L), key=lambda x:x[1])
(2, 3)
I think you may be going the wrong way to solving your problem, but it's possible to pull values of variable from the local namespace if you know their names. eg.
>>> a = 4
>>> b = 7
>>> c = 3
>>> d = 10
>>> min(enumerate(['a', 'b', 'c', 'd']), key=lambda x, ns=locals(): ns[x[1]])
(2, 'c')
a better way is to use a dict, so you are not filling your working namespace with these "junk" variables
>>> D = {}
>>> D['a'] = 4
>>> D['b'] = 7
>>> D['c'] = 3
>>> D['d'] = 10
>>> min(D, key=D.get)
'c'
>>> min(D.items(), key=lambda x:x[1])
('c', 3)
You can see that when the correct data structure is used, the amount of code required is much less.
If you store the numbers in an list you can use a reduce having a O(n) complexity due the list is not sorted.
numbers = [999, 1111, 222, -1111]
minimum = reduce(lambda mn, candidate: candidate if candidate < mn else mn, numbers[1:], numbers[0])
pack as dictionary, find min value and then find keys that have matching values (possibly more than one minimum)
D = dict(a = 4, b = 7, c = 3, d = 10)
min_val = min(D.values())
for k,v in D.items():
if v == min_val: print(k)
The buiit-in function min will do the trick. In your example, min(a,b,c,d) will yield 3.

using FOR statement on 2 elements at once python

I have the following list of variables and a mastervariable
a = (1,5,7)
b = (1,3,5)
c = (2,2,2)
d = (5,2,8)
e = (5,5,8)
mastervariable = (3,2,5)
I'm trying to check if 2 elements in each variable exist in the master variable, such that the above would show B (3,5) and D (5,2) as being elements with at least 2 elements matching in the mastervariable. Also note that using sets would result in C showing up as matchign but I don't want to count C cause only 'one' of the elements in C are in mastervariable (i.e. 2 only shows up once in mastervariable not twice)
I currently have the very inefficient:
if current_variable[0]==mastervariable[0]:
if current_variable[1] = mastervariable[1]:
True
elif current_variable[2] = mastervariable[1]:
True
#### I don't use OR here because I need to know which variables match.
elif current_variable[1] == mastervariable[0]: ##<-- I'm now checking 2nd element
etc. etc.
I then continue to iterate like the above by checking each one at a time which is extremely inefficient. I did the above because using a FOR statement resulted in me checking the first element twice which was incorrect:
For i in a:
for j in a:
### this checked if 1 was in the master variable and not 1,5 or 1,7
Is there a way to use 2 FOR statement that allows me to check 2 elements in a list at once while skipping any element that has been used already? Alternatively, can you suggest an efficient way to do what I'm trying?
Edit: Mastervariable can have duplicates in it.
For the case where matching elements can be duplicated so that set breaks, use Counter as a multiset - the duplicates between a and master are found by:
count_a = Counter(a)
count_master = Counter(master)
count_both = count_a + count_master
dups = Counter({e : min((count_a[e], count_master[e])) for e in count_a if count_both[e] > count_a[e]})
The logic is reasonably intuitive: if there's more of an item in the combined count of a and master, then it is duplicated, and the multiplicity is however many of that item are in whichever of a and master has less of them.
It gives a Counter of all the duplicates, where the count is their multiplicity. If you want it back as a tuple, you can do tuple(dups.elements()):
>>> a
(2, 2, 2)
>>> master
(1, 2, 2)
>>> dups = Counter({e : min((count_a[e], count_master[e])) for e in count_a if count_both[e] > count_a[e]})
>>> tuple(dups.elements())
(2, 2)
Seems like a good job for sets. Edit: sets aren't suitable since mastervariable can contain duplicates. Here is a version using Counters.
>>> a = (1,5,7)
>>>
>>> b = (1,3,5)
>>>
>>> c = (2,2,2)
>>>
>>> d = (5,2,8)
>>>
>>> e = (5,5,8)
>>> D=dict(a=a, b=b, c=c, d=d, e=e)
>>>
>>> from collections import Counter
>>> mastervariable = (5,5,3)
>>> mvc = Counter(mastervariable)
>>> for k,v in D.items():
... vc = Counter(v)
... if sum(min(count, vc[item]) for item, count in mvc.items())==2:
... print k
...
b
e

Python higher-order sequence assignment?

Is there a way to group names together in python, to repeatedly assign to them en masse?
While we can do:
a,b,c = (1,2,3)
I would like to be able to do something like:
names = a,b,c
*names = (3,2,1) # this syntax doesn't work
a,b,c == (3,2,1) #=> True
Is there a built-in syntax for this? If not, I assume it would be possible with an object that overloads its assignment operator. In that case, is there an existing implementation, and would this concept have any unexpected failure modes?
The point is not to use the names as data, but rather to be able to use the actual names as variables that each refer to their own individual item, and to be able to use the list as a list, and to avoid code like:
a = 1
b = 2
c = 3
sequence = (a,b,c)
You should go one level up in your data abstraction. You are not trying to access the entries by their individual names -- you rather use names to denote the whole collection of values, so a simple list might be what you want.
If you want both, a name for the collection and names for the individual items, then a dictionary might be the way to go:
names = "a b c".split()
d = dict(zip(names, (1, 2, 3)))
d.update(zip(names, (3, 2, 1)))
If you need something like this repeatedly, you might want to define a class with the names as attributes:
class X(object):
def __init__(self, a, b, c):
self.update(a, b, c)
def update(self, a, b, c)
self.a, self.b, self.c = a, b, c
x = X(1, 2, 3)
x.update(3, 2, 1)
print x.a, x.b. x.c
This reflects that you want to block a, b and c to some common structure, but keep the option to access them individually by name.
This?
>>> from collections import namedtuple
>>> names = namedtuple( 'names', ['a','b','c'] )
>>> thing= names(3,2,1)
>>> thing.a
3
>>> thing.b
2
>>> thing.c
1
You should use a dict:
>>> d = {"a": 1, "b": 2, "c": 3}
>>> d.update({"a": 8})
>>> print(d)
{"a": 8, "c": 3, "b": 2}
I've realised that "exotic" syntax is probably unnecessary. Instead the following achieves what I wanted: (1) to avoid repeating the names and (2) to capture them as a sequence:
sequence = (a,b,c) = (1,2,3)
Of course, this won't allow:
*names = (3,2,1) # this syntax doesn't work
a,b,c == (3,2,1) #=> True
So, it won't facilitate repeated assignment to the same group of names without writing out those names repeatedly (except in a loop).
Well, you shouldn't do this, since it's potentially unsafe, but you can use the exec statement
>>> names = "a, b, c"
>>> tup = 1,2,3
>>> exec names + "=" + repr(tup)
>>> a, b, c
(1, 2, 3)
Python has such an elegant namespace system:
#!/usr/bin/env python
class GenericContainer(object):
def __init__(self, *args, **kwargs):
self._names = []
self._names.extend(args)
self.set(**kwargs)
def set(self, *args, **kwargs):
for i, value in enumerate(args):
self.__dict__[self._names[i]] = value
for name, value in kwargs.items():
if name not in self._names:
self._names.append(name)
self.__dict__[name] = value
def zip(self, names, values):
self.set(**dict(zip(names, values)))
def main():
x = GenericContainer('a', 'b', 'c')
x.set(1, 2, 3, d=4)
x.a = 10
print (x.a, x.b, x.c, x.d,)
y = GenericContainer(a=1, b=2, c=3)
y.set(3, 2, 1)
print (y.a, y.b, y.c,)
y.set(**dict(zip(('a', 'b', 'c'), (1, 2, 3))))
print (y.a, y.b, y.c,)
names = 'x', 'y', 'z'
y.zip(names, (4, 5, 6))
print (y.x, y.y, y.z,)
if __name__ == '__main__':
main()
Each instance of GenericContainer is an isolated namespace. IMHO it is better than messing with the local namespace even if you are programming under a pure procedural paradigm.
Not sure whether this is what you want...
>>> a,b,c = (1,2,3)
>>> names = (a,b,c)
>>> names
(1, 2, 3)
>>> (a,b,c) == names
True
>>> (a,b,c) == (1,2,3)
True

Categories