openpyxl possible lambda usage - python

I was trying to assign value of a cell equal to sum of few other cell. I could achieve this using two line like
dailycell=45
while sheet['E' + str(dailycell)].value is not None:
mysum+=sheet['E'+str(dailycell)].value
sheet['B13']=mysum
dailycell+=1
With my limited knowledge, I tried lambda. Then I get all sort of error. I tried few more iteration but none got me the same result.
sheet['B13']=lambda weeklybudget,sheet['E'+str(dailycell)].value:weeklybudget+=sheet['E'+str(dailycell)]
Is there an alternative to this?

Why don't you just iterate over the whole E column and sum up the values:
sheet["B13"] = sum(c.value or 0 for c in sheet['E'])
If you have a restriction on what row to start from, then just grab the appropriate slice:
sheet["B13"] = sum(c.value or 0 for c in sheet['E'][46:])
If you need to stop summing at the first occurrence of None, you'll need a bit of external help - itertools.takewhile() comes to mind:
from itertools import takewhile
sheet["B13"] = sum(c.value for c in takewhile(lambda x: x.value is not None, sheet['E'][46:]))

Related

selecting a value using a conditional statement based on a list of tuples

I have a list of tuples converted from a dictionary. I am looking to compare a conditional value against the list of tuples(values) whether it is higher or lower starting from the beginning on the list. When this conditional value is lower than a tuple's(value) I want to use that specific tuple for further coding.
Please can somebody give me an insight into how this is achieved?
I am relatively new to coding, self-learning and I am not 100% sure the example would run but for the sake of demonstrating I have tried my best.
`tuple_list = [(12:00:00, £55.50), (13:00:00, £65.50), (14:00:00, £75.50), (15:00:00, £45.50), (16:00:00, £55.50)]
conditional_value = £50
if conditional_value != for x in tuple_list.values()
y = 0
if conditional_value < tuple_list(y)
y++1
else
///"return the relevant value from the tuple_list to use for further coding. I would be
looking to work with £45.50"///`
Thank you.
Just form a new list with a condition:
tuple_list = [("12:00:00", 55.50), ("13:00:00", 65.50), ("14:00:00", 75.50), ("15:00:00", 45.50), ("16:00:00", 55.50)]
threshold = 50
below = [tpl for tpl in tuple_list if tpl[1] < threshold]
print(below)
Which yields
[('15:00:00', 45.5)]
Note that I added quotation marks and removed the currency sign to be able to compare the values. If you happen to have the £ in your actual values, you'll have to preprocess (stripping) them before.
If I'm understanding your question correctly, this should be what you're looking for:
for key, value in tuple_list:
if conditional_value < value:
continue # Skips to next in the list.
else:
# Do further coding.
You can use
tuple_list = [("12:00:00", 55.50), ("13:00:00", 65.50), ("14:00:00", 75.50), ("15:00:00", 45.50), ("16:00:00", 55.50)]
conditional_value = 50
new_tuple_list = list(filter(lambda x: x[1] > conditional_value, tuple_list))
This code will return a new_tuple_list with all items that there value us greater then the conditional_value.

Apply operation and a division operation in the same step using Python

I am trying to get proportion of nouns in my text using the code below and it is giving me an error. I am using a function that calculates the number of nouns in my text and I have the overall word count in a different column.
pos_family = {
'noun' : ['NN','NNS','NNP','NNPS']
}
def check_pos_tag(x, flag):
cnt = 0
try:
for tag,value in x.items():
if tag in pos_family[flag]:
cnt +=value
except:
pass
return cnt
df2['noun_count'] = df2['PoS_Count'].apply(lambda x: check_pos_tag(x, 'noun')/df2['word_count'])
Note: I have used nltk package to get the counts by PoS tags and I have the counts in a dictionary in PoS_Count column in my dataframe.
If I remove "/df2['word_count']" in the first run and get the noun count and include it again and run, it works fine but if I run it for the first time I get the below error.
ValueError: Wrong number of items passed 100, placement implies 1
Any help is greatly appreciated
Thanks in Advance!
As you have guessed, the problem is in the /df2['word_count'] bit.
df2['word_count'] is a pandas series, but you need to use a float or int here, because you are dividing check_pos_tag(x, 'noun') (which is an int) by it.
A possible solution is to extract the corresponding field from the series and use it in your lambda.
However, it would be easier (and arguably faster) to do each operation alone.
Try this:
df2['noun_count'] = df2['PoS_Count'].apply(lambda x: check_pos_tag(x, 'noun')) / df2['word_count']

Permutation of pandas series with all elements in it (itertools)

I trying to get a back a series (or data frame) with permutations of the elements in that list:
stmt_stream_ticker = ("SELECT * FROM ticker;")
ticker = pd.read_sql(stmt_stream_ticker, eng)
which gives me my ticker series
ticker
0 ALJ
1 ALDW
2 BP
3 BPT
then via function I'd like to work my list:
def permus_maker(x):
global i # I tried nonlocal i but this gives me: nonbinding error
permus = itertools.permutations(x, 2)
permus_pairs = []
for i in permus:
permus_pairs.append(i)
return permus_pairs.append(i)
test = permus_maker(ticker)
print(test)
This gives me 'None' back. Any idea what I do wrong?
edit1:
I tested user defined function vs integrated function (%timeit): it takes 5x as long.
list.append() returns None, so return permus_pairs instead of permus_pairs.append(i)
Demo:
In [126]: [0].append(1) is None
Out[126]: True
but you don't really need that function:
In [124]: list(permutations(ticker.ticker, 2))
Out[124]:
[('ALJ', 'ALDW'),
('ALJ', 'BP'),
('ALJ', 'BPT'),
('ALDW', 'ALJ'),
('ALDW', 'BP'),
('ALDW', 'BPT'),
('BP', 'ALJ'),
('BP', 'ALDW'),
('BP', 'BPT'),
('BPT', 'ALJ'),
('BPT', 'ALDW'),
('BPT', 'BP')]
Be aware - if you are working with huge lists/Series/DataFrames, then it would make sense to use permutations iterator iteratively instead of exploding it in memory:
for pair in permutations(ticker.ticker, 2):
# process pair of tickers here ...

Sorting with two digits in string - Python

I am new to Python and I have a hard time solving this.
I am trying to sort a list to be able to human sort it 1) by the first number and 2) the second number. I would like to have something like this:
'1-1bird'
'1-1mouse'
'1-1nmouses'
'1-2mouse'
'1-2nmouses'
'1-3bird'
'10-1birds'
(...)
Those numbers can be from 1 to 99 ex: 99-99bird is possible.
This is the code I have after a couple of headaches. Being able to then sort by the following first letter would be a bonus.
Here is what I've tried:
#!/usr/bin/python
myList = list()
myList = ['1-10bird', '1-10mouse', '1-10nmouses', '1-10person', '1-10cat', '1-11bird', '1-11mouse', '1-11nmouses', '1-11person', '1-11cat', '1-12bird', '1-12mouse', '1-12nmouses', '1-12person', '1-13mouse', '1-13nmouses', '1-13person', '1-14bird', '1-14mouse', '1-14nmouses', '1-14person', '1-14cat', '1-15cat', '1-1bird', '1-1mouse', '1-1nmouses', '1-1person', '1-1cat', '1-2bird', '1-2mouse', '1-2nmouses', '1-2person', '1-2cat', '1-3bird', '1-3mouse', '1-3nmouses', '1-3person', '1-3cat', '2-14cat', '2-15cat', '2-16cat', '2-1bird', '2-1mouse', '2-1nmouses', '2-1person', '2-1cat', '2-2bird', '2-2mouse', '2-2nmouses', '2-2person']
def mysort(x,y):
x1=""
y1=""
for myletter in x :
if myletter.isdigit() or "-" in myletter:
x1=x1+myletter
x1 = x1.split("-")
for myletter in y :
if myletter.isdigit() or "-" in myletter:
y1=y1+myletter
y1 = y1.split("-")
if x1[0]>y1[0]:
return 1
elif x1[0]==y1[0]:
if x1[1]>y1[1]:
return 1
elif x1==y1:
return 0
else :
return -1
else :
return -1
myList.sort(mysort)
print myList
Thanks !
Martin
You have some good ideas with splitting on '-' and using isalpha() and isdigit(), but then we'll use those to create a function that takes in an item and returns a "clean" version of the item, which can be easily sorted. It will create a three-digit, zero-padded representation of the first number, then a similar thing with the second number, then the "word" portion (instead of just the first character). The result looks something like "001001bird" (that won't display - it'll just be used internally). The built-in function sorted() will use this callback function as a key, taking each element, passing it to the callback, and basing the sort order on the returned value. In the test, I use the * operator and the sep argument to print it without needing to construct a loop, but looping is perfectly fine as well.
def callback(item):
phrase = item.split('-')
first = phrase[0].rjust(3, '0')
second = ''.join(filter(str.isdigit, phrase[1])).rjust(3, '0')
word = ''.join(filter(str.isalpha, phrase[1]))
return first + second + word
Test:
>>> myList = ['1-10bird', '1-10mouse', '1-10nmouses', '1-10person', '1-10cat', '1-11bird', '1-11mouse', '1-11nmouses', '1-11person', '1-11cat', '1-12bird', '1-12mouse', '1-12nmouses', '1-12person', '1-13mouse', '1-13nmouses', '1-13person', '1-14bird', '1-14mouse', '1-14nmouses', '1-14person', '1-14cat', '1-15cat', '1-1bird', '1-1mouse', '1-1nmouses', '1-1person', '1-1cat', '1-2bird', '1-2mouse', '1-2nmouses', '1-2person', '1-2cat', '1-3bird', '1-3mouse', '1-3nmouses', '1-3person', '1-3cat', '2-14cat', '2-15cat', '2-16cat', '2-1bird', '2-1mouse', '2-1nmouses', '2-1person', '2-1cat', '2-2bird', '2-2mouse', '2-2nmouses', '2-2person']
>>> print(*sorted(myList, key=callback), sep='\n')
1-1bird
1-1cat
1-1mouse
1-1nmouses
1-1person
1-2bird
1-2cat
1-2mouse
1-2nmouses
1-2person
1-3bird
1-3cat
1-3mouse
1-3nmouses
1-3person
1-10bird
1-10cat
1-10mouse
1-10nmouses
1-10person
1-11bird
1-11cat
1-11mouse
1-11nmouses
1-11person
1-12bird
1-12mouse
1-12nmouses
1-12person
1-13mouse
1-13nmouses
1-13person
1-14bird
1-14cat
1-14mouse
1-14nmouses
1-14person
1-15cat
2-1bird
2-1cat
2-1mouse
2-1nmouses
2-1person
2-2bird
2-2mouse
2-2nmouses
2-2person
2-14cat
2-15cat
2-16cat
You need leading zeros. Strings are sorted alphabetically with the order different from the one for digits. It should be
'01-1bird'
'01-1mouse'
'01-1nmouses'
'01-2mouse'
'01-2nmouses'
'01-3bird'
'10-1birds'
As you you see 1 goes after 0.
The other answers here are very respectable, I'm sure, but for full credit you should ensure that your answer fits on a single line and uses as many list comprehensions as possible:
import itertools
[''.join(r) for r in sorted([[''.join(x) for _, x in
itertools.groupby(v, key=str.isdigit)]
for v in myList], key=lambda v: (int(v[0]), int(v[2]), v[3]))]
That should do nicely:
['1-1bird',
'1-1cat',
'1-1mouse',
'1-1nmouses',
'1-1person',
'1-2bird',
'1-2cat',
'1-2mouse',
...
'2-2person',
'2-14cat',
'2-15cat',
'2-16cat']

TypeError: 'filter' object is not subscriptable

I am receiving the error
TypeError: 'filter' object is not subscriptable
When trying to run the following block of code
bonds_unique = {}
for bond in bonds_new:
if bond[0] < 0:
ghost_atom = -(bond[0]) - 1
bond_index = 0
elif bond[1] < 0:
ghost_atom = -(bond[1]) - 1
bond_index = 1
else:
bonds_unique[repr(bond)] = bond
continue
if sheet[ghost_atom][1] > r_length or sheet[ghost_atom][1] < 0:
ghost_x = sheet[ghost_atom][0]
ghost_y = sheet[ghost_atom][1] % r_length
image = filter(lambda i: abs(i[0] - ghost_x) < 1e-2 and
abs(i[1] - ghost_y) < 1e-2, sheet)
bond[bond_index] = old_to_new[sheet.index(image[0]) + 1 ]
bond.sort()
#print >> stderr, ghost_atom +1, bond[bond_index], image
bonds_unique[repr(bond)] = bond
# Removing duplicate bonds
bonds_unique = sorted(bonds_unique.values())
And
sheet_new = []
bonds_new = []
old_to_new = {}
sheet=[]
bonds=[]
The error occurs at the line
bond[bond_index] = old_to_new[sheet.index(image[0]) + 1 ]
I apologise that this type of question has been posted on SO many times, but I am fairly new to Python and do not fully understand dictionaries. Am I trying to use a dictionary in a way in which it should not be used, or should I be using a dictionary where I am not using it?
I know that the fix is probably very simple (albeit not to me), and I will be very grateful if someone could point me in the right direction.
Once again, I apologise if this question has been answered already
Thanks,
Chris.
I am using Python IDLE 3.3.1 on Windows 7 64-bit.
filter() in python 3 does not return a list, but an iterable filter object. Use the next() function on it to get the first filtered item:
bond[bond_index] = old_to_new[sheet.index(next(image)) + 1 ]
There is no need to convert it to a list, as you only use the first value.
Iterable objects like filter() produce results on demand rather than all in one go. If your sheet list is very large, it might take a long time and a lot of memory to put all the filtered results into a list, but filter() only needs to evaluate your lambda condition until one of the values from sheet produces a True result to produce one output. You tell the filter() object to scan through sheet for that first value by passing it to the next() function. You could do so multiple times to get multiple values, or use other tools that take iterables to do more complex things; the itertools library is full of such tools. The Python for loop is another such a tool, it too takes values from an iterable one by one.
If you must have access to all filtered results together, because you have to, say, index into the results at will (e.g. because this time your algorithm needed to access index 223, index 17 then index 42) only then convert the iterable object to a list, by using list():
image = list(filter(lambda i: ..., sheet))
The ability to access any of the values of an ordered sequence of values is called random access; a list is such a sequence, and so is a tuple or a numpy array. Iterables do not provide random access.
Use list before filter condtion then it works fine. For me it resolved the issue.
For example
list(filter(lambda x: x%2!=0, mylist))
instead of
filter(lambda x: x%2!=0, mylist)
image = list(filter(lambda i: abs(i[0] - ghost_x) < 1e-2 and abs(i[1] - ghost_y) < 1e-2, sheet))

Categories