How can I get elements using lxml

How can I get elements using lxml - python

https://bankchart.kz/spravochniki/reytingi_cbr/2/2019/7
How can I get text from each column, that is, from the last three blocks with the class <div class = "col-currency-rate"> of each <div class = "row">? I got the table but what to do next?
>>> tree.xpath('//div[#class="table-currency"]/div[#class="row"]')
[<Element div at 0x7fcac2a47ba8>, <Element div at 0x7fcac2a47c00>, <Element div at 0x7fcac2a47c58>, <Element div at 0x7fcac2a47cb0>, <Element div at 0x7fcac2a47d08>, <Element div at 0x7fcac2a47d60>, <Element div at 0x7fcac2a47db8>, <Element div at 0x7fcac2a47e10>, <Element div at 0x7fcac2a47e68>, <Element div at 0x7fcac2a47ec0>, <Element div at 0x7fcac2a47f18>, <Element div at 0x7fcac2a47f70>, <Element div at 0x7fcac2a47fc8>, <Element div at 0x7fcac2a4e050>, <Element div at 0x7fcac2a4e0a8>, <Element div at 0x7fcac2a4e100>, <Element div at 0x7fcac2a4e158>, <Element div at 0x7fcac2a4e1b0>, <Element div at 0x7fcac2a4e208>, <Element div at 0x7fcac2a4e260>, <Element div at 0x7fcac2a4e2b8>, <Element div at 0x7fcac2a4e310>, <Element div at 0x7fcac2a4e368>, <Element div at 0x7fcac2a4e3c0>, <Element div at 0x7fcac2a4e418>, <Element div at 0x7fcac2a4e470>, <Element div at 0x7fcac2a4e4c8>, <Element div at 0x7fcac2a4e520>]
>>> len(tree.xpath('//div[#class="table-currency"]/div[#class="row"]'))
28
html
<div class="table-currency">
<div class="row"><div class="col col-currency">
2.
<img rel="nofollow" src="https://st6.prosto.im/cache/st6/1/0/5/5/1055/1055.jpg" width="16" height="16" alt="">
<a target="_blank" href="/spravochniki/reytingi_banka/2/1057">
ForteBank
</a></div><div class="col col-headery col-currency-rate"><p>Активы банков, тыс. тенге</p></div><div class="col col-headery col-currency-rate"><p>Прирост за июль 2019 года, тыс. тенге</p></div><div class="col col-headery col-currency-rate"><p>Прирост с начала 2019 года, тыс. тенге</p></div><div class="col col-currency-rate"><p>1 985 956 865</p></div><div class="col col-currency-rate"><p></p><p class="arrow-up">+89 298 547</p><p></p></div><div class="col col-currency-rate"><p></p><p class="arrow-up">+390 999 868</p><p></p></div></div>
<div class="row"><div class="col col-currency">
3.
<img rel="nofollow" src="https://st6.prosto.im/cache/st6/1/0/9/5/1095/1095.png" width="16" height="16" alt="">
<a target="_blank" href="/spravochniki/reytingi_banka/2/1076">
Сбербанк России
</a></div><div class="col col-headery col-currency-rate"><p>Активы банков, тыс. тенге</p></div><div class="col col-headery col-currency-rate"><p>Прирост за июль 2019 года, тыс. тенге</p></div><div class="col col-headery col-currency-rate"><p>Прирост с начала 2019 года, тыс. тенге</p></div><div class="col col-currency-rate"><p>1 983 840 092</p></div><div class="col col-currency-rate"><p></p><p class="arrow-up">+88 853 745</p><p></p></div><div class="col col-currency-rate"><p></p><p class="arrow-up">+119 145 827</p><p></p></div></div>
</div>

Complex solution with specific Xpath expressions:
from lxml import html
import requests
url = 'https://bankchart.kz/spravochniki/reytingi_cbr/2/2019/7'
doc = html.document_fromstring(requests.get(url).content)
for row in doc.xpath('//div[#class="table-currency"]/div[#class="row"]'):
bank_name = row.xpath('descendant::a/text()')[0].strip()
print(bank_name)
for cur_rate in row.xpath('div[contains(#class, "col-currency-rate")][position() > last() - 3]'):
print('-', cur_rate.text_content())
print()
Details:
descendant::a/text() - xpath to extract text node of a element which is a child/descendant node of underlined row
div[contains(#class, "col-currency-rate")][position() > last() - 3] - xpath to select div elements with specific class attribute partial value and with a position starting from the 3rd last position to the end (last() - position of the last element, last() - 3 points to the 3rd last position)
The output:
Народный банк Казахстана
- 8 729 518 087
- +101 401 107
- -190 957 466
ForteBank
- 1 985 956 865
- +89 298 547
- +390 999 868
Сбербанк России
- 1 983 840 092
- +88 853 745
- +119 145 827
Kaspi Bank
- 1 907 391 103
- +12 378 770
- +233 318 909
Банк ЦентрКредит
- 1 495 599 542
- +34 795 443
- -14 202 851
АТФБанк
- 1 314 405 536
- +1 661 967
- -19 558 254
First Heartland Jýsan Bank
- 1 217 617 065
- +52 641 777
- -553 564 176
Жилстройсбербанк Казахстана
- 1 148 974 349
- +7 721 823
- +261 041 394
Евразийский банк
- 1 040 820 999
- -25 910 447
- -25 911 373
Ситибанк Казахстан
- 758 117 020
- +48 724 924
- +82 877 576
Банк "Bank RBK"
- 618 310 738
- +21 856 874
- +62 626 834
Альфа-Банк
- 504 777 556
- +17 401 839
- +51 157 130
Altyn Bank («Народный банк Казахстана»)
- 421 018 633
- -20 058 555
- +33 720 048
Нурбанк
- 408 442 557
- +7 065 511
- -18 282 545
Хоум Кредит энд Финанс Банк
- 372 901 871
- -2 127 105
- +33 983 288
Банк Китая в Казахстане
- 324 386 349
- +11 609 880
- +4 997 316
Банк ВТБ
- 184 247 490
- +5 800 194
- +40 725 927
First Heartland Bank (Банк ЭкспоКреди)
- 173 058 018
- -17 261 535
- +16 047 168
Торгово-промышленный Банк Китая в Алматы
- 140 792 847
- +6 365 348
- -26 137 736
Банк Kassa Nova
- 133 910 512
- +954 985
- +4 039 523
Tengri Bank (Punjab National Bank)
- 133 721 602
- +1 136 896
- -485 570
Азия Кредит Банк
- 99 659 306
- -3 790 116
- -21 420 844
Capital Bank Kazakhstan
- 85 702 895
- -3 165 322
- +4 469 187
KZI Bank (Казахстан Зират Интернешнл)
- 65 240 704
- -3 412 060
- -126 750
Шинхан Банк Казахстан
- 43 323 406
- -7 588 366
- +722 399
Исламский Банк "Al-Hilal"
- 30 562 279
- +2 411 098
- -1 430 198
Заман-Банк
- 22 969 984
- -168 105
- +5 544 675
Национальный Банк Пакистана
- 4 705 084
- -20 113
- -131 233

Try using this
import requests
import bs4 as bs
base_url = 'https://bankchart.kz/spravochniki/reytingi_cbr/2/2019/7'
soup = bs.BeautifulSoup(requests.get(base_url).text, 'lxml')
res = soup.find_all('div', {'class': 'row'})
final = list()
# res[1:] to skip the header of the columns
for bank in res[1:]:
bank_data = list()
# Bank name
bank_data.append(bank.find('a').text.strip('\n'))
# Image
bank_data.append(bank.find('img')['src'])
res = bank.find_all('div', {'class': 'col col-currency-rate'})
for values in res:
data = values.find_all('p')
for x in data:
if x.text:
# All the three values
bank_data.append(x.text)
final.append(bank_data)
for x in final:
print(x)
Check if this works for you.

Related

Finding Common Elements (Amazon SDE-1)

Given two lists V1 and V2 of sizes n and m respectively. Return the list of elements common to both the lists and return the list in sorted order. Duplicates may be there in the output list.
Link to the problem : LINK
Example:
Input:
5
3 4 2 2 4
4
3 2 2 7
Output:
2 2 3
Explanation:
The first list is {3 4 2 2 4}, and the second list is {3 2 2 7}.
The common elements in sorted order are {2 2 3}
Expected Time complexity : O(N)
My code:
class Solution:
def common_element(self,v1,v2):
dict1 = {}
ans = []
for num1 in v1:
dict1[num1] = 0
for num2 in v2:
if num2 in dict1:
ans.append(num2)
return sorted(ans)
Problem with my code:
So the accessing time in a dictionary is constant and hence my time complexity was reduced but one of the hidden test cases is failing and my logic is very simple and straight forward and everything seems to be on point. What's your take? Is the logic wrong or the question desc is missing some vital details?
New Approach
Now I am generating two hashmaps/dictionaries for the two arrays. If a num is present in another array, we check the min frequency and then appending that num into the ans that many times.
class Solution:
def common_element(self,arr1,arr2):
dict1 = {}
dict2 = {}
ans = []
for num1 in arr1:
dict1[num1] = 0
for num1 in arr1:
dict1[num1] += 1
for num2 in arr2:
dict2[num2] = 0
for num2 in arr2:
dict2[num2] += 1
for number in dict1:
if number in dict2:
minFreq = min(dict1[number],dict2[number])
for _ in range(minFreq):
ans.append(number)
return sorted(ans)
The code is outputting nothing for this test case
Input:
64920
83454 38720 96164 26694 34159 26694 51732 64378 41604 13682 82725 82237 41850 26501 29460 57055 10851 58745 22405 37332 68806 65956 24444 97310 72883 33190 88996 42918 56060 73526 33825 8241 37300 46719 45367 1116 79566 75831 14760 95648 49875 66341 39691 56110 83764 67379 83210 31115 10030 90456 33607 62065 41831 65110 34633 81943 45048 92837 54415 29171 63497 10714 37685 68717 58156 51743 64900 85997 24597 73904 10421 41880 41826 40845 31548 14259 11134 16392 58525 3128 85059 29188 13812.................
Its Correct output is:
4 6 9 14 17 19 21 26 28 32 33 42 45 54 61 64 67 72 77 86 93 108 113 115 115 124 129 133 135 137 138 141 142 144 148 151 154 160 167 173 174 192 193 195 198 202 205 209 215 219 220 221 231 231 233 235 236 238 239 241 245 246 246 247 254 255 257 262 277 283 286 290 294 298 305 305 307 309 311 312 316 319 321 323 325 325 326 329 329 335 338 340 341 350 353 355 358 364 367 369 378 385 387 391 401 404 405 406 406 410 413 416 417 421 434 435 443 449 452 455 456 459 460 460 466 467 469 473 482 496 503 .................
And Your Code's output is:

Please find the below solution
def sorted_common_elemen(v1, v2):
res = []
for elem in v2:
res.append(elem)
v1.pop(0)
return sorted(res)

Your code ignores the number of times a given element occurs in the list. I think this is a good way to fix that:
class Solution:
def common_element(self, l0, l1):
li = []
for i in l0:
if i in l1:
l1.remove(i)
li.append(i)
return sorted(li)

Python - Print updating counter not working

This is my code. The goal is to print a counter that updates the number of the page that's being checked within the same lane, replacing the old one:
import time
start_page = 500
stop_page = 400
print 'Checking page ',
for n in range(start_page,stop_page,-1):
print str(n),
time.sleep(5) # This to simulate the execution of my code
print '\r',
This doesn't print anything:
$ python test.py
$
I'm using Python 2.7.10, the line that causes problems is probably this print '\r', because if I run this:
import time
start_page = 500
stop_page = 400
print 'Checking page ',
for n in range(start_page,stop_page,-1):
print str(n),
time.sleep(5) # This to simulate the execution of my code
#print '\r',
I have this output:
$ python test.py
Checking page 500 499 498 497 496 495 494 493 492 491 490 489 488 487 486 485 484 483 482 481 480 479 478 477 476 475 474 473 472 471 470 469 468 467 466 465 464 463 462 461 460 459 458 457 456 455 454 453 452 451 450 449 448 447 446 445 444 443 442 441 440 439 438 437 436 435 434 433 432 431 430 429 428 427 426 425 424 423 422 421 420 419 418 417 416 415 414 413 412 411 410 409 408 407 406 405 404 403 402 401
$

Remove the comas after the print expressions:
print 'Checking page ',
print str(n),
print '\r',
PS: Since I got asked, the first thing to notice is that print is not a function it is a statement, hence it is not interpreted in the same way.
In the print case in particular, adding a ',' after the print will make it print the content without a newline.
In the case of your program, in particular, what it was doing is:
printing 'Checking page' -> NO \n here
printing n -> no \n here
printing '\r' -> again no '\n' here
Since you were not sending any new lines to the output, your OS didn't flush the data. You can add a sys.stdout.flush() after the print('\r') and see it changing if you want.
More on the print statement here.
https://docs.python.org/2/reference/simple_stmts.html#grammar-token-print_stmt
Why the hell I got downvoted? o.O

Beautifulsoup split text in tag by <br/>

Is it possible to split a text from a tag by br tags?
I have this tag contents: [u'+420 777 593 531', <br/>, u'+420 776 593 531', <br/>, u'+420 775 593 531']
And I want to get only numbers.
Any advices?
EDIT:
[x for x in dt.find_next_sibling('dd').contents if x!=' <br/>']
Does not work at all.

You need to test for tags, which are modelled as Element instances. Element objects have a name attribute, while text elements don't (which are NavigableText instances):
[x for x in dt.find_next_sibling('dd').contents if getattr(x, 'name', None) != 'br']
Since you appear to only have text and <br /> elements in that <dd> element, you may as well just get all the contained strings instead:
list(dt.find_next_sibling('dd').stripped_strings)
Demo:
>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup('''\
... <dt>Term</dt>
... <dd>
... +420 777 593 531<br/>
... +420 776 593 531<br/>
... +420 775 593 531<br/>
... </dd>
... ''')
>>> dt = soup.dt
>>> [x for x in dt.find_next_sibling('dd').contents if getattr(x, 'name', None) != 'br']
[u'\n +420 777 593 531', u'\n +420 776 593 531', u'\n +420 775 593 531', u'\n']
>>> list(dt.find_next_sibling('dd').stripped_strings)
[u'+420 777 593 531', u'+420 776 593 531', u'+420 775 593 531']

Using get_text(strip=True, separator='\n') with str.splitlines:
from bs4 import BeautifulSoup
soup = BeautifulSoup('''\
<dt>Term</dt>
<dd>
+420 777 593 531<br/>
+420 776 593 531<br/>
+420 775 593 531<br/>
</dd>
''', 'html.parser')
print(soup.dd.get_text(strip=True, separator='\n').splitlines())
# ['+420 777 593 531', '+420 776 593 531', '+420 775 593 531']

tag = BeautifulSoup('''
<dd>
+420 777 593 531<br/>
+420 776 593 531<br/>
+420 775 593 531<br/>
</dd>
''', 'html.parser')
Convert this to a string
str_tag = str(tag)
Now split using <br/> tag and convert back to BeautifulSoup and extract text from it
numbers = [BeautifulSoup(_,'html.parser').text.strip() for _ in str(soup).split('<br/>')]
# output : ['+420 777 593 531', '+420 776 593 531', '+420 775 593 531', '']

how to print only lines which contain a substring

i have a file with strings:
REALSTEP 12342 {2012-7-20 15:10:39};[{416 369 338};{423 432 349};{383 380 357};{399 401 242};{0 454 285};{457 433 115};{419 455 314};{495 534 498};][{428 377 336};{433 456 345};{386 380 363};{384 411
REALSTEP 7191 {2012-7-20 15:10:41};[{416 370 361};{406 417 376};{377 368 359};{431 387 251};{0 461 366};{438 409 134};{429 411 349};{424 505 364};][{423 372 353};{420 433 374};{379 365 356};{431 387 2
REALSTEP 12123 {2012-7-20 15:10:42};[{375 382 329};{386 402 347};{374 378 357};{382 384 259};{0 397 357};{442 424 188};{398 384 356};{392 420 355};][{404 405 359};{420 432 372};{405 408 383};{413 407
REALSTEP 27237 {2012-7-20 15:10:44};[{431 375 329};{416 453 334};{387 382 349};{397 403 248};{0 451 300};{453 422 131};{433 401 317};{434 505 326};][{443 384 328};{427 467 336};{391 386 344};{394 413
FAKE 32290 {2012-7-20 15:10:48};[{424 399 364};{408 446 366};{397 394 389};{415 409 261};{0 430 374};{445 428 162};{432 416 375};{431 473 380};][{424 398 376};{412 436 372};{401 400 390};{417 409 261}
FAKE 32296 {2012-7-20 15:10:53};[{409 326 394};{445 425 353};{401 402 357};{390 424 250};{0 420 353};{447 423 143};{404 436 351};{421 527 420};][{410 332 400};{450 429 356};{402 403 356};{391 425 250}
FAKE 32296 {2012-7-20 15:10:59};[{381 312 387};{413 405 328};{320 387 376};{388 387 262};{0 402 326};{417 418 177};{407 409 335};{443 502 413};][{412 336 417};{446 437 353};{343 417 403};{417 418 258}
FAKE 32295 {2012-7-20 15:11:4};[{377 314 392};{416 403 329};{322 388 375};{385 391 261};{0 403 329};{425 420 168};{414 393 330};{458 502 397};][{408 338 421};{449 435 355};{345 418 403};{413 420 257};
FAKE 32295 {2012-7-20 15:11:9};[{371 318 411};{422 385 333};{342 379 352};{394 395 258};{0 440 338};{418 414 158};{420 445 346};{442 516 439};][{401 342 441};{456 415 358};{367 407 377};{420 420 255};
FAKE 32296 {2012-7-20 15:11:15};[{373 319 412};{423 386 333};{344 384 358};{402 402 257};{0 447 342};{423 416 151};{422 450 348};{447 520 442};][{403 342 442};{456 416 358};{366 409 379};{422 421 255}
REALSTEP 7191 {2012-7-20 15:10:41};[{416 370 361};{406 417 376};{377 368 359};{431 387 251};{0 461 366};{438 409 134};{429 411 349};{424 505 364};][{423 372 353};{420 433 374};{379 365 356};{431 387 2
REALSTEP 12123 {2012-7-20 15:10:42};[{375 382 329};{386 402 347};{374 378 357};{382 384 259};{0 397 357};{442 424 188};{398 384 356};{392 420 355};][{404 405 359};{420 432 372};{405 408 383};{413 407
REALSTEP 27237 {2012-7-20 15:10:44};[{431 375 329};{416 453 334};{387 382 349};{397 403 248};{0 451 300};{453 422 131};{433 401 317};{434 505 326};][{443 384 328};{427 467 336};{391 386 344};{394 413
I read the file with readlines() and want to then loop over the lines and print only when there is a consecutive block of lines larger than 3, containing the string "REALSTEP". So in the example the expected result is:
REALSTEP 12342 {2012-7-20 15:10:39};[{416 369 338};{423 432 349};{383 380 357};{399 401 242};{0 454 285};{457 433 115};{419 455 314};{495 534 498};][{428 377 336};{433 456 345};{386 380 363};{384 411
REALSTEP 7191 {2012-7-20 15:10:41};[{416 370 361};{406 417 376};{377 368 359};{431 387 251};{0 461 366};{438 409 134};{429 411 349};{424 505 364};][{423 372 353};{420 433 374};{379 365 356};{431 387 2
REALSTEP 12123 {2012-7-20 15:10:42};[{375 382 329};{386 402 347};{374 378 357};{382 384 259};{0 397 357};{442 424 188};{398 384 356};{392 420 355};][{404 405 359};{420 432 372};{405 408 383};{413 407
REALSTEP 27237 {2012-7-20 15:10:44};[{431 375 329};{416 453 334};{387 382 349};{397 403 248};{0 451 300};{453 422 131};{433 401 317};{434 505 326};][{443 384 328};{427 467 336};{391 386 344};{394 413
I tried this:
lines = f.readlines()
idx = -1
#loop trough all lines in the file
for i, line in enumerate(lines):
if idx > i:
continue
else:
if "REALSTEP" in line:
steps = lines[i:i+3]
#check for blokc of realsteps
if is_block(steps, "REALSTEP") == 3:
#prints the block up to the first next "FAKE STEP"
lst = get_block(lines[i:-1])
for l in lst:
print l[:200]
idx = i + len(lst)
print "next block============"
where the function is_block is this:
def is_block(lines, check):
#print len(lines)
bool = 1
for l in lines:
if check in l:
bool = 1
else:
bool = 0
bool = bool + bool
return bool
and the function get_block:
def get_block(lines):
lst = []
for l in lines:
if "REALSTEP" in l:
#print l[:200]
lst.append(l)
else:
break
return lst
While this works, it prints all lines containing the string "REALSTEPS". The print len(lines) in is_block(lines) is always 10 when the function is called so that is not it.
I am confused, please help me out here!

Here's a simple solution containing the logic you need:
to_print = []
count = 0
started = False
for line in f.readlines():
if "REALSTEP" in line:
if not started:
started = True
to_print.append(line)
count += 1
else:
if count > 3: print('\n'.join(to_print))
started = False
count = 0
to_print = []
It counts any line that has the string "REALSTEP" in it as valid. Produces the desired output.

This part:
...
if "REALSTEP" in line:
steps = lines[i:i+3]
for s in steps:
print s[:200] # <- right here
...
Whenever you find "REALSTEP" in a line, you retrieve the following three lines and print them right away. That's probably not what you wanted.

Pandas FloatingPoint Error

I'm getting a floating point error on a simple time series in pandas. I'm trying to do shift operations... but this also happens with the window functions like rolling_mean.
EDIT: For some more info... I tried to actually build this from source yesterday prior to the error. I'm not sure if the error would've occurred prior the build attempt, as I'd never messed around w/ these functions.
EDIT2: I thought I'd fixed this, but when I run this inside python it works, but when it's in ipython I get the error.
EDIT3: Numpy 1.7.0, iPython 0.13, pandas 0.7.3
In [35]: ts = Series(np.arange(12), index=DateRange('1/1/2000', periods=12, freq='T'))
In [36]: ts.shift(0)
Out[36]:
2000-01-03 0
2000-01-04 1
2000-01-05 2
2000-01-06 3
2000-01-07 4
2000-01-10 5
2000-01-11 6
2000-01-12 7
2000-01-13 8
2000-01-14 9
2000-01-17 10
2000-01-18 11
In [37]: ts.shift(1)
Out[37]: ---------------------------------------------------------------------------
FloatingPointError Traceback (most recent call last)
/Users/trenthauck/Repository/work/SQS/analysis/campaign/tv2/data/<ipython-input-37-2b7cec97d440> in <module>()
----> 1 ts.shift(1)
/Library/Python/2.7/site-packages/ipython-0.13.dev-py2.7.egg/IPython/core/displayhook.pyc in __call__(self, result)
236 self.start_displayhook()
237 self.write_output_prompt()
--> 238 format_dict = self.compute_format_data(result)
239 self.write_format_data(format_dict)
240 self.update_user_ns(result)
/Library/Python/2.7/site-packages/ipython-0.13.dev-py2.7.egg/IPython/core/displayhook.pyc in compute_format_data(self, result)
148 MIME type representation of the object.
149 """
--> 150 return self.shell.display_formatter.format(result)
151
152 def write_format_data(self, format_dict):
/Library/Python/2.7/site-packages/ipython-0.13.dev-py2.7.egg/IPython/core/formatters.pyc in format(self, obj, include, exclude)
124 continue
125 try:
--> 126 data = formatter(obj)
127 except:
128 # FIXME: log the exception
/Library/Python/2.7/site-packages/ipython-0.13.dev-py2.7.egg/IPython/core/formatters.pyc in __call__(self, obj)
445 type_pprinters=self.type_printers,
446 deferred_pprinters=self.deferred_printers)
--> 447 printer.pretty(obj)
448 printer.flush()
449 return stream.getvalue()
/Library/Python/2.7/site-packages/ipython-0.13.dev-py2.7.egg/IPython/lib/pretty.pyc in pretty(self, obj)
353 if callable(obj_class._repr_pretty_):
354 return obj_class._repr_pretty_(obj, self, cycle)
--> 355 return _default_pprint(obj, self, cycle)
356 finally:
357 self.end_group()
/Library/Python/2.7/site-packages/ipython-0.13.dev-py2.7.egg/IPython/lib/pretty.pyc in _default_pprint(obj, p, cycle)
473 if getattr(klass, '__repr__', None) not in _baseclass_reprs:
474 # A user-provided repr.
--> 475 p.text(repr(obj))
476 return
477 p.begin_group(1, '<')
/Library/Python/2.7/site-packages/pandas/core/series.pyc in __repr__(self)
696 result = self._get_repr(print_header=True,
697 length=len(self) > 50,
--> 698 name=True)
699 else:
700 result = '%s' % ndarray.__repr__(self)
/Library/Python/2.7/site-packages/pandas/core/series.pyc in _get_repr(self, name, print_header, length, na_rep, float_format)
756 length=length, na_rep=na_rep,
757 float_format=float_format)
--> 758 return formatter.to_string()
759
760 def __str__(self):
/Library/Python/2.7/site-packages/pandas/core/format.pyc in to_string(self)
99
100 fmt_index, have_header = self._get_formatted_index()
--> 101 fmt_values = self._get_formatted_values()
102
103 maxlen = max(len(x) for x in fmt_index)
/Library/Python/2.7/site-packages/pandas/core/format.pyc in _get_formatted_values(self)
90 return format_array(self.series.values, None,
91 float_format=self.float_format,
---> 92 na_rep=self.na_rep)
93
94 def to_string(self):
/Library/Python/2.7/site-packages/pandas/core/format.pyc in format_array(values, formatter, float_format, na_rep, digits, space, justify)
431 justify=justify)
432
--> 433 return fmt_obj.get_result()
434
435
/Library/Python/2.7/site-packages/pandas/core/format.pyc in get_result(self)
528
529 # this is pretty arbitrary for now
--> 530 has_large_values = (np.abs(self.values) > 1e8).any()
531
532 if too_long and has_large_values:
FloatingPointError: invalid value encountered in absolute
In [38]: ts.shift(-1)
Out[38]: ---------------------------------------------------------------------------
FloatingPointError Traceback (most recent call last)
/Users/myusername/Repository/work/SQS/analysis/campaign/tv2/data/<ipython-input-38-314ec815a7c5> in <module>()
----> 1 ts.shift(-1)
/Library/Python/2.7/site-packages/ipython-0.13.dev-py2.7.egg/IPython/core/displayhook.pyc in __call__(self, result)
236 self.start_displayhook()
237 self.write_output_prompt()
--> 238 format_dict = self.compute_format_data(result)
239 self.write_format_data(format_dict)
240 self.update_user_ns(result)
/Library/Python/2.7/site-packages/ipython-0.13.dev-py2.7.egg/IPython/core/displayhook.pyc in compute_format_data(self, result)
148 MIME type representation of the object.
149 """
--> 150 return self.shell.display_formatter.format(result)
151
152 def write_format_data(self, format_dict):
/Library/Python/2.7/site-packages/ipython-0.13.dev-py2.7.egg/IPython/core/formatters.pyc in format(self, obj, include, exclude)
124 continue
125 try:
--> 126 data = formatter(obj)
127 except:
128 # FIXME: log the exception
/Library/Python/2.7/site-packages/ipython-0.13.dev-py2.7.egg/IPython/core/formatters.pyc in __call__(self, obj)
445 type_pprinters=self.type_printers,
446 deferred_pprinters=self.deferred_printers)
--> 447 printer.pretty(obj)
448 printer.flush()
449 return stream.getvalue()
/Library/Python/2.7/site-packages/ipython-0.13.dev-py2.7.egg/IPython/lib/pretty.pyc in pretty(self, obj)
353 if callable(obj_class._repr_pretty_):
354 return obj_class._repr_pretty_(obj, self, cycle)
--> 355 return _default_pprint(obj, self, cycle)
356 finally:
357 self.end_group()
/Library/Python/2.7/site-packages/ipython-0.13.dev-py2.7.egg/IPython/lib/pretty.pyc in _default_pprint(obj, p, cycle)
473 if getattr(klass, '__repr__', None) not in _baseclass_reprs:
474 # A user-provided repr.
--> 475 p.text(repr(obj))
476 return
477 p.begin_group(1, '<')
/Library/Python/2.7/site-packages/pandas/core/series.pyc in __repr__(self)
696 result = self._get_repr(print_header=True,
697 length=len(self) > 50,
--> 698 name=True)
699 else:
700 result = '%s' % ndarray.__repr__(self)
/Library/Python/2.7/site-packages/pandas/core/series.pyc in _get_repr(self, name, print_header, length, na_rep, float_format)
756 length=length, na_rep=na_rep,
757 float_format=float_format)
--> 758 return formatter.to_string()
759
760 def __str__(self):
/Library/Python/2.7/site-packages/pandas/core/format.pyc in to_string(self)
99
100 fmt_index, have_header = self._get_formatted_index()
--> 101 fmt_values = self._get_formatted_values()
102
103 maxlen = max(len(x) for x in fmt_index)
/Library/Python/2.7/site-packages/pandas/core/format.pyc in _get_formatted_values(self)
90 return format_array(self.series.values, None,
91 float_format=self.float_format,
---> 92 na_rep=self.na_rep)
93
94 def to_string(self):
/Library/Python/2.7/site-packages/pandas/core/format.pyc in format_array(values, formatter, float_format, na_rep, digits, space, justify)
431 justify=justify)
432
--> 433 return fmt_obj.get_result()
434
435
/Library/Python/2.7/site-packages/pandas/core/format.pyc in get_result(self)
528
529 # this is pretty arbitrary for now
--> 530 has_large_values = (np.abs(self.values) > 1e8).any()
531
532 if too_long and has_large_values:
FloatingPointError: invalid value encountered in absolute

I would add this as a comment, but I don't have the privilege to do that yet :)
It works for me in python and iPython 0.12; iPython 0.13 is still in development (see http://ipython.org/ ), and, since the errors you're getting seem to involve formatting in the iPython 0.13 egg, I suspect that might be the cause. Try with iPython 0.12 instead-- if it works, file a bug report with iPython and then probably stick with 0.12 until 0.13 is (more) stable.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How can I get elements using lxml - python

Related

Finding Common Elements (Amazon SDE-1)

Python - Print updating counter not working

Beautifulsoup split text in tag by <br/>

how to print only lines which contain a substring

Pandas FloatingPoint Error

Categories

Resources