Trouble finding certain tag values - python

I'm trying to find the value of several tags using pydicom. For some reason only certain tags work while others do not. Below is a traceback explaining my problem. Can anyone see a way around the int() base 16 problem?
>>> ds['0x18','0x21'].value
'SP'
>>> ds['0x18','13x14'].value
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/space/jazz/1/users/gwarner/anaconda/lib/python2.7/site-packages/pydicom-0.9.9-py2.7.egg/dicom/dataset.py", line 276, in __getitem__
tag = Tag(key)
File "/space/jazz/1/users/gwarner/anaconda/lib/python2.7/site-packages/pydicom-0.9.9-py2.7.egg/dicom/tag.py", line 27, in Tag
arg = (int(arg[0], 16), int(arg[1], 16))
ValueError: invalid literal for int() with base 16: '13x14'

'13x14' is not a valid representation of base 16 number.
In python, base 16 numbers are represented with '0x' as prefix and then the number in base 16.
For example:
0x0, 0x1, 0x001, 0x235, 0xA5F, ..., are all valid base 16 number representations.
This:
ds['0x18','13x14'].value
Could be, for example, this:
ds['0x18','0x14'].value
And it should execute fine.

Related

Beautiful Soup web scraping and working with integers

I have the following code, using BeautifulSoup and Python to webscrape (and subsequently work out a percentage) pertaining to some coronavirus stats:
url = "https://www.worldometers.info/coronavirus/"
req = requests.get(url)
bsObj = BeautifulSoup(req.text, "html.parser")
data = bsObj.find_all("div",class_ = "maincounter-number")
totalcases=data[0].text.strip()
recovered=data[2].text.strip()
print(totalcases+3)
percentagerecovered=recovered/totalcases*100
The issue I am having is in producing the required value for the variable percentagerecovered.
I want to be working with integers, but the above didn't work, so I tried:
percentagecovered=int(recovered)/int(totalcases)*100 but it gave this error:
File "E:\webscraper\webscraper\webscraper.py", line 17, in <module>
percentagerecovered=int(recovered)/int(totalcases)*100
ValueError: invalid literal for int() with base 10: '6,175,537'
However, when I removed the casting, and tried to just print to see the value it gave a different error, that I am struggling to understand.
I changed it to:
totalcases=data[0].text.strip()
recovered=data[2].text.strip()
print(totalcases+3)
percentagerecovered=recovered/totalcases*100
ERROR
File "webscraper.py", line 16, in <module>
print(totalcases+3)
TypeError: can only concatenate str (not "int") to str
I simply want to obtain those strings using the split method and then work with them assuming they are integers.
Currently, when I pass them (without casting) it doesn't display anything on the page...but when I do cast turning them into int, i get errors. What am I doing wrong?
I also tried:
totalcases=int(totalcases)
recovered=int(recovered)
but this produced a further error:
File "webscraper.py", line 17, in <module>
totalcases=int(totalcases)
ValueError: invalid literal for int() with base 10: '11,018,642'
I also tried this: (stripping the comma) as suggested below in the comments:
totalcases=data[0].text.strip()
recovered=data[2].text.strip()
totalcases=totalcases.strip(",")
totalcases=int(totalcases)
recovered=recovered.strip(",")
recovered=int(recovered)
percentagerecovered=recovered/totalcases*100
ERROR:
totalcases=int(totalcases)
ValueError: invalid literal for int() with base 10: '11,018,684'
I note solutions like the function below (which I haven't tried) yet but they seem unnecessarily complex for what I'm trying to do. What is the best and easiest/most elegant solution.
This seems along the right lines, but still produces an error:
int(totalcases.replace(',', ''))
int(recovered.replace(',', ''))
ERROR:
File "webscraper.py", line 25, in <module>
percentagerecovered=recovered/totalcases*100
TypeError: unsupported operand type(s) for /: 'str' and 'str'
i wrote this little function that return to you a number, so you can increase it or do what ever you want
def str_to_int(text=None):
if text == None:
print('no text')
else:
text = text.split(',')
num = int(''.join(text))
return num
For example you have the number of totalcases: '11,018,642', so you do this:
totalcases = str_to_int('11,018,642')
Now you can do totalcases*100 or anything else with it
Another simple way to do it:
totalcases= int(data[0].text.strip().replace(',',''))
recovered = int(data[2].text.strip().replace(',',''))

Dividing and multiplying Decimal objects in Python

In the following code, both coeff1 and coeff2 are Decimal objects. When i check their type using type(coeff1), i get (class 'decimal.Decimal') but when i made a test code and checked decimal objects i get decimal. Decimal, without the word class
coeff1 = system[i].normal_vector.coordinates[i]
coeff2 = system[m].normal_vector.coordinates[i]
x = coeff2/coeff1
print(type(x))
system.xrow_add_to_row(x,i,m)
another issue is when i change the first input to the function xrow_add_to_row to negative x:
system.xrow_add_to_row(-x,i,m)
I get invalid operation error at a line that is above the changed code:
<ipython-input-11-ce84b250bafa> in compute_triangular_form(self)
93 coeff1 = system[i].normal_vector.coordinates[i]
94 coeff2 = system[m].normal_vector.coordinates[i]
---> 95 x = coeff2/coeff1
96 print(type(coeff1))
97 system.xrow_add_to_row(-x,i,m)
InvalidOperation: [<class 'decimal.DivisionUndefined'>]
But then again in a test code i use negative numbers with Decimal objects and it works fine. Any idea what the problem might be? Thanks.
decimal.DivisionUndefined is raised when you attempt to divide zero by zero. It's a bit confusing as you get a different exception when only the divisor is zero (decimal.DivisionByZero)
>>> import decimal.Decimal as D
>>> D(0) / D(0)
Traceback (most recent call last):
File "<pyshell#1>", line 1, in <module>
D(0) / D(0)
decimal.InvalidOperation: [<class 'decimal.DivisionUndefined'>]
>>> D(1) / D(0)
Traceback (most recent call last):
File "<pyshell#2>", line 1, in <module>
D(1) / D(0)
decimal.DivisionByZero: [<class 'decimal.DivisionByZero'>]

Is there any way to make Python .format() thow an exception if the data won't fit the field?

I want to normalize floating-point numbers to nn.nn strings, and to do some special handling if the number is out of range.
try:
norm = '{:5.2f}'.format(f)
except ValueError:
norm = 'BadData' # actually a bit more complex than this
except it doesn't work: .format silently overflows the 5-character width. Obviously I could length-check norm and raise my own ValueError, but have I missed any way to force format (or the older % formatting) to raise an exception on field-width overflow?
You can not achieve this with format(). You have to create your custom formatter which raises the exception. For example:
def format_float(num, max_int=5, decimal=2):
if len(str(num).split('.')[0])>max_int:
raise ValueError('Integer part of float can have maximum {} digits'.format(max_int))
return "{:.2f}".format(num)
Sample run:
>>> format_float(123.456)
'123.46'
>>> format_float(123.4)
'123.40'
>>> format_float(123789.456) # Error since integer part is having length more than 5
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in format_float
ValueError: Integer part of float can have maximum 5 digits

TypeError: not all arguments converted during string formatting 11

def main():
spiral = open('spiral.txt', 'r') # open input text file
dim = spiral.readline() # read first line of text
print(dim)
if (dim % 2 == 0): # check to see if even
dim += 1 # make odd
I know this is probably very obvious but I can't figure out what is going on. I am reading a file that simply has one number and checking to see if it is even. I know it is being read correctly because it prints out 10 when I call it to print dim. But then it says:
TypeError: not all arguments converted during string formatting
for the line in which I am testing to see if dim is even. I'm sure it's basic but I can't figure it out.
The readline method of file objects always returns a string; it will not convert the number into an integer for you. You need to do this explicitly:
dim = int(spiral.readline())
Otherwise, dim will be a string and doing dim % 2 will cause Python to try to perform string formatting with 2 as an argument:
>>> '10' % 2
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: not all arguments converted during string formatting
>>>
Also, doing print(dim) outputed 10 instead of '10' because print automatically removes the apostrophes when printing:
>>> print('10')
10
>>>

Float value is equal -1.#IND

A function returns a list which contains of float values. If I plot this list, I see that some of the float values are equal -1.#IND. I also checked the type of those -1.#IND values. And they are also of float type.
But how can I understand this -1.#IND values? What do they represent or stand for?
-1.#IND means indefinite, the result of a floating point equation that doesn't have a solution. On other platforms, you'd get NaN instead, meaning 'not a number', -1.#IND is specific to Windows. On Python 2.5 on Linux I get:
>>> 1e300 * 1e300 * 0
-nan
You'll only find this on python versions 2.5 and before, on Windows platforms. The float() code was improved in python 2.6 and consistently uses float('nan') for such results; mostly because there was no way to turn 1.#INF and -1.#IND back into an actual float() instance again:
>>> repr(inf)
'1.#INF'
>>> float(_)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for float(): 1.#INF
>>> repr(nan)
'-1.#IND'
>>> float(_)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for float(): -1.#IND
On versions 2.6 and newer this has all been cleaned up and made consistent:
>>> 1e300 * 1e300 * 0
nan
>>> 1e300 * 1e300
inf
>>> 1e300 * 1e300 * -1
-inf

Categories