Python BeautifulSoup string to float

Python BeautifulSoup string to float - python

I have downloaded a data set in xml format from online webpage. I have extracted the values tag using beautifulsoup library of python. This gives me unicode values.
graphs = soup.graphs
c = 0
for q in graphs:
name = q['title']
data = {}
for r in graphs.contents[c]:
print float(str(unicode(r.string)))
data[r['xid']] = unicode(r.string)
c = c + 1
result[name] = [data[k] for k in key]
The Source is http://charts.realclearpolitics.com/charts/1171.xml
And I want to make r.string float type
So I did
print float(str(unicode(r.string)))
print float(unicode(r.string))
But I met this err
File "<ipython-input-142-cf14a8845443>", line 73
print float(unicode(r.string)))
^
SyntaxError: invalid syntax
How could i do?

First error is imbalanced round brackets.
print float(str(unicode(r.string))))
^ 4th here
Second error, check the value whether its None or not before making the operation. Otherwise you'll get error ValueError: could not convert string to float: None
So the fix will be:
if(r.string != None):
print float(str(unicode(r.string)))

The syntax error is because of unbalanced parentheses (remove one from right). r.string is probably None, hence the TypeError

Related

Even though assigned zero to variables yet I am getting error "cannot aasign to literal"

def car_type(m):
ar=0,br=0,cr=0
for i in m:
if i=='sedan':
ar+=1
if i=='SUV':
br+=1
if i=='hatchback':
cr+=1
l.append(ar)
l.append(br)
l.append(cr)
print(l)
Error:
File "<ipython-input-7-c1c0dab37c53>", line 2
ar=0,br=0,cr=0
^ SyntaxError: can't assign to literal
What is the issue here?

It's impossible to assign multiple variables in one line like you did.
Correct syntax for multiple assignments in one line:
ar, br, cr = 0, 0, 0
If you want to assign the same values, you can use the following syntax:
ar = br = cr= 0
or just break it down into multiple lines:
ar = 0
br = 0
cr = 0
Which version is better depends on the context and the meaning of the variables.

Value. Error when doing a string strip and attempting to convert object column to an integer in python

I am attempting to strip out any characters other than numbers in a column and then convert that column from an object to integer, but I am receiving an error message.
data.dtypes
Column1 object
the column of interest has numbers but also ',' in it which I believe is preventing it from turning into an integer. What I tried was this but got the subsequent error message
data['Column1'] = data['Column1'].str.strip(',').astype(int)
ValueError: invalid literal for int() with base 10: '21,690'
I can edit and post the entire error message is that is helpful. But I'm assuming that having the ',' in the column is causing the issue. I don't believe there are any other characters in the column, but not sure.
Ultimately I'd like that column just transformed to int data type and stripped of any non numeric values.
edit
I also checked for null values with
data.isnull().sum() * 100 / len(data)
Column1 0.00000
however trying this results in the following error message
data['Column1'] = data['Column1'].str.replace(",", "").astype(int)
ValueError: cannot convert float NaN to integer
Not entirely sure why.

try this :
data['Column1'] = data['Column1'].str.replace(",", "").astype(int)

strip removes only leading and trailing characters. For internal characters, you can use regex, but it's simpler to do replace(',' , '').
Also, as far as debugging, you can do something along these lines:
conversion_none_type_count = 0
conversion_non_int_count = 0
conversion_other_error = 0
conversion_none_types = []
conversion_other_errors = []
def try_convert(cell):
global conversion_none_type_count
global conversion_non_int_count
global conversion_other_error
global conversion_other_errors
global conversion_none_types
if not cell:
conversion_null_count += 1
conversion_none_types.append(cell)
return 0
try:
cell_as_float = float(cell)
except Exception as exception:
conversion_other_error += 1
conversion_other_errors.append(type(exception).__name__)
return 0
if cell_as_float % 1:
conversion_non_int_count += 1
return int(cell_as_float)
data['conversion_attempt'] = data['Column1'].apply(try_convert(cell))
You'll now have conversion_null_count, conversion_non_int_count, and conversion_other_error as the count of number of nulls, floats that aren't integers, and other errors, respectively, while conversion_none_types will be a list of all the none-types in the column, and conversion_other_errors will be a list of the other errors.

Error appears when attempting to create a map with folium in Python

My assignment is to create an html file of a map. The data has already been given to us. However, when I try to execute my code, I get two errors:
"TypeError: 'str' object cannot be interpreted as an integer"
and
"KeyError: 'Latitude'"
this is the code that I've written:
import folium
import pandas as pd
cuny = pd.read_csv('datafile.csv')
print (cuny)
mapCUNY = folium.Map(location=[40.768731, -73.964915])
for index,row in cuny.iterrows():
lat = row["Latitude"]
lon = row["Longitude"]
name = row["TIME"]
newMarker = folium.Marker([lat,lon], popup=name)
newMarker.add_to(mapCUNY)
out = input('name: ')
mapCUNY.save(outfile = 'out.html')
When I run it, I get all the data in the python shell and then those two errors mentioned above pop up.
Something must have gone wrong, and I'll admit I'm not at all good with this stuff. Could anyone let me know if they spot error(s) or know what I've done wrong?

Generally, "TypeError: 'str' object cannot be interpreted as an integer" can happen when you try to use a string as an integer.
For example:
num_string = "2"
num = num_string+1 # This fails with type error, because num is a string
num = int(num_string) + 1 # This does not fail because num is cast to int
A key error means that the key you are requesting does not exist. Perhaps there is no latitude key, or its misspelled/incorrect capitalization.

Binary To String Conversion

I'm relatively new to python and I'm trying to design a program that takes an input of Binary Data and Converts it to a String of Text. I have the following so far but keep getting the following error: TypeError: unsupported operand type(s) for &: 'str' and 'int' Can anyone see where I'm going wrong? And, if so, advise how to fix it?
a = int(input("please enter the binary you want to convert: "))
for str in a:
g = [~(chr(str)&~240)|(e&~240)]
f = 86
e = 231
d = bin(chr(str))
b = (str)
j=(b)
print(j)

There is quite a lot wrong with what you're doing; I'm not sure how you get the error you claim to have, given the other errors in the posted code. In order of reading the function:
Don't call your own variables str, it prevents you from accessing the built-in of the same name. Also, that variable either isn't a str, or causes a TypeError on chr(str).
You can't iterate over an integer for x in y:; this is also a TypeError.
(The error you report) chr(str) returns a string of length one. This is not a valid operand type for &, hence TypeError.
Another operand, e, has not yet been defined, so that will be a NameError.
Irrespective of that, you never use g again anyway.
Or f - what's that even for?
Now e is defined!
bin(chr(str)) will never work - again, chr returns a string, and bin takes a number as an argument.
b = (str) works, but the parentheses are redundant.
Same with j = (b), which is also not indented far enough to be in the loop.
Neither is print(j).
It is not clear what you are trying to achieve, exactly. If you gave example inputs (what format is the "Binary Data"?) and outputs (and what "String of Text" do you want to get?) along with your actual code and the full error traceback this might be easier.
Edit
With the information provided in your comment, it appears that you are trying to reverse these operations:
a = input("please enter the text you want to hide: ")
for ch in a:
## b = (ch) # this still doesn't do anything!
c = ord(ch)
## d = bin(c) # and d is never used
## e = 231 # e is only used to calculate g
f = 86
## g = (((c&240) >> 4)|(e&240)) # which is also never used
h = (((c&15)|(f&240)))
j = bin(h)
print(j)
This produces, for example (a == 'foo'):
0b1010110
0b1011111
0b1011111
However, to convert the input '0b1010110' to an integer, you need to supply int with the appropriate base:
>>> int('0b1010110', 2)
86
and you can't iterate over the integer. I think you want something like:
data = input("please enter the binary you want to convert: ")
for j in data.split():
h = int(j, 2)
...
where the input would be e.g. '0b1010110 0b1011111 0b1011111', or just do one at a time:
h = int(input(...), 2)
Note that, as the function is reversed, you will have to define f before trying to go back through the bitwise operation.

Generating LaTeX tables from R summary with RPy and xtable

I am running a few linear model fits in python (using R as a backend via RPy) and I would like to export some LaTeX tables with my R "summary" data.
This thread explains quite well how to do it in R (with the xtable function), but I cannot figure out how to implement this in RPy.
The only relevant thing searches such as "Chunk RPy" or "xtable RPy" returned was this, which seems to load the package in python but not to use it :-/
Here's an example of how I use RPy and what happens.
And this would be the error without bothering to load any data:
from rpy2.robjects.packages import importr
xtable = importr('xtable')
latex = xtable('')
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-131-8b38f31b5bb9> in <module>()
----> 1 latex = xtable(res_sum)
2 print latex
TypeError: 'SignatureTranslatedPackage' object is not callable
I have tried using the stargazer package instead of xtable and I get the same error.

Ok, I solved it, and I'm a bit ashamed to say that it was a total no-brainer.
You just have to call the functions as xtable.xtable() or stargazer.stargazer().

To easily generate TeX data from Python, I wrote the following function;
import re
def tformat(txt, v):
"""Replace the variables between [] in raw text with the contents
of the named variables. Between the [] there should be a variable name,
a colon and a formatting specification. E.g. [smin:.2f] would give the
value of the smin variable printed as a float with two decimal digits.
:txt: The text to search for replacements
:v: Dictionary to use for variables.
:returns: The txt string with variables substituted by their formatted
values.
"""
rx = re.compile(r'\[(\w+)(\[\d+\])?:([^\]]+)\]')
matches = rx.finditer(txt)
for m in matches:
nm, idx, fmt = m.groups()
try:
if idx:
idx = int(idx[1:-1])
r = format(v[nm][idx], fmt)
else:
r = format(v[nm], fmt)
txt = txt.replace(m.group(0), r)
except KeyError:
raise ValueError('Variable "{}" not found'.format(nm))
return txt
You can use any variable name from the dictionary in the text that you pass to this function and have it replaced by the formatted value of that variable.
What I tend to do is to do my calculations in Python, and then pass the output of the globals() function as the second parameter of tformat:
smin = 235.0
smax = 580.0
lst = [0, 1, 2, 3, 4]
t = r'''The strength of the steel lies between SI{[smin:.0f]}{MPa} and \SI{[smax:.0f]}{MPa}. lst[2] = [lst[2]:d].'''
print tformat(t, globals())
Feel free to use this. I put it in the public domain.
Edit: I'm not sure what you mean by "linear model fits", but might numpy.polyfit do what you want in Python?

To resolve your problem, please update stargazer to version 4.5.3, now available on CRAN. Your example should then work perfectly.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python BeautifulSoup string to float - python

The syntax error is because of unbalanced parentheses (remove one from right). r.string is probably None, hence the TypeError

Related

Even though assigned zero to variables yet I am getting error "cannot aasign to literal"

Value. Error when doing a string strip and attempting to convert object column to an integer in python

Error appears when attempting to create a map with folium in Python

Binary To String Conversion

Generating LaTeX tables from R summary with RPy and xtable

Categories

Resources