Beautiful Soup web scraping and working with integers - python

I have the following code, using BeautifulSoup and Python to webscrape (and subsequently work out a percentage) pertaining to some coronavirus stats:
url = "https://www.worldometers.info/coronavirus/"
req = requests.get(url)
bsObj = BeautifulSoup(req.text, "html.parser")
data = bsObj.find_all("div",class_ = "maincounter-number")
totalcases=data[0].text.strip()
recovered=data[2].text.strip()
print(totalcases+3)
percentagerecovered=recovered/totalcases*100
The issue I am having is in producing the required value for the variable percentagerecovered.
I want to be working with integers, but the above didn't work, so I tried:
percentagecovered=int(recovered)/int(totalcases)*100 but it gave this error:
File "E:\webscraper\webscraper\webscraper.py", line 17, in <module>
percentagerecovered=int(recovered)/int(totalcases)*100
ValueError: invalid literal for int() with base 10: '6,175,537'
However, when I removed the casting, and tried to just print to see the value it gave a different error, that I am struggling to understand.
I changed it to:
totalcases=data[0].text.strip()
recovered=data[2].text.strip()
print(totalcases+3)
percentagerecovered=recovered/totalcases*100
ERROR
File "webscraper.py", line 16, in <module>
print(totalcases+3)
TypeError: can only concatenate str (not "int") to str
I simply want to obtain those strings using the split method and then work with them assuming they are integers.
Currently, when I pass them (without casting) it doesn't display anything on the page...but when I do cast turning them into int, i get errors. What am I doing wrong?
I also tried:
totalcases=int(totalcases)
recovered=int(recovered)
but this produced a further error:
File "webscraper.py", line 17, in <module>
totalcases=int(totalcases)
ValueError: invalid literal for int() with base 10: '11,018,642'
I also tried this: (stripping the comma) as suggested below in the comments:
totalcases=data[0].text.strip()
recovered=data[2].text.strip()
totalcases=totalcases.strip(",")
totalcases=int(totalcases)
recovered=recovered.strip(",")
recovered=int(recovered)
percentagerecovered=recovered/totalcases*100
ERROR:
totalcases=int(totalcases)
ValueError: invalid literal for int() with base 10: '11,018,684'
I note solutions like the function below (which I haven't tried) yet but they seem unnecessarily complex for what I'm trying to do. What is the best and easiest/most elegant solution.
This seems along the right lines, but still produces an error:
int(totalcases.replace(',', ''))
int(recovered.replace(',', ''))
ERROR:
File "webscraper.py", line 25, in <module>
percentagerecovered=recovered/totalcases*100
TypeError: unsupported operand type(s) for /: 'str' and 'str'

i wrote this little function that return to you a number, so you can increase it or do what ever you want
def str_to_int(text=None):
if text == None:
print('no text')
else:
text = text.split(',')
num = int(''.join(text))
return num
For example you have the number of totalcases: '11,018,642', so you do this:
totalcases = str_to_int('11,018,642')
Now you can do totalcases*100 or anything else with it

Another simple way to do it:
totalcases= int(data[0].text.strip().replace(',',''))
recovered = int(data[2].text.strip().replace(',',''))

Related

TypeError: cannot use a string pattern on a bytes-like object : Pyexpect In Python3

I am trying to use pexpect module (version 4.8.0) with Python 3.6.8. I get an error
TypeError: cannot use a string pattern on a bytes-like object
Here is my code:
buff = BytesIO()
child = pexpect.spawn(path, args, logfile=buff, timeout=self.PASSPHRASE_TIMEOUT, **kwargs)
while True:
idx = child.expect([self.PASSPHRASE_RE,pexpect.EOF])
if idx == 0:
child.sendline(passphrase)
elif idx == 1:
child.wait()
break
I am getting an error in line idx = child.expect([self.PASSPHRASE_RE,pexpect.EOF])
Note: I have already tried some solutions that are there on StackOverflow like:
pass encoding('utf-8') parameter in pexpect.spawn.
Replace pexpect.spawn with pexpect.spawnu
But no luck again same error.
Please please help me I have alreday wasted 3 days to resolve this errors.
The issue was with the PASSPHRASE_RE type which was sre.SRE_Pattern.
I just converted it into the string using PASSPHRASE_RE.pattern

Error: in the string to float conversion. float object is not subscriptable

My score are here where these are in a json file
"score": [0.7503408193588257, 0.43428170680999756]
the my code for making if to a string is like this:
score = data[model][id]["score"][0][index]
all = (class_info +" "+ str(score)+" "+x1+" "+y1+" "+x2+" "+y2+" "+"\n")
where x1,y1,x2,y2 and class_info are some variable.
and I am getting this type of error in python 3.6.9:
File "convert_formet.py", line 28, in main_function
score = data[model][id]["score"][0][index]
TypeError: 'float' object is not subscriptable
some one help me out pls.
#CryptoFool says me to do this:
score = data[model][id]["score"][0][index]
or this:
score = data[model][id]["score"][0][index]
So, I found my answer.
We should match the variable correctly. If not this type of errors will come.

ValueError: invalid literal for int() with base 10; Trying to extract an integer from a float

Im trying to make the program to identify a number in a NETCDF file name, I altered the code, but is still giving me the same error and I can't identify why.
The section of the code creating the error is:
Band = int((listofallthefiles[number][listofallthefiles[number].find("M3C" or "M4C" or "M6C")+3:listofallthefiles[number].find("_G16")]))
The path and name of the NETCDF file is:
/Volumes/Anthonys_backup/Hurricane_Dorian/August_28/Channel_13/OR_ABI-L2-CMIPF-M6C13_G16_s20192400000200_e20192400009520_c20192400010004.nc
Im trying to extract the "13" between "M6C" and "_G16" to save the value, but its giving me the error message:
ValueError: invalid literal for int() with base 10: 'olumes/Anthonys_backup/Hurricane_Dorian/August_28/Channel_13/OR_ABI-L2-CMIPF-M6C13'
First extract the number of your string, so that int can properly convert it, see here.
It might be easier to use regex to do so, e.g.:
import re
...
str = listofallthefiles[number]
num = re.findall('.*M6C(.*)_G16', str)[0]
Now you can convert that to an integer:
val = int(num)

Error appears when attempting to create a map with folium in Python

My assignment is to create an html file of a map. The data has already been given to us. However, when I try to execute my code, I get two errors:
"TypeError: 'str' object cannot be interpreted as an integer"
and
"KeyError: 'Latitude'"
this is the code that I've written:
import folium
import pandas as pd
cuny = pd.read_csv('datafile.csv')
print (cuny)
mapCUNY = folium.Map(location=[40.768731, -73.964915])
for index,row in cuny.iterrows():
lat = row["Latitude"]
lon = row["Longitude"]
name = row["TIME"]
newMarker = folium.Marker([lat,lon], popup=name)
newMarker.add_to(mapCUNY)
out = input('name: ')
mapCUNY.save(outfile = 'out.html')
When I run it, I get all the data in the python shell and then those two errors mentioned above pop up.
Something must have gone wrong, and I'll admit I'm not at all good with this stuff. Could anyone let me know if they spot error(s) or know what I've done wrong?
Generally, "TypeError: 'str' object cannot be interpreted as an integer" can happen when you try to use a string as an integer.
For example:
num_string = "2"
num = num_string+1 # This fails with type error, because num is a string
num = int(num_string) + 1 # This does not fail because num is cast to int
A key error means that the key you are requesting does not exist. Perhaps there is no latitude key, or its misspelled/incorrect capitalization.

Python Operation % not working?

I cant seem to get this working, cant someone point me in the right direction? If i put the the values without promting it works buut when i do this i get error.
username1 = raw_input('Enter Username:\n')
password = raw_input('Enter Password:\n')
r = requests.get("https://linktoasp.net/",auth=HttpNtlmAuth("domain\\%s",password),cookies=jar) % (username1)
Error:
Traceback (most recent call last): File "attend_punch.py", line 32,
in
r = requests.get("https://linktoasp.netserver/homeportal/default.aspx",auth=HttpNtlmAuth("domain\\%r",password),cookies=jar)
% (username1) TypeError: unsupported operand type(s) for %:
'Response' and 'str'
You could try this instead
auth = HttpNtlmAuth("domain\\%s" % username1, password), cookies = jar)
What you probably want is:
r = requests.get(
"https://linktoasp.net/",
auth=HttpNtlmAuth("domain\\%s" % username1,password),cookies=jar)
In order to do string interpolation with %, the % and the value need to immediately follow the string:
"domain\\%s" % username1
rather than just coming later in the line:
HttpNtlmAuth("domain\\%s", ...) % username1
The % symbol can have 2 meanings in Python:
The modulo operator which will give you the remainder of the division of an int by the other. That is usually used for 2 numbers.
The string formatting operator which comes after a string to replace the placeholders with actual values. That's what you want, but you are not placing it right after the string, so Python interprets it as the modulo operator, and since it's not defined for any object (only for int usually), raises that exception.

Categories