Unable to change a 'str' into an 'int', can only use float - python

I have the below code. This snippet:
pe = df[3][0]
pe = int(pe)
print(pe)
Does not work and will return the error in the 'exception' part of the code below:
ticker = input("Please choose a ticker symbol: ")
print("Loading data for " + ticker.upper())
try:
#Pulling the data for the chosen ticker.
url = ('https://finviz.com/quote.ashx?t=' + ticker.upper())
req = requests.get(url, headers = headers)
table = pd.read_html(req.text, attrs = {"class":"snapshot-table2"})
df = table[0]
#Pulls company name.
soup = BeautifulSoup(req.text, 'html.parser')
for title in soup.find_all('title'):
print(title.get_text())
pe = df[3][0]
pe = int(pe)
print(pe)
#Ratios and metrics to be checked.
data = {
df[10][10]: ["$" + df[11][10]], #Stock Price
df[2][0]: [df[3][0]], #P/E
df[2][1]: [df[3][1]], #Forward P/E
df[2][2]: [df[3][2]], #PEG
df[2][3]: [df[3][3]], #P/S
df[2][6]: [df[3][6]], #P/FCF
df[0][7]: [df[1][7]], #Dividend %
df[6][5]: [df[7][5]] #ROE
}
#Framing the table and printing the results.
df = pd.DataFrame(data, index = ["Stats:"])
print(df)
except ValueError:
print("Ticker doesn't exist. Please check your selection.")
This returns:
Loading data for KO
KO The Coca-Cola Company Stock Quote
Ticker doesn't exist. Please check your selection.
When I use this:
pe = int(float(pe))
It works but it rounds it to 30 for Coca-Cola for example. I'd like to return the exact number but turning it into an int isn't as straightforward as it seems. I did use type() to make sure the original piece of data is a str and df[3][0] is a str.
Any help is appreciated, thank you.

Related

Why does Twints "since" & "until" not work?

I'm trying to get all tweets from 2018-01-01 until now from various firms.
My code works, however I do not get the tweets from the time range. Sometimes I only get the tweets from today and yesterday or from mid April up to now, but not since the beginning of 2018. I've got then the message: [!] No more data! Scraping will stop now.
ticker = []
#read in csv file with company ticker in a list
with open('C:\\Users\\veron\\Desktop\\Test.csv', newline='') as inputfile:
for row in csv.reader(inputfile):
ticker.append(row[0])
#Getting tweets for every ticker in the list
for i in ticker:
searchstring = (f"{i} since:2018-01-01")
c = twint.Config()
c.Search = searchstring
c.Lang = "en"
c.Panda = True
c.Custom["tweet"] = ["date", "username", "tweet"]
c.Store_csv = True
c.Output = f"{i}.csv"
twint.run.Search(c)
df = pd. read_csv(f"{i}.csv")
df['company'] = i
df.to_csv(f"{i}.csv", index=False)
Does anyone had the same issues and has some tip?
You need to add the configuration parameter Since separately. For example:
c.Since = "2018-01-01"
Similarly for Until:
c.Until = "2017-12-27"
The official documentation might be helpful.
Since (string) - Filter Tweets sent since date, works only with twint.run.Search (Example: 2017-12-27).
Until (string) - Filter Tweets sent until date, works only with twint.run.Search (Example: 2017-12-27).

How to get $0 value in Webbot using python

Please find below code am trying to get Seller Proceeds value in Website, but it has $0, when i tried in console $0.value am getting 598.08 but am getting Calculate when i tried using this
sel_proc = web.find_elements(id="afn-seller-proceeds")[0].text
'''
Full Code :
import pandas as pd
from webbot import Browser
from bs4 import BeautifulSoup
web = Browser()
##web.set_window_position(-10000,0)
df = pd.read_excel('sample.xlsx')
soafees = []
fulfees = []
selproc = []
for ind in df.index:
web.go_to('https://somelink')
## web.set_window_position(-10000,0)
web.click(id='link_continue')
print("Login Successful")
asin = df['ASIN'][ind]
sp = int(df['Selling Price'][ind])
print(sp)
cp = int(df['Cost of Product'][ind])
print(cp)
web.type(df['ASIN'][ind] , into = 'Enter your product name, UPC, EAN, ISBN or ASIN',clear = True)
web.click(id='a-autoid-0')
web.type(sp,tag='input',id='afn-pricing',clear = True)
web.type(cp,tag='input',id='afn-cost-of-goods',clear = True)
web.click(id='update-fees-link')
res = web.find_elements(id="afn-selling-fees")[0].text
ful_fees = web.find_elements(id="afn-amazon-fulfillment-fees")[0].text
sel_proc = web.find_elements(id="afn-seller-proceeds")[0].text
## sel_proc = web.execute_script('return arguments[0].value;', element);
print("soa fees : "+res)
print("Fulfillment fees : "+ful_fees)
print("Seller Proceeds : "+sel_proc)
soafees.append(res)
fulfees.append(ful_fees)
selproc.append(sel_proc)
print(soafees)
print(fulfees)
print(selproc)
df_soa = pd.DataFrame(soafees,columns = ['SOA Fees'])
df_ful = pd.DataFrame(fulfees,columns = ['FBA Fees'])
df_sel = pd.DataFrame(selproc,columns = ['Seller Proceeds'])
print(df)
print(df_soa)
print(df_ful)
print(df_sel)
Snapshot for reference:
thanks in advance for your support
In the sel_proc variable, you are storing the text, Instead, you should look for the attribute which has the value. I believe, in this case, it should be a "value" attribute.
sel_proc = web.find_elements(id="afn-seller-proceeds")[0].get_attribute(<attribute_name>)
Your code will look something like this:
sel_proc = web.find_elements(id="afn-seller-proceeds")[0].get_attribute("value")

How to encode latitude and longitude using urllib.parse.urlencode?

I'm using Google API to obtain the json data of nearby coffee outlets. To do this, I need to encode the latitude and longitude into the URL.
The required URL: https://maps.googleapis.com/maps/api/place/textsearch/json?query=coffee&location=22.303940,114.170372&radius=1000&maxprice=3&key=myAPIKey
The URL i'm obtaining using urlencode: https://maps.googleapis.com/maps/api/place/textsearch/json?query=coffee&location=22.303940%2C114.170372&radius=1000&maxprice=3&key=myAPIKEY
How can I remove the "%2C" in the URL? (I have shown my code below)
serviceurl_placesearch = 'https://maps.googleapis.com/maps/api/place/textsearch/json?'
parameters = dict()
query = input('What are you searching for?')
parameters['query'] = query
parameters['location'] = "22.303940,114.170372"
while True:
radius = input('Enter radius of search in meters: ')
try:
radius = int(radius)
parameters['radius'] = radius
break
except:
print('Please enter number for radius')
while True:
maxprice = input('Enter the maximum price level you are looking for(0 to 4): ')
try:
maxprice = int(maxprice)
parameters['maxprice'] = maxprice
break
except:
print('Valid inputs are 0,1,2,3,4')
parameters['key'] = API_key
url = serviceurl_placesearch + urllib.parse.urlencode(parameters)
I added this piece of code in to make the URL work however I don't think this is a long term solution. I'm looking for a more long term solution.
urlparts = url.split('%2C')
url = ','.join(urlparts)
You can add safe=","
import urllib.parse
parameters = {'location': "22.303940,114.170372"}
urllib.parse.urlencode(parameters, safe=',')
Result
location=22.303940,114.170372

Parsing a JSON using specific key words using Python

I'm trying to parse a JSON of a sites stock.
The JSON: https://www.ssense.com/en-us/men/sneakers.json
So I want to take some keywords from the user. Then I want to parse the JSON using these keywords to find the name of the item and (in this specific case) return the ID, SKU and the URL.
So for example:
If I inputted "Black Fennec" I want to parse the JSON and find the ID,SKU, and URL of Black Fennec Sneakers (that have an ID of 3297299, a SKU of 191422M237006, and a url of /men/product/ps-paul-smith/black-fennec-sneakers/3297299 )
I have never attempted doing anything like this. Based on some guides that show how to parse a JSON I started out with this:
r = requests.Session()
stock = r.get("https://www.ssense.com/en-us/men/sneakers.json",headers = headers)
obj json_data = json.loads(stock.text)
However I am now confused. How do I find the product based off the keywords and how do I get the ID,Url and the SKU or it?
Theres a number of ways to handle the output. not sure what you want to do with it. But this should get you going.
EDIT 1:
import requests
r = requests.Session()
obj_json_data = r.get("https://www.ssense.com/en-us/men/sneakers.json").json()
products = obj_json_data['products']
keyword = input('Enter a keyword: ')
for product in products:
if keyword.upper() in product['name'].upper():
name = product['name']
id_var = product['id']
sku = product['sku']
url = product['url']
print ('Product: %s\nID: %s\nSKU: %s\nURL: %s' %(name, id_var, sku, url))
# if you only want to return the first match, uncomment next line
#break
I also have it setup to store it into a dataframe, and or a list too. Just to give some options of where to go with it.
import requests
import pandas as pd
r = requests.Session()
obj_json_data = r.get("https://www.ssense.com/en-us/men/sneakers.json").json()
products = obj_json_data['products']
keyword = input('Enter a keyword: ')
products_found = []
results = pd.DataFrame()
for product in products:
if keyword.upper() in product['name'].upper():
name = product['name']
id_var = product['id']
sku = product['sku']
url = product['url']
temp_df = pd.DataFrame([[name, id_var, sku, url]], columns=['name','id','sku','url'])
results = results.append(temp_df)
products_found = products_found.append(name)
print ('Product: %s\nID: %s\nSKU: %s\nURL: %s' %(name, id_var, sku, url))
if products_found == []:
print ('Nothing found')
EDIT 2: Here is another way to do it by converting the json to a dataframe, then filtering by those rows that have the keyword in the name (this is actually a better solution in my opinion)
import requests
import pandas as pd
from pandas.io.json import json_normalize
r = requests.Session()
obj_json_data = r.get("https://www.ssense.com/en-us/men/sneakers.json").json()
products = obj_json_data['products']
products_df = json_normalize(products)
keyword = input('Enter a keyword: ')
products_found = []
results = pd.DataFrame()
results = products_df[products_df['name'].str.contains(keyword, case = False)]
#print (results[['name', 'id', 'sku', 'url']])
products_found = list(results['name'])
if products_found == []:
print ('Nothing found')
else:
print ('Found: '+ str(products_found))

KeyError and TypeError in my python web scraper

So sorry about this vague and confusing title. But there is no really better way for me to summarize my problem in one sentence.
I was trying to get the student and grade information from a french website. The link is this (http://www.bankexam.fr/resultat/2014/BACCALAUREAT/AMIENS?filiere=BACS)
My code is as follows:
import time
import urllib2
from bs4 import BeautifulSoup
regions = {'R\xc3\xa9sultats Bac Amiens 2014':'/resultat/2014/BACCALAUREAT/AMIENS'}
base_url = 'http://www.bankexam.fr'
tests = {'es':'?filiere=BACES','s':'?filiere=BACS','l':'?filiere=BACL'}
for i in regions:
for x in tests:
# create the output file
output_file = open('/Users/student project/'+ i + '_' + x + '.txt','a')
time.sleep(2) #compassionate scraping
section_url = base_url + regions[i] + tests[x] #now goes to the x test page of region i
request = urllib2.Request(section_url)
response = urllib2.urlopen(request)
soup = BeautifulSoup(response,'html.parser')
content = soup.find('div',id='zone_res')
for row in content.find_all('tr'):
if row.td:
student = row.find_all('td')
name = student[0].strong.string.encode('utf8').strip()
try:
school = student[1].strong.string.encode('utf8')
except AttributeError:
school = 'NA'
result = student[2].span.string.encode('utf8')
output_file.write ('%s|%s|%s\n' % (name,school,result))
# Find the maximum pages to go through
if soup.find('div','pagination'):
import re
page_info = soup.find('div','pagination')
pages = []
for i in page_info.find_all('a',re.compile('elt')):
try:
pages.append(int(i.string.encode('utf8')))
except ValueError:
continue
max_page = max(pages)
# Now goes through page 2 to max page
for i in range(1,max_page):
page_url = '&p='+str(i)+'#anchor'
section2_url = section_url+page_url
request = urllib2.Request(section2_url)
response = urllib2.urlopen(request)
soup = BeautifulSoup(response,'html.parser')
content = soup.find('div',id='zone_res')
for row in content.find_all('tr'):
if row.td:
student = row.find_all('td')
name = student[0].strong.string.encode('utf8').strip()
try:
school = student[1].strong.string.encode('utf8')
except AttributeError:
school = 'NA'
result = student[2].span.string.encode('utf8')
output_file.write ('%s|%s|%s\n' % (name,school,result))
A little more description about the code:
I created a 'regions' dictionary and 'tests' dictionary because there are 30 other regions I need to collect and I just include one here for showcase. And I'm just interested in the test results of three tests (ES, S, L) and so I created this 'tests' dictionary.
Two errors keep showing up,
one is
KeyError: 2
and the error is linked to line 12,
section_url = base_url + regions[i] + tests[x]
The other is
TypeError: cannot concatenate 'str' and 'int' objects
and this is linked to line 10.
I know there is a lot of information here and I'm probably not listing the most important info for you to help me. But let me know how I can do to fix this!
Thanks
The issue is that you're using the variable i in more than one place.
Near the top of the file, you do:
for i in regions:
So, in some places i is expected to be a key into the regions dictionary.
The trouble comes when you use it again later. You do so in two places:
for i in page_info.find_all('a',re.compile('elt')):
And:
for i in range(1,max_page):
The second of these is what is causing your exceptions, as the integer values that get assigned to i don't appear in the regions dict (nor can an integer be added to a string).
I suggest renaming some or all of those variables. Give them meaningful names, if possible (i is perhaps acceptable for an "index" variable, but I'd avoid using it for anything else unless you're code golfing).

Categories