When I execute this code...
from bs4 import BeautifulSoup
with open("games.html", "r") as page:
doc = BeautifulSoup(page, "html.parser")
titles = doc.select("a.title")
prices = doc.select("span.price-inner")
for game_soup in doc.find_all("div", {"class": "game-options-wrapper"}):
game_ids = (game_soup.button.get("data-game-id"))
for title, price_official, price_lowest in zip(titles, prices[::2], prices[1::2]):
print(title.text + ',' + str(price_official.text.replace('$', '').replace('~', '')) + ',' + str(
price_lowest.text.replace('$', '').replace('~', '')))
The output is...
153356
80011
130187
119003
73502
156474
96592
154207
155123
152790
165013
110837
Call of Duty: Modern Warfare II (2022),69.99,77.05
Red Dead Redemption 2,14.85,13.79
God of War,28.12,22.03
ELDEN RING,50.36,48.10
Cyberpunk 2077,29.99,28.63
EA SPORTS FIFA 23,41.99,39.04
Warhammer 40,000: Darktide,39.99,45.86
Marvels Spider-Man Remastered,30.71,27.07
Persona 5 Royal,37.79,43.32
The Callisto Protocol,59.99,69.41
Need for Speed Unbound,69.99,42.29
Days Gone,15.00,9.01
But I'm trying to get the value next to the other ones on the same line
Expected output:
Call of Duty: Modern Warfare II (2022),69.99,77.05,153356
Red Dead Redemption 2,14.85,13.79,80011
...
Even when adding game_ids to print(), it spams the same game id for each line.
How can I go about resolving this issue?
HTML file: https://jsfiddle.net/m3hqy54x/
I feel like all 3 details (title, price_official, price_lowest) are probably all in a shared container. It would be better to loop through these containers and select the details as sets from each container to make sure the wight prices and titles are being paired up, but I can't tell you how to do that without seeing at least a snippet from (or all of) "games.html"....
Anyway, assuming that '110837\nCall of Duty: Modern Warfare II (2022)' is from the first title here, you can rewrite your last loop as something like:
for z in zip(titles, prices[::2], prices[1::2]):
z, lw = list(z), ''
for i in len(z):
if i == 0: # title
z[0] = ' '.join(w for w in z[0].text.split('\n', 1)[-1] if w)
if '\n' in z[0].text: lw = z[0].text.split('\n', 1)[0]
continue
z[i] = z[i].text.replace('$', '').replace('~', '')
print(','.join(z+[lw]))
Added EDIT: After seeing the html, this is my suggested solution:
for g in doc.select('div[data-container-game-id]'):
gameId = g.get('data-container-game-id')
title = g.select_one('a.title')
if title: title = ' '.join(w for w in title.text.split() if w)
price_official = g.select_one('.price-wrap > div:first-child span.price')
price_lowest = g.select_one('.price-wrap > div:first-child+div span.price')
if price_official:
price_official = price_official.text.replace('$', '').replace('~', '')
if price_lowest:
price_lowest = price_lowest.text.replace('$', '').replace('~', '')
print(', '.join([title, price_official, price_lowest, gameId]))
prints
Call of Duty: Modern Warfare II (2022), 69.99, 77.05, 153356
Red Dead Redemption 2, 14.85, 13.79, 80011
God of War, 28.12, 22.03, 130187
ELDEN RING, 50.36, 48.10, 119003
Cyberpunk 2077, 29.99, 28.63, 73502
EA SPORTS FIFA 23, 41.99, 39.04, 156474
Warhammer 40,000: Darktide, 39.99, 45.86, 96592
Marvel's Spider-Man Remastered, 30.71, 27.07, 154207
Persona 5 Royal, 37.79, 43.32, 155123
The Callisto Protocol, 59.99, 69.41, 152790
Need for Speed Unbound, 69.99, 42.29, 165013
Days Gone, 15.00, 9.01, 110837
Btw, this might look ok for just four values, but if you have a large amount of details that you want to extract, you might want to consider using a function like this.
The question I am trying to answer is
Query if a book title is available and present option of (a) increasing stock level or (b) decreasing the stock level, due to a sale. If the stock level is decreased to zero indicate to the user that the book is currently out of stock.
This is the text file
#Listing showing sample book details
#AUTHOR, TITLE, FORMAT, PUBLISHER, COST?, STOCK, GENRE
P.G. Wodehouse, Right Ho Jeeves, hb, Penguin, 10.99, 5, fiction
A. Pais, Subtle is the Lord, pb, OUP, 12.99, 2, biography
A. Calaprice, The Quotable Einstein, pb, PUP, 7.99, 6, science
M. Faraday, The Chemical History of a Candle, pb, Cherokee, 5.99, 1, science
C. Smith, Energy and Empire, hb, CUP, 60, 1, science
J. Herschel, Popular Lectures, hb, CUP, 25, 1, science
C.S. Lewis, The Screwtape Letters, pb, Fount, 6.99, 16, religion
J.R.R. Tolkein, The Hobbit, pb, Harper Collins, 7.99, 12, fiction
C.S. Lewis, The Four Loves, pb, Fount, 6.99, 7, religion
E. Heisenberg, Inner Exile, hb, Birkhauser, 24.95, 1, biography
G.G. Stokes, Natural Theology, hb, Black, 30, 1, religion
And this is the code i have so far
def Task5():
again = 'y'
while again == 'y':
desc = input('Enter the title of the book you would like to search for: ')
for bookrecord in book_list:
if desc in book_list:
print('Book found')
else:
print('Book not found')
break
again = input('\nWould you like to search again(press y for yes)').lower()
i already have a function which reads from the text file:
book_list = []
def readbook():
infile = open('book_data_file.txt')
for row in infile:
start = 0 # used to start at the beginning of each line
string_builder = []
if not(row.startswith('#')):
for index in range(len(row)):
if row[index] ==',' or index ==len(row)-1:
string_builder.append(row[start:index])
start = index+1
book_list.append(string_builder)
infile.close()
Any one have an idea on how i complete this task? :)
Get the titles from the book_list variable.
titles = [data[1].strip() for data in book_list]
Remove any white-space from the desc variable.
desc = desc.strip()
For instance If I'm searching for Popular Lectures book, but If I type Popular Lectures then I couldn't find it in the book_list Therefore you should remove the white characters from the input.
If the book is avail, then get the book name and stock value from the book_list
info = [(title, int(book_list[idx][5].strip())) for idx, title in enumerate(titles) if desc in title][0]
bk_nm, stock = info
Print the current situation
if stock == 0:
print("{} is currently not avail".format(bk_nm))
else:
print("{} is avail w/ stock {}".format(bk_nm, stock))
Example:
Enter the title of the book you would like to search for: Popular Lectures
Popular Lectures is avail w/ stock 1
Would you like to search again(press y for yes)y
Enter the title of the book you would like to search for: Chemical
The Chemical History of a Candle is avail w/ stock 1
Would you like to search again(press y for yes)n
Code:
book_list = []
def task5():
titles = [data[1].strip() for data in book_list]
again = 'y'
while again == 'y':
desc = input('Enter the title of the book you would like to search for: ')
desc = desc.strip() # Remove any white space
stock = [int(book_list[idx][5].strip()) for idx, title in enumerate(titles) if desc in title][0]
if stock == 0:
print("{} is currently not avail".format(desc))
else:
print("{} is avail w/ stock {}".format(desc, stock))
again = input('\nWould you like to search again(press y for yes)').lower()
def read_txt():
infile = open('book_data_file.txt')
for row in infile:
start = 0 # used to start at the beginning of each line
string_builder = []
if not (row.startswith('#')):
for index in range(len(row)):
if row[index] == ',' or index == len(row) - 1:
string_builder.append(row[start:index])
start = index + 1
book_list.append(string_builder)
infile.close()
if __name__ == '__main__':
read_txt()
task5()
I have sought different articles here about searching data from a list, but nothing seems to be working right or is appropriate in what I am supposed to implement.
I have this pre-created module with over 500 list (they are strings, yes, but is considered as list when called into function; see code below) of names, city, email, etc. The following are just a chunk of it.
empRecords="""Jovita,Oles,8 S Haven St,Daytona Beach,Volusia,FL,6/14/1965,32114,386-248-4118,386-208-6976,joles#gmail.com,http://www.paganophilipgesq.com,;
Alesia,Hixenbaugh,9 Front St,Washington,District of Columbia,DC,3/3/2000,20001,202-646-7516,202-276-6826,alesia_hixenbaugh#hixenbaugh.org,http://www.kwikprint.com,;
Lai,Harabedian,1933 Packer Ave #2,Novato,Marin,CA,1/5/2000,94945,415-423-3294,415-926-6089,lai#gmail.com,http://www.buergimaddenscale.com,;
Brittni,Gillaspie,67 Rv Cent,Boise,Ada,ID,11/28/1974,83709,208-709-1235,208-206-9848,bgillaspie#gillaspie.com,http://www.innerlabel.com,;
Raylene,Kampa,2 Sw Nyberg Rd,Elkhart,Elkhart,IN,12/19/2001,46514,574-499-1454,574-330-1884,rkampa#kampa.org,http://www.hermarinc.com,;
Flo,Bookamer,89992 E 15th St,Alliance,Box Butte,NE,12/19/1957,69301,308-726-2182,308-250-6987,flo.bookamer#cox.net,http://www.simontonhoweschneiderpc.com,;
Jani,Biddy,61556 W 20th Ave,Seattle,King,WA,8/7/1966,98104,206-711-6498,206-395-6284,jbiddy#yahoo.com,http://www.warehouseofficepaperprod.com,;
Chauncey,Motley,63 E Aurora Dr,Orlando,Orange,FL,3/1/2000,32804,407-413-4842,407-557-8857,chauncey_motley#aol.com,http://www.affiliatedwithtravelodge.com
"""
a = empRecords.strip().split(";")
And I have the following code for searching:
import empData as x
def seecity():
empCitylist = list()
for ct in x.a:
empCt = ct.strip().split(",")
empCitylist.append(empCt)
t = sorted(empCitylist, key=lambda x: x[3])
for c in t:
city = (c[3])
print(city)
live_city = input("Enter city: ")
for cy in city:
if live_city in cy:
print(c[1])
# print("Name: "+ c[1] + ",", c[0], "| Current City: " + c[3])
Forgive my idiotic approach as I am new to Python. However, what I am trying to do is user will input the city, then the results should display the employee's last name, first name who are living in that city (I dunno if I made sense lol)
By the way, the code I used above doesn't return any answers. It just loops to the input.
Thank you for helping. Lovelots. <3
PS: the format of the empData is: first name, last name, address, city, country, birthday, zip, phone, and email
You can use the csv module to read easily a file with comma separated values
import csv
with open('test.csv', newline='') as csvfile:
records = list(csv.reader(csvfile))
def search(data, elem, index):
out = list()
for row in data:
if row[index] == elem:
out.append(row)
return out
#test
print(search(records, 'Orlando', 3))
Based on your original code, you can do it like this:
# Make list of list records, sorted by city
t = sorted((ct.strip().split(",") for ct in x.a), key=lambda x: x[3])
# List cities
print("Cities in DB:")
for c in t:
city = (c[3])
print("-", city)
# Define search function
def seecity():
live_city = input("Enter city: ")
for c in t:
if live_city == c[3]:
print("Name: "+ c[1] + ",", c[0], "| Current City: " + c[3])
seecity()
Then, after you understand what's going on, do as #Hoxha Alban suggested, and use the csv module.
The beauty of python lies in list comprehension.
empRecords="""Jovita,Oles,8 S Haven St,Daytona Beach,Volusia,FL,6/14/1965,32114,386-248-4118,386-208-6976,joles#gmail.com,http://www.paganophilipgesq.com,;
Alesia,Hixenbaugh,9 Front St,Washington,District of Columbia,DC,3/3/2000,20001,202-646-7516,202-276-6826,alesia_hixenbaugh#hixenbaugh.org,http://www.kwikprint.com,;
Lai,Harabedian,1933 Packer Ave #2,Novato,Marin,CA,1/5/2000,94945,415-423-3294,415-926-6089,lai#gmail.com,http://www.buergimaddenscale.com,;
Brittni,Gillaspie,67 Rv Cent,Boise,Ada,ID,11/28/1974,83709,208-709-1235,208-206-9848,bgillaspie#gillaspie.com,http://www.innerlabel.com,;
Raylene,Kampa,2 Sw Nyberg Rd,Elkhart,Elkhart,IN,12/19/2001,46514,574-499-1454,574-330-1884,rkampa#kampa.org,http://www.hermarinc.com,;
Flo,Bookamer,89992 E 15th St,Alliance,Box Butte,NE,12/19/1957,69301,308-726-2182,308-250-6987,flo.bookamer#cox.net,http://www.simontonhoweschneiderpc.com,;
Jani,Biddy,61556 W 20th Ave,Seattle,King,WA,8/7/1966,98104,206-711-6498,206-395-6284,jbiddy#yahoo.com,http://www.warehouseofficepaperprod.com,;
Chauncey,Motley,63 E Aurora Dr,Orlando,Orange,FL,3/1/2000,32804,407-413-4842,407-557-8857,chauncey_motley#aol.com,http://www.affiliatedwithtravelodge.com
"""
rows = empRecords.strip().split(";")
data = [ r.strip().split(",") for r in rows ]
then you can use any condition to filter the list, like
print ( [ "Name: " + emp[1] + "," + emp[0] + "| Current City: " + emp[3] for emp in data if emp[3] == "Washington" ] )
['Name: Hixenbaugh,Alesia| Current City: Washington']
I have the following type of document, where each person might have a couple of names and an associated description of features:
New person
name: ana
name: anna
name: ann
feature: A 65-year old woman that has no known health issues but has a medical history of Schizophrenia.
New person
name: tom
name: thomas
name: thimoty
name: tommy
feature: A 32-year old male that is known to be deaf.
New person
.....
What I would like is to read this file in a python dictionary, where each new person is id-ed.
i.e. Person with ID 1 will have the names ['ann','anna','ana']
and will have the feature ['A 65-year old woman that has no known health issues but has a medical history of Schizophrenia.' ]
Any suggestions?
Assuming that your input file is lo.txt. It can be added to dictionary this way:
file = open('lo.txt')
final_data = []
feature = []
names = []
for line in file.readlines():
if ("feature") in line:
data = line.replace("\n","").split(":")
feature=data[1]
final_data.append({
'names': names,
'feature': feature
})
names = []
feature = []
if ("name") in line:
data = line.replace("\n","").split(":")
names.append(data[1])
print final_data
Something like this might work
result = {}
f = open("document.txt")
contents = f.read()
info = contents.split('==== new person ===')
for i in range(len(info)):
info[i].split('\n')
names = []
features = []
for j in range(len(info[i])):
info[i][j].split(':')
if info[i][j][0] == 'name':
names.append(info[i][j][1])
else:
features.append(info[i][j][1])
result[i] = {'names': names,'features': features}
print(result)
This should give you something like:
{0: {'names': ['ana', 'anna', 'ann'], features:['...', '...']}}
e.t.c
Here is code that may work for you:
f = open("documents.txt").readlines()
f = [i.strip('\n') for i in f]
final_condition = f[len(f)-1]
f.remove(final_condition)
names = [i.split(":")[1] for i in f]
the_dict = {}
the_dict["names"] = names
the_dict["features"] = final_condition
print the_dict
All it does is split the names at ":" and take the last element of the resulting list (the names) and keep it for the list names.