unable to append unwatched EPs to list - python

The idea of the code is to add to existent playlist unwatched EPs by index order, ep 1 Show X, ep 1 Show Z, regardless of air date:
from plexapi.server import PlexServer
baseurl = 'http://0.0.0.0:0000/'
token = '0000000000000'
plex = PlexServer(baseurl, token)
episode = 0
first_ep_name = []
for x in plex.library.section('Anime').search(unwatched=True):
try:
for y in plex.library.section('Anime').get(x.title).episodes()[episode]:
if plex.library.section('Anime').get(x.title).episodes()[episode].isWatched:
episode +=1
first_ep_name.append(y)
else:
episode = 0
first_ep_name.append(y)
except:
continue
plex.playlist('Anime Playlist').addItems(first_ep_name)
But when I run it, it will always add watched EPs but if I debug the code in Thoni IDE it seems that is doing its purpose so I am not sure whats wrong with that code.
Any ideas?
Im thinking that the error might be here:
plex.playlist('Anime Playlist').addItems(first_ep_name)
but according to the documentation addItems should be a list but my list "first_ep_name " its already appending unwatched episodes in the correct order, in theory addItems should recognize the specific episode and not only the series name but I am not sure anymore.

is somebody out there is having the same issue with plexapi I was able to find a way to get this project working properly:
from plexapi.server import PlexServer
baseurl = 'insert plex url here'
token = 'plex token here'
plex = PlexServer(baseurl, token)
anime_plex = []
scrapped_playlist = []
for x in plex.library.section('Anime').search(unwatched=True):
anime_plex.append(x)
while len(anime_plex) >0:
episode_list = []
for y in plex.library.section('Anime').get(anime_plex[0].title).episodes():
episode_list.append(y)
ep_checker = True
while ep_checker:
if episode_list[0].isWatched:
episode_list.pop(0)
else:
scrapped_playlist.append(episode_list[0])
episode_list.clear()
ep_checker = False
anime_plex.pop(0)
# plex.playlist('Anime Playlist').addItems(scrapped_playlist)
plex.playlist('Anime Playlist').delete()
plex.createPlaylist('Anime Playlist', section='Anime', items= scrapped_playlist)
Basically, what I am doing with that code I am looping through each anime series I have and if EP # X is watched then pop from the list until it finds a boolean FALSE then that will append into an empty list that later I will use for creating/adding to playlist.
The last lines of the code can be commented on for whatever purpose, creating the playlist anime or adding items.

Related

KeyError with Riot API Matchv5 When Trying To Pull Data

I'm trying to pull a list of team and player stats from match IDs. Everything looks fine to me but when I run my "for loops" to call the functions for pulling the stats I want, it just prints the error from my try/except block. I'm still pretty new to python and this is my first project so I've tried everything I can think of in the past few days but no luck. I believe the problem is with my actual pull request but I'm not sure as I'm also using a GitHub library I found to help me with the Riot API while I change and update it to get the info I want.
def get_match_json(matchid):
url_pull_match = "https://{}.api.riotgames.com/lol/match/v5/matches/{}/timeline?api_key={}".format(region, matchid, api_key)
match_data_all = requests.get(url_pull_match).json()
# Check to make sure match is long enough
try:
length_match = match_data_all['frames'][15]
return match_data_all
except IndexError:
return ['Match is too short. Skipping.']
And then this is a shortened version of the stat function:
def get_player_stats(match_data, player):
# Get player information at the fifteenth minute of the game.
player_query = match_data['frames'][15]['participantFrames'][player]
player_team = player_query['teamId']
player_total_gold = player_query['totalGold']
player_level = player_query['level']
And there are some other functions in the code as well but I'm not sure they are faulty as well or if they are needed to figure out the error. But here is the "for loop" to call the request and defines the variable 'matchid'
for matchid_batch in all_batches:
match_data = []
for match_id in matchid_batch:
time.sleep(1.5)
if match_id == 'MatchId':
pass
else:
try:
match_entry = get_match_row(match_id)
if match_entry[0] == 'Match is too short. Skipping.':
print('Match', match_id, "is too short.")
else:
match_entry = get_match_row(match_id).reshape(1, -1)
match_data.append(np.array(match_entry))
except KeyError:
print('KeyError.')
match_data = np.array(match_data)
match_data.shape = -1, 17
df = pd.DataFrame(match_data, columns=column_titles)
df.to_csv('Match_data_Diamond.csv', mode='a')
print('Done Batch!')
Since this is my first project any help would be appreciated since I can't find any info on this particular subject so I really don't know where to look to learn why it's not working on my own.
I guess your issue was that the 'frame' array is subordinate to the array 'info'.
def get_match_json(matchid):
url_pull_match = "https://{}.api.riotgames.com/lol/match/v5/matches/{}/timeline?api_key={}".format(region, matchid, api_key)
match_data_all = requests.get(url_pull_match).json()
try:
length_match = match_data_all['info']['frames'][15]
return match_data_all
except IndexError:
return ['Match is too short. Skipping.']
def get_player_stats(match_data, player): # player has to be an int (1-10)
# Get player information at the fifteenth minute of the game.
player_query = match_data['info']['frames'][15]['participantFrames'][str(player)]
#player_team = player_query['teamId'] - It is not possibly with the endpoint to get the teamId
player_total_gold = player_query['totalGold']
player_level = player_query['level']
return player_query
This example worked for me. Unfortunately it is not possible to gain the teamId only through your API-endpoint. Usually the players 1-5 are in team 100 (blue side) and 6-10 in team 200 (red side).

Find Value Using Selenium using a Variable that Contains String

I am trying to open up several URL's (because they contain data I want to append to a list). I have a logic saying "if amount in icl_dollar_amount_l" then run the rest of the code. However, I want the script to only run the rest of the code on that specific amount in the variable "amount".
Example:
selenium opens up X amount of links and sees ['144,827.95', '5,199,024.87', '130,710.67'] in icl_dollar_amount_l but i want it to skip '144,827.95', '5,199,024.87' and only get the information for '130,710.67' which is in the 'amount' variable already.
Actual results:
Its getting webscaping information for amount '144,827.95' only and not even going to '5,199,024.87', '130,710.67'. I only want it getting webscaping information for '130,710.67' because my amount variable has this as the only amount.
print(icl_dollar_amount_l)
['144,827.95', '5,199,024.87', '130,710.67']
print(amount)
'130,710.67'
file2.py
def scrapeBOAWebsite(url,fcg_subject_l, gp_subject_l):
from ICL_Awk_Checker import rps_amount_l2
icl_dollar_amount_l = []
amount_ack_missing_l = []
file_total_l = []
body_l = []
for link in url:
print(link)
browser = webdriver.Chrome(options=options,
executable_path=r'\\TEST\user$\TEST\Documents\driver\chromedriver.exe')
# if 'P2 Cust ID 908554 File' in fcg_subject:
browser.get(link)
username = browser.find_element_by_name("dialog:username").get_attribute('value')
submit = browser.find_element_by_xpath("//*[#id='dialog:continueButton']").click()
body = browser.find_element_by_xpath("//*[contains(text(), 'Total:')]").text
body_l.append(body)
icl_dollar_amount = re.findall('(?:[\£\$\€]{1}[,\d]+.?\d*)', body)[0].split('$', 1)[1]
icl_dollar_amount_l.append(icl_dollar_amount)
if not missing_amount:
logging.info("List is empty")
print("List is empty")
count = 0
for amount in missing_amount:
if amount in icl_dollar_amount_l:
body = body_l[count]
get_file_total = re.findall('(?:[\£\$\€]{1}[,\d]+.?\d*)', body)[0].split('$', 1)[1]
file_total_l.append(get_file_total)
return icl_dollar_amount_l, file_date_l, company_id_l, client_id_l, customer_name_l, file_name_l, file_total_l, \
item_count_l, file_status_l, amount_ack_missing_l
I don't know if I understand problem but this
if amount in icl_dollar_amount_l:
doesn't give information on which position is '130,710.67' in icl_dollar_amount_l and you need also
count = icl_dollar_amount_l.index(amount)
for amount in missing_amount:
if amount in icl_dollar_amount_l:
count = icl_dollar_amount_l.index(amount)
body = body_l[count]
But it will works if you expect only one amount on list icl_dollar_amount_l. For more elements you would have to use rather for-loop and check every element separatelly
for amount in missing_amount:
for count, item in enumerate(icl_dollar_amount_l)
if amount == item :
body = body_l[count]
But frankly I don't know why you don't check it in first loop for link in url: when you have direct access to icl_dollar_amount and body

Why does this for loop return a different sized list than expected?

I'm doing a data analysis project using spotipy and numpy libraries. I've figured out how to achieve my expected result, but I don't know exactly why a slight change (using a for loop) to my code causes it to not work. here is my code:
def get_user_playlist(username, playlist_id, sp):
offset=0
playlist_songs = sp.user_playlist_tracks(username, playlist_id, limit=100, fields=None, offset=offset, market=None)['items']
return playlist_songs
def create_dataframe(playlist_songs):
playlist_df_columns = ['artist','track_name','id','explicit','duration','danceability','loudness','tempo']
#audio_analysis_columns = ['danceability','loudness','tempo']
playlist_df = pd.DataFrame(columns=df_columns)
# song = dict object containing song
playlist_df['artist'] = np.array([song['track']["album"]["artists"][0]["name"] for song in playlist_songs])
playlist_df['track_name'] = np.array([song['track']['name'] for song in playlist_songs])
playlist_df['id'] = np.array([song['track']['id'] for song in playlist_songs])
playlist_df['explicit'] = np.array([song['track']['explicit'] for song in playlist_songs])
for song in playlist_songs:
audio_analysis = sp.audio_features(song['track']['id'])
#returning audio_analysis for testing purposes.
return audio_analysis
#return playlist_df
the important part is the for loop, when I run this code, the length of the audio_analysis list = 1 :
for song in playlist_songs:
audio_analysis = sp.audio_features(song['track']['id'])
However, it works when I remove the for loop and do this instead, the length of the audio_analysis list = 94, as expected.:
audio_analysis = sp.audio_features(playlist_df['id'])
For reference, here is the code that prints the length:
playlist = get_user_playlist('username', 'playlist_name', sp)
audio_analysis = create_dataframe(playlist)
print(len(audio_analysis))
My question is: why does the for loop not work as I expect? Is my code not accessing the same information? Why isn't using a for loop to access information the same as using the playlist_df['id'] column directly?

Find all elements with specific text selenium python

I am trying to connect with elements that carry the contact numbers on each site. I was able to create the routine to get the numbers, extract the contact number with available formats and regex and the following code snippet to get the element
contact_elem = browser.find_elements_by_xpath("//*[contains(text(), '" + phone_num + "')]")
Considering the example of https://www.cssfirm.com/, the contact number appears in 2 locations, the top header and the bottom footer
The element texts accompanying the contact number are as follows :
<h3>CALL US TODAY AT (855) 910-7824</h3> - Footer
<span>Call Us<br>Today</span> (855) 910-7824 - Header
The extracted phone number matches perfectly while printing it out. For some reason, the element from the header part is not being detected.
I tried by searching for elements and even by deleting the footer element from the browser before executing the rest of the code.
What could be the reason for it to go undetected?
P.S: Below is the amateurish,uncorrected code. Efficiency edits/suggestions are welcome. The same code has been tested with various sites and works fine.
url = 'http://www.cssfirm.com/'
browser.get(url)
parsed = browser.find_element_by_tag_name('html').get_attribute('innerHTML')
s = BeautifulSoup(parsed, 'html.parser')
s = s.decode('utf-8')
phoneNumberRegex = '(\s*(?:\+?(\d{1,4}))?[-. (]*(\d{1,})[-. )]*(\d{3}|[A-Z0-9]+)[-. \/]*(\d{4}|[A-Z0-9]+)[-. \/]?(\d{4}|[A-Z0-9]+)?(?: *x(\d+))?\s*)'
custom_re = ['([0-9]{4,4} )([0-9]{3,3} )([0-9]{4,4})',
'([0-9]{3,3} )([0-9]{4,4} )([0-9]{4,4})',
'(\+[0-9]{2,2}-)([0-9]{4,4}-)([0-9]{4,4}-)(0)',
'(\([0-9]{3,3}\) )([0-9]{3,3}-)([0-9]{4,4})',
'(\+[0-9]{2,2} )(\(0\)[0-9]{4,4} )([0-9]{4,6})',
'([0-9]{5,5} )([0-9]{6,6})',
'(\+[0-9]{2,2}\(0\))([0-9]{4,4} )([0-9]{4,4})',
'(\+[0-9]{2,2} )([0-9]{3,3} )([0-9]{4,4} )([0-9]{3,3})',
'([0-9]{3,3}-)([0-9]{3,3}-)([0-9]{4,4})']
phones = []
phones = re.findall(phoneNumberRegex, s)
phone_num_list = ()
phone_num = ''
matched = 0
for phoneHeader in phones:
#phoneHeader = phoneHeader.decode('utf-8')
for ph_cnd in phoneHeader:
for pttrn in custom_re:
phones = re.findall(pttrn,ph_cnd)
if(phones):
phone_num_list = phones
for x in phone_num_list:
phone_num = ''.join(x)
try:
contact_elem = browser.find_element_by_xpath("//*[contains(text(), '" + phone_num + "')]")
phone_num_txt = contact_elem.text
if(phone_num_txt):
matched = 1
break
except NoSuchElementException:
pass
if(matched == 1):
break
if(matched == 1):
break
if(matched == 1):
break
print("Phone number :",phone_num) <-- Perfect output
contact_elem <--empty for header or just the footer element
EDIT
Code updated. Forgot an important piece. Moreover, there is sleep time given in between to give time for the page to load. Considering it trivial, I haven't included them for a quick read.
I found a temporary solution by searching for the partial link text, as the number also comes on the link.
contact_elem2 = browser.find_element_by_partial_link_text(phone_num)
However, this does not answer the generic question as to why that text was ignored within the element.

KeyError and TypeError in my python web scraper

So sorry about this vague and confusing title. But there is no really better way for me to summarize my problem in one sentence.
I was trying to get the student and grade information from a french website. The link is this (http://www.bankexam.fr/resultat/2014/BACCALAUREAT/AMIENS?filiere=BACS)
My code is as follows:
import time
import urllib2
from bs4 import BeautifulSoup
regions = {'R\xc3\xa9sultats Bac Amiens 2014':'/resultat/2014/BACCALAUREAT/AMIENS'}
base_url = 'http://www.bankexam.fr'
tests = {'es':'?filiere=BACES','s':'?filiere=BACS','l':'?filiere=BACL'}
for i in regions:
for x in tests:
# create the output file
output_file = open('/Users/student project/'+ i + '_' + x + '.txt','a')
time.sleep(2) #compassionate scraping
section_url = base_url + regions[i] + tests[x] #now goes to the x test page of region i
request = urllib2.Request(section_url)
response = urllib2.urlopen(request)
soup = BeautifulSoup(response,'html.parser')
content = soup.find('div',id='zone_res')
for row in content.find_all('tr'):
if row.td:
student = row.find_all('td')
name = student[0].strong.string.encode('utf8').strip()
try:
school = student[1].strong.string.encode('utf8')
except AttributeError:
school = 'NA'
result = student[2].span.string.encode('utf8')
output_file.write ('%s|%s|%s\n' % (name,school,result))
# Find the maximum pages to go through
if soup.find('div','pagination'):
import re
page_info = soup.find('div','pagination')
pages = []
for i in page_info.find_all('a',re.compile('elt')):
try:
pages.append(int(i.string.encode('utf8')))
except ValueError:
continue
max_page = max(pages)
# Now goes through page 2 to max page
for i in range(1,max_page):
page_url = '&p='+str(i)+'#anchor'
section2_url = section_url+page_url
request = urllib2.Request(section2_url)
response = urllib2.urlopen(request)
soup = BeautifulSoup(response,'html.parser')
content = soup.find('div',id='zone_res')
for row in content.find_all('tr'):
if row.td:
student = row.find_all('td')
name = student[0].strong.string.encode('utf8').strip()
try:
school = student[1].strong.string.encode('utf8')
except AttributeError:
school = 'NA'
result = student[2].span.string.encode('utf8')
output_file.write ('%s|%s|%s\n' % (name,school,result))
A little more description about the code:
I created a 'regions' dictionary and 'tests' dictionary because there are 30 other regions I need to collect and I just include one here for showcase. And I'm just interested in the test results of three tests (ES, S, L) and so I created this 'tests' dictionary.
Two errors keep showing up,
one is
KeyError: 2
and the error is linked to line 12,
section_url = base_url + regions[i] + tests[x]
The other is
TypeError: cannot concatenate 'str' and 'int' objects
and this is linked to line 10.
I know there is a lot of information here and I'm probably not listing the most important info for you to help me. But let me know how I can do to fix this!
Thanks
The issue is that you're using the variable i in more than one place.
Near the top of the file, you do:
for i in regions:
So, in some places i is expected to be a key into the regions dictionary.
The trouble comes when you use it again later. You do so in two places:
for i in page_info.find_all('a',re.compile('elt')):
And:
for i in range(1,max_page):
The second of these is what is causing your exceptions, as the integer values that get assigned to i don't appear in the regions dict (nor can an integer be added to a string).
I suggest renaming some or all of those variables. Give them meaningful names, if possible (i is perhaps acceptable for an "index" variable, but I'd avoid using it for anything else unless you're code golfing).

Categories