Python Loop replace a part of string - python

I am making a project using Jupyter Notebook. I am creating an oversimplified example here.
I have a url, lets say
url=www.instagram.com/alex
I need to create a database by adding url with replace function in column adjacent to names
And I have a pandas data frame
Names
John
Cherry
nancy
Results wanted using function
Names url
John wwww.instagram.com/john
Cherry www.instagram.com/cherry
nancy www.instagram.com/nancy
What I am doing is:
data["url"] = url
w = data.names.values
def replace()
for i in w,data.iteritems:
for j in range(len(data.url),data.iteritems:
data["url"]=url.replace("alex",i(j))
return data
It throws an error that I cannot use range as indices... so I tried many things to use integers, but it still doesn't give me the results until I manually put i(0) or i(1) or i(3)
If I try to add another for line like
for w in range(len(data.url):
And do i(w)..
Then it changes everything to the i(0) that in this example will be www.instagram.com/john
I have used oversimplified example for my problem, in my project it is very important to create function because url is too big and the names are input (user selects) so that is why i need to creaTe function

Please check below:
df['url'] = df['Name'].apply(lambda x : url.replace('alex',x.lower()))

data["url"] = "www.instagram.com/" + data["Names"].str.lower()

Related

Why isn't this function to scrape a table with Selenium working as intended?

I'm trying to scrape the table on this website with Selenium (you need to make a quick account and login to see it). To get the first column, I use the following code, which works:
first_column = driver.find_elements_by_xpath("//div[#class='top-line-table']//tbody//td[2]")
for i in first_column:
print(i.text)
Prints:
Bernie Sanders
Joe Biden
Michael Bloomberg
Elizabeth Warren
However, when I try making it into a function that I can input the column I want into, it only returns the first value, "Bernie Sanders". Here's my code to define, call, and print the function:
def scrape_column (column):
raw = driver.find_elements_by_xpath(f"//div[#class='top-line-table']//tbody//td[{column}]")
for ii in raw:
return[ii.text]
candidates = scrape_column("2")
print(candidates)
I don't know why it only returns the first value, I've tried a lot of things and it still doesn't work. Help is much appreciated!
The reason is that when you return the lifecycle of your function ends. It has returned a value and that's it. If you want to get all candidate names as list, do something like this:
def scrape_column (column):
names = []
raw = driver.find_elements_by_xpath(f"//div[#class='top-line-table']//tbody//td[{column}]")
for ii in raw:
names.append(ii.text)
return names
candidates = scrape_column("2")
print('\n'.join(candidates))
Look up join, it is a pretty neat stuff.

python nested loop list

I am currently stuck at one nested loop problem. I would appreciate it greatly if anyone can offer their insight or tips on how to solve this sticky problem that i am facing.
I am trying to append some values to a list in a for loop. I succeeded in doing that. But how can I get the last list as my variable to use in another loop?
Lets say. I am extracting something by appending them in a list in a for loop.
a=list()
for b in hugo:
a.append(ids)
print(a)
gives me
[1]
[1,2]
[1,2,3]
[1,2,3,4]
But I only need the last line of the list as my variable to be used in another for loop. Can anybody gives me some insights how to do this? Your help is much appreciated. Thanks in advance.
Edit:
Actually I am not trying to get someone to do my homework for me. I am just testing some software programming using python. Here goes:
I am trying to write a script to extract files with the end name of .dat from ANSA pre-processor with the correct name and file ID
For example:
ID Name
1 hugo1.dat
8 hugo2.dat
11 hugo3.dat
18 hugo4.dat
Here is what I have written:
import os
import ansa
from ansa import base
from ansa import constants
from ansa import guitk
def export_include_content():
directory = gutik.UserInput('Please enter the directory to Output dat files:')
ishow=list()
includes=list()
setna=list()
iname=list()
# Set includes variables to collect the elements from a function known as "INCLUDE" from the software
includes=base.CollectEntitites(deck, None, "INCLUDE")
# For loop to get information from the "INCLUDE" function with the end filename ".dat"
for include in includes:
ret=base.GetEntityCardValues(deck, include, 'NAME', 'ID')
ids=str(ret['ID'])
setname=ret['NAME']
if setname.endswith('dat'):
ishow.append(ids)
iname.append(setname)
# Print(ishow) gives me
[1]
[1,8]
[1,8,11]
[1,8,11,18]
# print(iname) gives me
[hugo1]
[hugo1,hugo2]
[hugo1,hugo2,hugo3]
[hugo1,hugo2,hugo3,hugo4]
# Now that I got both of my required list of IDs and Names. It's time for me to save the files with the respective IDs and Names.
for a in ishow:
test=base.GetEntity(deck,'INCLUDE',int(a))
print(a)
file_path_name=directory+"/"+iname
print(file_path_name)
#print(a) gives me
1
8
11
18
#print(file_path_name) gives me
filepath/[hugo1,hugo2,hugo3,hugo4]
filepath/[hugo1,hugo2,hugo3,hugo4]
filepath/[hugo1,hugo2,hugo3,hugo4]
filepath/[hugo1,hugo2,hugo3,hugo4]
# This is the part I got stuck. I wanted the output to be printed in this order:
1
filepath/hugo1
8
filepath/hugo2
11
filepath/hugo3
18
filepath/hugo4
But it doesnt work well so far for me, that's why I am asking whether you all can provide me some assistance on solving this problem :) Helps appreciated!! Thanks all
Your problem is with the code indent:
a=list()
for b in hugo:
a.append(ids)
print(a)
Use a dictionary instead of having 2 separate list for ids and names of includes
The code below creates a dictionary with include id as keys and the corresponding include's name as the value. later this dict is used to print file name
In case you want to save each include as separate file,First isolate the include using "Or"(API) then we have an API for each deck in ANSA to do save files(make sure to enable optional argument 'save visible').for example for NASTRAN it is OutputNastran you can search it in the API search tab in the script editor window
dict={}
for include in includes:
ret=base.GetEntityCardValues(deck, include, 'NAME', 'ID')
ids=str(ret['ID'])
setname=ret['NAME']
if setname.endswith('.dat'):
dict[ids]=setname
for k, v in dict.items():
test=base.GetEntity(deck,'INCLUDE',int(k))
file_path_name=directory+"/"+v
print(file_path_name)
Hope this helps
Assuming ids is actually just the elements in hugo:
a=[id for id in hugo]
print(a)
Or
a=hugo.copy()
print(a)
Or
print(hugo)
Or
a=hugo
print(a)
Or
string = "["
for elem in hugo:
string.append(elem + ",")
print(string[:-1] + "]")
Edit: Added more amazing answers. The last is my personal favourite.
Edit 2:
Answer for your edited question:
This part
for a in ishow:
test=base.GetEntity(deck,'INCLUDE',int(a))
print(a)
file_path_name=directory+"/"+iname
print(file_path_name)
Needs to be changed to
for i in range(len(ishow)):
test=base.GetEntity(deck,'INCLUDE',int(ishow[i]))
file_path_name=directory+"/"+iname[i]
The print statements can be left if you wish.
When you are trying to refer to the same index in multiple lists, it is better to use for i in range(len(a))so that you can access the same index in both.
Your current code has the loop printing every single time it iterates through, so move the print statement left to the same indent level as the for loop, so it only prints once the for loop has finished running its iterations.
a=list()
for b in hugo:
a.append(ids)
print(a)

How would I be able to remove this part of the variable?

So I am making a code like a guessing game. The data for the guessing game is in the CSV file so I decided to use pandas. I have tried to use pandas to import my csv file, pick a random row and put the data into variables so I can use it in the rest of the code but, I can't figure out how to format the data in the variable correctly.
I've tried to split the string with split() but I am quite lost.
ar = pandas.read_csv('names.csv')
ar.columns = ["Song Name","Artist","Intials"]
randomsong = ar.sample(1)
songartist = randomsong["Artist"]
songname = (randomsong["Song Name"])
songintials = randomsong["Intials"]
print(songname)
My CSV file looks like this.
Song Name,Artist,Intials
Someone you loved,Lewis Capaldi,SYL
Bad Guy,Billie Eilish,BG
Ransom,Lil Tecca,R
Wow,Post Malone, W
I expect the output to be the name of the song from the csv file. For Example
Bad Guy
Instead the output is
1 Bad Guy
Name: Song Name, dtype:object
If anyone knows the solution please let me know. Thanks
You're getting a series object as output. You can try
randomsong["Song Name"].to_string()
Use df['column].values to get values of the column.
In your case, songartist = randomsong["Artist"].values[0] because you want only the first element of the returned list.

Python - Format a string from a list not working

I want to crawl a webpage for some information and what I've done so far It's working but I need to do a request to another url from the website, I'm trying to format it but it's not working, this is what I have so far:
name = input("> ")
page = requests.get("http://www.mobafire.com/league-of-legends/champions")
tree = html.fromstring(page.content)
for index, champ in enumerate(champ_list):
if name == champ:
y = tree.xpath(".//*[#id='browse-build']/a[{}]/#href".format(index + 1))
print(y)
guide = requests.get("http://www.mobafire.com{}".format(y))
builds = html.fromstring(guide.content)
print(builds)
for title in builds.xpath(".//table[#class='browse-table']/tr[2]/td[2]/div[1]/a/text()"):
print(title)
From the input, the user enters a name; if the name matches one from a list (champ_list) it prints an url and from there it formats it to the guide variable and gets another information but I'm getting errors such as invalid ipv6.
This is the output url (one of them but they're similar anyway) ['/league-of-legends/champion/ivern-133']
I tried using slicing but it doesn't do anything, probably I'm using it wrong or it doesn't work in this case. I tried using replace as well, they don't work on lists; tried using it as:
y = [y.replace("'", "") for y in y] so I could see if it removed at least the quotes but it didn't work neither; what can be another approach to format this properly?
I take it y is the list you want to insert into the string?
Try this:
"http://www.mobafire.com{}".format('/'.join(y))

Python splitting values from urllib in string

I'm trying to get IP location and other stuff from ipinfodb.com, but I'm stuck.
I want to split all of the values into new strings that I can format how I want later. What I wrote so far is:
resp = urllib2.urlopen('http://api.ipinfodb.com/v3/ip-city/?key=mykey&ip=someip').read()
out = resp.replace(";", " ")
print out
Before I replaced the string into new one the output was:
OK;;someip;somecountry;somecountrycode;somecity;somecity;-;42.1975;23.3342;+05:00
So I made it show only
OK someip somecountry somecountrycode somecity somecity - 42.1975;23.3342 +05:00
But the problem is that this is pretty stupid, because I want to use them not in one string, but in more, because what I do now is print out and it outputs this, I want to change it like print country, print city and it outputs the country,city etc. I tried checking in their site, there's some class for that but it's for different api version so I can't use it (v2, mine is v3). Does anyone have an idea how to do that?
PS. Sorry if the answer is obvious or I'm mistaken, I'm new with Python :s
You need to split the resp text by ;:
out = resp.split(';')
Now out is a list of values instead, use indexes to access various items:
print 'Country: {}'.format(out[3])
Alternatively, add format=json to your query string and receive a JSON response from that API:
import json
resp = urllib2.urlopen('http://api.ipinfodb.com/v3/ip-city/?format=json&key=mykey&ip=someip')
data = json.load(resp)
print data['countryName']

Categories