I am trying to generate URLs as follows:
http://ergast.com/api/f1/2000/qualifying?limit=10000
I am using Python to generate URLs for the years 2000 to 2015, and to that end, wrote this code snippet:
url = "http://ergast.com/api/f1/"
year = url.join([str(i) + "/qualifying?limit=10000" + "\n" for i in range(1999, 2016)])
print(year)
The output is:
1999/qualifying?limit=10000
http://ergast.com/api/f1/2000/qualifying?limit=10000
http://ergast.com/api/f1/2001/qualifying?limit=10000
http://ergast.com/api/f1/2002/qualifying?limit=10000
http://ergast.com/api/f1/2003/qualifying?limit=10000
http://ergast.com/api/f1/2004/qualifying?limit=10000
......
http://ergast.com/api/f1/2012/qualifying?limit=10000
http://ergast.com/api/f1/2013/qualifying?limit=10000
http://ergast.com/api/f1/2014/qualifying?limit=10000
http://ergast.com/api/f1/2015/qualifying?limit=10000
How do I get rid of the first line? I tried making the range (2000, 2016), but the same thing happened with the first line being 2000 instead of 1999. What am I doing wrong? How can I fix this?
You can use string formatting for this:
url = 'http://ergast.com/api/f1/{0}/qualifying?limit=10000'
print('\n'.join(url.format(year) for year in range(2000, 2016)))
# http://ergast.com/api/f1/2000/qualifying?limit=10000
# http://ergast.com/api/f1/2001/qualifying?limit=10000
# ...
# http://ergast.com/api/f1/2015/qualifying?limit=10000
UPDATE:
Based on OP's comments to pass these urls in requests.get:
url_tpl = 'http://ergast.com/api/f1/{0}/qualifying?limit=10000'
# use list coprehension to get all the urls
all_urls = [url_tpl.format(year) for year in range(2000, 2016)]
for url in all_urls:
response = requests.get(url)
Instead of using the URL to join the string, use a list comprehension to create the different URLs.
>>> ["http://ergast.com/api/f1/%d/qualifying?limit=10000" % i for i in range(1999, 2016)]
['http://ergast.com/api/f1/1999/qualifying?limit=10000',
'http://ergast.com/api/f1/2000/qualifying?limit=10000',
...
'http://ergast.com/api/f1/2014/qualifying?limit=10000',
'http://ergast.com/api/f1/2015/qualifying?limit=10000']
You could then still use '\n'.join(...) to join all those to one big string, it you like.
You could use the cleaner and more powerful string formatting as follows,
fmt = "http://ergast.com/api/f1/{y}/qualifying?limit=10000"
urls = [fmt.format(y=y) for y in range(2000, 1016)]
In your code the use of str.join is questionable as it has a semantics different from what you are trying to accomplish. s.join(ls), joins the items of list ls by str s. If ls = [l1, l2 ,...] , it returns str(l1) + s + str(l2) + s..
It's good to understand why it's happening.
For that you need to understand the join function, look the docs
Concatenate a list or tuple of words with intervening occurrences of
sep.
That means that your url parameter will be repeated in between the words you want to concatenate, what will result in the output above, with the first element without the url.
What you want is not use join, is to concatenate the strings as you're already doing with the year.
For that you can use different methods, as was already answered.
You can use string formatting as was pointed out by #AKS and it should work.
Related
Good evening,
I have a python variable like so
myList = ["['Ben'", " 'Dillon'", " 'Rawr'", " 'Mega'", " 'Tote'", " 'Case']"]
I would like it to look like this instead
myList = ['Ben', 'Dillon', 'Rawr', 'Mega', 'Tote', 'Case']
If I do something like this
','.join(myList)
It gives me what I want but the type is a String
I also would like it to keep the type of List. I have tried using the Join method and split method. And I have been debugging use the type() method. It tells me that the type in the original scenario is a list.
I appreciate any and all help on this.
Join the inner list elements, then call ast.literal_eval() to parse it as a list of strings.
import ast
myList = ast.literal_eval(",".join(myList))
Also can be done by truncating Strings, therefore avoiding the import of ast.
myList[5] = (myList[5])[:-1]
for n in range(0, len(myList)):
myList[n] = (myList[n])[2:-1]
I need help using Python.
Supposing I have the list [22,23,45].
Is it possible to get an output like this: [22;23:45] ?
It's possible to change the delimiters if you display your list as a string. You can then use the join method. The following example will display your list with ; as a delimiter:
print(";".join(my_list))
This will only work if your list's items are string, by the way.
Even if you have more than one item
str(your_list[:1][0])+";" + ":".join(map(str,your_list[1:])) #'22;23:45'
Not sure why you want to wrap in the list but if you do just wrap around the above string in list()
my list which was [22,23,45] returned [;2;2;,; ;2;3;,; ;4;5;,] for both methods.
To bring more information, I have a variable:
ID= [elements['id'] for elements in country]
Using a print (ID), I get [22,23,45] so I suppose that the list is already to this form.
Problem is: I need another delimiters because [22,23,45] corresponds to ['EU, UK', 'EU, Italy', 'USA, California'].
The output I wish is [22,23,45] --> ['EU, UK'; 'EU, Italy'; 'USA, California']
I don't know if it's clearer but hope it could help
Try this I don't know exactly what do You want?
first solution:
list = [22,23,45]
str = ""
for i in list:
str += "{}{}".format(i, ";")
new_list=str[:-len(";")]
print(new_list)
and this is second solution
list = [22,23,45]
print(list)
list=str(list)
list=list.split(",")
list=";".join(list)
print(list)
I have list with one item in it, then I try to dismantle, & rebuild it.
Not really sure if it is the 'right' way, but for now it will do.
I tried using replace \ substitute, other means of manipulating the list, but it didn't go too far, so this is what I came up with:
This is the list I get : alias_account = ['account-12345']
I then use this code to remove the [' in the front , and '] from the back.
NAME = ('%s' % alias_account).split(',')
for x in NAME:
key = x.split("-")[0]
value = x.split("-")[1]
alias_account = value[:-2]
alias_account1 = key[2:]
alias_account = ('%s-%s') % (alias_account1, alias_account)
This works beautifully when running print alias_account.
The problem starts when I have a list that have ['acc-ount-12345'] or ['account']
So my question is, how to include all of the possibilities?
Should I use try\except with other split options?
or is there more fancy split options ?
To access a single list element, you can index its position in square brackets:
alias_account[0]
To hide the quotes marking the result as a string, you can use print():
print(alias_account[0])
I have the following code that is filtering and printing a list. The final output is json that is in the form of name.example.com. I want to substitute that with name.sub.example.com but I'm having a hard time actually doing that. filterIP is a working bit of code that removes elements entirely and I have been trying to re-use that bit to also modify elements, it doesn't have to be handled this way.
def filterIP(fullList):
regexIP = re.compile(r'\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}$')
return filter(lambda i: not regexIP.search(i), fullList)
def filterSub(fullList2):
regexSub = re.compile(r'example\.com, sub.example.com')
return filter(lambda i: regexSub.search(i), fullList2)
groups = {key : filterSub(filterIP(list(set(items)))) for (key, items) in groups.iteritems() }
print(self.json_format_dict(groups, pretty=True))
This is what I get without filterSub
"type_1": [
"server1.example.com",
"server2.example.com"
],
This is what I get with filterSub
"type_1": [],
This is what I'm trying to get
"type_1": [
"server1.sub.example.com",
"server2.sub.example.com"
],
The statement:
regexSub = re.compile(r'example\.com, sub.example.com')
doesn't do what you think it does. It creates a compiled regular expression that matches the string "example.com" followed by a comma, a space, the string "sub", an arbitrary character, the string "example", an arbitrary character, and the string "com". It does not create any sort of substitution.
Instead, you want to write something like this, using the re.sub function to perform the substitution and using map to apply it:
def filterSub(fullList2):
regexSub = re.compile(r'example\.com')
return map(lambda i: re.sub(regexSub, "sub.example.com", i),
filter(lambda i: re.search(regexSub, i), fullList2))
If the examples are all truly as simple as those you listed, a regex is probably overkill. A simple solution would be to use string .split and .join. This would likely give better performance.
First split the url at the first period:
url = 'server1.example.com'
split_url = url.split('.', 1)
# ['server1', 'example.com']
Then you can use the sub to rejoin the url:
subbed_url = '.sub.'.join(split_url)
# 'server1.sub.example.com'
Of course you can do the split and the join at the same time
'.sub.'.join(url.split('.', 1))
Or create a simple function:
def sub_url(url):
return '.sub.'.join(url.split('.', 1))
To apply this to the list you can take several approaches.
A list comprehension:
subbed_list = [sub_url(url)
for url in url_list]
Map it:
subbed_list = map(sub_url, url_list)
Or my favorite, a generator:
gen_subbed = (sub_url(url)
for url in url_list)
The last looks like a list comprehension but gives the added benefit that you don't rebuild the entire list. It processes the elements one item at a time as the generator is iterated through. If you decide you do need the list later you can simply convert it to a list as follows:
subbed_list = list(gen_subbed)
I have this script:
import urllib.request
from bs4 import BeautifulSoup
url= 'https://www.inforge.net/xi/forums/liste-proxy.1118/'
soup = BeautifulSoup(urllib.request.urlopen(url), "lxml")
base = ("https://www.inforge.net/xi/")
for tag in soup.find_all('a', {'class':'PreviewTooltip'}):
links = (tag.get('href'))
final = base + links
print (final[0])
which takes every link of the topics in this page.
The problem is that when I print(final[0]) the output is:
h
instead of the entire link. Can someone help me with this?
final has a type of str, as such, indexing it in position 0 will result in the first character of the url getting printed, specifically h.
You either need to print all of final if you're using it as a str:
print(final)
or, if you must have a list, make final a list in the for loop by enclosing it in square brackets []:
final = [base + links]
then print(final[0]) will print the first element of the list as you'd expect.
As #Bryan pointed out and I just noticed, it seems like you might be confused about the usage of () in Python. Without a comma , inside the () they do absolutely nothing. If you add the comma, it turns them into tuples (not lists, lists use square brackets []).
So:
base = ("https://www.inforge.net/xi/")
results in base referring to a value of str type while:
base = ("https://www.inforge.net/xi/", )
# which can also be written as:
base = "https://www.inforge.net/xi/",
results in base referring to a value of tuple type with a single element.
The same applies for the name links:
links = (tag.get('href')) # 'str'
links = (tag.get('href'), ) # 'tuple'
If you change links and base to be tuples then final is going to end up as a 2 element tuple after final = base + links is executed. So, in this case you should join the elements inside the tuple during your print call:
print ("".join(final)) # takes all elements in final and joins them together