How to iterate an expression? - python

This is in relation to web scraping, specifically scrapy. I want to be able to iterate an expression to create my items. As an example, lets say I import the item class as "item." In order to then store an item, I would have to code something like:
item['item_name'] = response.xpath('xpath')
My response is actually a function so it actually looks something like:
item['item_name'] = eval(xpath_function(n))
This works perfectly. However, how can I iterate this to create multiple items with different names without having to manually name each one? The code below does not work at all (and I didn't expect it to), but should give you an idea of what I am trying to accomplish:
for n in range(1, 10):
f"item['item_name{n}'] = eval(xpath_function(n))"
Basically trying to create 10 different items names item_name1 - item_name10. Hope that makes sense and I appreciate any help.

If you are just creating keys for your dictionary based on the value of n you could try something like:
for n in range(10):
item['item_name' + str(n+1)] = eval(xpath_function(n+1))
If you need to format the number (e.g. include leading zeros), you could use an f-string rather than concatenating the strings as I did.
[NB your for loop as written will only run from 1 to 9, so I have changed this in my answer.]

Related

Adding strings together adds brackets and quotation marks

I'm new to programming and trying to learn it by doing small projects. Currently I'm working on a random string generator and I have it 99% done, but I cant get the output to be the way I want it to be.
First, here is the code:
import random
def pwgenerator():
print("This is a randomm password generator.")
print("Enter the lenght of your password and press enter to generate a password.")
lenght = int(input())
template = "abcdefghijklmnopqrstuvwxyz01234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ!" # this is the sample used for choosing random characters
generator = ""
for x in range(lenght): # join fcuntion goes trough str chronologically and I want it fully randomized, so I made this loop
add_on = str(random.sample(template, 1))
generator = generator + add_on
#print(add_on) added this and next one to test if these are already like list or still strings.
#print(generator)
print(generator) # I wanted this to work, but...
for x in range(lenght): #...created this, because I thought that I created list with "generator" and tried to print out a normal string with this
print(generator[x], end="")
pwgenerator()
The original code was supposed to be this:
import random
def pwgenerator():
print("This is a randomm password generator.")
print("Enter the lenght of your password and press enter to generate a password.")
lenght = int(input())
template = "abcdefghijklmnopqrstuvwxyz01234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ!"
generator = ""
for x in range(lenght):
generator = generator + str(random.sample(template, 1))
print(generator)
pwgenerator()
The problem is that with this original code and an for example an input of 10 I get this result:
['e']['3']['i']['E']['L']['I']['3']['r']['l']['2']
what I would want as an output here would be "e3iELI3rl2"
As you can see in the first code i tried a few things, because it looked to me like i was somehow creating a List with lists as items, that each have 1 entry. So i though I would just print out each item, but the result was (for a user input/lenght of 10):
['e']['3']
So it just printed out each character in that list as a string (inlcuding the brackets and quotation marks) , which I interpret as whatever I created not being a list. but actually still a string
Doing some research - and assuming I still created a string - i found this from W3Schools. If I understand it correctly though Im, doing everything right trying to add strings together.
Can you please tell me whats going on here, specifically why I get the output i get that looks like a list of lists?
And if you can spare some more time Id also like to hear for a better way to do this, but I mainly want to understand whats going on, rather than be given a solution. Id like to find a solution myself. :D
Cheers
PS:
Just in case you are wondering: Im trying to learn by doing and currently follow the suggested mini projects from HERE. But in this case I read on W3Schools, that the "join" method results in chronological results so I added the additional complication of making it really random.
Okay, so the problem is that random.choice returns list of strings instead of a string as you may see below:
template = "abcdefghijklmnopqrstuvwxyz01234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ!"
random.sample(template, 1)
Out[5]: ['H']
What actually happened there it was adding strings containing result of list casting (e.g ['H'] was converted to "['H']" and then printed on the screen as ['H']).
After modyfying the function to random.choice it worked fine:
random.choice(template)
Out[6]: 'j'
Switch this random.sample to random.choice in your function and it shall be as you expected.
The random.sample() function returns a list chosen from the given string.
That's why you're getting a bunch lists stacked together.

Get results from a for loop in Python

I am new to coding in Python and ran into an issue.
I have a list of domain names that I would like to get whois lookup information of.
I am using a for look to get whois information on every domain in the list called domain_name like this:
for i in domain_name:
print(whois.whois(i))
I am getting the results printed just fine. But I would like to save those results in a variable that I can make a list of dataframe out of.
How do I go about doing that?
thank you!
A list comprehension is appropriate here, useful if you are starting with one list and want to create a new one.
my_results = [whois.whois(i) for i in domain_name]
Will create a new list with the whois results.
Define the list you want to store them in before the loop then append them to it inside the loop:
my_container = []
for domain in domain_names:
my_container.append(whois.whois(domain))

How can I remove certain strings from a list based on the strings in another list, if those strings differ slightly? More info below

This seems like a pretty rudimentary question, but I'm wondering because the items in these lists change every so often when a website is scraped...
employees = ['leadership(x)', 'drivers(y)', 'trainers(z)']
Where x,y,z are the number of employees in those specific roles, and are the values that change every so often.
If I know that the strings will always be 'leadership' 'drivers' and 'trainers', just with a difference in what's in between the parentheses, how can I dynamically remove these strings without having to hardcode it every week that I run the program?
The obvious but not so successful solution is...
employees = ['leadership(x)', 'drivers(y)', 'trainers(z)']
unwanted = ['leadership(x)', 'drivers(y)', 'trainers(z)']
for i in unwanted:
if i in employees:
employees.remove(i)
This of course fails because the values are hardcoded and the values are bound to change, any help with this would be greatly appreciated!
You could do something like
unwanted_prefixes = ['leadership', 'drivers', 'trainers']
unwanted = [s for s in employees if s.split('(')[0] in unwanted_prefixes]
This will make the list of things to delete contain any string beginning with those 3 prefixes and either containing nothing else or immediately followed by a parenthesis.
A more complicated solution, if that one deletes strings that you want, that follows roughly the same idea, but with a regex:
import re
unwanted_re = re.compile(r'(leadership|drivers|trainers)\(\d+\)')
unwanted = [x for x in employees if unwanted_re.fullmatch(x)]

How to make a dynamic argument/variable in a Python loop?

I have a function(x). I have also several dctionaries items that will go as function() argument x. I need to write a dynamic name for x using for loop. In PHP used to be simple, but here I cannot find a simple way. I see posts about creating new dictionaries etc. very complicated. Am I missing something? I thought Python was extra simplistic.
EDIT:
I need this in a dynamic way where a number is replaced with i:
for i in range(1, 100):
function(x1)
I cannot write function(x1), function(x2), function(x3) 99 times. How to incorporate i in a simple way without creating dictionaries or lists etc.
Maybe it is not possible the way I want, because x1, x2, x3, ... x99 are dictionaries and also object and cannot generate their names in a simple way taking x and adding i at the end to make it x1, can I?
EDIT ends.
I need to add i at the end or a dictionary name:
for i in range(1, 100):
z = 'n'+str(i) # or something like this
function(m.''.i) # this is only an attempt, which is incorrect.
As others have pointed out in comments, this is probably an indication that you need to refactor the rest of your code so that the data is passed as a list rather than 99 individual variables.
If you do need to access all the values like this, and they're fields on an object, you can use the getattr() function:
for i in range(1, 100):
z = 'n'+str(i)
function(getattr(m, z))
However, if at all possible, do modify the rest of the code so that the data is in a list or similar.

Python list.remove items present in second list

I've searched around and most of the errors I see are when people are trying to iterate over a list and modify it at the same time. In my case, I am trying to take one list, and remove items from that list that are present in a second list.
import pymysql
schemaOnly = ["table1", "table2", "table6", "table9"]
db = pymysql.connect(my connection stuff)
tables = db.cursor()
tables.execute("SHOW TABLES")
tablesTuple = tables.fetchall()
tablesList = []
# I do this because there is no way to remove items from a tuple
# which is what I get back from tables.fetchall
for item in tablesTuple:
tablesList.append(item)
for schemaTable in schemaOnly:
tablesList.remove(schemaTable)
When I put various print statements in the code, everything looks like proper and like it is going to work. But when it gets to the actual tablesList.remove(schemaTable) I get the dreaded ValueError: list.remove(x): x not in list.
If there is a better way to do this I am open to ideas. It just seemed logical to me to iterate through the list and remove items.
Thanks in advance!
** Edit **
Everyone in the comments and the first answer is correct. The reason this is failing is because the conversion from a Tuple to a list is creating a very badly formatted list. Hence there is nothing that matches when trying to remove items in the next loop. The solution to this issue was to take the first item from each Tuple and put those into a list like so: tablesList = [x[0] for x in tablesTuple] . Once I did this the second loop worked and the table names were correctly removed.
Thanks for pointing me in the right direction!
I assume that fetchall returns tuples, one for each database row matched.
Now the problem is that the elements in tablesList are tuples, whereas schemaTable contains strings. Python does not consider these to be equal.
Thus when you attempt to call remove on tablesList with a string from schemaTable, Python cannot find any such value.
You need to inspect the values in tablesList and find a way convert them to a strings. I suspect it would be by simply taking the first element out of the tuple, but I do not have a mySQL database at hand so I cannot test that.
Regarding your question, if there is a better way to do this: Yes.
Instead of adding items to the list, and then removing them, you can append only the items that you want. For example:
for item in tablesTuple:
if item not in schemaOnly:
tablesList.append(item)
Also, schemaOnly can be written as a set, to improve search complexity from O(n) to O(1):
schemaOnly = {"table1", "table2", "table6", "table9"}
This will only be meaningful with big lists, but in my experience it's useful semantically.
And finally, you can write the whole thing in one list comprehension:
tablesList = [item for item in tablesTuple if item not in schemaOnly]
And if you don't need to keep repetitions (or if there aren't any in the first place), you can also do this:
tablesSet = set(tablesTuple) - schemaOnly
Which is also has the best big-O complexity of all these variations.

Categories