I have searched quite thoroughly and have not found a suitable solution. I am new to Python/Programming, so I appreciate any advice I can get:
I am trying to search a string from StringSet, here is what i am trying to do but not getting the value.
string_set = {'"123", "456", "789"'}
value = '123'
values_list = []
def fun():
for i in string_set:
if i in value:
output=LookupTables.get('dynamo-table', i, {})
return output
fun()
Using the above if it value is in the stringset then it will return the value which is in my dynmodb table.
Nothe: There could be more than 5000 values in my table so i wanted to get earliest possible return.
maybe you should romove the extra '' firstly
string_set = {'"123", "456", "789"'} # this set has just one value '"123", "456", "789"'
string_set_fixed = {"123", "456", "789"}
im assuming you're just checking if 123 is in "123", "456", "789" since you had it wrapped in single quotes:
to represent that lets use:
strset = {"123", "456", "789"}
what if you have to use that weird variable?
this should render it useable
strset = {'"123", "456", "789"'}
removed = next(iter(strset))
strset.update((removed).split())
strset.remove(removed)
strset = set([i.strip(",").strip('"') for i in strset])
another cleaner way:
strset = {'"123", "456", "789"'}
exec(f"strset = {next(iter(strset))}")
print("123" in strset)
now to check if value is in there:
if value in strset:
#do code here
Try this:
string_set = {"123", "456", "789"}
value = '123'
values_list = []
def fun():
if value in string_set:
output = LookupTables.get('dynamo-table', value, {})
return output
fun()
Explanation:
Your definition of string_set contains an extraneous pair of ' ';
When you are testing i in value, you are comparing i against all substrings of value, rather than against the whole string.
Related
I'm trying to do a ternary like operator for python to check if my dictionary value exist then use it or else leave it blank, for example in the code below I want to get the value of creator and assignee, if the value doesn't exist I want it to be '' if theres a way to use ternary operator in python?
Here's my code :
in_progress_response = requests.request("GET", url, headers=headers, auth=auth).json()
issue_list = []
for issue in in_progress_response['issues'] :
# return HttpResponse( json.dumps( issue['fields']['creator']['displayName'] ) )
issue_list.append(
{
"id": issue['id'],
"key": issue['key'],
# DOESN'T WORK
"creator": issue['fields']['creator']['displayName'] ? '',
"is_creator_active": issue['fields']['creator']['active'] ? '',
"assignee": issue['fields']['assignee']['displayName'] ? '',
"is_assignee_active": issue['fields']['assignee']['active'] ? '',
"updated": issue['fields']['updated'],
}
)
return issue_list
Ternary operators in python act as follows:
condition = True
foo = 3.14 if condition else 0
But for your particular use case, you should consider using dict.get(). The first argument specifies what you are trying to access, and the second argument specifies a default return value if the key does not exist in the dictionary.
some_dict = {'a' : 1}
foo = some_dict.get('a', '') # foo is 1
bar = some_dict.get('b', '') # bar is ''
You can use .get(…) [Django-doc] to try to fetch an item from a dictionary and return an optional default value in case the dictionary does not contain the given key, you thus can implement this as:
"creator": issue.get('fields', {}).get('creator', {}).get('displayName', ''),
the same with the other items.
if you want to use something like ternary then
you can say
value = issue['fields']['creator']['displayName'] if issue['fields']['creator'] else ""
Basically what I am trying to do is generate a json list of SSH keys (public and private) on a server using Python. I am using nested dictionaries and while it does work to an extent, the issue lies with it displaying every other user's keys; I need it to list only the keys that belong to the user for each user.
Below is my code:
def ssh_key_info(key_files):
for f in key_files:
c_time = os.path.getctime(f) # gets the creation time of file (f)
username_list = f.split('/') # splits on the / character
user = username_list[2] # assigns the 2nd field frome the above spilt to the user variable
key_length_cmd = check_output(['ssh-keygen','-l','-f', f]) # Run the ssh-keygen command on the file (f)
attr_dict = {}
attr_dict['Date Created'] = str(datetime.datetime.fromtimestamp(c_time)) # converts file create time to string
attr_dict['Key_Length]'] = key_length_cmd[0:5] # assigns the first 5 characters of the key_length_cmd variable
ssh_user_key_dict[f] = attr_dict
user_dict['SSH_Keys'] = ssh_user_key_dict
main_dict[user] = user_dict
A list containing the absolute path of the keys (/home/user/.ssh/id_rsa for example) is passed to the function. Below is an example of what I receive:
{
"user1": {
"SSH_Keys": {
"/home/user1/.ssh/id_rsa": {
"Date Created": "2017-03-09 01:03:20.995862",
"Key_Length]": "2048 "
},
"/home/user2/.ssh/id_rsa": {
"Date Created": "2017-03-09 01:03:21.457867",
"Key_Length]": "2048 "
},
"/home/user2/.ssh/id_rsa.pub": {
"Date Created": "2017-03-09 01:03:21.423867",
"Key_Length]": "2048 "
},
"/home/user1/.ssh/id_rsa.pub": {
"Date Created": "2017-03-09 01:03:20.956862",
"Key_Length]": "2048 "
}
}
},
As can be seen, user2's key files are included in user1's output. I may be going about this completely wrong, so any pointers are welcomed.
Thanks for the replies, I read up on nested dictionaries and found that the best answer on this post, helped me solve the issue: What is the best way to implement nested dictionaries?
Instead of all the dictionaries, I simplfied the code and just have one dictionary now. This is the working code:
class Vividict(dict):
def __missing__(self, key): # Sets and return a new instance
value = self[key] = type(self)() # retain local pointer to value
return value # faster to return than dict lookup
main_dict = Vividict()
def ssh_key_info(key_files):
for f in key_files:
c_time = os.path.getctime(f)
username_list = f.split('/')
user = username_list[2]
key_bit_cmd = check_output(['ssh-keygen','-l','-f', f])
date_created = str(datetime.datetime.fromtimestamp(c_time))
key_type = key_bit_cmd[-5:-2]
key_bits = key_bit_cmd[0:5]
main_dict[user]['SSH Keys'][f]['Date Created'] = date_created
main_dict[user]['SSH Keys'][f]['Key Type'] = key_type
main_dict[user]['SSH Keys'][f]['Bits'] = key_bits
I'm looping through a list of web pages with Scrapy. Some of the pages that I scrape are in error. i want to keep track of the various error types so I have set up my function to first check if a series of error conditions ( which I have placed in a dictionary are true and if none are proceed with normal page scraping:
def parse_detail_page(self, response):
error_value = False
output = ""
error_cases = {
"' pageis not found' in response.body" : 'invalid',
"'has been transferred' in response.body" : 'transferred',
}
for key, value in error_cases.iteritems():
if bool(key):
error_value = True
output = value
if error_value:
for field in J1_Item.fields:
if field == 'case':
item[field] = id
else:
item[field] = output
else:
item['case'] = id
........................
However I see that despite even in cases with none of the error cases being valid, the 'invalid' option is being selected. What am I doing wrong?
Your conditions (something in response.body) are not evaluated. Instead, you evaluate the truth value of a nonempty string, which is True.
This might work:
def parse_detail_page(self, response):
error_value = False
output = ""
error_cases = {
"pageis not found" : 'invalid',
"has been transferred" : 'transferred',
}
for key, value in error_cases.iteritems():
if key in response.body:
error_value = True
output = value
break
.................
(Must it be "pageis not found" or "page is not found"?)
bool(key) will convert key from a string to a bool.
What it won't do is actually evaluate the condition. You could use eval() for that, but I'd recommend instead storing a list of functions (each returning an object or throwing an exception) rather than your current dict-with-string-keys-that-are-actually-Python-code.
I'm not sure why you are evaluating bool(key) like you are. Let's look at your error_cases. You have two keys, and two values. "' pageis not found' in response.body" will be your key the first time, and "'has been transferred' in response.body" will be the key in the second round in your for loop. Neither of those will be false when you check bool(key), because key has a value other than False or 0.
>>> a = "' pageis not found' in response.body"
>>> bool(a)
True
You need to have a different evaluator other than bool(key) there or you will always have an error.
Your conditions are strings, so they are not be evaluated.
You could evaluate your strings using eval(key) function, that is quite unsafe.
With the help of the operator module, there is no need to evaluate unsafe strings (as long as your conditions stay quite simple).
error['operator'] holds reference to the 'contains' function, which can be used as a replacement for 'in'.
from operator import contains
class ...:
def parse_detail_page(self, response):
error_value = False
output = ""
error_cases = [
{'search': ' pageis not found', 'operator': contains, 'output': 'invalid' },
{'search': 'has been transferred', 'operator': contains, 'output': 'invalid' },
]
for error in error_cases:
if error['operator'](error['search'], response.body):
error_value = True
output = error['output']
print output
if error_value:
for field in J1_Item.fields:
if field == 'case':
item[field] = id
else:
item[field] = output
else:
item['case'] = id
...
I'm writing a python scraper code for OpenData and I have one question about : how to check if all values aren't filled in site and if it is null change value to null.
My scraper is here.
Currently I'm working on it to optimalize.
My variables now look like:
evcisloval = soup.find_all('td')[3].text.strip()
prinalezival = soup.find_all('td')[5].text.strip()
popisfaplnenia = soup.find_all('td')[7].text.replace('\"', '')
hodnotafaplnenia = soup.find_all('td')[9].text[:-1].replace(",", ".").replace(" ", "")
datumdfa = soup.find_all('td')[11].text
datumzfa = soup.find_all('td')[13].text
formazaplatenia = soup.find_all('td')[15].text
obchmenonazov = soup.find_all('td')[17].text
sidlofirmy = soup.find_all('td')[19].text
pravnaforma = soup.find_all('td')[21].text
sudregistracie = soup.find_all('td')[23].text
ico = soup.find_all('td')[25].text
dic = soup.find_all('td')[27].text
cislouctu = soup.find_all('td')[29].text
And Output :
scraperwiki.sqlite.save(unique_keys=["invoice_id"],
data={ "invoice_id":number,
"invoice_price":hodnotafaplnenia,
"evidence_no":evcisloval,
"paired_with":prinalezival,
"invoice_desc":popisfaplnenia,
"date_received":datumdfa,
"date_payment":datumzfa,
"pay_form":formazaplatenia,
"trade_name":obchmenonazov,
"trade_form":pravnaforma,
"company_location":sidlofirmy,
"court":sudregistracie,
"ico":ico,
"dic":dic,
"accout_no":cislouctu,
"invoice_attachment":urlfa,
"invoice_url":url})
I googled it but without success.
First, write a configuration dict of your variables in the form:
conf = {'evidence_no': (3, str.strip),
'trade_form': (21, None),
...}
i.e. key is the output key, value is a tuple of id from soup.find_all('td') and of an optional function that has to be applied to the result, None otherwise. You don't need those Slavic variable names that may confuse other SO members.
Then iterate over conf and fill the data dict.
Also, run soup.find_all('td') before the loop.
tds = soup.find_all('td')
data = {}
for name, (num, func) in conf.iteritems():
text = tds[num].text
# replace text with None or "NULL" or whatever if needed
...
if func is None:
data[name] = text
else:
data[name] = func(text)
This will remove a lot of duplicated code. Easier to maintain.
Also, I am not sure the strings "NULL" are the best way to write missing data. Doesn't sqlite support Python's real None objects?
Just read your attached link, and it seems what you want is
evcisloval = soup.find_all('td')[3].text.strip() or "NULL"
But be careful. You should only do this with strings. If the part before or is either empty or False or None, or 0, they will all be replaced with "NULL"
Is there any way to check every element of a list comprehension in a clean and elegant way?
For example, if I have some db result which may or may not have a 'loc' attribute, is there any way to have the following code run without crashing?
db_objs = SQL("query")
top_scores = [{"name":obj.name, "score":obj.score, "latitude":obj.loc.lat, "longitude":obj.loc.lon} for obj in db_objs]
If there is any way to fill these fields in either as None or the empty string or anything, that would be much very nice. Python tends to be a magical thing, so if any of you have sage advice it would be much appreciated.
Clean and unified solution:
from operator import attrgetter as _attrgetter
def attrgetter(attrname, default=None):
getter = _attrgetter(attrname)
def wrapped(obj):
try:
return getter(obj)
except AttributeError:
return default
return wrapped
GETTER_MAP = {
"name":attrgetter('name'),
"score":attrgetter('score'),
"latitude":attrgetter('loc.lat'),
"longitude":attrgetter('loc.lon'),
}
def getdict(obj):
return dict(((k,v(obj)) for (k,v) in GETTER_MAP.items()))
if __name__ == "__main__":
db_objs = SQL("query")
top_scores = [getdict(obj) for obj in db_objs]
print top_scores
Try this:
top_scores = [{"name":obj.name,
"score":obj.score,
"latitude": obj.loc.lat if hasattr(obj.loc, lat) else 0
"longitude":obj.loc.lon if hasattr(obj.loc, lon) else 0}
for obj in db_objs]
Or, in your query set a default value.
It's not pretty, but getattr() should work:
top_scores = [
{
"name": obj.name,
"score": obj.score,
"latitude": getattr(getattr(obj, "loc", None), "lat", None),
"longitude": getattr(getattr(obj, "loc", None), "lon", None),
}
for obj in db_objs
]
This will set the dict item with key "latitude" to obj.loc.lat (and so on) if it exists; if it doesn't (and even if obj.loc doesn't exist), it'll be set to None.