Issue with strings and dictionaries in python - python

I'm using the click package to get input for one or more variables which get loaded in as a combined dictionary. Each entry is then joined and the combined string is added to the end of a base URL and sent through the requests package to receive some xml data.
Earlier I had an issue with one of the variables that let you search through a range, such as
[value1, value2]
Python added double quotes around it so the search function didn't operate correctly, so I used
.replace('"', '')
on the joined string before combined with the base url and that seemed to fix that problem. The issue now is that individual input that contains more than one word now doesn't produce the same output as the actual search engine online. I have to use quotes when I input the information to keep it as a single argument, but then the quotes get removed by the function above and I believe that is what is causing the issue.
I think if I have a way to access individual entries of this dictionary and remove the double quotes from only certain entries then that should get the job done. But if I am overlooking something please let me know.
Help is appreciated.
Code added below:
import click
import requests
#click.command()
#click.option(--variable1)
#click.option(--variable2)
query_list=[variable1, variable2]
query=''.join(query_list)
base_url = "abc.com...."
response=requests.get(base_url,query)

Related

Replace multiple values in string using replace, sub string and find in Python

So im pulling issues from our jira project and I need to replace url's with new formatted url's in the description.
old description contains the old sharepoint server URL's so I need to change them to our new online Sharepoint url.
I decided to use Python to make use of the atlassian plugin.
here is a version of how the description looks like in the jira issue currently:
Good day
we need a new validation on the External Reference when doing work pads or amending manually when we do refund, it seems that the user that updates this is using Tab or Enter and therefore the payment files fails ,
we need users to be validated while updating refund reference same way as we limited claims payments for updating invoice numbers
thank you
regards
*BRS & FRS:*
[BRS_FRS_PS_ACC_Payments_v20.0|http://portal.mycompany.local/mycompany/someproject/_layouts/15/start.aspx#/someproject/Forms/AllItems.aspx?RootFolder=%2Fmycompany%2Fsomeproject%2Fsomeproject%2F06%20Solution%20Documentation%2F03%20Accounting%2C%20Etc%2F07%20TIA%2FBRS%20%26%20FRS&View=%7B9A71C976%2D85D3%2D4D34%2D828B%2DE5B1B428EA5E%7D]
[BRS_FRS_PS_ACC_Workpads_Manual_Write_Off_and_Incomming_Paument_v5.0|http://portal.mycompany.local/mycompany/someproject/_layouts/15/start.aspx#/someproject/Forms/AllItems.aspx?RootFolder=%2Fmycompany%2Fsomeproject%2Fsomeproject%2F06%20Solution%20Documentation%2F03%20Accounting%2C%20Etc%2F07%20TIA%2FBRS%20%26%20FRS&View=%7B9A71C976%2D85D3%2D4D34%2D828B%2DE5B1B428EA5E%7D]
*Sign Offs:*
[BRS_FRS_PS_ACC_Payments_v20.0|http://portal.mycompany.local/mycompany/someproject/_layouts/15/start.aspx#/someproject/Forms/AllItems.aspx?RootFolder=%2Fmycompany%2Fsomeproject%2Fsomeproject%2F06%20Solution%20Documentation%2F03%20Accounting%2C%20Etc%2F07%20TIA%2FBRS%20%26%20FRS%2FSign%2Doffs%2FPayments%2FV20%2E0&FolderCTID=0x0120005C60D5FB65C2C84191CB5ACDFD820AA6&View=%7B9A71C976%2D85D3%2D4D34%2D828B%2DE5B1B428EA5E%7D]
[BRS_FRS_PS_ACC_Workpads_Manual_Write_Off_and_Incomming_Paument_v5.0|http://portal.mycompany.local/mycompany/someproject/_layouts/15/start.aspx#/someproject/Forms/AllItems.aspx?RootFolder=%2Fmycompany%2Fsomeproject%2Fsomeproject%2F06%20Solution%20Documentation%2F03%20Accounting%2C%20Etc%2F07%20TIA%2FBRS%20%26%20FRS%2FSign%2Doffs%2FWorkpads%20%26%20incoming%20payments%2FV5%2E0&FolderCTID=0x0120005C60D5FB65C2C84191CB5ACDFD820AA6&View=%7B9A71C976%2D85D3%2D4D34%2D828B%2DE5B1B428EA5E%7D]
*Technical Documentation:*
N/A
*Unit Testing:*
[TU_dd-1821|http://portal.mycompany.local/mycompany/someproject/SitePages/Home.aspx?RootFolder=%2Fmycompany%2Fsomeproject%2Fsomeproject%2F06%20Solution%20Documentation%2F03%20Accounting%2C%20Etc%2F07%20TIA%2FUnit%20Testing&FolderCTID=0x0120005C60D5FB65C2C84191CB5ACDFD820AA6&View=%7B5AF02A9E%2D451A%2D443D%2DB8CA%2DAF7C7ED6F00C%7D]
this is how i pulled in the issue from Jira( my plan is to scan through all issues and update.
from jira import JIRA
import re
jira = JIRA(server=('https://mycompanydev.atlassian.net'),basic_auth=('user', 'password'))
issue = jira.issue("S1-3000")
print("Ticket nr: ", issue)
olddescription = issue.fields.description
newdescription = olddescription
So i managed to change the first part of the url with this line:
newdescription = newdescription.replace("http://portal.mycompany.local/mycompany/someproject/_layouts/15/start.aspx#/someproject/Forms/AllItems.aspx?RootFolder=%2Fmycompany%2Fsomeproject%2F", "https://somecompany.sharepoint.com/sites/CCPortal/")
and this line:
newdescription = newdescription.replace("http://portal.mycompany.local/mycompany/someproject/SitePages/Home.aspx?RootFolder=%2Fmycompany%2Fsomeproject%2F", "https://mycompany.sharepoint.com/sites/CCPortal/")
this code completes successfully and changes the URL as intended.
Now i need to remove the end of the url from the string "View=" and the string "FolderCTID="
my line of code to do this:
newdescription = newdescription.replace(newdescription[newdescription.find("View=")-1:newdescription.find("]")],"")
and:
newdescription = newdescription.replace(newdescription[newdescription.find("FolderCTID="):newdescription.find("]")], "")
for some reason it only does the first 2 URLs:
result looks like this:
Good day
we need a new validation on the External Reference when doing work pads or amending manually when we do refund, it seems that the user that updates this is using Tab or Enter and therefore the payment files fails ,
we need users to be validated while updating refund reference same way as we limited claims payments for updating invoice numbers
thank you
regards
*BRS & FRS:*
[BRS_FRS_PS_ACC_Payments_v20.0|https://mycompany.sharepoint.com/sites/mycompany/someproject%2F06%20Solution%20Documentation%2F03%20Accounting%2C%20Etc%2F07%20TIA%2FBRS%20%26%20FRS]
[BRS_FRS_PS_ACC_Workpads_Manual_Write_Off_and_Incomming_Paument_v5.0|https://mycompany.sharepoint.com/sites/mycompany/someproject%2F06%20Solution%20Documentation%2F03%20Accounting%2C%20Etc%2F07%20TIA%2FBRS%20%26%20FRS]
*Sign Offs:*
[BRS_FRS_PS_ACC_Payments_v20.0|https://mycompany.sharepoint.com/sites/mycompany/someproject%2F06%20Solution%20Documentation%2F03%20Accounting%2C%20Etc%2F07%20TIA%2FBRS%20%26%20FRS%2FSign%2Doffs%2FPayments%2FV20%2E0&FolderCTID=0x0120005C60D5FB65C2C84191CB5ACDFD820AA6]
[BRS_FRS_PS_ACC_Workpads_Manual_Write_Off_and_Incomming_Paument_v5.0|https://mycompany.sharepoint.com/sites/mycompany/someproject%2F06%20Solution%20Documentation%2F03%20Accounting%2C%20Etc%2F07%20TIA%2FBRS%20%26%20FRS%2FSign%2Doffs%2FWorkpads%20%26%20incoming%20payments%2FV5%2E0&FolderCTID=0x0120005C60D5FB65C2C84191CB5ACDFD820AA6]
*Technical Documentation:*
N/A
*Unit Testing:*
[TU_MV-1821|https://mycompany.sharepoint.com/sites/mycompany/someproject%2F06%20Solution%20Documentation%2F03%20Accounting%2C%20Etc%2F07%20TIA%2FUnit%20Testing&FolderCTID=0x0120005C60D5FB65C2C84191CB5ACDFD820AA6&View=%7B5AF02A9E%2D451A%2D443D%2DB8CA%2DAF7C7ED6F00C%7D]
as you can see the code removed the first 2 "View=" strings with the trailing string to the end.
I cant figure out where I went wrong I also tried putting this in a while loop and just repeating the code 5 times for a test.
str.find returns the lowest index where the substring is found. So if newdescription has more than one "]", presumably because it contains more than one link, that means the returned index will only be correct for the first link.
str.find also accepts an optional start/end index to limit the search, so you can use the index of "View=" as an offset for the search for "]":
offset = newdescription.find("View=")
replace_me = newdescription[offset:newdescription.find("]", offset)]
newdescription = newdescription.replace(replace_me, "")

Python Iterating Array of URI to retrieve JSON

I have a dynamically created array of URIs that I want to iterate through for retrieval of json data, which I'll then use to search a particular field for a specific value. Unfortunately, I keep getting syntax errors.
for i in list_url_id:{
var t = requests.get(base_url+i+'?limit='+str(count),auth=HTTPBasicAuth(uname,pw)).json()
print(t)
}
If I do a print(i) in the loop, it prints the full URL out properly. I'm lost.
EDIT:
base_url is a URL similar to https://www.abcdef.com:1443
the URI in list_url_id is a URI similar to /v1/messages/list/0293842
I have no issue (as mentioned) concatenating them into a print operation, but when used for the string for requests.get, it returns a nondescript syntax error
Python sees the code inside the bracket as a dictionary, that's probably causing the syntax errors.
Indentation is used in Python for traversing for loops.
for i in list_url_id:
t = requests.get(base_url+i+'?limit='+str(count),auth=HTTPBasicAuth(uname,pw)).json()
print(t)
This should work for you and note that var is also removed as it is the wrong syntax. Python variables do not need an explicit declaration to reserve memory space. The declaration happens automatically when you assign a value to a variable.

how to convert string contains list to regular list in python?

I have an the following function in Apache Airflow:
from airflow.utils.email import send_email
send_email(to=['a#mail.com','b#mail.com'],
subject=....)
This works great.
Now I don't want to hard code the email list so I save it as a configurable field that the user can change from his UI.
So I change my code to:
NOTIFY_LIST = Variable.get("a_emails")
send_email([NOTIFY_LIST],
subject=....)
But this doesn't work.
When I do:
logging.info(NOTIFY_LIST)
logging.info([NOTIFY_LIST])
logging.info(NOTIFY_LIST.split(','))
I see:
'a#mail.com', 'b#mail.com'
[u"'a#mail.com', 'b#mail.com'"]
[u"'a#mail.com'", u" 'b#mail.com'"]
So my problem is that:
['a#mail.com','b#mail.com']
and
[NOTIFY_LIST]
isn't the same.
How can I fix this? I tried any conversion I could think of.
A suggestion to try the following;
logging.info(NOTIFY_LIST.replace("'", "").split(','))
The problem here is that the elements in the list contain quote marks.
The other answer will fail if there's an ' in the midle of the strings, to fix that I'll use str.strip:
logging.info([s.strip("' ") for s in NOTIFY_LIST.split(',')])

Output list of links grouped by extension or base URL - built on regex using python.

Working on this assignment for a while now. The regex is not particularly difficult, but I don't quite follow how to get the output they want
Your program should:
Read the html of a webpage (which has been stored as textfile);
Extract all the domains referred to and list all the full http addresses related to these domains;
Extract all the resource types referred to and list all the full http * addresses related to these resource types.
Please solve the task using regular expressions and re functions/methods. I suggest using ‘finditer’ and ‘groups’ (there might be other possibilities). Please do not use string functions where re is better suited."
The output is supposed to look like this
www.fairfaxmedia.co.nz
http://www.fairfaxmedia.co.nz
www.essentialmums.co.nz
http://www.essentialmums.co.nz/
http://www.essentialmums.co.nz/
http://www.essentialmums.co.nz/
www.nzfishingnews.co.nz
http://www.nzfishingnews.co.nz/
www.nzlifeandleisure.co.nz
http://www.nzlifeandleisure.co.nz/
www.weatherzone.co.nz
http://www.weatherzone.co.nz/
www.azdirect.co.nz
http://www.azdirect.co.nz/
i.stuff.co.nz
http://i.stuff.co.nz/
ico
http://static.stuff.co.nz/781/3251781.ico
zip
http://static2.stuff.co.nz/1392867595/static/jwplayer/skin/Modieus.zip
mp4
http://file2.stuff.co.nz/1394587586/272/9819272.mp4
I really need help with how to filter stuff out so the output shows up like that?
create list of tuples (keyword, url)
sort it according to keyword
using itertools.groupby group per keyword
for each keyword, print keyword and then all urls (these to be printed indentend).

BioPython Pubmed Eutils url?

I'm trying to run some queries against Pubmed's Eutils service. If I run them on the website I get a certain number of records returned, in this case 13126 (link to pubmed).
A while ago I bodged together a python script to build a query to do much the same thing, and the resultant url returns the same number of hits (link to Eutils result).
Of course, not having any formal programming background, it was all a bit cludgy, so I'm trying to do the same thing using Biopython. I think the following code should do the same thing, but it returns a greater number of hits, 23303.
from Bio import Entrez
Entrez.email = "A.N.Other#example.com"
handle = Entrez.esearch(db="pubmed", term="stem+cell[All Fields]",datetype="pdat", mindate="2012", maxdate="2012")
record = Entrez.read(handle)
print(record["Count"])
I'm fairly sure it's just down to some subtlety in how the url is being generated, but I can't work out how to see what url is being generated by Biopython. Can anyone give me some pointers?
Thanks!
EDIT:
It's something to do with how the url is being generated, as I can get back the original number of hits by modifying the code to include double quotes around the search term, thus:
handle = Entrez.esearch(db='pubmed', term='"stem+cell"[ALL]', datetype='pdat', mindate='2012', maxdate='2012')
I'm still interested in knowing what url is being generated by Biopython as it'll help me work out how i have to structure the search term for when i want to do more complicated searches.
handle = Entrez.esearch(db="pubmed", term="stem+cell[All Fields]",datetype="pdat", mindate="2012", maxdate="2012")
print(handle.url)
You've solved this already (Entrez likes explicit double quoting round combined search terms), but currently the URL generated is not exposed via the API. The simplest trick would be to edit the Bio/Entrez/__init__.py file to add a print statement inside the _open function.
Update: Recent versions of Biopython now save the URL as an attribute of the returned handle, i.e. in this example try doing print(handle.url)

Categories