I'm trying to create a script that :
Open Browser
-> Go to a website (logging page)
-> Auto Logging (filling up email and password details from csv file )
-> Close Tab
-> Re open again the website
-> Re Auto logging but with the second account (filling up details from csv file SECOND ROW ) .
...
Re do the same tasks 50 times (From account 1 to 50 for example)
import pandas as pd
from selenium import webdriver
//Open Browser and go to facebook logging page
browser = webdriver.Chrome(r'C:\Users\Hamza\Desktop\Python\chromedriver')
browser.get('https://facebook.com')
//Import csv file
data = pd.read_excel (r'C:\Users\Hamza\Desktop\testcsv.xlsx')
I actually went on the Facebook website and pulled the source codes and wrote a little something extra quickly to log you in to the website
import pandas as pd
from selenium import webdriver
# Open Browser and go to facebook logging page
browser = webdriver.Chrome(r'C:\Users\Hamza\Desktop\Python\chromedriver')
browser.get('https://facebook.com')
# Import csv file
data = pd.read_excel (r'C:\Users\Hamza\Desktop\testcsv.xlsx')
i = 0
while i == 0:
a = 0
Username = df.username
Password = df.password
# Sends username
id_box = driver.find_element_by_class_id('email')
id_box.send_keys(Username[a])
# Sends password
Pass_box = driver.find_element_by_class_name('pass')
Pass_box.send_keys(Password[a])
# Click login
Send = driver.find_element_by_css_selector("u_0_3")
Send.click()
try:
test = driver.find_element_by_class_name('pass')
id_box.clear()
Pass_box.clear()
except:
print("logged in")
break
a = a + 1
However this is assuming that your csv files has the files saved in columns named username and password, so you might have to tweak it
The best way to do this in my opinion is to open the CSV file in dict().
Here's my code it may help you guys. Ignore detail this is nothing just the file I'm working on.
with open('C:\Users\Hamza\Desktop\testcsv.csv','rt')as f:
data = csv.DictReader(f)
for detail in data:
numberOfBedrooms=detail['numberOfBedrooms']
numberOfBathrooms=detail['numberOfBathrooms']
pricePerMonth=detail['pricePerMonth']
adress=detail['adress']
description=detail['description']
square_feet=detail['square_feet']
bedrooms = driver.find_element_by_xpath('//*[#id="jsc_c_12" or text()="Number
of bathrooms"]')
bedrooms.send_keys(numberOfBathrooms)
Loop through your data and store the data you want in variable then use variable to sendeys. Just like I did in example bedrooms.send_keys(numberOfBathrooms)
Related
I am an absolute beginner when it comes to working with REST APIs with python. We have received a share-point URL which has multiple folders and multiples files inside those folders in the 'document' section. I have been provided an 'app_id' and a 'secret_token'.
I am trying to access the .csv file and read them as a dataframe and perform operations.
The code for operation is ready after I downloaded the .csv and did it locally but I need help in terms of how to connect share-point using python so that I don't have to download such heavy files ever again.
I know there had been multiple queries already on this over stack-overflow but none helped to get to where I want.
I did the following and I am unsure of what to do next:
import json
from office365.runtime.auth.user_credential import UserCredential
from office365.sharepoint.client_context import ClientContext
from office365.runtime.http.request_options import RequestOptions
site_url = "https://<company-name>.sharepoint.com"
ctx = ClientContext(site_url).with_credentials(UserCredential("{app_id}", "{secret_token}"))
Above for site_url, should I use the whole URL or is it fine till ####.com?
This is what I have so far, next I want to read files from respective folders and convert them into a dataframe? The files will always be in .csv format
The example hierarchy of the folders are as follows:
Documents --> Folder A, Folder B
Folder A --> a1.csv, a2.csv
Folder B --> b1.csv, b2.csv
I should be able to move to whichever folder I want and read the files based on my requirement.
Thanks for the help.
This works for me, using a Sharepoint App Identity with an associated client Id and client Secret.
First, I demonstrate authenticating and reading a specific file, then getting a list of files from a folder and reading the first one.
import pandas as pd
import json
import io
from office365.sharepoint.client_context import ClientCredential
from office365.sharepoint.client_context import ClientContext
from office365.sharepoint.files.file import File
#Authentication (shown for a 'modern teams site', but I think should work for a company.sharepoint.com site:
site="https://<myteams.companyname.com>/sites/<site name>/<sub-site name>"
#Read credentials from a json configuration file:
spo_conf = json.load(open(r"conf\spo.conf", "r"))
client_credentials = ClientCredential(spo_conf["RMAppID"]["clientId"],spo_conf["RMAppID"]["clientSecret"])
ctx = ClientContext(site).with_credentials(client_credentials)
#Read a specific CSV file into a dataframe:
folder_relative_url = "/sites/<site name>/<sub site>/<Library Name>/<Folder Name>"
filename = "MyFileName.csv"
response = File.open_binary(ctx, "/".join([folder_relative_url, filename]))
df = pd.read_csv(io.BytesIO(response.content))
#Get a list of file objects from a folder and read one into a DataFrame:
def getFolderContents(relativeUrl):
contents = []
library = ctx.web.get_list(relativeUrl)
all_items = library.items.filter("FSObjType eq 0").expand(["File"]).get().execute_query()
for item in all_items: # type: ListItem
cur_file = item.file
contents.append(cur_file)
return contents
fldrContents = getFolderContents('/sites/<site name>/<sub site>/<Library Name>')
response2 = File.open_binary(ctx, fldrContents[0].serverRelativeUrl)
df2 = pd.read_csv(io.BytesIO(response2.content))
Some References:
Related SO thread.
Office365 library github site.
Getting a list of contents in a doc library folder.
Additional notes following up on comments:
The site path doesn't not include the full url for the site home page (ending in .aspx) - it just ends with the name for the site (or sub-site, if relevant to your case).
You don't need to use a configuration file to store your authentication credentials for the Sharepoint application identity - you could just replace spo_conf["RMAppID"]["clientId"] with the value for the Sharepoint-generated client Id and do similarly for the client Secret. But this is a simple example of what the text of a JSON file could look like:
{
"MyAppName":{
"clientId": "my-client-id",
"clientSecret": "my-client-secret",
"title":"name_for_application"
}
}
I am doing automation on the website of the company I work for.
I have a difficulty, I think it may be simple to solve but I couldn't think of a solution to solve it.
I have this code example
from selenium import webdriver
from time import sleep
import csv
URL = 'XXXXX'
URL2 = 'XXXXX'
user = 'XXXXX'
password = 'XXXXX'
filename = './geradores.csv'
def Autonomation():
driver = webdriver.Ie()
driver.get(URL)
driver.find_elemen_by_name('Login').send_keys(user)
driver.find_element_by_name('password').send_keys(password)
sleep(5)
driver.execute_script("window.open()")
driver.switch_to.window(driver.window_handles[1])
driver.get(URL2)
driver.maximize_window()
with open(filename, 'r') as writer:
reader = csv.DictReader(writer, delimiter=';')
for linha in reader:
folder = linha['Pasta']
rsc = linha['Rascunho']
driver.find_element_by_link_text('Geradores').click()
sleep(5)
driver.find_element_by_name('gErador').send_keys(folder)
driver.find_element_by_name('bloco').send_keys(rsc)
driver.find_element_by_id('salva').click()
driver.find_element_by_link_text('Começo').click()
if __name__ == '__main___':
while True: # this part causes the code to reload
try:
Autonomation()
execept:
driver.quit()
Autonomation()
The problem I face is that when the the code is reloaded automatically, it reads the first line of CSV again, and can´t save the same folder
Accurate When the code is automatically reloaded, it starts reading on the same line it stopped.
example: if the code is running and is reading line 200 and the page timeout is reached the code is automatically reloaded, it will read where it left off on line 200
The number of rows in the CSV 5000K.
timeout = 40 min
I even thought of reading CSV in a separate file and calling the CSV file as a module in autonomation.
From what I understand,
Your code is a "Data Entry" work.
You are reading CSV Data to do a web page form filling.
Problem is this code is not reading the data from the start when the page refreshes.
Try to read the data at once. Write to a file, for all the completed records.
Option1:
The simple option - Try to split this large CSV files into smaller files. Say if you can update 550 records before a page goes for refresh then update 500 records and wait out till the page is refreshed !
Option2:
Can you actually have a way to check if the page is going to refresh? If it is possible then do this
Keep a counter of how many records are updated. When the page is about to refresh, save this data to a temp file.
Now update your code to check if a temp is present and then get that counter number. Skip these no.of records and continue the work
So I am trying to download a file from and API which will be in csv format
I generate a link with user inputs and store it in a variable exportLink
import requests
#getProjectName
projectName = raw_input('ProjectName')
#getApiToken
apiToken = "mytokenishere"
#getStartDate
startDate = raw_input('Start Date')
#getStopDate
stopDate = raw_input('Stop Date')
url = "https://api.awrcloud.com/get.php?action=export_ranking&project=%s&token=%s&startDate=%s&stopDate=%s" % (projectName,apiToken,startDate,stopDate)
exportLink = requests.get(url).content
exportLink will store the generated link
which I must then call to download the csv file using another
requests.get() command on exportLink
When I click the link it opens the download in a browser,
is there any way to automate this so it opens the zip and I can begin
to edit the csv using python i.e removing some stuff?
If you have bytes object zipdata that you got with requests.get(url).content, you can extract file by file to another bytes object
import zipfile
import io
import csv
with zipfile.ZipFile(io.BytesIO(zipdata)) as z:
for f in z.filelist:
csvdata = z.read(f)
and then do something with csvdata
reader = csv.reader(io.StringIO(csvdata.decode()))
...
I have a problem connecting to Excel API in windows 10. I use Office365 and with it Excel2016. My goal is: to download CSV file from a client FTPS Server, merge it with the existing files,perfom some action on it(with pandas) and then load the whole data into excel and do reporting with it... Up to the point of loading it into Excel everything is fine.I managed to do all steps automatically with Python (sorry if my code looks a little cluttered - I am new to Python)
import subprocess
import os
import ftplib
import fnmatch
import sys
from ftplib import FTP_TLS
from win32com.client import Dispatch
import pandas as pd
filematch = '*.csv'
target_dir = 'cannot tell you the path :-) '
def loginftps(servername,user,passwort):
ftps = FTP_TLS(servername)
ftps.login(user=user,passwd=passwort)
ftps.prot_p()
ftps.cwd('/changes to some directory')
for filename in ftps.nlst(filematch):
target_file_name = os.path.join(target_dir,os.path.basename(filename))
with open(target_file_name,'wb') as fhandle:
ftps.retrbinary('RETR %s' %filename, fhandle.write)
def openExcelApplication():
xl = Dispatch("Excel.Application")
xl.Visible = True # otherwise excel is hidden
def mergeallFilestoOneFile():
subprocess.call(['prepareData_executable.bat'])
def deletezerorows():
rohdaten = pd.read_csv("merged.csv",engine="python",index_col=False,encoding='Latin-1',delimiter=";", quoting = 3)
rohdaten = rohdaten.convert_objects(convert_numeric=True)
rohdaten = rohdaten[rohdaten.UN_PY > 0]
del rohdaten['deletes something']
del rohdaten['deletes something']
rohdaten.to_csv('merged_angepasst.csv',index=False,sep=";")
def rohdatenExcelAuswertung():
csvdaten = pd.csv_read("merged.csv")
servername = input("please enter FTPS serveradress:")
user = input("Loginname:")
passwort = input("Password:")
loginftps(servername,user,passwort)
mergeallFilestoOneFile()
deletezerorows()
And here I am stuck somehow,.. I did extensive google research but somehow nobody has ever tried to perform Excel tasks from within Python??
I found this stackoverflow discussion: Opening/running Excel file from python but I somehow cannot figure out where my Excel-Application is stored to run code mentioned in this thread.
What I have is an Excel-Workbook which has a data connection to my CSV file. I want Python to open MS-Excel, refresh data connection and refresh a PivoTable & then save and close the file.
Has anybody here ever tried to to something similar and can provide some code to get me started?
Thanks
A small snippet of code that should work for opening an excel file, updating linked data, saving it, and finally closing it:
from win32com.client import Dispatch
xl = Dispatch("Excel.Application")
xl.Workbooks.Open(Filename='C:\\Users\\Xukrao\\Desktop\\workbook.xlsx', UpdateLinks=3)
xl.ActiveWorkbook.Close(SaveChanges=True)
xl.Quit()
I am trying to find any way possible to get a SharePoint list in Python. I was able to connect to SharePoint and get the XML data using Rest API via this video: https://www.youtube.com/watch?v=dvFbVPDQYyk... but not sure how to get the list data into python. The ultimate goal will be to get the SharePoint data and import into SSMS daily.
Here is what I have so far..
import requests
from requests_ntlm import HttpNtlmAuth
url='URL would go here'
username='username would go here'
password='password would go here'
r=requests.get(url, auth=HttpNtlmAuth(username,password),verify=False)
I believe these would be the next steps. I really only need help getting the data from SharePoint in Excel/CSV format preferably and should be fine from there. But any recommendations would be helpful..
#PARSE XML VIA REST API
#PRINT INTO DATAFRAME AND CONVERT INTO CSV
#IMPORT INTO SQL SERVER
#EMAIL RESULTS
from shareplum import Site
from requests_ntlm import HttpNtlmAuth
server_url = "https://sharepoint.xxx.com/"
site_url = server_url + "sites/org/"
auth = HttpNtlmAuth('xxx\\user', 'pwd')
site = Site(site_url, auth=auth, verify_ssl=False)
sp_list = site.List('list name in my share point')
data = sp_list.GetListItems('All Items', rowlimit=200)
this can be done using SharePlum and Pandas
following is the working code snippet
import pandas as pd # importing pandas to write SharePoint list in excel or csv
from shareplum import Site
from requests_ntlm import HttpNtlmAuth
cred = HttpNtlmAuth(#userid_here, #password_here)
site = Site('#sharePoint_url_here', auth=cred)
sp_list = site.List('#SharePoint_list name here') # this creates SharePlum object
data = sp_list.GetListItems('All Items') # this will retrieve all items from list
# this creates pandas data frame you can perform any operation you like do within
# pandas capabilities
data_df = pd.DataFrame(data[0:])
data_df.to_excel("data.xlsx")
please rate if this helps.
Thank you in advance!
I know this doesn't directly answer your question (and you probably have an answer by now) but I would give the SharePlum library a try. It should hopefully simplify the process you have for interacting with SharePoint.
Also, I am not sure if you have a requirement to export the data into a csv but, you can connect directly to SQL Server and insert your data more directly.
I would have just added this into the comments but don't have a high enough reputation yet.
I can help with most of these issues
import requests
import xml.etree.ElementTree as ET
import csv
from requests_ntlm import HttpNtlmAuth
response = requests.get("your_url", auth=HttpNtlmAuth('xxxx\\username','password'))
tree = ET.ElementTree(ET.fromstring(response.content))
tree.write('file_name_xml.xml')
root = tree.getroot()
#Create csv file
csv_file = open('file_name_csv.csv', 'w', newline = '', encoding='ansi')
csvwriter = csv.writer(csv_file)
col_names = ['Col_1', 'Col_2', 'Col_3', 'Col_n']
csvwriter.writerow(col_names)
field_tag = ['dado_1', 'dado_2', 'dado_3', 'dado_n']
#schema XML microsoft
ns0 = "http://www.w3.org/2005/Atom"
ns1 = "http://schemas.microsoft.com/ado/2007/08/dataservices/metadata"
ns2 = "http://schemas.microsoft.com/ado/2007/08/dataservices"
for member in root:
if member.tag == '{' + ns0 + '}entry':
for element in member:
if element.tag == '{' + ns0 + '}content':
data_line = []
for field in element[0]:
for count in range(0, len(field_tag)):
if field.tag == '{' + ns2 + '}' + field_tag[count]:
data_line.append(field.text)
csvwriter.writerow(data_line)
csv_file.close()