How can I get the username value from the "Last saved by" property from any windows file?
e.g.: I can see this info right clicking on a word file and accessing the detail tab. See the picture below:
Does any body knows how can I get it using python code?
Following the comment from #user1558604, I searched a bit on google and reached a solution. I tested on extensions .docx, .xlsx, .pptx.
import zipfile
import xml.dom.minidom
# Open the MS Office file to see the XML structure.
filePath = r"C:\Users\Desktop\Perpetual-Draft-2019.xlsx"
document = zipfile.ZipFile(filePath)
# Open/read the core.xml (contains the last user and modified date).
uglyXML = xml.dom.minidom.parseString(document.read('docProps/core.xml')).toprettyxml(indent=' ')
# Split lines in order to create a list.
asText = uglyXML.splitlines()
# loop the list in order to get the value you need. In my case last Modified By and the date.
for item in asText:
if 'lastModifiedBy' in item:
itemLength = len(item)-20
print('Modified by:', item[21:itemLength])
if 'dcterms:modified' in item:
itemLength = len(item)-29
print('Modified On:', item[46:itemLength])
The result in the console is:
Modified by: adm.UserName
Modified On: 2019-11-08"
Related
I have code in jupyter notebook with the help of requests to get confirmation on whether that url existed or not and after that prints out the output into the text file. Here is the line code for that
import requests
Instaurl = open("dictionaries/insta.txt", 'w', encoding="utf-8")
cli = ['duolingo', 'ryanair', 'mcguinness.paddy', 'duolingodeutschland', 'duolingobrasil']
exist=[]
url = []
for i in cli:
r = requests.get("https://www.instagram.com/"+i+"/")
if r.apparent_encoding == 'Windows-1252':
exist.append(i)
url.append("instagram.com/"+i+"/")
Instaurl.write(url)
Let's say that inside the cli list, i accidentally added the same existing username as before into the text file (duolingo for example). Is there a way where if the requests found the same URL from the text file, it would not be added into the the text file again?
Thank you!
You defined a list:
cli = ['duolingo', ...]
It sounds like you would prefer to define a set:
cli = {'duolingo', ...}
That way, duplicates will be suppressed.
It happens for dups in the initial
assignment, and for any duplicate cli.add(entry) you might attempt later.
I have a question regarding Maximo Location Hierarchy, I want to delete location hierarchy using enterprise service via CSV file, additional field is "MARKFORDELETE" has been created, in the CSV file the user need to enter "1" on the MARKFORDELETE field. And I have created the following action automation script on LOCHIERARCHY object:
from psdi.util.logging import MXLogger
from psdi.util.logging import MXLoggerFactory
from psdi.mbo import MboConstants
from psdi.mbo import MboSetRemote
from psdi.server import MXServer
from psdi.util import MXException
mxServer = MXServer.getMXServer();
userInfo = mbo.getUserInfo();
if launchPoint=='LOCHIERDEL2':
locHierarchySet=mbo.getMboSet("LOCHIERARCHYDEL")
locHierarchySet.setWhere("markfordelete = 1")
locHierarchySet.reset()
locHier = locHierarchySet.moveFirst()
while locHier is not None:
locHierarchy=locHierarchySet.getMbo(0)
locAncestorSet=mxServer.getMboSet("LOCANCESTOR",userInfo);
locAncestorSet.setWhere("location='"+locHierarchy.getString("LOCATION")+"' and ancestor='"+locHierarchy.getString("PARENT")+"' and systemid='"+locHierarchy.getString("SYSTEMID")+"' and siteid='"+locHierarchy.getString("SITEID")+"'")
locAncestorSet.reset()
locAnc = locAncestorSet.moveFirst()
if locAncestorSet.count()==1:
locAncestor=locAncestorSet.getMbo(0)
locAncestor.delete(11l)
locAncestorSet.save(11l)
locHierarchy.delete(11l)
locHierarchySet.save(11l)
locHierarchySet2=mbo.getMboSet("LOCHIERARCHYDEL3")
locHier2 = locHierarchySet2.moveFirst()
while locHier2 is not None:
locHierarchy2=locHierarchySet2.getMbo(0)
locHierarchy2.delete(11l)
locHierarchySet2.save()
locHier2 = locHierarchySet2.moveNext()
locHier = locHierarchySet.moveNext()
And the following is the CSV file:
EXTSYS1,LOCHIER_DEL,AddChange,EN
LOCATION,PARENT,SYSTEMID,CHILDREN,SITEID,ORGID,MARKFORDELETE
45668,XY_10603,NETWORK,0,ABC,ORG1,1
45668,XY_10604,NETWORK,0,ABC,ORG1,1
45669,XY_10606,NETWORK,0,ABC,ORG1,1
45669,XY_10607,NETWORK,0,ABC,ORG1,1
Create an escalation point with action using the above action script with where clause markfordelete=1. The escalation is working fine, and the records were deleted from LOCHIERARCHY table from the above CSV file, however the LOCANCESTOR table records were deleted ALL with SYSTEMID is NETWORK. and I noticed that, if the LOCHIERARCHY record is deleted, a new record will be created with parent is null.
Is there something that I have done wrong in writing the code, or have I missed out something?
Any suggestions or pointer would be great.
That is out of box behaviour, it will create new record in lochierarchy and locancestor tables. Please look into Lochierarchy.class delete method.
I have a file that contains text as well as some xml content dumped into it. It looks something like this :
The authentication details : <id>70016683</id><password>password#123</password>
The next step is to send the request.
The request : <request><id>90016133</id><password>password#3212</password></request>
Additional info includes <Address><line1>House no. 341</line1><line2>A B Street</line2><City>Sample city</City></Address>
I am using a python program to parse this file. I would like to replace the xml part with a place holder : xml_obj. The output should look something like this :
The authentication details : xml_obj
The next step is to send the request.
The request : xml_obj
Additional info includes xml_obj
At the same time I would also like to extract the replaced xml text and store it in a list. The list should contain None if the line doesn't have an xml object.
I have tried using regex for this purpose :
xml_tag = re.search(r"<\w*>",line)
if xml_tag:
start_position = xml_tag.start()
xml_word = xml_tag.group()[:1]+'/'+xml_tag.group()[1:]
xml_pattern = r'{}'.format(xml_word)
stop_position = re.search(xml_pattern,line).stop()
But this code retrieves the start and stop positions for only one xml tag and it's content for the first line and the entire format for the last line ( in the input file ). I would like to get all xml content irrespective of the xml structure and also replace it with 'xml_obj'.
Any advice would be helpful. Thanks in advance.
Edit :
I also want to apply the same logic to files that look like this :
The authentication details : ID <id>70016683</id> Password <password>password#123</password> Authentication details complete
The next step is to send the request.
The request : <request><id>90016133</id><password>password#3212</password></request> Request successful
Additional info includes <Address><line1>House no. 341</line1><line2>A B Street</line2><City>Sample city</City></Address>
The above files may have more than one xml object in a line.
They may also have some plain text after the xml part.
The following is a a little convoluted, but assuming that your actual text is correctly represented by the sample in your question, try this:
txt = """[your sample text above]"""
lines = txt.splitlines()
entries = []
new_txt = ''
for line in lines:
entry = (line.replace(' <',' xxx<',1).split('xxx'))
if len(entry)==2:
entries.append(entry[1])
entry[1]="xml_obj"
line=''.join(entry)
else:
entries.append('none')
new_txt+=line+'\n'
for entry in entries:
print(entry)
print('---')
print(new_txt)
Output:
<id>70016683</id><password>password#123</password>
none
<request><id>90016133</id><password>password#3212</password></request>
<Address><line1>House no. 341</line1><line2>A B Street</line2><City>Sample city</City></Address>
---
The authentication details : xml_obj
The next step is to send the request.
The request : xml_obj
Additional info includes xml_obj
I have executed ssh commands in remote machine using paramiko library and written output to text file. Now, I want to extract few values from a text file. The output of a text file looks as pasted below
b'\nMS Administrator\n(C) Copyright 2006-2016 LP\n\n[MODE]> SHOW INFO\n\n\nMode: \nTrusted Certificates\n1 Details\n------------\n\tDeveloper ID: MS-00c1\n\tTester ID: ms-00B1\n\tValid from: 2030-01-29T06:51:15Z\n\tValid until: 2030-01-30T06:51:15Z\n\t
how do i get the value of Developer ID and Tester ID. The file is huge.
As suggested by users I have written the snippet below.
file = open("Output.txt").readlines()
for lines in file:
word = re.findall('Developer\sID:\s(.*)\n', lines)[0]
print(word)
I see the error IndexError: list index out of range
If i remove the index. I see empty output
file = open("Output.txt").readlines()
developer_id=""
for lines in file:
if 'Developer ID' in line:
developer_id = line.split(":")[-1].strip()
print developer_id
You can use Regular expressions
text = """\nMS Administrator\n(C) Copyright 2006-2016 LP\n\n[MODE]> SHOW INFO\n\n\nMode: \nTrusted Certificates\n1 Details\n------------\n\tDeveloper ID: MS-00c1\n\tTester ID: ms-00B1\n\tValid from: 2030-01-29T06:51:15Z\n\tValid until: 2030-01-30T06:51:15Z\n\t"""
import re
developerID = re.match("Developer ID:(.+)\\n", text).group(0)
testerID = re.match("Tester ID:(.+)\\n", text).group(0)
If your output is consistent in format, you can use something as easy as line.split():
developer_id = line.split('\n')[11].lstrip()
tester_id = line.split('\n')[12].lstrip()
Again, this assumes that every line is using the same formatting. Otherwise, use regex as suggested above.
I apologize for not being able to specifically give out the url im dealing with. I'm trying to extract some data from a certain site but its not organized well enough. However, they do have an "Export To CSV file" and the code for that block is ...
<input type="submit" name="ctl00$ContentPlaceHolder1$ExportValueCSVButton" value="Export to Value CSV" id="ContentPlaceHolder1_ExportValueCSVButton" class="smallbutton">
In this type of situation, whats the best way to go about grabbing that data when there is no specific url to the CSV, Im using Mechanize and BS4.
If you're able to click a button that could download the data as a csv, it sounds like you might be able to wget link that data and save it on your machine and work with it there. I'm not sure if that's what you're getting at here though, any more details you can offer?
You should try Selenium, Selenium is a suite of tools to automate web browsers across many platforms. It can do a lot thing including click button.
Well, you need SOME starting URL to feed br.open() to even start the process.
It appears that you have an aspnetForm type control there and the below code MAY serve as a bit of a starting point, even though it does not work as-is (it's a work in progress...:-).
You'll need to look at the headers and parameters via the network tab of your browser dev tools to see them.
br.open("http://media.ethics.ga.gov/search/Lobbyist/Lobbyist_results.aspx?&Year=2016&LastName="+letter+"&FirstName=&City=&FilerID=")
soup = BS(br.response().read())
table = soup.find("table", { "id" : "ctl00_ContentPlaceHolder1_Results" }) # Need to add error check here...
if table is None: # No lobbyist with last name starting with 'X' :-)
continue
records = table.find_all('tr') # List of all results for this letter
for form in br.forms():
print "Form name:", form.name
print form
for row in records:
rec_print = ""
span = row.find_all('span', 'lblentry', 'value')
for sname in span:
if ',' in sname.get_text(): # They actually have a field named 'comma'!!
continue
rec_print = rec_print + sname.get_text() + "," # Create comma-delimited output
print(rec_print[:-1]) # Strip final comma
lnk = row.find('a', 'lblentrylink')
if lnk is None: # For some reason, first record is blank.
continue
print("Lnk: ", lnk)
newlnk = lnk['id']
print("NEWLNK: ", newlnk)
newstr = lnk['href']
newctl = newstr[+25:-5] # Matching placeholder (strip javascript....)
br.select_form('aspnetForm') # Tried (nr=0) also...
print("NEWCTL: ", newctl)
br[__EVENTTARGET] = newctl
response = br.submit(name=newlnk).read()