Validating saved text file with Python to check if parameters are set - python

I've got a problem with validating text file. I need to check if parameters that I set are correctly saved. File name is an actual date and time and I need to check if parameters that were send are in this text (log) file. Below you can find my code:
Arguments are sent with argpars eg.
parser.add_argument("freq", type=int)
print('Saving Measurement...')
print(inst.write(':MMEMory:STORe:TRACe 0, "%s"' % timestr)) #Saving file on the inst
time.sleep(1) #Wait for file to save
print('Downloading file from device...')
ftp = FTP('XX.XX.XXX.XXX')
ftp.login()
ftp.retrbinary('RETR %s'% timestr + '.spa', open(timestr + '.spa', 'wb').write) #downloading saved file into a directory where you run script
print('Done, saved as: ' + timestr)
time.sleep(1)
with open (timestr + '.spa') as f:
if (str(args.freq)) in f.read():
print("saved correctly")
ftp.delete(timestr + '.spa') #Delete file from inst
ftp.quit()
I'm not sure if it works for me. Thank you for your help

You could use the re module to help you find a date pattern inside your file. I will give you a little example code that searches, at least for this case, this date pattern dd-mm-yyyy
import re
filepath = 'your-file-path.spa'
regex = '\d\d-\d\d-\d\d\d\d'
with open(filepath, 'r') as f:
file = f.read()
dates_found = re.findall(regex, file)
# dates_found will be an array with all the dates found in the file
print(dates_found)
You could use any regex you want as the first argument of re.findall(regex, file)

Related

Having trouble using requests to download images off of wiki

I am working on a project where I need to scrape images off of the web. To do this, I write the image links to a file, and then I download each of them to a folder with requests. At first, I used Google as the scrape site, but do to several reasons, I have decided that wikipedia is a much better alternative. However, after I tried the first time, many of the images couldn't be opened, so I tried again with the change that when I downloaded the images, I downloaded them to names with endings that matched the endings of the links. More images were able to be accessed like this, but many were still not able to be opened. When I tested downloading the images myself (individually outside of the function), they downloaded perfectly, and when I used my function to download them afterwards, they kept downloading correctly (i.e. I could access them). I am not sure i it is important, but the image endings that I generally come across are svg.png and png. I want to know why this is occurring and what I may be able to do to prevent it. I have left some of my code below. Thank you.
Function:
def download_images(file):
object = file[0:file.index("IMAGELINKS") - 1]
folder_name = object + "_images"
dir = os.path.join("math_obj_images/original_images/", folder_name)
if not os.path.exists(dir):
os.mkdir(dir)
with open("math_obj_image_links/" + file, "r") as f:
count = 1
for line in f:
try:
if line[len(line) - 1] == "\n":
line = line[:len(line) - 1]
if line[0] != "/":
last_chunk = line.split("/")[len(line.split("/")) - 1]
endings = last_chunk.split(".")[1:]
image_ending = ""
for ending in endings:
image_ending += "." + ending
if image_ending == "":
continue
with open("math_obj_images/original_images/" + folder_name + "/" + object + str(count) + image_ending, "wb") as f:
f.write(requests.get(line).content)
file = object + "_IMAGEENDINGS.txt"
path = "math_obj_image_endings/" + file
with open(path, "a") as f:
f.write(image_ending + "\n")
count += 1
except:
continue
f.close()
Doing this outside of it worked:
with open("test" + image_ending, "wb") as f:
f.write(requests.get(line).content)
Example of image link file:
https://upload.wikimedia.org/wikipedia/commons/thumb/6/63/Triangle.TrigArea.svg/120px-Triangle.TrigArea.svg.png
https://upload.wikimedia.org/wikipedia/commons/thumb/c/c9/Square_%28geometry%29.svg/120px-Square_%28geometry%29.svg.png
https://upload.wikimedia.org/wikipedia/commons/thumb/3/33/Hexahedron.png/120px-Hexahedron.png
https://upload.wikimedia.org/wikipedia/commons/thumb/2/22/Hypercube.svg/110px-Hypercube.svg.png
https://wikimedia.org/api/rest_v1/media/math/render/svg/5f8ab564115bf2f7f7d12a9f873d9c6c7a50190e
https://en.wikipedia.org/wiki/Special:CentralAutoLogin/start?type=1x1
https:/static/images/footer/wikimedia-button.png
https:/static/images/footer/poweredby_mediawiki_88x31.png
If all the files are indeed in PNG format and the suffix is always .png, you could try something like this:
import requests
from pathlib import Path
u1 = "https://upload.wikimedia.org/wikipedia/commons/thumb/6/63/Triangle.TrigArea.svg/120px-Triangle.TrigArea.svg.png"
r = requests.get(u1)
Path('u1.png').write_bytes(r.content)
My previous answer works for PNG's only
For SVG files you need to check if the file contents start eith the string "<svg" and create a file with the .svg suffix.
The code below saves the downloaded files in the "downloads" subdirectory.
import requests
from pathlib import Path
# urls are stored in a file 'urls.txt'.
with open('urls.txt') as f:
for i, url in enumerate(f.readlines()):
url = url.strip() # MUST strip the line-ending char(s)!
try:
content = requests.get(url).content
except:
print('Cannot download url:', url)
continue
# Check if this is an SVG file
# Note that content is bytes hence the b in b'<svg'
if content.startswith(b'<svg'):
ext = 'svg'
elif url.endswith('.png'):
ext = 'png'
else:
print('Cannot process contents of url:', url)
Path('downloads', f'url{i}.{ext}').write_bytes(requests.get(url).content)
Contents of the urls.txt file:
(the last url is an svg)
https://upload.wikimedia.org/wikipedia/commons/thumb/6/63/Triangle.TrigArea.svg/120px-Triangle.TrigArea.svg.png
https://upload.wikimedia.org/wikipedia/commons/thumb/c/c9/Square_%28geometry%29.svg/120px-Square_%28geometry%29.svg.png
https://upload.wikimedia.org/wikipedia/commons/thumb/3/33/Hexahedron.png/120px-Hexahedron.png
https://upload.wikimedia.org/wikipedia/commons/thumb/2/22/Hypercube.svg/110px-Hypercube.svg.png
https://wikimedia.org/api/rest_v1/media/math/render/svg/5f8ab564115bf2f7f7d12a9f873d9c6c7a50190e

Error reading data from the old files, writing it into the new files and then deleting the old files in Python

I'm trying to use below code to read 5 files from source, write them in destination and then deleting the files in source. I get the following error: [Errno 13] Permission denied: 'c:\\data\\AM\\Desktop\\tester1. The file by the way look like this:
import os
import time
source = r'c:\data\AM\Desktop\tester'
destination = r'c:\data\AM\Desktop\tester1'
for file in os.listdir(source):
file_path = os.path.join(source, file)
if not os.path.isfile:
continue
print(file_path)
with open (file_path, 'r') as IN, open (destination, 'w') as OUT:
data ={
'Power': None,
}
for line in IN:
splitter = (ID, Item, Content, Status) = line.strip().split()
if Item in data == "Power":
Content = str(int(Content) * 10)
os.remove(IN)
I have re-written your entire code. I assume you want to update the value of Power by a multiple of 10 and write the updated content into a new file. The below code will do just that.
Your code had multiple issues, first and foremost, most of what you wanted in your head did not get written in the code (like writing into a new file, providing what and where to write, etc.). The original issue of the permission was because you were trying to open a directory to write instead of a file.
source = r'c:\data\AM\Desktop\tester'
destination = r'c:\data\AM\Desktop\tester1'
for file in os.listdir(source):
source_file = os.path.join(source, file)
destination_file=os.path.join(destination, file)
if not os.path.isfile:
continue
print(source_file)
with open (source_file, 'r') as IN , open (destination_file, 'w') as OUT:
data={
'Power': None,
}
for line in IN:
splitter = (ID, Item, Content, Status) = line.strip().split()
if Item in data:# == "Power": #Changed
Content = str(int(Content) * 10)
OUT.write(ID+'\t'+Item+'\t'+Content+'\t'+Status+'\n') #Added to write the content into destination file.
else:
OUT.write(line) #Added to write the content into destination file.
os.remove(source_file)
Hope this works for you.
I'm not sure what you're going for here, but here's what I could come up with the question put into the title.
import os
# Takes the text from the old file
with open('old file path.txt', 'r') as f:
text = f.read()
# Takes text from old file and writes it to the new file
with open('new file path.txt', 'w') as f:
f.write(text)
# Removes the old text file
os.remove('old file path.txt')
Sounds from your description like this line fails:
with open (file_path, 'r') as IN, open (destination, 'w') as OUT:
Because of this operation:
open (destination, 'w')
So, you might not have write-access to
c:\data\AM\Desktop\tester1
Set file permission on Windows systems:
https://www.online-tech-tips.com/computer-tips/set-file-folder-permissions-windows/
#Sherin Jayanand
One more question bro, I wanted to try something out with some pieces of your code. I made this of it:
import os
import time
from datetime import datetime
#Make source, destination and archive paths.
source = r'c:\data\AM\Desktop\Source'
destination = r'c:\data\AM\Desktop\Destination'
archive = r'c:\data\AM\Desktop\Archive'
for root, dirs, files in os.walk(source):
for f in files:
pads = (root + '\\' + f)
# print(pads)
for file in os.listdir(source):
dst_path=os.path.join(destination, file)
print(dst_path)
with open(pads, 'r') as IN, open(dst_path, 'w') as OUT:
data={'Power': None,
}
for line in IN:
(ID, Item, Content, Status) = line.strip().split()
if Item in data:
Content = str(int(Content) * 10)
OUT.write(ID+'\t'+Item+'\t'+Content+'\t'+Status+'\n')
else:
OUT.write(line)
But again I received the same error: Permission denied: 'c:\\data\\AM\\Desktop\\Destination\\C'
How comes? Thank you very much!

Python download multiple files within for loop

I have a list of URLs, which direct to filings from the SEC (e.g., https://www.sec.gov/Archives/edgar/data/18651/000119312509042636/d10k.htm)
My goal ist to write a for loop that opens the URLs, request the document and save it to a folder.
However, I need to be able to identify the documents later. Thats why I wanted to use "htps://www.sec.gov/Archives/edgar/data/18651/000119312509042636/d10k.htm" this filing-specific number as document name
directory = r"\Desktop\10ks"
for url in url_list:
response = requests.get(url).content
path = (directory + str(url)[40:-5] +".txt")
with open(path, "w") as f:
f.write(response)
f.close()
But everytime, I get the following error message: filenotfounderror: [errno 2] no such file or directory:
I really hope you can help me out!!
Thanks
import requests
import os
url_list = ["https://www.sec.gov/Archives/edgar/data/18651/000119312509042636/d10k.htm"]
#Create the path Desktop/10ks/
directory = os.path.expanduser("~/Desktop") + "\\10ks"
for url in url_list:
#Get the content as string instead of getting it as bytes
response = requests.get(url).text
#Replace slash in filename with underscore
filename = str(url)[40:-5].replace("/", "_")
#print filename to check if it is correct
print(filename)
path = (directory + "\\" + filename +".txt")
with open(path, "w") as f:
f.write(response)
f.close()
See comments.
I guess backslashes in filenames are not allowed, since
filename = str(url)[40:-5].replace("/", "\\")
gives me
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\user/Desktop\\10ks\\18651\\000119312509042636\\d10.txt'
See also:
https://docs.python.org/3/library/os.path.html#os.path.expanduser
Get request python as a string
https://docs.python.org/3/library/stdtypes.html#str.replace
This works
for url in url_list:
response = requests.get(url).content.decode('utf-8')
path = (directory + str(url)[40:-5] +".txt").replace('/', '\\')
with open(path, "w+") as f:
f.write(response)
f.close()
the path that you were build was something like this \\Desktop\\10ks18651/000119312509042636/d10.txt I suppose you are working on windows for those backslashes, anyways you just need to replace the slashes that were coming in the url to backslashes.
Another thing, write receives a string, because of that you need to decode your response that is coming in bytes to string.
I hope this helps you!

How to create and write into a file correctly in Python

I am trying to create a file in a certain directory, and save the name of that file with today's date.
I am having some issue, where the file is created, but the title line that I want to write in, does not work.
from datetime import datetime
today = datetime.now().date().strftime('%Y-%m-%d')
g = open(path_prefix+today+'.csv', 'w+')
if os.stat(path_prefix+today+'.csv').st_size == 0: # this checks if file is empty
g = open(path_prefix+today+'.csv', 'w+')
g.write('Title\r\n')
path_prefix is just a path to the directory I am saving in /Users/name/Documents/folder/subfolder/
I am expecting a file 2019-08-22.csv to be saved in the directory given by path_prefix with a title as specified in the last line of the code above.
What I am getting is an empty file, and if I run the code again then the title is appended into the file.
As mentioned by #sampie777 I was not losing the file after writing to it, which is why the changes were not being saved when I opened the file. Adding close in an extra line solves the issue that I was having
from datetime import datetime
today = datetime.now().date().strftime('%Y-%m-%d')
g = open(path_prefix+today+'.csv', 'w+')
if os.stat(path_prefix+today+'.csv').st_size == 0: #this checks if file is empty
g = open(path_prefix+today+'.csv', 'w+')
g.write('Title\r\n')
g.close()
I am sure there are plenty of other ways to do this
You need to close the file before the content will be written to it. So call
g.close().
I can suggest to use:
with open(path_prefix+today+'.csv', 'w+') as g:
g.write('...')
This will automatically handle closing the file for you.
Also, why are you opening the file two times?
Tip: I see you are using path_prefix+today+'.csv' a lot. Create a variable for this, so you're code will be a lot easier to maintain.
Suggested refactor of the last lines:
output_file_name = path_prefix + today + '.csv' # I prefer "{}{}.csv".format(path_prefix, today) or "%s%s.csv" % (path_prefix, today)
is_output_file_empty = os.stat(output_file_name).st_size == 0
with open(output_file_name, 'a') as output_file:
if is_output_file_empty:
output_file.write('Title\r\n')
For more information, see this question: Correct way to write line to file?
and maybo also How to check whether a file is empty or not?
I haven't used Python in a while, but by doing a quick bit of research, this seems like it could work:
# - Load imports
import os
import os.path
from datetime import datetime
# - Get the date
dateToday = datetime.now().date()
# - Set the savePath / path_prefix
savePath = 'C:/Users/name/Documents/folder/subfolder/'
fileName = dateToday.strftime("%Y-%m-%d") # - Convert 'dateToday' to string
# - Join path and file name
completeName = os.path.join(savePath, fileName + ".csv")
# - Check for file
if (not path.exists(completeName)):
# - If it doesn't exist, write to it and then close
with (open(completeName, 'w+') as file):
file.write('Title\r\n')
else:
print("File already exists")

Read lines and remove them after read complete

I am new to python language, trying to develop a script to read a file with emails in it, split good emails from bad emails and than remove that line from the source file.
I got so far but here i have no idea how to remove the line already readed
Any help?
import os
with open('/home/klevin/Desktop/python_test/email.txt', 'rw+') as f:
for line in f.readlines():
#print line
domain = line.split("#")[1]
#print(domain)
response = os.system("ping -c 1 " + domain)
if response == 0:
print(response)
file1 = open("good_emails.txt","a")
file1.write( line )
else:
print(response)
file = open("bad_emails.txt","a")
file.write( line )
In general I would not prefer to both read and write to a file at the same time. So here is what I would do:
open the file for reading
loop over the emails and do your thing. In the comments below you've clarified you want to test only the first 100 mails, so that is what the code below now does.
close the file
reopen the file but this time in write mode, truncating it (throwing away its contents)
write all the remaining (untested) emails to the file
This effectively removes all mails that have been tested.
The code might look like this:
import os
emails = []
# Opening the file for reading
with open('email.txt', 'r') as f, open("good_emails.txt", "w") as good, open("bad_emails.txt", "w") as bad:
emails = f.readlines()
# Only loop over the first 100 mails
for line in emails[:100]:
domain = line.split("#")[1]
response = os.system("ping -c 1 " + domain)
if response == 0:
print(response)
good.write( line )
else:
print(response)
bad.write( line )
# Now re-open the file and overwrite it with the correct emails
with open('email.txt', 'w') as f:
# Write the remaining emails to the original file
for e in emails[100:]:
f.write(e)
You can't. That's simply not how files work, you cannot just remove a couple lines from the middle of a file. To achieve what you want you want to overwrite or replace the file.
So in your code you'd remove the original file and copy good_email.txt over it:
import shutil
import subprocess
with open('email.txt', 'r') as original, open("good_emails.txt", "w") as good, open("bad_emails.txt", "w") as bad:
for line in original: # no need to readlines()
domain = line.split("#")[1]
response = subprocess.call(['ping', '-c', '1', domain])
if response == 0:
good.write(line)
else:
bad.write(line)
shutil.copyfile('good_emails.txt', 'emails.txt')

Categories