Read and extract information from 3 files (python) - python

I'am designing a code in python for extract information from a xml file with a function with two variables. The code is working with one file:
import re
def Readfiles(XFile):
Id=''
des=''
with open(XFile,"r",encoding="utf-8") as h:
for line in h:
wline = line.rstrip("\n")
if re.search("^ID\s{3}",wline):
res=re.search(r"^ID\s{3}",wline)
Id=res.group(1)
if re.search("^DE\s{3}",wline):
res=re.search("^DE\s{3}",wline)
des=res.group(1)
return(Id,des)
(Identificator,desc)=Readfiles("rte.xml", "pre.xml", "ytl.xml")
print("Nom:",Identificator)
print("Descrip:",desc)
On the other hand, I want to read more files (tree xml in the code) in a same time but it give me error.
Thank for your help.

for f in ("rte.xml", "pre.xml", "ytl.xml"):
(Identificator,desc)=Readfiles(f)
The error is that Readfiles is called with three arguments but it has only one parameter.

Related

Need a push to start with a function about text files, I can't figure this out on my own

I don't need the entire code but I want a push to help me on the way, I've been searching on the internet for clues on how to start to write a function like this but I haven't gotten any further then just the name of the function.
So I haven't got the slightest clue on how to start with this, I don't know how to work with text files. Any tips?
These text files are CSV (Comma Separated Values). It is a simple file format used to store tabular data.
You may explore Python's inbuilt module called csv.
Following code snippet an example to load .csv file in Python:
import csv
filename = 'us_population.csv'
with open(filename, 'r') as csvfile:
csvreader = csv.reader(csvfile)

XML Parsing Data with plant_catalog

I am currently having issues with parsing data in my Python class and was wondering if anyone was able to provide a solution to my problem. Here are the instructions for the assignment I'm doing:
XML is the basis for many interfaces and web services. Consequently, reading and manipulating XML data is a common task in software development.
Description
An online plant distributor has recently experience a shortage in its supply of Anemone plants such that the price has increased by 20%. Their plant catalog is maintained in an XML file and they need a Python utility to find the plant by name, read the current price, change it by the specified percentage, and update the file. Writing this utility is your assignment.
Using Python’s ElementTree XML API, write a Python program to perform the following tasks below. Note that your program’s execution syntax must be as follows:
python xmlparse.py plant_catalog.xml plantName percentChange
Using ElementTree, read in this assignments XML file plant_catalog.xml specified by a command line parameter as shown above.
Find the plant by the name passed in as an argument on the command line (plantName above).
Once found, read the current price and adjust it by the command line argument percentChange. Note that this value could be anything in the range of -90 < percentChange < 100.
For example, if you run your script as follows:
python plant_catalog.xml "Greek Valerian" -20
with the original XML containing:
<PLANT>
<COMMON>Greek Valerian</COMMON>
<BOTANICAL>Polemonium caeruleum</BOTANICAL>
<ZONE>Annual</ZONE>
<LIGHT>Shade</LIGHT>
<PRICE>4.36</PRICE>
<AVAILABILITY>071499</AVAILABILITY>
</PLANT>
The resulting file should contain:
<PLANT>
<COMMON>Greek Valerian</COMMON>
<BOTANICAL>Polemonium caeruleum</BOTANICAL>
<ZONE>Annual</ZONE>
<LIGHT>Shade</LIGHT>
<PRICE>3.48</PRICE>
<AVAILABILITY>071499</AVAILABILITY>
</PLANT>
Note: You may reduce the precision of the calculation if you wish but it isn’t required.
Hints
Since XML is just a text file, you could write the code to read all the data and the decode the XML information. However, I certainly don’t recommend this approach. Instead, let Python do it for you! Using Python’s ElementTree module, parse the file into an “in-memory” representation of the XML data. Once parsed, the root (or starting place) is as simple as requesting it from the tree. Once you have the root, you can call methods to find what you are looking for and modify them appropriately. You'll want to "findall" the plants and, "for" each plant "in" the result, you'll want to compare the name with the name passed on the command line. If you find a match you'll apply the percentage change, save the result back to the tree.
When you are done with the search you will "write" the tree back to a file. I suggest using a different file name or you will be having to re-download the original with each run.
One note of caution, be sure to read about XML in the Distributed Systems text. From doing so and reviewing the data file you will not that there are no attributes in the XML file. Consequently, you do not need to use attribute methods when you attempt this assignment.
The following code snippet will give you a good starting point:
# Calling arguments: plant_catalog.xml plantName percentChange
import xml.etree.ElementTree as ET
import sys
# input parameters
searchName = sys.argv[2]
percent = float(sys.argv[3])
# parse XML data file
tree = ET.parse(sys.argv[1])
root = tree.getroot()
Now here is my code:
import xml
import xml.etree.ElementTree as ET
import sys
searchName = sys.argv[2]
percent = float(sys.argv[3])
tree = ET.parse(sys.argv[1])
root = tree.getroot()
def main():
with open("plant_catalog.xml", "r") as file:
data = file.read()
for plant in root.findall("PLANT"):
name = plant.find("COMMON").text
if name == searchName:
original_price = float(plant.find("PRICE").text)
with open("plant_catalog - output.xml", "wb") as file:
file.write(percent)
def change_plant_price(plantName, newPrice):
root = ET.fromstring(xml)
plant = root.find(".//*[COMMON='{}']".format(plantName))
plant.find('PRICE').text = str(newPrice)
ET.dump(root)
if __name__ == "__main__":
main()
The problem with my code is that when I write the code, I get an error in the file. Write(percent) line and shows it needs a byte-like object instead of a float. I'm not sure what's wrong with the code but if anyone is able to provide a solution I would greatly appreciate it.
I think what you want to be doing is not file.write(percent) but instead tree.write(file). You want to write the tree to a file

How to read Json files in a directory separately with a for loop and performing a calculation

Update: Sorry it seems my question wasn't asked properly. So I am analyzing a transportation network consisting of more than 5000 links. All the data included in a big CSV file. I have several JSON files which each consist of subset of this network. I am trying to loop through all the JSON files INDIVIDUALLY (i.e. not trying to concatenate or something), read the JSON file, extract the information from the CVS file, perform calculation, and save the information along with the name of file in new dataframe. Something like this:
enter image description here
This is the code I wrote, but not sure if it's efficient enough.
name=[]
percent_of_truck=[]
path_to_json = \\directory
import glob
z= glob.glob(os.path.join(path_to_json, '*.json'))
for i in z:
with open(i, 'r') as myfile:
l=json.load(myfile)
name.append(i)
d_2019= final.loc[final['LINK_ID'].isin(l)] #retreive data from main CSV file
avg_m=(d_2019['AADTT16']/d_2019['AADT16']*d_2019['Length']).sum()/d_2019['Length'].sum() #calculation
percent_of_truck.append(avg_m)
f=pd.DataFrame()
f['Name']=name
f['% of truck']=percent_of_truck
I'm assuming here you just want a dictionary of all the JSON. If so, use the JSON library ( import JSON). If so, this code may be of use:
import json
def importSomeJSONFile(f):
return json.load(open(f))
# make sure the file exists in the same directory
example = importSomeJSONFile("example.json")
print(example)
#access a value within this , replacing key with what you want like "name"
print(JSON_imported[key])
Since you haven't added any Schema or any other specific requirements.
You can follow this approach to solve your problem, in any language you prefer
Get Directory of the JsonFiles, which needs to be read
Get List of all files present in directory
For each file-name returned in Step2.
Read File
Parse Json from String
Perform required calculation

Trying to create a Python Script to extract data from .log files

I'm trying to create a Python Script but I'm a bit stuck and can't find what I'm looking for on a Google search as it's quite specific.
I need to run a script on two .log files (auth.log and access.log) to view the following information:
Find how many attempts were made with the bin account
So how many attempts the bin account made to try and get into the server.
The logs are based off being hacked and needing to identify how and who is responsible.
Would anyone be able to give me some help in how I go about doing this? I can provide more information if needed.
Thanks in advance.
Edit:
I've managed to print all the times 'bin' appears in the log which is one way of doing it. Does anyone know if I can count how many times 'bin' appears as well?
with open("auth.log") as f:
for line in f:
if "bin" in line:
print line
Given that you work with system logs and their format is known and stable, my approach would be something like:
identify a set of keywords (either common, or one per log)
for each log, iterate line by line
once keywords match, add the relevant information from each line in e.g. a dictionary
You could use shell tools (like grep, cut and/or awk) to pre-process the log and extract relevant lines from the log (I assume you only need e.g. error entries).
You can use something like this as a starting point.
If you want ot use tool then you can use ELK(Elastic,Logstash and kibana).
if no then you have to read first log file then apply regex according to your requirment.
In case you might be interested in extracting some data and save it to a .txt file, the following sample code might be helpful:
import re
import sys
import os.path
expDate = '2018-11-27'
expTime = '11-21-09'
infile = r"/home/xenial/Datasets/CIVIT/Nov_27/rover/NMND17420010S_"+expDate+"_"+expTime+".LOG"
keep_phrases = ["FINESTEERING"]
with open(infile) as f:
f = f.readlines()
with open('/home/xenial/Datasets/CIVIT/Nov_27/rover/GPS_'+expDate+'_'+expTime+'.txt', 'w') as file:
file.write("gpsWeek,gpsSOW\n")
for line in f:
for phrase in keep_phrases:
if phrase in line:
resFind = re.findall('\.*?FINESTEERING,(\d+).*?,(\d+\.\d*)',line)[0]
gpsWeek = re.findall('\.*?FINESTEERING,(\d+)',line)[0]
gpsWeekStr = str(gpsWeek)
gpsSOW = re.findall('\.*?FINESTEERING,'+ gpsWeekStr + ',(\d+\.\d*)',line)[0]
gpsSOWStr = str(gpsSOW)
file.write(gpsWeekStr+','+gpsSOWStr+'\n')
break
print ("------------------------------------")
In my case, FINESTEERING was an interesting keyword in my .log file to extract numbers, including GPS_Week and GPS_Seconds_of_Weeks. You may modify this code to suit your own application.

Parsing any file entered in command line using python

Problem statement is to read any xml file (format of the xml file will remain same only the content will differ) entered by the user from command line which contains number of test cases, and I need to parse it, generate another xml as a output.
Currently I am using minidom:
document = parse(sys.argv[1])
Which can read only one specific file.
I got stuck with only this part rest all is working fine.
I need to submit it as soon as possible.
sys.argv[1] means take the second argument, so if your command is python foo.py abc.xml def.xml, argv[1] is 'abc.xml'. You need to grab all the files:
for f in sys.argv[1:]:
# do something for f

Categories