Bogus data in output CSV file - python

I'm trying to change delimiter on CSV and write to a new file, is just a simple modification but isn't it.
#!/usr/bin/python
#-*- econde: utf-8 -*-
import sys
import csv
def main():
r = open(sys.argv[1],"r")
wr = open(sys.argv[2],"a+")
rea = csv.reader(r, delimiter=',')
writer = csv.writer(wr,delimiter="|", quotechar="'")
for row in rea:
#line = str(row).replace(",","|")
#writer.writerow("".join(line))
writer.writerow(row)
print type(row)
print row
r.close()
wr.close()
if __name__ == '__main__':
main()
Update:
The output in console looks so so like:
./csv_read.py fim.csv salida.csv
<type 'list'>
['9/17/18 22:29', 'any', 'la_cuerda.net', 'Sep 17 22:29:29 running yum[37144]: Installed: ImageMagic-toolkit-2.1.7-1.noarch']
but in the file is writing 3 times the same string but on three different ways
the first way is still the same: 1 char per field (wrong format and brackets are included)
the second way is inserting all in one cell without split it like the original
This is the content of the Input File and the Output file
$ cat Input.csv
Time(GMT),Host,dest,Alert
9/17/18 22:34,any,google.com.mx,monitor: Agent started: 'discovery.channel.org->any'.
9/17/18 22:29,any,la_cuerda.net,Sep 17 22:29:29 running yum[37144]: Installed: ImageMagic-toolkit-2.1.7-1.noarch
$ cat Output.csv
[,'''',T,i,m,e,(,G,M,T,),'''',|, ,'''',H,o,s,t,'''',|, ,'''',d,e,s,t,'''',|, ,'-''',A,l,e,r,t,'''',]
[,'''',9,/,1,7,/,1,8, ,2,2,:,3,4,'''',|, ,'''',a,n,y,'''',|, ,'''',g,o,o,g,l,e,.,c,o,m,.,m,x,'''',|, ,",m,o,n,i,t,o,r,:, ,A,g,e,n,t, ,s,t,a,r,t,e,d,:, ,'''',d,i,s,c,o,v,e,r,y,.,c,h,a,n,n,e,l,.,o,r,g,-,>,a,n,y,'''',.,",]
[,'''',9,/,1,7,/,1,8, ,2,2,:,2,9,'''',|, ,'''',a,n,y,'''',|, ,'''',l,a,_,c,u,e,r,d,a,.,n,e,t,'''',|, ,'''',S,e,p, ,1,7, ,2,2,:,2,9,:,2,9, ,r,u,n,n,i,n,g, ,y,u,m,[,3,7,1,4,4,],:, ,I,n,s,t,a,l,l,e,d,:, ,I,m,a,g,e,M,a,g,i,c,-,t,o,o,l,k,i,t,-,2,.,1,.,7,-,1,.,n,o,a,r,c,h,'''',]
Time(GMT)|Host|dest|Alert
9/17/18 22:34|any|google.com.mx|'monitor: Agent started: ''discovery.channel.org->any''.'
9/17/18 22:29|any|la_cuerda.net|Sep 17 22:29:29 running yum[37144]: Installed: ImageMagic-toolkit-2.1.7-1.noarch
Time(GMT)|Host|dest|Alert
9/17/18 22:34|any|google.com.mx|'monitor: Agent started: ''discovery.channel.org->any''.'
9/17/18 22:29|any|la_cuerda.net|Sep 17 22:29:29 running yum[37144]: Installed: ImageMagic-toolkit-2.1.7-1.noarch
Time(GMT)|Host|dest|Alert
9/17/18 22:34|any|google.com.mx|'monitor: Agent started: ''discovery.channel.org->any''.'
9/17/18 22:29|any|la_cuerda.net|Sep 17 22:29:29 running yum[37144]: Installed: ImageMagic-toolkit-2.1.7-1.noarch

wr = open(sys.argv[2],"a+") is the cause. Each time you run your program, it appends its output to the file. The bogus data you see is from previous runs.
Unless your program is really supposed to append to the output file rather than overwrite it, open the file in wb mode.
Also note that csv.reader and csv.writer docs mandate opening the file in binary mode (because the module is supposed to do its own transcoding).

Related

Automatic script doesn't output data

I am just for fun collecting weather data with my Raspberry Pi.
If I execute my python script in the console everything is working fine.But if I add the python-file to crontab to start it after rebooting, it isn't working. (crontab-entry: #reboot python3 /home/pi/Documents/PythonProgramme/WeatherData/weatherdata.py &)
#! /usr/bin/python3
from pyowm import OWM
import csv
import schedule
from datetime import datetime
import time
key = 'XXXXXX'
def weather_request(text):
owm = OWM(key)
mgr = owm.weather_manager()
karlsruhe = mgr.weather_at_place('Karlsruhe, DE').weather
hamburg = mgr.weather_at_place('Hamburg, DE').weather
cities = (karlsruhe, hamburg)
with open('weatherdata.csv', 'a') as file:
writer = csv.writer(file)
row = [datetime.now().strftime("%Y-%m-%d %H:%M:%S")]
for city in cities:
row.append(city.temperature('celsius')['temp'])
row.append(round(row[1] - row[2], 2))
row.append(text)
writer.writerow(row)
schedule.every().day.at("08:00").do(weather_request, 'morgens')
schedule.every().day.at("13:00").do(weather_request, 'mittags')
schedule.every().day.at("18:00").do(weather_request, 'abends')
while 1:
schedule.run_pending()
time.sleep(1)
If I run ps -aef | grep python it is showing, that my script is running: pi 337 1 21 10:32 ? 00:00:10 python3 /home/pi/Documents/PythonProgramme/WeatherData/weatherdata.py
But I never get any data. What am I missing?
Thanks in advance!
where are you checking the output file?
Have tried to open the file with full path?
with open('***<fullPath>***weatherdata.csv', 'a') as

Unable to read content from tempfile

python: 3.6.8
code:
temp_file = tempfile.NamedTemporaryFile()
getui.download(path=f"{path}{file}", localPath=temp_file.name)
temp_file.seek(0)
import ipdb;ipdb.set_trace()
item = {"file": temp_file, "filename": file}
queue.put(item)
I cannot read anything from tempfile, but can read its content using open, as another file:
ipdb> temp_file.read(10)
b''
ipdb> temp_file.seek(0)
0
ipdb> temp_file.read(10)
b''
ipdb> f = open(temp_file.name)
ipdb> f.read(10)
'6C1D91DB-F'
ipdb>
Why is this happening?
You have not posted enough detail to be able to reproduce this, but the likelihood is that your getui.download is deleting and recreating the file, so it will be writing to a different inode than the one which Python has open (which no longer has any associated directory entry). When you reopen the file, you are now looking at the same inode as was written by getui.download.
To demonstrate by means of example what is likely to be happening, here is an example in which (in Linux) some basic file operations are performed to do a delete-and-recreate (using ctrl-Z to temporarily suspend the Python process while this is done):
>>> temp_file = tempfile.NamedTemporaryFile()
>>> temp_file.name
'/tmp/tmpo4j5k0ul'
>>> os.stat(temp_file.name).st_ino # <=== look at the inode number
42
>>> [[ctrl-Z pressed here]]
[1]+ Stopped python3
$ ls -li /tmp/tmpo4j5k0ul
42 -rw------- 1 myuser mygroup 0 Aug 14 10:33 /tmp/tmpo4j5k0ul
$ rm /tmp/tmpo4j5k0ul # <=== delete
$ echo hello > /tmp/tmpo4j5k0ul # <=== create new file
$ ls -li /tmp/tmpo4j5k0ul # <=== see the new inode number
41 -rw-rw-r-- 1 myuser mygroup 6 Aug 14 10:34 /tmp/tmpo4j5k0ul
$ fg # <=== return to the python session
python3
>>> os.fstat(temp_file.fileno()).st_ino # <=== recheck the inode number
42 # <=== still the old one
>>> temp_file.seek(0)
0
>>> temp_file.read()
b''
>>> f = open(temp_file.name) # <=== reopen from the filename
>>> os.fstat(f.fileno()).st_ino # <=== recheck the inode number
41 # <=== the new one this time
>>> f.read()
'hello\n'
Regarding how to fix this, you might find that your getui.download has an option to pass a file object rather than a file name, or at least to open to an existing file for writing rather than deleting and recreating it. Again, without exact details of where getui.download comes from, it is hard to give definite advice, but this will be the principle that you need to follow.

How to run Orange.canvas from within a python script and transform a Table to it?

I'm new to Python & Orange, and want to make use of Spyder & Orange to do some data mining.
I'm working with Rstudio & Rattle in this way:
...
# make a dataframe so called "mydata", then transform it to rattle
library(rattle)
rattle(dataset="mydata")
...
It's quite convenient to do some complex data preparation before calling Rattle, and then get the last script from Rattle. So I'm wondering about whether I can work in this way with Python or not.
I find a script of running a Widget within a pyhton script like this:
import Orange
from Orange.widgets.visualize.owruleviewer import OWRuleViewer
from AnyQt.QtWidgets import QApplication
from Orange.classification import CN2Learner
data = Orange.data.Table("titanic")
learner = Orange.classification.CN2Learner()
model = learner(data)
model.instances = data
a = QApplication([])
ow = OWRuleViewer()
ow.set_classifier(model)
ow.show()
a.exec()
So, how to do with the Orange.canvas's main window?
Any idea is appreciated, thanks in advance.
Now that I can run Orange within a python script:
...
# transform a pandas dataframe to an orange table
from Orange.data.pandas_compat import table_from_frame
ot1 = table_from_frame(data)
...
import sys
from Orange.canvas import __main__ as om
sys.exit(om.main(["-l 1","--no-splash","--no-welcome"]))
The question now is how to transform the orange table ot1 to orange, put it in a widget and place it in the canvas.
And I modify the source orangecanvas/main.py to work around the stdout & stderr problem, this bug is fixed in Python3.7.5:
...
def fix_win_pythonw_std_stream():
"""
On windows when running without a console (using pythonw.exe without I/O
redirection) the std[err|out] file descriptors are invalid
(`http://bugs.python.org/issue706263`_). We `fix` this by setting the
stdout/stderr to `os.devnull`.
"""
if sys.platform == "win32" and \
os.path.basename(sys.executable) == "pythonw.exe":
# if sys.stdout is None or sys.stdout.fileno() < 0:
# sys.stdout = open(os.devnull, "w")
# if sys.stderr is None or sys.stderr.fileno() < 0:
# sys.stderr = open(os.devnull, "w")
# This bug is fixed in Python3.7.5
print("win32 pythonw.exe")
...

My Python regex code is not giving me the output as i expected

I was trying to extract and parse the words (Hostname & version) from a text file. when i run my code it writes the data into csv file but the output looks different.
**my input file is .txt and below is the content**
Hostname Router1
version 15.01
code:
line console 0
logging synchronous
exec-timeout 15 1
usb-inactivity-timeout 15
exec prompt timestamp
transport preferred none
Hostname Router2
version 15.02
line vty 0 15
logging synchronous
exec-timeout 15 2
exec prompt timestamp
transport input ssh
transport preferred none
access-class REMOTE_ACCESS in
Hostname Router3
version 15
line console 0
logging synchronous
exec-timeout 15 3
usb-inactivity-timeout 15
exec prompt timestamp
transport preferred none
Hostname Router3
version 15.12
line vty 0 15
logging synchronous
exec-timeout 15 4
exec prompt timestamp
transport input ssh
transport preferred none
access-class REMOTE_ACCESS in
**Above is the sample content in my input text file**
$
import re
import csv
with open('sample5.csv','w',newline='') as output:
HeaderFields = ['Hostname','version']
writer = csv.DictWriter(output,fieldnames=HeaderFields)
writer.writeheader()
with open('testfile.txt','r',encoding='utf-8') as input:
for line in input.readlines():
pattern = re.compile(r'Hostname(.*)''|''version(.*)')
match=pattern.finditer(line)
for match1 in match:
with open('sample5.csv', 'a',newline='') as output:
writer = csv.DictWriter(output, fieldnames=HeaderFields)
writer.writerow({'Hostname': match1.group(1), 'version':
match1.group(2)})
My expected result in csv is as follows:
Thank You.
You code fails because in each iteration you read only one line (which can contain host or version but not both, yet you write data into csv. Let's iterate through all the text while matching twoliners:
with first line Hostname.. and second line version... \n works as line break for Windows (I hear Mac uses \r not sure). Now since you match twoliners you can grab both router and version from same match object.
with open('testfile.txt','r',encoding='utf-8') as input:
txt = input.read()
pattern = re.compile(r'Hostname (.*)(\r\n?|\n)version (.*)')
match=pattern.finditer(txt)
for match1 in match:
with open('sample5.csv', 'a',newline='') as output:
writer = csv.DictWriter(output, fieldnames=HeaderFields)
writer.writerow({'Hostname': match1.group(1), 'version':
match1.group(3)})

Create Ipython magic command for saving last console input to file

Remark now I found a solution of doing it. I want to implement my own magic command in ipython which saves the last input to a python file in order to produce executable python code interactively:
I thought about saving it as own magicfile.py in the ipython startup directory:
#Save this file in the ipython profile startup directory which can be found via:
#import IPython
#IPython.utils.path.locate_profile()
from IPython.core.magic import (Magics, magics_class, line_magic,
cell_magic, line_cell_magic)
# The class MUST call this class decorator at creation time
#magics_class
class MyMagics(Magics):
#line_magic
def s(self, line):
import os
import datetime
today = datetime.date.today()
get_ipython().magic('%history -l 1 -t -f history.txt /')
with open('history.txt', 'r') as history:
lastinput = history.readline()
with open('ilog_'+str(today)+'.py', 'a') as log:
log.write(lastinput)
os.remove('history.txt')
print 'Successfully logged to ilog_'+str(today)+'.py!'
# In order to actually use these magics, you must register them with a
# running IPython. This code must be placed in a file that is loaded once
# IPython is up and running:
ip = get_ipython()
# You can register the class itself without instantiating it. IPython will
# call the default constructor on it.
ip.register_magics(MyMagics)
So right now i type in a command in ipython, then s; and it appends it to the logfile of today.
Use the append argument, -a, with %save.
If this is the line you wish to save:
In [10]: print 'airspeed velocity of an unladen swallow: '
Then save it like this:
In [11]: %save -a IPy_session.py 10
The following commands were written to file `IPy_session.py`:
print 'airspeed velocity of an unladen swallow: '
See the Ipython %save documentation
It works by using the IPython Magic history. In the history the old inputs are saved and you just pick the last one and append it to a file with the date of today, so that you can save all inputs from one day in one log-file. The important lines are
get_ipython().magic('%history -l 1 -t -f history.txt /')
with open('history.txt', 'r') as history:
lastinput = history.readline()
with open('ilog_'+str(today)+'.py', 'a') as log:
log.write(lastinput)
os.remove('history.txt')

Categories