I'm not sure how to word my question exactly, and I have seen some similar questions asked but not exactly what I'm trying to do. If there already is a solution please direct me to it.
Here is what I'm trying to do:
At my work, we have a few pkgs we've built to handle various data types. One I am working with is reading in a csv file into a std_io object (std_io is our all-purpose object class that reads in any type of data file).
I am trying to connect this to another pkg I am writing, so I can make an object in the new pkg, and covert it to a std_io object.
The problem is, the std_io object is meant to read an actual file, not take in an object. To get around this, I can basically write my data to temp.csv file then read it into a std_io object.
I am wondering if there is a way to eliminate this step of writing the temp.csv file.
Here is my code:
x #my object
df = x.to_df() #object class method to convert to a pandas dataframe
df.to_csv('temp.csv') #write data to a csv file
std_io_obj = std_read('temp.csv') #read csv file into a std_io object
Is there a way to basically pass what the output of writing the csv file would be directly into std_read? Does this make sense?
The only reason I want to do this is to avoid having to code additional functionality into either of the pkgs to directly accept an object as input.
Hope this was clear, and thanks to anyone who contributes.
For those interested, or who may have this same kind of issue/objective, here's what I did to solve this problem.
I basically just created a temporary named file, linked a .csv filename to this temp file, then passed it into my std_read function which requires a csv filename as an input.
This basically tricks the function into thinking it's taking the name of a real file as an input, and it just opens it as usual and uses csvreader to parse it up.
This is the code:
import tempfile
import os
x #my object I want to convert to a std_io object
text = x.to_df().to_csv() #object class method to convert to a pandas dataframe then generate the 'text' of a csv file
filename = 'temp.csv'
with tempfile.NamedTemporaryFile(dir = os.path.dirname('.')) as f:
f.write(text.encode())
os.link(f.name, filename)
stdio_obj = std_read(filename)
os.unlink(filename)
del f
FYI - the std_read function essentially just opens the file the usual way, and passes it into csvreader:
with open(filename, 'r') as f:
rdr = csv.reader(f)
Related
I don't need the entire code but I want a push to help me on the way, I've been searching on the internet for clues on how to start to write a function like this but I haven't gotten any further then just the name of the function.
So I haven't got the slightest clue on how to start with this, I don't know how to work with text files. Any tips?
These text files are CSV (Comma Separated Values). It is a simple file format used to store tabular data.
You may explore Python's inbuilt module called csv.
Following code snippet an example to load .csv file in Python:
import csv
filename = 'us_population.csv'
with open(filename, 'r') as csvfile:
csvreader = csv.reader(csvfile)
am new to Python and working a bit on pickle files.
I have already a pickle file called training.pickle and a txt file called danish.txt
I would like to import the danish.txt to the training.pickle file but i don't know how to do ?
I have tried some thing but am sure its wrong :-)
import pickle
file1=open('danish.txt','r')
file2=open('training.pickle','r')
obj=[file1.read(), file2.read()]
outfile.write("obj,training.pickle")
I don't know much about pickle but if you're just trying to add the data from "danish.txt" to the pickle file you should be able to just open the .txt, store the data in a variable, and then write the data in the pickle.
To demonstrate my thinking:
f = open("danish.txt", "r+")
data = f.read()
output = data
f.close() #this reads the .txt file
and then afterward you'd write "output" into the pickle file via whatever method you use to write a string variable to a pickle file.
P.S. as I said I don't know much about pickle, but if it works anything like writing to a .txt you'd have to change the r to a w because r means opening it in read mode. If its just reading it can't write, or atleast that's how it works with .txts. Also, if there's no particular reason why you're using a pickle to store data, why not just use a .txt?
Is there a way, in the code below, to access the variable utterances_dict outside of the with-block? The code below obviously returns the error: ValueError: I/O operation on closed file.
from csv import DictReader
utterances_dict = {}
utterance_file = 'toy_utterances.csv'
with open(utterance_file, 'r') as utt_f:
utterances_dict = DictReader(utt_f)
for line in utterances_dict:
print(line)
I am not an expert on DictReader implementation, however their documentation leaves the implementation open to the reader itself parsing the file after construction. Meaning it may be possible that the underlying file has to remain open until you are done using it. In this case, it would be problematic to attempt to use the utterances_dict outside of the with block because the underlying file will be closed by then.
Even if the current implementation of DictReader does in fact parse the whole csv on construction, it doesn't mean their implementation won't change in the future.
DictReader returns a view of the csv file.
Convert the result to a list of dictionaries.
from csv import DictReader
utterances = []
utterance_file = 'toy_utterances.csv'
with open(utterance_file, 'r') as utt_f:
utterances = [dict(row) for row in DictReader(utt_f) ]
for line in utterances:
print(line)
I am using
csvFile=open("yelloUSA.csv",'w+')
to make a new csv file.
now I want to write the file names from 1-10. But I want to use formatting to write the names. how can I do that?
I used below code but it is showing error
for i in range(0,10):
csvFile=open("{0}.csv",'w+').format(i)
writer=csv.writer(csvFile)
The format is a method of a string object. You are applying it to the result of open function, which if a file object. You need to apply it to the filename string
filename = "{0}.csv".format(i)
csvFile=open(filename ,'w+')
So I basically just want to have a list of all the pixel colour values that overlap written in a text file so I can then access them later.
The only problem is that the text file is having (set([ or whatever written with it.
Heres my code
import cv2
import numpy as np
import time
om=cv2.imread('spectrum1.png')
om=om.reshape(1,-1,3)
om_list=om.tolist()
om_tuple={tuple(item) for item in om_list[0]}
om_set=set(om_tuple)
im=cv2.imread('RGB.png')
im=cv2.resize(im,(100,100))
im= im.reshape(1,-1,3)
im_list=im.tolist()
im_tuple={tuple(item) for item in im_list[0]}
ColourCount= om_set & set(im_tuple)
File= open('Weedlist', 'w')
File.write(str(ColourCount))
Also, if I run this program again but with a different picture for comparison, will it append the data or overwrite it? It's kinda hard to tell when just looking at numbers.
If you replace these lines:
im=cv2.imread('RGB.png')
File= open('Weedlist', 'w')
File.write(str(ColourCount))
with:
import sys
im=cv2.imread(sys.argv[1])
open(sys.argv[1]+'Weedlist', 'w').write(str(list(ColourCount)))
you will get a new file for each input file and also you don't have to overwrite the RGB.png every time you want to try something new.
Files opened with mode 'w' will be overwritten. You can use 'a' to append.
You opened the file with the 'w' mode, write mode, which will truncate (empty) the file when you open it. Use 'a' append mode if you want data to be added to the end each time
You are writing the str() conversion of a set object to your file:
ColourCount= om_set & set(im_tuple)
File= open('Weedlist', 'w')
File.write(str(ColourCount))
Don't use str to convert the whole object; format your data to a string you find easy to read back again. You probably want to add a newline too if you want each new entry to be added on a new line. Perhaps you want to sort the data too, since a set lists items in an ordered determined by implementation details.
If comma-separated works for you, use str.join(); your set contains tuples of integer numbers, and it sounds as if you are fine with the repr() output per tuple, so we can re-use that:
with open('Weedlist', 'a') as outputfile:
output = ', '.join([str(tup) for tup in sorted(ColourCount)])
outputfile.write(output + '\n')
I used with there to ensure that the file object is automatically closed again after you are done writing; see Understanding Python's with statement for further information on what this means.
Note that if you plan to read this data again, the above is not going to be all that efficient to parse again. You should pick a machine-readable format. If you need to communicate with an existing program, you'll need to find out what formats that program accepts.
If you are programming that other program as well, pick a format that other programming language supports. JSON is widely supported for example (use the json module and convert your set to a list first; json.dump(sorted(ColourCount), fileobj), then `fileobj.write('\n') to produce newline-separated JSON objects could do).
If that other program is coded in Python, consider using the pickle module, which writes Python objects to a file efficiently in a format the same module can load again:
with open('Weedlist', 'ab') as picklefile:
pickle.dump(ColourCount, picklefile)
and reading is as easy as:
sets = []
with open('Weedlist', 'rb') as picklefile:
while True:
try:
sets.append(pickle.load(output))
except EOFError:
break
See Saving and loading multiple objects in pickle file? as to why I use a while True loop there to load multiple entries.
How would you like the data to be written? Replace the final line by
File.write(str(list(ColourCount)))
Maybe you like that more.
If you run that program, it will overwrite the previous content of the file. If you prefer to apprend the data open the file with:
File= open('Weedlist', 'a')