How do I write a list to a file?

How do I write a list to a file? - python

how to write a list with elements with words from different languages and with numbers to a file?
s = ['привет', 'hi', 235, 235, 45]
with open('test.txt', 'wb') as f:
f.write(s)
TypeError: a bytes-like object is required, not 'list'

Well since you are opening a .txt file and you mentioned different languages, I will assume that you want to write to a .txt file some values with different encoding.
(If that's not the case and you want to write binary, you can check the answers above)
You should specify the encoding for the file depending on the values you have.
s = ['привет', 'hi', 235, 235, 45]
with open("filename.txt", "w", encoding="utf-8") as f:
f.write("\n".join(str(val) for val in s))

Using 'wb' to open the file causes you to write in binary mode. I'm not sure if this is what you want, since you are writing to a .txt file.
If you are sure you want to write in binary mode, I would suggest you use .bin or .dat as the file extension for your file.
If you are sure you want to write the output as binary, here is one way to do so (I make several assumptions here, because you don't give details in your question):
import sys
s = ['привет', 'hi', 235, 235, 45]
# Create an empty bytearray
byte_array = bytearray()
# For each item in the list s, convert it to a bytearray, and append it to the
# output bytearray
for item in s:
if isinstance(item, str):
byte_array.extend(bytearray(item, 'utf-8'))
elif isinstance(item, int):
# Assumption: all ints in the list can be represented with only 2 bytes
# Assumption: we want to output with the system's byte order
byte_array.extend(item.to_bytes(2, sys.byteorder, signed=True))
# Here I use the .bin extension, because it is a binary file
with open('test.bin', 'wb') as f:
f.write(byte_array)
If you instead do not want to write as binary, but want to write as text, see Spiros Gkogkas's answer.

with open('filename.txt', 'w') as file:
file.write(', '.join([str(item) for item in s]))
It writes every item in a list (first changing every item to a string of course) separated by a comma.

Maybe convert (parse) to json string
Like this:
import json
dataList = ["po", "op", "oo"]
dataJson = json.dumps(dataList)
# Then your code
# with open('test.txt', 'wb') as f: f.write(dataJson)

Related

Python program for writing length of list to file

I have a file list.txt that contains a single list only e.g.
[asd,ask,asp,asq]
The list might be a very long one. I want to create a python program len.py that reads list.txt and writes the length of the within list to the file num.txt. Something like the following:
fin = open("list.txt", "rt")
fout = open("num.txt", "wt")
for list in fin:
fout.write(len(list))
fin.close()
fout.close()
However this does not work. Can someone point out what needs to be changed? Many thanks.

Use:
with open("list.txt") as f1, open("num.txt", "w") as f2:
for line in f1:
line = line.strip('\n[]')
f2.write(str(len(line.split(','))) + '\n')

with open("list.txt") as fin, open("num.txt", "w") as fout:
input_data = fin.readline()
# check if there was any info read from input file
if input_data:
# split string into list on comma character
strs = input_data.replace('[','').split('],')
lists = [map(int, s.replace(']','').split(',')) for s in strs]
print(len(lists))
fout.write(str(len(lists)))
I updated the code to use the with statement from another answer. I also used some code from this answer (How can I convert this string to list of lists?) to (more?) correctly count nested lists.

When python try to read a file using default method it generally treats content of that file as a string. So first responsibility is to type cast string content into appropriate content type for that you can not use default type casting method.
You can use special package by the name ast to type cast the data.
import ast
fin = open("list.txt", "r")
fout = open("num.txt", "w")
for list in fin.readlines():
fout.write(len(ast.literal_eval(list)))
fin.close()
fout.close()

Need to convert string to format usable by .hex() or other hex conversion method

I am reading hex data from a .csv file that has multiple rows (example format: FFFDF3FFFBF2FFFAF210FFF0) using the following code:
with open('c:\\temp\\results.csv') as csv_file:
csv_reader = csv.reader(csv_file, delimiter=",")
line_count = 0
file = open('c:\\temp\\sent.csv', 'w')
for row in csv_reader:
hex_string = f'{row[0]}'
bytes_object = bytes.fromhex(hex_string)
file.write(str(bytes_object) + '\n')
line_count += 1
file.close()
The output file contains mutliple rows that are converted to this format (sorry new to python so not sure if this is a bytearray or what it is actually called): b'\xff\xfd\xf3\xff\xfb\xf2\xff\xfa\xf2\x10\xff\xf0'
I am trying to convert back from this format to the orginal format reading the rows of the newly created .csv file (need to edit readable ascii in the file and covert back for use in another program).
file = open('c:\\temp\\sent.csv', 'r')
for row in file:
byte_string = row
# hex_object = byte_string.hex()
#THIS works if I enter the byte array in directly, but not if reading
#from file hex_object = byte_string.hex()
hex_object = b'\xff\xfd\x03\xff\xfb\x03\xff\xfd\x01\xff\xfb\x17\xff\xfa\xff\xf0\xff\xfd\x00\xff\xfb\x00'.hex()
print(hex_object)
#print(byte_string)
# writer.writerow(hex_object)
Is there a way to get this to work? I have tried several encoding methods, but since the data is already in the proper format I really just need to get it in a readable type for the .hex() method. I am using the latest version of Python 3.8.1enter code here

You are storing a textual representation of your bytes object and then trying to read it back without conversion to/from binary. Instead you are better off opening the output file in binary format like this:
file = open('c:\\temp\\sent.csv', 'wb')
and write the bytes to file:
bytes_object = bytes.fromhex(hex_string)
file.write(bytes_object)
(no need for newline character).
Then to do the opposite open in binary format:
with open('c:\\temp\\sent.csv', "rb") as f:
data = f.read()
s = data.hex()
print(s)
Here data is a bytes object and it has the hex() function you are looking for.

filtering a weird text file in python

I have a text file in which each ID line starts with > and the next line(s) are the a sequence of characters. And the next line after the sequence of characters would be an other ID line starting with >. but in some of them, instead of sequence I have “Sequence unavailable”. The sequence after the ID line can be one or more lines.
like this example:
>ENSG00000173153|ENST00000000442|64073050;64074640|64073208;64074651
AAGCAGCCGGCGGCGCCGCCGAGTGAGGGGACGCGGCGCGGTGGGGCGGCGCGGCCCGAGGAGGCGGCGGAGGAGGGGCCGCCCGCGGCCCCCGGCTCACTCCGGCACTCCGGGCCGCTC
>ENSG00000004139|ENST00000003834
Sequence unavailable
I want to filter out those IDs with “Sequence unavailable”. The output should look like this:
output:
>ENSG00000173153|ENST00000000442|64073050;64074640|64073208;64074651
AAGCAGCCGGCGGCGCCGCCGAGTGAGGGGACGCGGCGCGGTGGGGCGGCGCGGCCCGAGGAGGCGGCGGAGGAGGGGCCGCCCGCGGCCCCCGGCTCACTCCGGCACTCCGGGCCGCTC
do you know how to do that in python?

Unlike the other answers, I’d strongly recommand against parsing the FASTA format manually. It’s not too hard but there are pitfalls, and it’s completely unnecessary since efficient, well-tested implementations exist:
Use Bio.SeqIO from BioPython; for example:
from Bio import SeqIO
for record in SeqIO.parse(filename, 'fasta'):
if record.seq != 'Sequenceunavailable':
SeqIO.write(record, outfile, 'fasta')
Note the missing space in 'Sequenceunavailable': reading the sequences in FASTA format will omit spaces.

How about this:
with open(filename, 'r+') as f:
data = f.read()
data = data.split('>')
result = ['>{}'.format(item) for item in data if item and 'Sequence unavailable' not in item]
f.seek(0)
for line in result:
f.write(line)

def main():
filename = open('text.txt', 'rU').readlines()
filterFile(filename)
def filterFile(SequenceFile):
outfile = open('outfile', 'w')
for line in SequenceFile:
if line.startswith('>'):
sequence = line.next()
if sequence.startswith('Sequence unavailable'):
//nothing should happen I suppose?
else:
outfile.write(line + "\n" + sequence + "\n")
main()
I unfortunately can't test this code right now but I made this out of the top of my head! Please test it and let me know what the outcome is so I can adjust the code :-)

So I don't exactly know how large these files will get, just in case, I'm doing it without mapping the file in memory:
with open(filename) as fh:
with open(filename+'.new', 'w+') as fh_new:
for idline, geneseq in zip(*[iter(fh)] * 2):
if geneseq.strip() != 'Sequence unavailable':
fh_new.write(idline)
fh_new.write(geneseq)
It works by creating a new file, then the zip thing is some magic to read the 2 lines of the file, the idline will be the first part and the geneseq the second part.
This solution should be relatively cheap in computer power but will create an extra output file.

open a .json file with multiple dictionaries

I have a problem that I can't solve with python, it is probably very stupid but I didn't manage to find the solution by myself.
I have a .json file where the results of a simulation are stored. The result is stored as a series of dictionaries like
{"F_t_in_max": 709.1800264942982, "F_t_out_max": 3333.1574129603068, "P_elec_max": 0.87088836042046958, "beta_max": 0.38091242406098391, "r0_max": 187.55175182942901, "r1_max": 1354.8636763521174, " speed ": 8}
{"F_t_in_max": 525.61428305710433, "F_t_out_max": 2965.0538075438467, "P_elec_max": 0.80977406754203796, "beta_max": 0.59471606595464666, "r0_max": 241.25371753877008, "r1_max": 688.61786996066826, " speed ": 9}
{"F_t_in_max": 453.71124051199763, "F_t_out_max": 2630.1763649193008, "P_elec_max": 0.64268078173342935, "beta_max": 1.0352896471221695, "r0_max": 249.32706230502498, "r1_max": 709.11415981343885, " speed ": 10}
I would like to open the file and and access the values like to plot "r0_max" as function of "speed" but I can't open unless there is only one dictionary.
I use
with open('./results/rigid_wing_opt.json') as data_file:
data = json.load(data_file)
but When the file contains more than one dictionary I get the error
ValueError: Extra data: line 5 column 1 - line 6 column 1 (char 217 - 431)

If your input data is exactly as provided then you should be able to interpret each individual dictionary using json.load. If each dictionary is on its own line then this should be sufficient:
with open('filename', 'r') as handle:
json_data = [json.loads(line) for line in handle]

I would recommend reading the file line-by-line and convert each line independently to a dictionary.
You can place each line into a list with the following code:
import ast
# Read all lines into a list
with open(fname) as f:
content = f.readlines()
# Convert each list item to a dict
content = [ ast.literal_eval( line ) for line in content ]
Or an even shorter version performing the list comprehension on the same line:
import ast
# Read all lines into a list
with open(fname) as f:
content = [ ast.literal_eval( l ) for l in f.readlines() ]

{...} {...} is not proper json. It is two json objects separated by a space. Unless you can change the format of the input file to correct this, I'd suggest you try something a little different. If the data is a simple as in your example, then you could do something like this:
with open('filename', 'r') as handle:
text_data = handle.read()
text_data = '[' + re.sub(r'\}\s\{', '},{', text_data) + ']'
json_data = json.loads(text_data)
This should work even if your dictionaries are not on separate lines.

That is not valid JSON. You can't have multiple obje at the top level, without surrounding them by a list and inserting commas between them.

Converting to csv from?

I have got a file with the following lines
{"status":"OK","message":"OK","data":[{"type":"addressAccessType","addressAccessId":"0a3f508f-e7c8-32b8-e044-0003ba298018","municipalityCode":"0766","municipalityName":"Hedensted","streetCode":"0072","streetName":"Værnegården","streetBuildingIdentifier":"13","mailDeliverySublocationIdentifier":"","districtSubDivisionIdentifier":"","postCodeIdentifier":"8000","districtName":"Århus","presentationString":"Værnegården 13, 8000 Århus","addressSpecificCount":1,"validCoordinates":true,"geometryWkt":"POINT(553564 6179299)","x":553564,"y":6179299}]}
I want to transform every line into a csv readable file with headers. Like the following
status,message,data,addressAccessId,municipalityCode,municipalityName,streetCode,streetName,streetBuildingIdentifier,mailDeliverySublocationIdentifier,districtSubDivisionIdentifier,postCodeIdentifier,districtName,presentationString,addressSpecificCount,validCoordinates,geometryWkt,x,y
OK,OK,data:type,addressAccessType,0a3f508f-e7c8-32b8-e044-0003ba298018,0766,Hedensted,0072,Værnegården,13,,,8000,Århus,Værnegården 13, 8000 Århus,1,true,POINT553564 6179299,553564,6179299
How do I accomplish that? Code and explanation are very welcome. So far this is what I have come up with the following from this example:(How can I convert JSON to CSV?)
x = json.loads(x)
f = csv.writer(open('test.csv', 'wb+'))
# Write CSV Header, If you dont need that, remove this line
f.writerow(['status', 'message', 'type', 'addressAccessId', 'municipalityCode','municipalityName','streetCode','streetName','streetBuildingIdentifier','mailDeliverySublocationIdentifier','districtSubDivisionIdentifier','postCodeIdentifier','districtName','presentationString','addressSpecificCount','validCoordinates','geometryWkt','x','y'])
for x in x:
f.writerow([x['status'],
x['message'],
x['data']['type'],
x['data']['addressAccessId'],
x['data']['municipalityCode'],
x['data']['municipalityName'],
x['data']['streetCode'],
x['data']['streetName'],
x['data']['streetBuildingIdentifier'],
x['data']['mailDeliverySublocationIdentifier'],
x['data']['districtSubDivisionIdentifier'],
x['data']['postCodeIdentifier'],
x['data']['districtName'],
x['data']['presentationString'],
x['data']['addressSpecificCount'],
x['data']['validCoordinates'],
x['data']['geometryWkt'],
x['data']['x'],
x['data']['y']])
I have looked through and tried a lot of other solutions, including DictWriter, replace() and translate() to remove characthers but have not yet been able to transform the line to my need. The purpose being able to select the fields that are output into a new file, and transforming x and y to a new coordinate system. But for now Im just trying to parse the above line to a csv file. Can anyone offer code and explanation of their code? Thank you very much for your time.
Below are the first few lines of my addresses.txt
{"status":"OK","message":"OK","data":[{"type":"addressAccessType","addressAccessId":"0a3f5081-e039-32b8-e044-0003ba298018","municipalityCode":"0265","municipalityName":"Roskilde","streetCode":"0831","streetName":"Brønsager","streetBuildingIdentifier":"69","mailDeliverySublocationIdentifier":"","districtSubDivisionIdentifier":"Svogerslev","postCodeIdentifier":"4000","districtName":"Roskilde","presentationString":"Brønsager 69, 4000 Roskilde","addressSpecificCount":1,"validCoordinates":true,"geometryWkt":"POINT(690026 6169309)","x":690026,"y":6169309}]}
{"status":"OK","message":"OK","data":[{"type":"addressAccessType","addressAccessId":"0a3f5089-ecab-32b8-e044-0003ba298018","municipalityCode":"0461","municipalityName":"Odense","streetCode":"9505","streetName":"Vægtens Kvarter","streetBuildingIdentifier":"271","mailDeliverySublocationIdentifier":"","districtSubDivisionIdentifier":"Holluf Pile","postCodeIdentifier":"5220","districtName":"Odense SØ","presentationString":"Vægtens Kvarter 271, 5220 Odense SØ","addressSpecificCount":1,"validCoordinates":true,"geometryWkt":"POINT(592191 6135829)","x":592191,"y":6135829}]}
{"status":"OK","message":"OK","data":[{"type":"addressAccessType","addressAccessId":"0a3f507c-adc3-32b8-e044-0003ba298018","municipalityCode":"0165","municipalityName":"Albertslund","streetCode":"0445","streetName":"Skyttehusene","streetBuildingIdentifier":"33","mailDeliverySublocationIdentifier":"","districtSubDivisionIdentifier":"","postCodeIdentifier":"2620","districtName":"Albertslund","presentationString":"Skyttehusene 33, 2620 Albertslund","addressSpecificCount":1,"validCoordinates":true,"geometryWkt":"POINT(711079 6174741)","x":711079,"y":6174741}]}
{"status":"OK","message":"OK","data":[{"type":"addressAccessType","addressAccessId":"0a3f509c-7f57-32b8-e044-0003ba298018","municipalityCode":"0851","municipalityName":"Aalborg","streetCode":"5205","streetName":"Løvstikkevej","streetBuildingIdentifier":"36","mailDeliverySublocationIdentifier":"","districtSubDivisionIdentifier":"","postCodeIdentifier":"9000","districtName":"Aalborg","presentationString":"Løvstikkevej 36, 9000 Aalborg","addressSpecificCount":1,"validCoordinates":true,"geometryWkt":"POINT(552407 6322490)","x":552407,"y":6322490}]}
{"status":"OK","message":"OK","data":[{"type":"addressAccessType","addressAccessId":"0a3f5098-32a6-32b8-e044-0003ba298018","municipalityCode":"0779","municipalityName":"Skive","streetCode":"0462","streetName":"Landevejen","streetBuildingIdentifier":"52","mailDeliverySublocationIdentifier":"","districtSubDivisionIdentifier":"Håsum","postCodeIdentifier":"7860","districtName":"Spøttrup","presentationString":"Landevejen 52, 7860 Spøttrup","addressSpecificCount":1,"validCoordinates":true,"geometryWkt":"POINT(491515 6269739)","x":491515,"y":6269739}]}

Note that the data key holds a list of dictionaries. x['data']['type'] wouldn't work, but x['data'][0]['type'] would. There might be more than one such dictionary in that list, however. I'll assume you want a CSV row per x['data'] dictionary.
Next, it appears you have a UTF-8 BOM on every line; whatever wrote this was not using UTF-8 encoding correctly. We need to strip this marker, the first 3 characters.
Last, JSON strings are always Unicode data, and you have non-ASCII characters in your data, so you'll have to encode to bytestrings again before passing the data to the CSV writer object.
I'd use csv.DictWriter here, with a pre-defined list of field names:
import codecs
import csv
import json
fields = [
'status', 'message', 'type', 'addressAccessId', 'municipalityCode',
'municipalityName', 'streetCode', 'streetName', 'streetBuildingIdentifier',
'mailDeliverySublocationIdentifier', 'districtSubDivisionIdentifier',
'postCodeIdentifier', 'districtName', 'presentationString', 'addressSpecificCount',
'validCoordinates', 'geometryWkt', 'x', 'y']
with open('test.csv', 'wb') as csvfile, open('jsonfile', 'r') as jsonfile:
writer = csv.DictWriter(csvfile, fields)
writer.writeheader()
for line in jsonfile:
if line.startswith(codecs.BOM_UTF8):
line = line[3:]
entry = json.loads(line)
for item in entry['data']:
row = dict(item, status=entry['status'], message=entry['message'])
row = {k.encode('utf8'): unicode(v).encode('utf8') for k, v in row.iteritems()}
writer.writerow(row)
The row dictionary is basically a copy of each of the dictionaries in the entry['data'] list, with the status and message keys copied over separately. This makes row a flat dictionary instead.
I also read your input file line by line, as you say that each line contains a separate JSON entry.

Open the output file with cvs.DictWriter() and define the output header fields as you specified. Use extrasaction='ignore' and restval='' as options.
Look at Opening A large JSON file in Python with no newlines for csv conversion Python 2.6.6 for help with processing large files as I had a similar question Also look at the question that I link to.
I build a similar type of system from a JSON using appropriate loops.
for example,
def parse_row(currdata):
outx = {}
# currdata is defined earlier to point to the x['data'] dictionary
for eachx in currdata:
outx[eachx] = currdata[eachx]
return outx
where this is in a function with currdata as an argument and called with x['data'][row] as the input argument.
rows = len(x['data'])
for row in range(rows):
outx = parse_row(x['data'][row])
# process the row and create output
This should allow you to set up the parsing properly. I cannot copy the actual code into this answer but this should point you to a solution.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How do I write a list to a file? - python

how to write a list with elements with words from different languages and with numbers to a file? s = ['привет', 'hi', 235, 235, 45] with open('test.txt', 'wb') as f: f.write(s) TypeError: a bytes-like object is required, not 'list'

with open('filename.txt', 'w') as file: file.write(', '.join([str(item) for item in s])) It writes every item in a list (first changing every item to a string of course) separated by a comma.

Maybe convert (parse) to json string Like this: import json dataList = ["po", "op", "oo"] dataJson = json.dumps(dataList) # Then your code # with open('test.txt', 'wb') as f: f.write(dataJson)

Related

Python program for writing length of list to file

Need to convert string to format usable by .hex() or other hex conversion method

filtering a weird text file in python

open a .json file with multiple dictionaries

Converting to csv from?

Categories

Resources

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How do I write a list to a file? - python

how to write a list with elements with words from different languages ​​and with numbers to a file? s = ['привет', 'hi', 235, 235, 45] with open('test.txt', 'wb') as f: f.write(s) TypeError: a bytes-like object is required, not 'list'

with open('filename.txt', 'w') as file: file.write(', '.join([str(item) for item in s])) It writes every item in a list (first changing every item to a string of course) separated by a comma.

Maybe convert (parse) to json string Like this: import json dataList = ["po", "op", "oo"] dataJson = json.dumps(dataList) # Then your code # with open('test.txt', 'wb') as f: f.write(dataJson)

Related

Python program for writing length of list to file

Need to convert string to format usable by .hex() or other hex conversion method

filtering a weird text file in python

open a .json file with multiple dictionaries

Converting to csv from?

Categories

Resources

how to write a list with elements with words from different languages and with numbers to a file? s = ['привет', 'hi', 235, 235, 45] with open('test.txt', 'wb') as f: f.write(s) TypeError: a bytes-like object is required, not 'list'