Writing csv header removes data from numpy array written below - python

I'm trying to export data to a csv file. It should contain a header (from datastack) and restacked arrays with my data (from datastack). One line in datastack has the same length as dataset. The code below works but it removes parts of the first line from datastack. Any ideas why that could be?
s = ','.join(itertools.chain(dataset)) + '\n'
newfile = 'export.csv'
f = open(newfile,'w')
f.write(s)
numpy.savetxt(newfile, (numpy.transpose(datastack)), delimiter=', ')
f.close()

You a file with the filename 'export.csv' twice, once when you call open() and once when you call numpy.savetxt(). Thus, there are two open file handles competing for the same filename. If you pass the file handle rather than the file name to numpy.savetxt() you avoid this race condition:
s = ','.join(itertools.chain(dataset)) + '\n'
newfile = 'export.csv'
f = open(newfile,'w')
f.write(s)
numpy.savetxt(f, (numpy.transpose(datastack)), delimiter=', ')
f.close()

Related

Slice a given txtfile and write only part of it in a newfile in python

This is my original .txt data:
HKEY_CURRENT_USER\SOFTWARE\7-Zip
HKEY_CURRENT_USER\SOFTWARE\AppDataLow
HKEY_CURRENT_USER\SOFTWARE\Chromium
HKEY_CURRENT_USER\SOFTWARE\Clients
HKEY_CURRENT_USER\SOFTWARE\CodeBlocks
HKEY_CURRENT_USER\SOFTWARE\Discord
HKEY_CURRENT_USER\SOFTWARE\Dropbox
HKEY_CURRENT_USER\SOFTWARE\DropboxUpdate
HKEY_CURRENT_USER\SOFTWARE\ej-technologies
HKEY_CURRENT_USER\SOFTWARE\Evernote
HKEY_CURRENT_USER\SOFTWARE\GNU
And I need to have a new file where the new lines contain only part of those strings, like:
7-Zip
AppDataLow
Chromium
Clients
...
how to do it in python?
Try this:
## read file content as string
with open("file.txt", "r") as file:
string = file.read()
## convert each line to list
lines = string.split("\n")
## write only last part after "\" in each line
with open("new.txt", "w") as file:
for line in lines:
file.write(line.split("\\")[-1] + "\n")
One approach would be to read the entire text file into a Python string. Then use split on each line to find the final path component.
with open('file.txt', 'r') as file:
data = file.read()
lines = re.split(r'\r?\n', data)
output = [x.split("\\")[-1] for x in lines]
# write to file if desired
text = '\n'.join(output)
f_out = open('output.txt', 'w')
f_out.write(text)
f_out.close()

Overwriting lines in text file [duplicate]

How can I insert a string at the beginning of each line in a text file, I have the following code:
f = open('./ampo.txt', 'r+')
with open('./ampo.txt') as infile:
for line in infile:
f.insert(0, 'EDF ')
f.close
I get the following error:
'file' object has no attribute 'insert'
Python comes with batteries included:
import fileinput
import sys
for line in fileinput.input(['./ampo.txt'], inplace=True):
sys.stdout.write('EDF {l}'.format(l=line))
Unlike the solutions already posted, this also preserves file permissions.
You can't modify a file inplace like that. Files do not support insertion. You have to read it all in and then write it all out again.
You can do this line by line if you wish. But in that case you need to write to a temporary file and then replace the original. So, for small enough files, it is just simpler to do it in one go like this:
with open('./ampo.txt', 'r') as f:
lines = f.readlines()
lines = ['EDF '+line for line in lines]
with open('./ampo.txt', 'w') as f:
f.writelines(lines)
Here's a solution where you write to a temporary file and move it into place. You might prefer this version if the file you are rewriting is very large, since it avoids keeping the contents of the file in memory, as versions that involve .read() or .readlines() will. In addition, if there is any error in reading or writing, your original file will be safe:
from shutil import move
from tempfile import NamedTemporaryFile
filename = './ampo.txt'
tmp = NamedTemporaryFile(delete=False)
with open(filename) as finput:
with open(tmp.name, 'w') as ftmp:
for line in finput:
ftmp.write('EDF '+line)
move(tmp.name, filename)
For a file not too big:
with open('./ampo.txt', 'rb+') as f:
x = f.read()
f.seek(0,0)
f.writelines(('EDF ', x.replace('\n','\nEDF ')))
f.truncate()
Note that , IN THEORY, in THIS case (the content is augmented), the f.truncate() may be not really necessary. Because the with statement is supposed to close the file correctly, that is to say, writing an EOF (end of file ) at the end before closing.
That's what I observed on examples.
But I am prudent: I think it's better to put this instruction anyway. For when the content diminishes, the with statement doesn't write an EOF to close correctly the file less far than the preceding initial EOF, hence trailing initial characters remains in the file.
So if the with statement doens't write EOF when the content diminishes, why would it write it when the content augments ?
For a big file, to avoid to put all the content of the file in RAM at once:
import os
def addsomething(filepath, ss):
if filepath.rfind('.') > filepath.rfind(os.sep):
a,_,c = filepath.rpartition('.')
tempi = a + 'temp.' + c
else:
tempi = filepath + 'temp'
with open(filepath, 'rb') as f, open(tempi,'wb') as g:
g.writelines(ss + line for line in f)
os.remove(filepath)
os.rename(tempi,filepath)
addsomething('./ampo.txt','WZE')
f = open('./ampo.txt', 'r')
lines = map(lambda l : 'EDF ' + l, f.readlines())
f.close()
f = open('./ampo.txt', 'w')
map(lambda l : f.write(l), lines)
f.close()

How does the last part of this Python code do what it does?

Essential what this code does is it takes a .txt file, replaces " " with "|"
then after it replaces it, it then goes back in and deletes any rows that start with a "|"
Im having trouble understanding what the last part of this code does.
I understand everything up until the:
output = [line for line in txt if not line.startswith("|")]
f.seek(0)
f.write("".join(output))
f.truncate()
everything before this code ^^ i understand but im not sure how this code does what its doing.
--------this is the full code----------
# This imports the correct package
import datetime
# this creates "today" as the variable that holds todays date
today = datetime.date.today().strftime("%Y%m%d")
# Read the file
with open(r'\\mlgserver04\mlgshare\DataTransfer&AuditCompliance\ARSI Calls\CallLog_ARSI_' + today + '.txt', 'r') as file:
filedata = file.read()
# Replace the target string
filedata = filedata.replace(' ', '|')
# Write the file out again
with open(r'\\mlgserver04\mlgshare\DataTransfer&AuditCompliance\ARSI Calls\CallLog_ARSI_' + today + '.txt', 'w') as file:
file.write(filedata)
# opens the file
with open(r'\\mlgserver04\mlgshare\DataTransfer&AuditCompliance\ARSI Calls\CallLog_ARSI_' + today + '.txt','r+') as f:
txt = f.readlines()
output = [line for line in txt if not line.startswith("|")]
f.seek(0)
f.write("".join(output))
f.truncate()
f.read() returns a blob of data (string or bytes)
On the other hand, f.readlines() returns a LIST of strings or bytes.
output = [line for line in txt if not line.startswith("|")]
This is the same thing as saying
output = []
for line in txt:
if not line.startswith("|"):
output.append(line)
So, make a NEW list of strings, consisting of only the ones that we want to keep.
"".join(output)
Join takes an iterable of strings, and adds them together with the delimiter (in this case, it was ""). so output[0] + "" + output[1] and so on.
Finally, write that final string back to the file.

Removing New Line from CSV Files using Python

I obtain multiple CSV files from API, in which I need to remove New Lines present in the CSV and join the record, consider the data provided below;
My Code to remove the New Line:
## Loading necessary libraries
import glob
import os
import shutil
import csv
## Assigning necessary path
source_path = "/home/Desktop/Space/"
dest_path = "/home/Desktop/Output/"
# Assigning file_read path to modify the copied CSV files
file_read_path = "/home/Desktop/Output/*.csv"
## Code to copy .csv files from one folder to another
for csv_file in glob.iglob(os.path.join(source_path, "*.csv"), recursive = True):
shutil.copy(csv_file, dest_path)
## Code to delete the second row in all .CSV files
for filename in glob.glob(file_read_path):
with open(filename, "r", encoding = 'ISO-8859-1') as file:
reader = list(csv.reader(file , delimiter = ","))
for i in range(0,len(reader)):
reader[i] = [row_space.replace("\n", "") for row_space in reader[i]]
with open(filename, "w") as output:
writer = csv.writer(output, delimiter = ",", dialect = 'unix')
for row in reader:
writer.writerow(row)
I actually copy the CSV files into a new folder and then use the above code to remove any new line present in the file.
You are fixing the csv File, because they have wrong \n the problem here is how
to know if the line is a part of the previous line or not. if all lines starts
with specifics words like in your example SV_a5d15EwfI8Zk1Zr or just SV_ You can do something like this:
import glob
# this is the FIX PART
# I have file ./data.csv(contains your example) Fixed version is in data.csv.FIXED
file_read_path = "./*.csv"
for filename in glob.glob(file_read_path):
with open(filename, "r", encoding='ISO-8859-1') as file, open(filename + '.FIXED', "w", encoding='ISO-8859-1') as target:
previous_line = ''
for line in file:
# check if it's a new line or a part of the previous line
if line.startswith('SV_'):
if previous_line:
target.write( previous_line + '\n')
previous_line = line[:-1] # remove \n
else:
# concatenate the broken part with previous_line
previous_line += line[:-1] # remove \n
# add last line
target.write(previous_line + '\n')
Ouput:
SV_a5d15EwfI8Zk1Zr;QID4;"<span style=""font-size:16px;""><strong>HOUR</strong> Interview completed at:</span>";HOUR;TE;SL;;;true;ValidNumber;0;23.0;0.0;882;-873;0
SV_a5d15EwfI8Zk1Zr;QID6;"<span style=""font-size:16px;""><strong>MINUTE</strong> Interview completed:</span>";MIN;TE;SL;;;true;ValidNumber;0;59.0;0.0;882;-873;0
SV_a5d15EwfI8Zk1Zr;QID8;Number of Refusals - no language<br />For <strong>Zero Refusals - no language</strong> use 0;REFUSAL1;TE;SL;;;true;ValidNumber;0;99.0;0.0;882;-873;0
SV_a5d15EwfI8Zk1Zr;QID10;<strong>DAY OF WEEK:</strong>;WEEKDAY;MC;SACOL;TX;;true;;0;;;882;-873;0
SV_a5d15EwfI8Zk1Zr;QID45;"<span style=""font-size:16px;"">Using points from 0 to 10, how likely would you be recommend Gatwick Airport to a friend or colleague?</span><div> </div>";NPSCORE;MC;NPS;;;true;;0;;;882;-873;
EDITS:
Can Be Simpler using split too, this will fix the file it self:
import glob
# this is the FIX PART
# I have file //data.csv the fixed version in the same file
file_read_path = "./*.csv"
# assuming that all lines starts with SV_
STARTING_KEYWORD = 'SV_'
for filename in glob.glob(file_read_path):
with open(filename, "r", encoding='ISO-8859-1') as file:
lines = file.read().split(STARTING_KEYWORD)
with open(filename, 'w', encoding='ISO-8859-1') as file:
file.write('\n'.join(STARTING_KEYWORD + l.replace('\n', '') for l in lines if l))
Well I'm not sure on the restrictions you have. But if you can use the pandas library , this is simple.
import pandas as pd
data_set = pd.read_csv(data_file,skip_blank_lines=True)
data_set.to_csv(target_file,index=False)
This will create a CSV File will all new lines removed. You can save a lot of time with available libraries.

Python read file by line and write into a different file

SO basically what I am trying to do is that I am trying to make it so I can read a file line by line, and then have a certain text added after the text displayed
For Ex.
Code:
file = open("testlist.txt",'w')
file2 = open("testerlist.txt",'r+')
//This gives me a syntax error obviously.
file.write1("" + file + "" + file2 + "")
Textlist
In my testlist.txt it lists as:
os
Testerlist
In my testerlist.txt it lists as:
010101
I am trying to copy one text from one file and read another file and add it to the beginning of a new file for ex.[accounts.txt].
My End Result
For my end result I am trying to have it be like:
os010101
(btw I have all the correct code, its just that I am using this as an example so if I am missing any values its just because I was to lazy to add it.)
You can use file.read() to read the contents of a file. Then just concatenate the data from two files and write to the output file:
with open("testlist.txt") as f1, open("testerlist.txt") as f2, \
open("accounts.txt", "w") as f3:
f3.write(f1.read().strip() + f2.read().strip())
Note that 'mode' is not required when opening files for reading.
If you need to write the lines in particular order, you could use file.readlines() to read the lines into a list and file.writelines() to write multiple lines to the output file, e.g.:
with open("testlist.txt") as f1, open("testerlist.txt") as f2, \
open("accounts.txt", "w") as f3:
f1_lines = f1.readlines()
f3.write(f1_lines[0].strip())
f3.write(f2.read().strip())
f3.writelines(f1_lines[1:])
Try with something like this:
with open('testlist.txt', 'r') as f:
input1 = f.read()
with open('testerlist.txt', 'r') as f:
input2 = f.read()
output = input1+input2
with open("accounts.txt", "a") as myfile:
myfile.write(output)

Categories