csv file overwritten instead of appended in python [duplicate] - python

I am trying to append data to a file using numpy's savetxt function. Below is the minimum working example
#!/usr/bin/env python3
import numpy as np
f=open('asd.dat','a')
for iind in range(4):
a=np.random.rand(10,10)
np.savetxt(f,a)
f.close()
The error that I got is something about the type of the error
File "/usr/lib/python3/dist-packages/numpy/lib/npyio.py", line 1073,
in savetxt
fh.write(asbytes(format % tuple(row) + newline)) TypeError: must be str, not bytes
This error doesn't occur in python2 so I am wondering what the issue could be. Can anyone help me out?

You should open file by binary mode.
#!/usr/bin/env python3
import numpy as np
f=open('asd.dat','ab')
for iind in range(4):
a=np.random.rand(10,10)
np.savetxt(f,a)
f.close()
reference:
python - How to write a numpy array to a csv file? - Stack Overflow How to write a numpy array to a csv file?

Related

TypeError: NumPy Text File Import

Text File
I'm trying to import the text file above using the following code:
import numpy as np
Climate_data =np.genfromtxt('C:\\Users\\vishv\\Desktop\\Stats With Python\\Stats_with_python\\CSV Files\\Climate Data.txt', delimiter= ',', skip_header='1')
print(Climate_data)
But it gives me the following TypeError:'str' object cannot be interpreted as an integer
Any ideas?
You are passing skip_header as a string, it should be an integer (skip_header=1).
Also, please don't post (even parts of) code as images, it makes it harder to help.

Code in Python that converts .h5 files to .csv

I have a .h5 file that I need to convert to .csv and this is what I have done.
#coding: utf-8
import numpy as np
import sys
import h5py
file = h5py.File('C:/Users/Sakib/Desktop/VNP46A1.A2020086.h01v01.001.2020087082319.h5','r')
a = list(file.keys())
np.savetxt(sys.stdout, file[a[0:]], '%g', ',')
But this generates an error saying 'list' object has no attribute 'encode'
[P.S Also I have not worked with the module sys before. Where will my new csv file be written and with which name?]
First, you have a small error in the arrangement of the []
. There is no need to create a list.
Also, sys.stdout depends on your process "standard output". For an interactive process it will go to the screen. You should create a file and write to it if you want to capture the output. Also, your formatting string (%g) needs to match the data in the HDF5 dataset.
Try this:
h5f= h5py.File('C:/Users/.....h5','r')
for a in h5f.keys() :
outf = open('./save_'+a+'.txt','w')
np.savetxt(outf, file[a][:], '%g', ',')
outf.close

How to read a limited lines in my file as the python is giving a Memory error

I have a JSON data of about 7GB size, and I want to read just a few lines of that data(NOT ALL THE DATA). When I print all of the data there is a memory error.
I tried to print it using pandas and numpy but I couldn't print it.
import pandas as pd
import numpy as np
df = pd.read_json("xyz.json")
print(df.head())
If file consists of huge number of small objects separated by newline, then read file line by line and parse each object individually:
import json
import itertools
f = open("abc.json")
for line in itertools.islice(f,3):
line = line.strip()
if not line: continue
print(json.loads(line))
f.close();
Will read only 3 first objects from abc.json
You can read in a chunk of the data with chunksize
The panda documentation details how to read a large input line by line. You can make the read_json method return an iterator which will read and return on fragment of the file at a time:
df = pd.read_json("xyz.json", lines=True, chunksize=1)
for chunk in df:
print(chunk)
What you need is a json reader, which treats input file as a stream (not reading it whole, but as needed).
import ijson
from itertools import islice
f = open('xyz.json','r')
elements = ijson.items(f,'')
for x in islice(elements,3):
print(x)
Will print first 3 objects from json.
Install using (linux)
sudo apt install python3-ijson
or pip. See ijson: https://pypi.org/project/ijson/

"TypeError: data type not understood" while reading csv file

I am attempting to read a .csv file and assign a particular range of values to its own list for indexing purposes:
import numpy as np
import scipy as sp
import matplotlib as plot
import pandas as pd
# read in the data held in the csv file using "read_csv", a function built into the "pandas" module
Ne = pd.read_table('Ionosphere_data.csv', sep='\s+', dtype=np.float64);
print(Ne.shape)
print(np.dtype)
# store each dimension from the csv file into its own array for plotting use
Altitude = Ne[:,1];
Time = Ne[0,:];
# loop through each electron density value with respect to
Ne_max = []
for i in range(0,len(Time)):
for j in range(0,len(Altitude)):
Ne_max[j] = np.max(Ne[:,i]);
print Ne_max
#plot(Time,Altitude,Ne_max);
#xaxislabel('Time (hours)')
#yaxislabel('Altitude (km)')
however when I run this code, ipython displays the error message:
"TypeError: data type not understood" in context to line 10. (Another side note, when 'print(np.dtype)' is not included, a separate error message is given to line 13: "TypeError: unhashable type").
Does anyone know if I am reading in the file incorrectly or if there is another problem?
In a comment, you say that the line
print(np.dtype)
should be
print(np.dtype(Ne))
That gives the error TypeError: data type not understood. numpy.dtype tries to convert its argument into a numpy data type object. It is not used to inspect the data type of the argument.
For a Pandas DataFrame, use the dtypes attribute:
print(Ne.dtypes)

numpy.loadtxt does not read file with complex numbers

I am trying to read a file with complex numbers in the form :
data.dat
1.5795219122457646E-11-3.852906516379872E-15i -3.5949335665378405E-12-1.626143709108086E-15i
-6.720365121161621E-15-5.377186331212649E-17i -3.736251476362349E-15-3.0190920417856674E-17i
I use the following code to read the file :
import numpy as np
c_complex = np.loadtxt('data.dat', delimiter='\t', dtype=np.complex128)
But it gives me the following error :
TypeError: complex() argument must be a string or a number, not 'bytes'
What could I do to solve this problem ?
Thank you very much for your help
This seems to have been a bug in older versions of numpy (Issue). Either update your numpy to the latest version of their github repository or use the function numpy.genfromtxt().
c.complex = np.genfromtxt('data.dat', delimiter='\t', dtype=np.complex128)

Categories