python script converting .dat to json - python

I have .dat file that I want to use in my script which draws scatter graph with data input from that .dat file. I have been manually converting .dat files to .csv for this purpose but I find it not satisfactory.
This is what I am using currently.
import pandas as pd import matplotlib.pyplot as plt import numpy as np
filename=raw_input('Enter filename ')
csv = pd.read_csv(filename)
data=csv[['deformation','stress']]
data=data.astype(float)
x=data['deformation']
y=data['stress']
plt.scatter(x,y,s=0.5)
fit=np.polyfit(x,y,15)
p=np.poly1d(fit)
plt.plot(x,p(x),"r--")
plt.show()
Programmer friend told me that it would be more convenient to convert it to JSON and use it as such. How would I go about this?

try using the numpy read feature
import numpy as np
yourArray = np.fromfile('YourData.dat',dtype=dtype)
yourArray = np.loadtxt('YourData.dat')
loadtxt is more flexible than fromfile

Related

Pandas read_csv gives decimal column numbers

I've been pulling my hair out trying to make a bipartite graph from a csv file and so far all I have is a panda matrix that looks like this
My code so far is just
`
import networkx as nx
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
# import pyexcel as pe
# import pyexcel.ext.xlsx
from networkx.algorithms import bipartite
mat = pd.read_csv("networkdata3.csv")
# mat = pd.read_excel("networkdata1.xlsx",sheet_name="sheet_name_1")
print(mat.info)
sand = nx.from_pandas_adjacency(mat)
`
and I have no clue what I'm doing wrong. Initially I was trying to read it in as the original xlsx file but then I just converted it to a csv and it started reading. I assume I can't make the graph because the column numbers are decimals and the error that spits out claims that the column numbers don't match up. So how else should I be doing this to actually start making some progress?

netCDF files has no variables in python when importing with xarray

I'm VERY new to xarray, and I tried to import a satellite netcdf files into python using xarray using this file: https://tropomi.gesdisc.eosdis.nasa.gov/data//S5P_TROPOMI_Level2/S5P_L2__NO2____HiR.1/2020/003/S5P_OFFL_L2__NO2____20200103T170946_20200103T185116_11525_01_010302_20200105T100506.nc
This is the code I used:
import xarray as xr
import numpy as np
import pandas as pd
tropomi = xr.open_dataset('test2.nc', engine = 'netcdf4')
tropomi
Output:
But the output does not present any variables, and has 53 attributes - why is this happening?
Thanks!
I figured it out. When you open the file without a group defined, you get the global attributes with no variables. You need to include a group='PRODUCT' to get the data products, like this:
tropomi = xr.open_dataset('test2.nc', group='PRODUCT')

Is it possible to create a word cloud by importing data from an Excel file?

I have no idea.
your_list=[]
workbook=openpyxl.load_workbook(filename='crawling.xlsx')
worksheet=workbook.get_sheet_by_name("Sheet")
for i in worksheet.rows:
page=i[0].value
your_list.append(page)
I don't know what to do first. Is the library to use openpyxl or is it better to just read the document with open function?
I want WordCloud (). Generate (your_list) to be possible
Sorry for the messy code.
I can use
from wordcloud import WordCloud, STOPWORDS
import matplotlib.pyplot as plt
import pandas as pd
df = pd.read_excel(r"Youtube04-Eminem.excel", encoding ="latin-1")
pandas lib to read excel file

How to put many numpy files in one big numpy file, file by file?

I have 166600 numpy files, I want to put them into one numpy file: file by file,
I mean that the creation of my new big file must from the begin: the first file must be read and written in the file, so the big file contains only the first file, after that I need to read and write the second file, so the big file contains the first two files.
import matplotlib.pyplot as plt
import numpy as np
import glob
import os, sys
fpath ="path_Of_my_final_Big_File"
npyfilespath ="path_of_my_numpy_files"
os.chdir(npyfilespath)
npfiles= glob.glob("*.npy")
npfiles.sort()
all_arrays = np.zeros((166601,8000))
for i,npfile in enumerate(npfiles):
all_arrays[i]=np.load(os.path.join(npyfilespath, npfile))
np.save(fpath, all_arrays)
If I understand your questions correctly, you can use numpy.concatenate for this:
import matplotlib.pyplot as plt
import numpy as np
import glob
import os, sys
fpath ="path_Of_my_final_Big_File"
npyfilespath ="path_of_my_numpy_files"
os.chdir(npyfilespath)
npfiles= glob.glob("*.npy")
npfiles.sort()
all_arrays = []
for i, npfile in enumerate(npfiles):
all_arrays.append(np.load(os.path.join(npyfilespath, npfile)))
np.save(fpath, np.concatenate(all_arrays))
Depending on the shape of your arrays and the intended concatenation, you might need to specify the axis parameter of concatenate.

How can my argument be used to create a matlab file?

I am writing a function that converts a 2d python array into a matlab file. Here is my code so far...
def save_array(arr,fname):
import scipy.io
import numpy
out_dict={}
out_dict[fname]=arr
scipy.io.savemat(fname.mat,out_dict)`
I want fname to be a string, but I am not sure how I can get the savemat part to work.
import scipy.io
import numpy as np
def save_array(arr, arrname, fname):
"""
Save an array to a .mat file
Inputs:
arr: ndarray to save
arrname: name to save the array as (string)
fname: .mat filename (string)
"""
out_dict={arrname: arr}
scipy.io.savemat(fname,out_dict)
save_array(np.array([1,2,3]), 'arr', 'test.mat')
Might be worth doing a python tutorial or two. This is very basic stuff!

Categories