My lab has a very large directory of Sigmaplot files, saved as .JNB . I would like to process the data in these files using Python. However, I have thus far been unable to read the files into anything interpretable.
I've already tried pretty much every numpy read function and most the panda read functions, and am getting nothing but gibberish.
Does anyone have any advice about reading these files short of exporting them all to excel one by one?
Related
I'm currently working on this project: https://github.com/lucasmolinari/unlocker-EX.
It's a excel unlocker, it works by editing the XML files inside the workbooks. (more information on the github page).
The script works fine in workbooks with almost no content inside, but recently I'm testing some bigger workbooks, and when I open the unlocked file, excel says it's corrupted and I can't find any difference between the original and the unlocked workbook, I'm 100% sure the problem is when the script change the content in the file, I watched every step of the script and it just stops working when the files are edited.
Does someone have more knowlege on how XML files work or in the structure of excel workbooks? Or like, some way to verify the differences between the original file and the edited to see if is some formatting problem..? I'm really sorry about this question, but I have no idea from where to start now, I tried everything I can.
Changed to open files in UTF-8 format and tried to find any corrupted character in the edited file,but manually is too hard to find any.
Using ElementTree library solves the problem
So at my work we have to work in .sav files (SPSS files). Reason being for standardized purposes.
I'm curious if i can read SPSS/.sav files into pandas as a csv and essentially bypass reading it in as an sav?
So for example, when i read in files in then convert to a csv i typically do this:
df = pd.read_spss('filepath.sav')
df.to_csv('filepath.csv')
df = pd.read_csv('filepath.csv')
this is extremely inefficent and SLOW, because reading in .sav files is a slow/time consuming process.
so what i'm wondering, is can i read .sav files as .csv files without needing to first read it in as a .sav?
Doesn't pd.read_spss return a DataFrame just like pd.read_csv ?
You might be interested on this topic. In short, it points to a wrapper around the C library ReadStat that reads SPSS files way faster than pandas.
The link to their GitHub repo is https://github.com/Roche/pyreadstat
There seems to be a million questions on reading SPSS .sav files via Python but nothing on reading SPSS .spv files, aka Viewer Files. See image below, highlighted in yellow.
My aim is to read the information within it (usually frequencies, tables, charts etc) and do something fun with it. I know you can export the same information in excel but I want to know if I can work directly with the .spv file.
Is this possible?
I have encountered a Problem with scipy.io.savemat. I am trying to save a *.mat file with extremely Long Name, but it is not able to create the file in the Directory. An example mat file Name would be:
Ex:=
'MM01_MM02_MM03_hoch_MM03_runter_MM04_MM05_MP02_mr1_MP02_mr2_MR03_MR09_MR04_MR05_MR06_20170623_10-50-49.mat'
I use the scipy.io.savemat from the docs in this way:=
matPATH='V:\\Messdatenbank_Powertrain\\MESSDATENBANK_PER_CLICK\NVH_BILDVERGLEICHE_TMP\\MM01_MM02_MM03_hoch_MM03_runter_MM04_MM05_MP02_mr1_MP02_mr2_MR03_MR09_MR04_MR05_MR06_20170623_10-50-49\\'
stb_path=matPATH.split('\\')[-2]
fileName=str(stb_path+'_MM_MAX_COMPARE.mat')
sio.savemat(fileName,mm_stb_max_dict,long_field_names=True)
Can anyone suggest me how to save files with extremely Long names, i cannot find Information in scipy docs
So before I start I know this is not the proper way to go about doing this but this is the only method that I have for accessing the data I need on the fly.
I have a system which is writing telemetry data to a .csv file while it is running. I need to see some of this data while it is being written but it is not being broadcast in a manner which allows me to do this.
Question: How do I read from a CSV file which is being written to safely.
Typically I would open the file and look at the values but I am hoping to be able to write a python script which is able to examine the csv for me and report the most recent values written without compromising the systems ability to write to the file.
I have absolutely NO access to the system or the manner in which it is writing to the CSV I am only able to see that the CSV file is being updated as the system runs.
Again I know this is NOT the right way to do this but any help you could provide would be extremely helpful.
This is mostly being run in a Windows environment
You can do something like:
tailf csv_file | python your_script.py
and read from sys.stdin