how to open and read .nc files? - python

I had a problem of opening .nc files and converting them to .csv files but still, I can not read them (meaning the first part). I saw this link also this link but I could not find out how to open them. I have written a piece of code and I faced an error which I will post below. To elaborate on the error, it is able to find the files but is not able to open them.
#from netCDF4 import Dataset # use scipy instead
from scipy.io import netcdf #### <--- This is the library to import.
import os
# Open file in a netCDF reader
directory = './'
#wrf_file_name = directory+'filename'
wrf_file_name = [f for f in sorted(os.listdir('.')) if f.endswith('.nc')]
nc = netcdf.netcdf_file(wrf_file_name,'r')
#Look at the variables available
nc.variables
#Look at the dimensions
nc.dimensions
And the error is:
Error: LAKE00000002-GloboLakes-L3S-LSWT-v4.0-fv01.0.nc is not a valid NetCDF 3 file

Related

Converting Shapefile to Geojson

In short, I am trying to convert a shapefile to geojson using gdal. Here is the idea:
from osgeo import gdal
def shapefile2geojson(infile, outfile):
options = gdal.VectorTranslateOptions(format="GeoJSON", dstSRS="EPSG:4326")
gdal.VectorTranslate(outfile, infile, options=options)
Okay then here is my input & output locations:
infile = r"C:\Users\clay\Desktop\Geojson Converter\arizona.shp"
outfile = r"C:\Users\clay\Desktop\Geojson Converter\arizona.geojson"
Then I call the function:
shapefile2geojson(infile, outfile)
It never saves where I can find it, if it is working at all. It would be nice if it would pull from a file and put the newly converted geojson in the same folder. I do am not receiving any errors. I am using windows and Jupyter Notebook and am a noob. I don't know if I am using this right:
r"C:\Users\clay\Desktop\Geojson Converter\arizona.shp"

Reading only .csv file within a .zip from URL with Pandas?

There is a .csv file contained within a .zip file from a URL I am trying to read into a Pandas DataFrame; I don't want to download the .zip file to disk but rather read the data directly from the URL. I realize that pandas.read_csv() can only do this if the .csv file is the only file contained in the .zip, however, when I run this:
import pandas as pd
# specify zipped comma-separated values url
zip_csv_url = 'http://www12.statcan.gc.ca/census-recensement/2016/geo/ref/gaf/files-fichiers/2016_92-151_XBB_csv.zip'
df1 = pd.read_csv(zip_csv_url)
I get this:
ValueError: Multiple files found in compressed zip file ['2016_92-151_XBB.csv', '92-151-g2016001-eng.pdf', '92-151-g2016001-fra.pdf']
The contents of the .zip appear to be arranged as a list; I'm wondering how I can assign the new DataFrame (df1) as the only available .csv file in the .zip (as the .zip file from the URL I will be using would only ever have one .csv file within it). Thanks!
N.B.
The corresponding .zip file from a separate URL with shapefiles reads no problem with geopandas.read_file() when I run this code:
import geopandas as gpd
# specify zipped shapefile url
zip_shp_url = 'http://www12.statcan.gc.ca/census-recensement/2011/geo/bound-limit/files-fichiers/2016/ldb_000b16a_e.zip'
gdf1 = gpd.read_file(zip_shp_url)
Despite having a .pdf file also contained within the .zip, as seen in the image below:
It would appear that the geopandas.read_file() has the ability to only read the requisite shapefiles for creating the GeoDataFrame while ignoring unnecessary data files. Since it is based on Pandas, shouldn't Pandas also have a functionality to only read a .csv within a .zip with multiple other file types? Any thoughts?
import zipfile
import pandas as pd
from io import BytesIO
from urllib.request import urlopen
resp = urlopen( YOUR_ZIP_LINK )
files_zip = zipfile.ZipFile(BytesIO(resp.read()))
# files_zip.namelist()
directory_to_extract_to = YOUR_DESTINATION_FOLDER
file = YOUR_csv_FILE_NAME
with files_zip as zip_ref:
zip_ref.extract(file,directory_to_extract_to)
pd.read_csv(directory_to_extract_to + file)

Read multiple csv files in a folder

I have multiple .csv files that represents a serious of measurements maiden.
I need to plot them in order to compare proceeding alterations.
I basically want to create a function with it I can read the file into a list and replay several of the "data cleaning in each .csv file" Then plot them all together in a happy graph
this is a task I need to do to analyze some results. I intend to make this in python/pandas as I might need to integrate into a bigger picture in the future but for now, this is it.
I basically want to create a function with it I can read the file into a big picture comparing it Graph.
I also have one file that represents background noise. I want to remove these values from the others .csv as well
import matplotlib.pyplot as plt
from matplotlib.ticker import FormatStrFormatter
PATH = r'C:\Users\UserName\Documents\FSC\Folder_name'
FileNames = [os.listdir(PATH)]
for files in FileNames:
df = pd.read_csv(PATH + file, index_col = 0)
I expected to read every file and store into this List but I got this code:
FileNotFoundError: [Errno 2] File b'C:\Users\UserName\Documents\FSC\FolderNameFileName.csv' does not exist: b'C:\Users\UserName\Documents\FSC\FolderNameFileName.csv'
Have you used pathlib from the standard library? it makes working with the file system a breeze,
recommend reading : https://realpython.com/python-pathlib/
try:
from pathlib import Path
files = Path('/your/path/here/').glob('*.csv') # get all csvs in your dir.
for file in files:
df = pd.read_csv(file,index_col = 0)
# your plots.

How to load a data set into Jupyter Notebook

When loading a dataset into Jupyter, I know it requires lines of code to load it in:
from tensorflow.contrib.learn.python.learn.datasets import base
# Data files
IRIS_TRAINING = "iris_training.csv"
IRIS_TEST = "iris_test.csv"
# Load datasets.
training_set = base.load_csv_with_header(filename=IRIS_TRAINING,
features_dtype=np.float32,
target_dtype=np.int)
test_set = base.load_csv_with_header(filename=IRIS_TEST,
features_dtype=np.float32,
target_dtype=np.int)
So why is ther error NotFoundError: iris_training.csv
still thrown? I feel as though there is more to loading data sets on to jupyter, and would be grateful on any help on this topic
I'm following a course through AI adventures, and dont know how to add in the .csv file; the video mentions nothing about how to add it on.
Here is the link: https://www.youtube.com/watch?v=G7oolm0jU8I&list=PLIivdWyY5sqJxnwJhe3etaK7utrBiPBQ2&index=3
The issue is that you either need to use file's absolute path i.e C:\path_to_csv\iris_training.csv for windows and for UNIX/Linux /path_to_csv/iris_training.csv or you will need to place the file in your notebook workspace i.e directory that is being listed in your Jupyter UI which can be found at http://localhost:8888/tree Web UI. If you are having trouble finding the directory then just execute below python code and place the file in the printed location
import os
cwd = os.getcwd()
print(cwd)
Solution A
if you are working with python you can use python lib pandas to import your file .csv using:
import pandas as pd
IRIS_TRAINING = pd.read_csv("../iris_training.csv")
IRIS_TEST = pd.read_csv("../iris_test.csv")
Solution B
import numpy as np
mydata = np.genfromtxt(filename, delimiter=",")
Read More About
python-pandas
Read More About
python-Numpy

importing a shapefile to python with pyshp

I'm trying to import the shapefile "Metropolin_31Jul_0921.shp" to python using the following code:
import shapefile
stat_area_df = shapefile.Reader("Metropolin_31Jul_0921.shp")
but i keep getting this error:
File "C:\Users\maya\Anaconda3\lib\site-packages\shapefile.py", line 291,
in load
raise ShapefileException("Unable to open %s.dbf or %s.shp." %
(shapeName, shapeName) )
shapefile.ShapefileException: Unable to open Metropolin_31Jul_0921.dbf
or Metropolin_31Jul_0921.shp.
Does anyone know what it means?
I tried adding the directory but it didn't help.
Make sure that the directory which the shapefile is located in, includes all of the supporting files such as .dbf, .shx, etc. the .shp will not work without these supporting files.

Categories