I've downloaded an average temperature change dataset formatted like this but with lat/long range across the entire US:
original csv
I'm trying to convert it into a raster that I can visualize in a python or R map, and all methods I've seen require the lat, long and z fields to be tabular like this: ideal table
Is there a way to do this with the current "grid" format or do I need to transform it into a table? If the latter how can I do that in Excel or python/R?
Tried transposing data in Excel first, at a loss for other methods
Please include sample code/sample data when you ask a question here. Your data set pictured in the PNG was small enough, so I recreated it:
+----------+--------+----------+----------+----------+----------+----------+----------+----------+----------+----------+
| Lat/Long | -179.5 | -179 | -178.5 | -178 | -177.5 | -177 | -176.5 | -176 | -175.5 | -175 |
+----------+--------+----------+----------+----------+----------+----------+----------+----------+----------+----------+
| 18.5 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 19 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 19.5 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 20 | 0 | 1.524704 | 1.489677 | 1.488556 | 1.485161 | 0 | 0 | 0 | 0 | 0 |
| 20.5 | 0 | 1.484848 | 1.484863 | 1.484833 | 1.484802 | 1.516785 | 1.554611 | 1.5672 | 1.567184 | 0 |
| 21 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 21.5 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 22 | 0 | 0 | 0 | 0 | 0 | 0 | 1.586227 | 0 | 0 | 0 |
| 23 | 0 | 0 | 2.718926 | 2.743782 | 2.74353 | 0 | 1.64222 | 1.661705 | 1.720245 | 1.755074 |
| 23.5 | 0 | 0 | 0 | 3.006203 | 3.005981 | 0 | 0 | 0 | 0 | 1.808762 |
+----------+--------+----------+----------+----------+----------+----------+----------+----------+----------+----------+
A problem like this would be better solved/dealt with in Numpy/Python, but if you want to do it in Excel, here are the steps I took to arrive at the end result posted below. I am assuming you are using Excel 365 on a Windows 10 PC. I am also assuming that you need help with the data set, not with the raster map itself.
The first problem you have is that there are blanks in your data where zeros should be. I have no idea how big this table is - probably really big - so if you want to do this in Excel, select the first cell in A1, then the last cell in column K while holding the SHIFT key. Click CTRL + H, which brings up the "Find & Replace" dialog. Replace all of the spaces with "0".
Format your data as a table in Excel by clicking within the range, and then on the Home tab "Format as Table" in the "Styles" group. The style you pick does not matter. I named the table "Original" (select a cell in the table, then click on the "Table Design" tab which appears in the top right; change the table name in the ribbon under "Properties" on the left).
Click on the "Data" tab while the table is still selected, then select "From Table/Range" in the "Get & Transform Data". This will open Power Pivot. Since you don't want to output this table again from the query, click on the arrow (NOT the button) next to "Close & Load" on the ribbon under "Close" and pick "Close & load to". This brings up a dialog box. Select "Only Create Connection" and then click "OK". If you accidentally hit the button itself, it will create a table on a new worksheet that is identical to the one you started with. You can delete the sheet later, which will convert the output of the query to a connection.
In the data tab, click on "Queries & Connections" in the "Queries & Connections" group. This brings up a sidebar on the right. Double-click the query you just created, which gets you back to Power Query:
I duplicated the original query, because we want to manipulate it further (right-click on the query in the left pane, then select "Duplicate"). Name the query something specific. I picked "Unpivoted".
Select the first column that contains the Latitude values. Then click on the arrow next to "Unpivot Columns" on the "Transform" tab and select "Unpivot Other Columns":
As a final step, I renamed the resulting columns "Latitude", "Longitude" and "Temperature", then clicked "Close & Load" to put the table onto its own worksheet.
Here is the resulting data set:
+----------+-----------+-------------+
| Latitude | Longitude | Temperature |
+----------+-----------+-------------+
| 18.5 | -179.5 | 0 |
| 18.5 | -179 | 0 |
| 18.5 | -178.5 | 0 |
| 18.5 | -178 | 0 |
| 18.5 | -177.5 | 0 |
| 18.5 | -177 | 0 |
| 18.5 | -176.5 | 0 |
| 18.5 | -176 | 0 |
| 18.5 | -175.5 | 0 |
| 18.5 | -175 | 0 |
| 19 | -179.5 | 0 |
| 19 | -179 | 0 |
| 19 | -178.5 | 0 |
| 19 | -178 | 0 |
| 19 | -177.5 | 0 |
| 19 | -177 | 0 |
| 19 | -176.5 | 0 |
| 19 | -176 | 0 |
| 19 | -175.5 | 0 |
| 19 | -175 | 0 |
| 19.5 | -179.5 | 0 |
| 19.5 | -179 | 0 |
| 19.5 | -178.5 | 0 |
| 19.5 | -178 | 0 |
| 19.5 | -177.5 | 0 |
| 19.5 | -177 | 0 |
| 19.5 | -176.5 | 0 |
| 19.5 | -176 | 0 |
| 19.5 | -175.5 | 0 |
| 19.5 | -175 | 0 |
| 20 | -179.5 | 0 |
| 20 | -179 | 1.524704 |
| 20 | -178.5 | 1.489677 |
| 20 | -178 | 1.488556 |
| 20 | -177.5 | 1.485161 |
| 20 | -177 | 0 |
| 20 | -176.5 | 0 |
| 20 | -176 | 0 |
| 20 | -175.5 | 0 |
| 20 | -175 | 0 |
| 20.5 | -179.5 | 0 |
| 20.5 | -179 | 1.484848 |
| 20.5 | -178.5 | 1.484863 |
| 20.5 | -178 | 1.484833 |
| 20.5 | -177.5 | 1.484802 |
| 20.5 | -177 | 1.516785 |
| 20.5 | -176.5 | 1.554611 |
| 20.5 | -176 | 1.5672 |
| 20.5 | -175.5 | 1.567184 |
| 20.5 | -175 | 0 |
| 21 | -179.5 | 0 |
| 21 | -179 | 0 |
| 21 | -178.5 | 0 |
| 21 | -178 | 0 |
| 21 | -177.5 | 0 |
| 21 | -177 | 0 |
| 21 | -176.5 | 0 |
| 21 | -176 | 0 |
| 21 | -175.5 | 0 |
| 21 | -175 | 0 |
| 21.5 | -179.5 | 0 |
| 21.5 | -179 | 0 |
| 21.5 | -178.5 | 0 |
| 21.5 | -178 | 0 |
| 21.5 | -177.5 | 0 |
| 21.5 | -177 | 0 |
| 21.5 | -176.5 | 0 |
| 21.5 | -176 | 0 |
| 21.5 | -175.5 | 0 |
| 21.5 | -175 | 0 |
| 22 | -179.5 | 0 |
| 22 | -179 | 0 |
| 22 | -178.5 | 0 |
| 22 | -178 | 0 |
| 22 | -177.5 | 0 |
| 22 | -177 | 0 |
| 22 | -176.5 | 1.586227 |
| 22 | -176 | 0 |
| 22 | -175.5 | 0 |
| 22 | -175 | 0 |
| 23 | -179.5 | 0 |
| 23 | -179 | 0 |
| 23 | -178.5 | 2.718926 |
| 23 | -178 | 2.743782 |
| 23 | -177.5 | 2.74353 |
| 23 | -177 | 0 |
| 23 | -176.5 | 1.64222 |
| 23 | -176 | 1.661705 |
| 23 | -175.5 | 1.720245 |
| 23 | -175 | 1.755074 |
| 23.5 | -179.5 | 0 |
| 23.5 | -179 | 0 |
| 23.5 | -178.5 | 0 |
| 23.5 | -178 | 3.006203 |
| 23.5 | -177.5 | 3.005981 |
| 23.5 | -177 | 0 |
| 23.5 | -176.5 | 0 |
| 23.5 | -176 | 0 |
| 23.5 | -175.5 | 0 |
| 23.5 | -175 | 1.808762 |
+----------+-----------+-------------+
And this is the underlying M code:
let
Source = Excel.CurrentWorkbook(){[Name="Original"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Lat/Long", type number}, {"-179.5", type any}, {"-179", type number}, {"-178.5", type number}, {"-178", type number}, {"-177.5", type number}, {"-177", type number}, {"-176.5", type number}, {"-176", type number}, {"-175.5", type number}, {"-175", type number}}),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"Lat/Long"}, "Attribute", "Value"),
#"Renamed Columns" = Table.RenameColumns(#"Unpivoted Other Columns",{{"Lat/Long", "Latitude"}, {"Attribute", "Longitude"}, {"Value", "Temperature"}})
in
#"Renamed Columns"
I hope this is what you are looking for. Please click the check box by this answer to accept it if this solved your problem.
Given the following numpy array. How can I find the highest and lowest value locations of column 0 within the interval on column 1 using numpy?
import numpy as np
data = np.array([
[1879.289,np.nan],[1879.281,np.nan],[1879.292,1],[1879.295,1],[1879.481,1],[1879.294,1],[1879.268,1],
[1879.293,1],[1879.277,1],[1879.285,1],[1879.464,1],[1879.475,1],[1879.971,1],[1879.779,1],
[1879.986,1],[1880.791,1],[1880.29,1],[1879.253,np.nan],[1878.268,np.nan],[1875.73,1],[1876.792,1],
[1875.977,1],[1876.408,1],[1877.159,1],[1877.187,1],[1883.164,1],[1883.171,1],[1883.495,1],
[1883.962,1],[1885.158,1],[1885.974,1],[1886.479,np.nan],[1885.969,np.nan],[1884.693,1],[1884.977,1],
[1884.967,1],[1884.691,1],[1886.171,1],[1886.166,np.nan],[1884.476,np.nan],[1884.66,1],[1882.962,1],
[1881.496,1],[1871.163,1],[1874.985,1],[1874.979,1],[1871.173,np.nan],[1871.973,np.nan],[1871.682,np.nan],
[1872.476,np.nan],[1882.361,1],[1880.869,1],[1882.165,1],[1881.857,1],[1880.375,1],[1880.66,1],
[1880.891,1],[1880.377,1],[1881.663,1],[1881.66,1],[1877.888,1],[1875.69,1],[1875.161,1],
[1876.697,np.nan],[1876.671,np.nan],[1879.666,np.nan],[1877.182,np.nan],[1878.898,1],[1878.668,1],[1878.871,1],
[1878.882,1],[1879.173,1],[1878.887,1],[1878.68,1],[1878.872,1],[1878.677,1],[1877.877,1],
[1877.669,1],[1877.69,1],[1877.684,1],[1877.68,1],[1877.885,1],[1877.863,1],[1877.674,1],
[1877.676,1],[1877.687,1],[1878.367,1],[1878.179,1],[1877.696,1],[1877.665,1],[1877.667,np.nan],
[1878.678,np.nan],[1878.661,1],[1878.171,1],[1877.371,1],[1877.359,1],[1878.381,1],[1875.185,1],
[1875.367,np.nan],[1865.492,np.nan],[1865.495,1],[1866.995,1],[1866.672,1],[1867.465,1],[1867.663,1],
[1867.186,1],[1867.687,1],[1867.459,1],[1867.168,1],[1869.689,1],[1869.693,1],[1871.676,1],
[1873.174,1],[1873.691,np.nan],[1873.685,np.nan]
])
In the third column below you can see where the max and min is for each interval.
+-------+----------+-----------+---------+
| index | Value | Intervals | Min/Max |
+-------+----------+-----------+---------+
| 0 | 1879.289 | np.nan | |
| 1 | 1879.281 | np.nan | |
| 2 | 1879.292 | 1 | |
| 3 | 1879.295 | 1 | |
| 4 | 1879.481 | 1 | |
| 5 | 1879.294 | 1 | |
| 6 | 1879.268 | 1 | -1 | min
| 7 | 1879.293 | 1 | |
| 8 | 1879.277 | 1 | |
| 9 | 1879.285 | 1 | |
| 10 | 1879.464 | 1 | |
| 11 | 1879.475 | 1 | |
| 12 | 1879.971 | 1 | |
| 13 | 1879.779 | 1 | |
| 17 | 1879.986 | 1 | |
| 18 | 1880.791 | 1 | 1 | max
| 19 | 1880.29 | 1 | |
| 55 | 1879.253 | np.nan | |
| 56 | 1878.268 | np.nan | |
| 57 | 1875.73 | 1 | -1 |min
| 58 | 1876.792 | 1 | |
| 59 | 1875.977 | 1 | |
| 60 | 1876.408 | 1 | |
| 61 | 1877.159 | 1 | |
| 62 | 1877.187 | 1 | |
| 63 | 1883.164 | 1 | |
| 64 | 1883.171 | 1 | |
| 65 | 1883.495 | 1 | |
| 66 | 1883.962 | 1 | |
| 67 | 1885.158 | 1 | |
| 68 | 1885.974 | 1 | 1 | max
| 69 | 1886.479 | np.nan | |
| 70 | 1885.969 | np.nan | |
| 71 | 1884.693 | 1 | |
| 72 | 1884.977 | 1 | |
| 73 | 1884.967 | 1 | |
| 74 | 1884.691 | 1 | -1 | min
| 75 | 1886.171 | 1 | 1 | max
| 76 | 1886.166 | np.nan | |
| 77 | 1884.476 | np.nan | |
| 78 | 1884.66 | 1 | 1 | max
| 79 | 1882.962 | 1 | |
| 80 | 1881.496 | 1 | |
| 81 | 1871.163 | 1 | -1 | min
| 82 | 1874.985 | 1 | |
| 83 | 1874.979 | 1 | |
| 84 | 1871.173 | np.nan | |
| 85 | 1871.973 | np.nan | |
| 86 | 1871.682 | np.nan | |
| 87 | 1872.476 | np.nan | |
| 88 | 1882.361 | 1 | 1 | max
| 89 | 1880.869 | 1 | |
| 90 | 1882.165 | 1 | |
| 91 | 1881.857 | 1 | |
| 92 | 1880.375 | 1 | |
| 93 | 1880.66 | 1 | |
| 94 | 1880.891 | 1 | |
| 95 | 1880.377 | 1 | |
| 96 | 1881.663 | 1 | |
| 97 | 1881.66 | 1 | |
| 98 | 1877.888 | 1 | |
| 99 | 1875.69 | 1 | |
| 100 | 1875.161 | 1 | -1 | min
| 101 | 1876.697 | np.nan | |
| 102 | 1876.671 | np.nan | |
| 103 | 1879.666 | np.nan | |
| 111 | 1877.182 | np.nan | |
| 112 | 1878.898 | 1 | |
| 113 | 1878.668 | 1 | |
| 114 | 1878.871 | 1 | |
| 115 | 1878.882 | 1 | |
| 116 | 1879.173 | 1 | 1 | max
| 117 | 1878.887 | 1 | |
| 118 | 1878.68 | 1 | |
| 119 | 1878.872 | 1 | |
| 120 | 1878.677 | 1 | |
| 121 | 1877.877 | 1 | |
| 122 | 1877.669 | 1 | |
| 123 | 1877.69 | 1 | |
| 124 | 1877.684 | 1 | |
| 125 | 1877.68 | 1 | |
| 126 | 1877.885 | 1 | |
| 127 | 1877.863 | 1 | |
| 128 | 1877.674 | 1 | |
| 129 | 1877.676 | 1 | |
| 130 | 1877.687 | 1 | |
| 131 | 1878.367 | 1 | |
| 132 | 1878.179 | 1 | |
| 133 | 1877.696 | 1 | |
| 134 | 1877.665 | 1 | -1 | min
| 135 | 1877.667 | np.nan | |
| 136 | 1878.678 | np.nan | |
| 137 | 1878.661 | 1 | 1 | max
| 138 | 1878.171 | 1 | |
| 139 | 1877.371 | 1 | |
| 140 | 1877.359 | 1 | |
| 141 | 1878.381 | 1 | |
| 142 | 1875.185 | 1 | -1 | min
| 143 | 1875.367 | np.nan | |
| 144 | 1865.492 | np.nan | |
| 145 | 1865.495 | 1 | -1 | min
| 146 | 1866.995 | 1 | |
| 147 | 1866.672 | 1 | |
| 148 | 1867.465 | 1 | |
| 149 | 1867.663 | 1 | |
| 150 | 1867.186 | 1 | |
| 151 | 1867.687 | 1 | |
| 152 | 1867.459 | 1 | |
| 153 | 1867.168 | 1 | |
| 154 | 1869.689 | 1 | |
| 155 | 1869.693 | 1 | |
| 156 | 1871.676 | 1 | |
| 157 | 1873.174 | 1 | 1 | max
| 158 | 1873.691 | np.nan | |
| 159 | 1873.685 | np.nan | |
+-------+----------+-----------+---------+
I must specify upfront that this question has already been answered here with a pandas solution. The solution performs reasonable at about 300 seconds for a table of around 1 million rows. But after some more testing, I see that if the table is over 3 million rows, the execution time increases dramatically to over 2500 seconds and even more. This is obviously too long for such a simple task. How would the same problem be solved with numpy?
Here's one NumPy approach -
mask = ~np.isnan(data[:,1])
s0 = np.flatnonzero(mask[1:] > mask[:-1])+1
s1 = np.flatnonzero(mask[1:] < mask[:-1])+1
lens = s1 - s0
tags = np.repeat(np.arange(len(lens)), lens)
idx = np.lexsort((data[mask,0], tags))
starts = np.r_[0,lens.cumsum()]
offsets = np.r_[s0[0], s0[1:] - s1[:-1]]
offsets_cumsum = offsets.cumsum()
min_ids = idx[starts[:-1]] + offsets_cumsum
max_ids = idx[starts[1:]-1] + offsets_cumsum
out = np.full(data.shape[0], np.nan)
out[min_ids] = -1
out[max_ids] = 1
So this is a bit of a cheat since it uses scipy:
import numpy as np
from scipy import ndimage
markers = np.isnan(data[:, 1])
groups = np.cumsum(markers)
mins, max, min_idx, max_idx = ndimage.measurements.extrema(
data[:, 0], labels=groups, index=range(2, groups.max(), 2))