I am getting my feet wet with Python. I never done any programming or what so ever before and I really would appreciate it if someone would explain his_hers answer and not just post it since I want to learn something! Even better would be not posting the answere, but just giving hints what I should look at or something :)
I have several lists with a lot of values (numbers) one the one side.
On the other side, I have a URL which needs to be updated by the numbers out of the several lists and then be saved into another list for further process.
#borders of the bbox
longmax = 15.418483 #longitude top right
longmin = 4.953142 #longitude top left
latmax = 54.869808 #latitude top
latmin = 47.236219 #latitude bottom
#longitude
longstep = longmax - longmin
longstepx = longstep / 100 #longitudal steps the model shall perfom
#latitude
latstep = latmax - longmin
latstepx = latstep / 100 #latitudal steps the model shall perform
#create list of steps through coordinates longitude
llong = []
while longmin < longmax:
longmin+=longstepx
llong.append(+longmin)
#create list of steps through coordinates latitude
llat = []
while latmin < latmax:
latmin+=latstepx
llat.append(+latmin)
#create the URLs and store in list
for i in (llong):
"https://api.flickr.com/services/rest/?method=flickr.photos.search&format=json&api_key=5....lback=1&page=X&per_page=500&bbox=i&accuracy=1&has_geo=1&extras=geo,tags,views,description",sep="")"
As you can see, I try to make a request to the REST API from flickr.
What I dont understand is:
How do I get the loop to go through my lists, insert the values from the list to a certain point in the URL?
How to I tell the loop to save each URL separately after it inserted the first number out of list "llong" and "llat" and then proceed with the two next numbers.
Any hints?
You can use string formatting to insert whatever you want into your url:
my_list=["foo","bar","foobar"]
for word in my_list:
print ("www.google.com/{}".format(word))
www.google.com/foo
www.google.com/bar
www.google.com/foobar
The {} is used in your string wherever you want to insert.
To save them to a list you can use zip, insert using string formatting and then append to a new list.
urls=[]
for lat,lon in zip(llat,llong):
urls.append("www.google.com/{}{}".format(lat,lon))
Python string formatting: % vs. .format
I think the .format() method is the preferred method as opposed to using the "www.google.com/%s" % lat syntax.
There are answers here that discuss some of the differences.
The zip function is best explained with an example:
Say we have 2 lists l1 and l2:
l1 = [1,2,3]
l2 = [4,5,6]
If we use zip(l1,l2)the result will be:
[(1, 4), (2, 5), (3, 6)]
Then when we loop over the two zipped lists like below:
for ele_1,ele_2 in zip(l1,l2):
first iteration ele_1 = 1, ele_2 = 4
second iteration ele_1 = 2 ele_2 = 5 and so on ...
myUrls=[]
for i in range len(llat) # len(llong) is valid if both have same size.
myUrls.append(newUrl(llat[i]),llong[i])
def newUrl(lat,long):
return "www.flickr.......lat="+lat+".....long="+long+"...."
Related
I need to create a function that reads the data given and creates a list that contains tuples each of which has as its first element the name of the airport and as its second and third its geographical coordinates as float numbers.
airport_data = """
Alexandroupoli 40.855869°N 25.956264°E
Athens 37.936389°N 23.947222°E
Chania 35.531667°N 24.149722°E
Chios 38.343056°N 26.140556°E
Corfu 39.601944°N 19.911667°E
Heraklion 35.339722°N 25.180278°E"""
airports = []
import re
airport_data1 = re.sub("[°N#°E]","",airport_data)
def process_airports(string):
airports_temp = list(string.split())
airports = [tuple(airports_temp[x:x+3]) for x in range(0, len(airports_temp), 3)]
return airports
print(process_airports(airport_data1))
This is my code so far but I'm new to Python, so I'm struggling to debug my code.
If you want the second and third element of the tuple to be a float, you have to convert them using the float() function.
One way to do this is creating a tuple with round brackets in your list comprehension and convert the values there:
def process_airports(string):
airports_temp = string.split()
airports = [(airports_temp[x], float(airports_temp[x+1]), float(airports_temp[x+2])) for x in range(0, len(airports_temp), 3)]
return airports
This yields a pretty unwieldy expression, so maybe this problem could be solved more readable with a classical for loop.
Also note that slpit() already returns a list.
Further remark: If you just cut off the letters from coordinates this might come back to bite you when your airports are in different quadrants.
You need to take in account N/S, W/E for longitude and latitude.
May be
def process_airports(string):
airports = []
for line in string.split('\n'):
if not line: continue
name, lon, lat = line.split()
airports.append((name,
float(lon[:-2]) * (1 if lon[-1] == "N" else -1),
float(lat[:-2]) * (-1 if lat[-1] == "E" else 1)
))
return airports
>>> process_airports(airport_data1)
[('Alexandroupoli', 40.855869, -25.956264), ('Athens', 37.936389, -23.947222), ('Chania', 35.531667, -24.149722), ('Chios', 38.343056, -26.140556), ('Corfu', 39.601944, -19.911667), ('Heraklion', 35.339722, -25.180278)]
I prefered the double split to put in evidence the differences lines/tuple elements
I have the following polygon of a geographic area that I fetch via a request in CAP/XML format from an API
The raw data looks like this:
<polygon>22.3243,113.8659 22.3333,113.8691 22.4288,113.8691 22.4316,113.8742 22.4724,113.9478 22.5101,113.9951 22.5099,113.9985 22.508,114.0017 22.5046,114.0051 22.5018,114.0085 22.5007,114.0112 22.5007,114.0125 22.502,114.0166 22.5038,114.0204 22.5066,114.0245 22.5067,114.0281 22.5057,114.0371 22.5051,114.0409 22.5041,114.0453 22.5025,114.0494 22.5023,114.0511 22.5035,114.0549 22.5047,114.0564 22.5059,114.057 22.5104,114.0576 22.512,114.0584 22.5144,114.0608 22.5163,114.0637 22.517,114.0657 22.5172,114.0683 22.5181,114.0717 22.5173,114.0739</polygon>
I store the requested items in a dictionary and then work through them to transform to a GeoJSON list object that is suitable for ingestion into Elasticsearch according to the schema I'm working with. I've removed irrelevant code here for ease of reading.
# fetches and store data in a dictionary
r = requests.get("https://alerts.weather.gov/cap/ny.php?x=0")
xpars = xmltodict.parse(r.text)
json_entry = json.dumps(xpars['feed']['entry'])
dict_entry = json.loads(json_entry)
# transform items if necessary
for entry in dict_entry:
if entry['cap:polygon']:
polygon = entry['cap:polygon']
polygon = polygon.split(" ")
coordinates = []
# take the split list items swap their positions and enclose them in their own arrays
for p in polygon:
p = p.split(",")
p[0], p[1] = float(p[1]), float(p[0]) # swap lon/lat
coordinates += [p]
# more code adding fields to new dict object, not relevant to the question
The output of the p in polygon loop looks like:
[ [113.8659, 22.3243], [113.8691, 22.3333], [113.8691, 22.4288], [113.8742, 22.4316], [113.9478, 22.4724], [113.9951, 22.5101], [113.9985, 22.5099], [114.0017, 22.508], [114.0051, 22.5046], [114.0085, 22.5018], [114.0112, 22.5007], [114.0125, 22.5007], [114.0166, 22.502], [114.0204, 22.5038], [114.0245, 22.5066], [114.0281, 22.5067], [114.0371, 22.5057], [114.0409, 22.5051], [114.0453, 22.5041], [114.0494, 22.5025], [114.0511, 22.5023], [114.0549, 22.5035], [114.0564, 22.5047], [114.057, 22.5059], [114.0576, 22.5104], [114.0584, 22.512], [114.0608, 22.5144], [114.0637, 22.5163], [114.0657, 22.517], [114.0683, 22.5172], [114.0717, 22.5181], [114.0739, 22.5173] ]
Is there a way to do this that is better than O(N^2)? Thank you for taking the time to read.
O(KxNxM)
This process involves three obvious loops. These are:
Checking each entry (K)
Splitting valid entries into points (MxN) and iterating through those points (N)
Splitting those points into respective coordinates (M)
The amount of letters in a polygon string is ~MxN because there are N points each roughly M letters long, so it iterates through MxN characters.
Now that we know all of this, let's pinpoint where each occurs.
ENTRIES (K):
IF:
SPLIT (MxN)
POINTS (N):
COORDS(M)
So, we can finally conclude that this is O(K(MxN + MxN)) which is just O(KxNxM).
I have a for loop that cycles through and creates 3 data frames and stores them in a dictionary. From each of these data frames, I would like to be able to create another dictionary, but I cant figure out how to do this.
Here is the repetitive code without the loop:
Trad = allreports2[allreports2['Trad'].notna()]
Alti = allreports2[allreports2['Alti'].notna()]
Alto = allreports2[allreports2['Alto'].notna()]
Trad_dict = dict(zip(Trad.State, Trad.Position))
Alti_dict = dict(zip(Alti.State, Alti.Position))
Alto_dict = dict(zip(Alto.State, Alto.Position))
As stated earlier, I understand how to make the 3 dataframes by storing them in a dictionary and I understand what needs to go on the right side of the equal sign in the second statement in the for loop, but not what goes on the left side (denoted below as XXXXXXXXX).
Routes = ['Trad', 'Alti', 'Alto']
dfd = {}
for route in Routes:
dfd[route] = allreports2[allreports2[route].notna()]
XXXXXXXXX = dict(zip(dfd[route].State, dfd[route].Position))
(Please note: I am very new to Python and teaching myself so apologies in advance!)
This compromises readability, but this should work.
Routes = ['Trad', 'Alti', 'Alto']
dfd, output = [{},{}] # Unpack List
for route in Routes:
dfd[route] = allreports2[allreprots2[route].notna()]
output[route] = dict(zip(dfd[route].State, dfd[route].Position))
Trad_dict, Alti_dict, Alto_dict = list(output.values()) # Unpack List
Reference
How can I get list of values from dict?
This is my third thread in StackOverflow.
I think I already learnt a LOT by reading threads here and clearing my doubts.
I'm trying to transform an excel table, in my own python script. I've done so much, and now that I'm almost finishing the script, I'm getting an Err message that I can't really understand. Here is my code: (I tried to give as much information as possible!)
def _sensitivity_analysis(datasource):
#datasource is a list with data that may be used for HBV_model() function;
datasource_length = len(datasource) #returns tha size of the data time series
sense_param = parameter_vector #collects the parameter data from the global vector (parameter_vector);
sense_index = np.linspace(0, 11, 12) #Vector that reflects the indexes of parameters that must be analyzed (0 - 11)
sense_factor = np.linspace(0.5, 2, 31) #Vecor with the variance factors that multiply the original parameter value;
ns_sense = [] #list that will be filled with Nasch-Sutcliff values (those numbers will be data for sensitivity analysis)
for i in range(sense_factor.shape[0]): #start column loop
ns_sense.append([]) #create column in ns_sense matrix
for j in range(sense_index.shape[0]): #start row loop
aux = sense_factor[i]*sense_param[j] #Multiplies the param[j] value by the factor[i] value
print(i,j,aux) #debug purposes
sense_param[j] = aux #substitutes the original parameter value by the modified one
hbv = _HBV_model(datasource, sense_param) #run the model calculations (works awesomely!)
sqrdiff = _square_diff() #does square-difference calculations for Nasch-Sutcliff;
average = _qcalc_qmed() #does square-difference calculations for Nasch-Sutcliff [2];
nasch = _nasch_sutcliff(sqrdiff, average) #Returns the Nasch-Sutcliff calculation value
ns_sense[i].insert(j, nasch) #insert the value into ns_sense(i, j) for further uses;
sense_param = np.array([np.float64(catchment_area), np.float64(thresh_temp),
np.float64(degreeday_factor), np.float64(field_capacity),
np.float64(shape_coeficient), np.float64(model_paramC),
np.float64(surfaceflow_param), np.float64(thresh_surface_level),
np.float64(interflow_param), np.float64(baseflow_param),
np.float64(percolation_param), np.float64(soilmoist_param)]) #restores sense_param to original values
for i in range(len(datasource)): #HBV_model() transforms original data (index = 5) in a fully calculated data (index 17)
for j in range(12): #in order to return it to original state before a new loop
datasource[i].pop() #data is popped out;
print(ns_sense) #debug purposes
So, when I run _sensitivity_analysis(datasource) I receive this message:
File "<ipython-input-47-c9748eaba818>", line 4, in <module>
aux = sense_factor[i]*sense_param[j]
IndexError: index 3652 is out of bounds for axis 0 with size 31;
I'm totally aware that it is talking about a index that is not accessible as it does not exists.
Explaining my situation, datasource is a list with index [3652]. But I can't see how the console is trying to access index 3652, as I'm not asking it to do so. The only point I'm trying to access such value is in the final loop:
for i in range(len(datasource)):
I'm really lost. I'd really apreciate if you could help me guys! If you need more info, I can give you.
Guess: sense_factor = np.linspace(0.5, 2, 31) has 31 elements - you ask for element 3652 and it naturally blows. i takes this value in final loop. Rewrite final loop as:
for k in range(len(datasource))
for m in range(12):
datasource[k].pop()
However your code has many issues - you should have been not using indexes
at all - instead use for loops directly on the arrays
You reused your variable names, here:
for i in range(sense_factor.shape[0]):
...
for j in range(sense_index.shape[0]):
and then here:
for i in range(len(datasource)):
for j in range(12):
so in aux = sense_factor[i]*sense_param[j], you're using the wrong value of i, and it's basically a fluke that you're not using the wrong value of j too.
Don't reuse variable names in the same scope.
I have a ticker that grabs current information of multiple elements and adds it to a list in the format: trade_list.append([[trade_id, results]]).
Say we're tracking trade_id's 4555, 5555, 23232, the trade_list will keep ticking away adding their results to the list, I then want to find the averages of their results individually.
The code works as such:
Find accounts
for a in accounts:
find open trades of accounts
for t in range(len(trades)):
do some math
trades_list.append(trade_id,result)
avernum = 0
average = []
for r in range(len(trades_list)):
average.append(trades_list[r][1]) # This is the value attached to the trade_id
avernum+=1
results = float(sum(average)/avernum))
results_list.append([[trade_id,results]])
This fills out really quickly. This is after two ticks:
print(results_list)
[[[53471, 28.36432]], [[53477, 31.67835]], [[53474, 32.27664]], [[52232, 1908.30604]], [[52241, 350.4758]], [[53471, 28.36432]], [[53477, 31.67835]], [[53474, 32.27664]], [[52232, 1908.30604]], [[52241, 350.4758]]]
These averages will move and change very quickly. I want to use results_list to track and watch them, then compare previous averages to current ones
Thinking:
for r in range(len(results_list)):
if results_list[r][0] == trade_id:
restick.append(results_list[r][1])
resnum = len(restick)
if restick[resnum] > restick[resnum-1]:
do fancy things
Here is some short code that does what you I think you have described, although I might have misunderstood. You basically do exactly what you say; select everything that has a certain trade_id and returns its average.:
TID_INDEX = 0
DATA_INDEX = 1
def id_average(t_id, arr):
filt_arr = [i[DATA_INDEX] for i in arr if i[TID_INDEX] == t_id]
return sum(filt_arr)/len(filt_arr)