I have issue then i try get from correct information.
For example i have very big json output after request i made in post (i cant use get).
"offers": [
{
"rank": 1,
"provider": {
"id": 6653,
"isLocalProvider": false,
"logoUrl": "https://img.vxcdn.com/i/partner-energy/c_6653.png?v=878adaf9ed",
"userRatings": {
"additonalCustomerRatings": {
"price": {
"percent": 73.80
},
"service": {
"percent": 67.50
},
"switching": {
"percent": 76.37
},
"caption": {
"text": "Zusätzliche Kundenbewertungen"
}
},
I cant show it all because its very big.
Like you see "rank" 1 in this request exist 20 ranks with information like content , totalCost and i need pick them all. Like 6 rank content and totalCost, 8 rank content and totalCost.
So first off all in python i use code for getting what json data.
import requests
import json
url = "https://www.verivox.de/api/energy/offers/electricity/postCode/10555/custom?"
payload="{\"profile\":\"H0\",\"prepayment\":true,\"signupOnly\":true,\"includePackageTariffs\":true,\"includeTariffsWithDeposit\":true,\"includeNonCompliantTariffs\":true,\"bonusIncluded\":\"non-compliant\",\"maxResultsPerPage\":20,\"onlyProductsWithGoodCustomerRating\":false,\"benchmarkTariffId\":741122,\"benchmarkPermanentTariffId\":38,\"paolaLocationId\":\"71085\",\"includeEcoTariffs\":{\"includesNonEcoTariffs\":true},\"maxContractDuration\":240,\"maxContractProlongation\":240,\"usage\":{\"annualTotal\":3500,\"offPeakUsage\":0},\"priceGuarantee\":{\"minDurationInMonths\":0},\"maxTariffsPerProvider\":999,\"cancellationPeriod\":null,\"previewDisplayTime\":null,\"onlyRegionalTariffs\":false,\"sorting\":{\"criterion\":\"TotalCosts\",\"direction\":\"Ascending\"},\"includeSpecialBonusesInCalculation\":\"None\",\"totalCostViewMode\":1,\"ecoProductType\":0}"
headers = {
'Content-Type': 'application/json',
'Cookie': '__cfduid=d97a159bb287de284487ebdfa0fd097b41606303469; ASP.NET_SessionId=jfg3y20s31hclqywloocjamz; 0e3a873fd211409ead79e21fffd2d021=product=Power&ReturnToCalcLink=/power/&CustomErrorsEnabled=False&IsSignupWhiteLabelled=False; __RequestVerificationToken=vrxksNqu8CiEk9yV-_QHiinfCqmzyATcGg18dAqYXqR0L8HZNlvoHZSZienIAVQ60cB40aqfQOXFL9bsvJu7cFOcS2s1'
}
response = requests.request("POST", url, headers=headers, data=payload)
jsondata = response.json()
# print(response.text)
For it working fine, but then i try pick some data what i needed like i say before im getting
for Rankdata in str(jsondata['rank']):
KeyError: 'rank'
my code for this error.
dataRank = []
for Rankdata in str(jsondata['rank']):
dataRank.append({
'tariff':Rankdata['content'],
'cost': Rankdata['totalCost'],
'sumOfOneTimeBonuses': Rankdata['content'],
'savings': Rankdata['content']
})
Then i try do another way. Just get one or some data, but not working too.
data = response.json()
#print(data)
test = float((data['rank']['totalCost']['content']))
I know my code not perfect, but i first time deal with json what are so big and are so difficult. I will be very grateful if show my in my case example how i can pick rank 1 - rank 20 data and print it.
Thank you for your help.
If you look closely at the highest level in the json, you can see that the value for key offers is a list of dicts. You can therefore loop through it like this:
for offer in jsondata['offers']:
print(offer.get('rank'))
print(offer.get('provider').get('id'))
And the same goes for other keys in the offers.
Related
I'm trying to request all the sizes in stock from Zalando. I can not quite figure out how to do it since the video I'm watching
showing how to request sizes look different than min.
The video that I watch was this. Video - 5.30
Does anyone know how to request the sizes in stock and print the sizes that in stock?
The site in trying to request sizes of: here
My code looks like this:
import requests
from bs4 import BeautifulSoup as bs
session = requests.session()
def get_sizes_in_stock():
global session
endpoint = "https://www.zalando.dk/nike-sportswear-air-max-90-sneakers-ni112o0bt-a11.html"
response = session.get(endpoint)
soup = bs(response.text, "html.parser")
I have tried to go to the View page source and look for the sizes, but I could not see the sizes in the page source.
I hope someone out there can help me what to do.
The sizes are in the page
I found them in the html, in a javascript tag, in the format
{
"sku": "NI112O0BT-A110090000",
"size": "42.5",
"deliveryOptions": [
{
"deliveryTenderType": "FASTER"
}
],
"offer": {
"price": {
"promotional": null,
"original": {
"amount": 114500
},
"previous": null,
"displayMode": null
},
"merchant": {
"id": "810d1d00-4312-43e5-bd31-d8373fdd24c7"
},
"selectionContext": null,
"isMeaningfulOffer": true,
"displayFlags": [],
"stock": {
"quantity": "MANY"
},
"sku": "NI112O0BT-A110090000",
"size": "42.5",
"deliveryOptions": [
{
"deliveryTenderType": "FASTER"
}
],
"offer": {
"price": {
"promotional": null,
"original": {
"amount": 114500
},
"previous": null,
"displayMode": null
},
"merchant": {
"id": "810d1d00-4312-43e5-bd31-d8373fdd24c7"
},
"selectionContext": null,
"isMeaningfulOffer": true,
"displayFlags": [],
"stock": {
"quantity": "MANY"
}
},
"allOffers": [
{
"price": {
"promotional": null,
"original": {
"amount": 114500
},
"previous": null,
"displayMode": null
},
"merchant": {
"id": "810d1d00-4312-43e5-bd31-d8373fdd24c7"
},
"selectionContext": null,
"isMeaningfulOffer": true,
"displayFlags": [],
"stock": {
"quantity": "MANY"
},
"deliveryOptions": [
{
"deliveryWindow": "2022-05-23 - 2022-05-25"
}
],
"fulfillment": {
"kind": "ZALANDO"
}
}
]
}
}
If you parse the html with bs4 you should be able to find the script tag and extract the JSON.
The sizes for the default color of shoe are shown in html. Alongside this are the urls for the other colors. You can extract these into a dictionary and loop, making requests and pulling the different colors and their availability, which I think is what you are actually requesting, as follows (note: I have kept quite generic to avoid hardcoding keys which change across requests):
import requests, re, json
def get_color_results(link):
headers = {"User-Agent": "Mozilla/5.0"}
r = requests.get(link, headers=headers).text
data = json.loads(re.search(r'(\{"enrichedEntity".*size.*)<\/script', r).group(1))
results = []
color = ""
for i in data["graphqlCache"]:
if "ern:product" in i:
if "product" in data["graphqlCache"][i]["data"]:
if "name" in data["graphqlCache"][i]["data"]["product"]:
results.append(data["graphqlCache"][i]["data"]["product"])
if (
color == ""
and "color" in data["graphqlCache"][i]["data"]["product"]
):
color = data["graphqlCache"][i]["data"]["product"]["color"]["name"]
return (color, results)
link = "https://www.zalando.dk/nike-sportswear-air-max-90-sneakers-ni112o0bt-a11.html"
final = {}
color, results = get_color_results(link)
colors = {
j["node"]["color"]["name"]: j["node"]["uri"]
for j in [
a
for b in [
i["family"]["products"]["edges"]
for i in results
if "family" in i
if "products" in i["family"]
]
for a in b
]
}
final[color] = {
j["size"]: j["offer"]["stock"]["quantity"]
for j in [i for i in results if "simples" in i][0]["simples"]
}
for k, v in colors.items():
if k not in final:
color, results = get_color_results(v)
final[color] = {
j["size"]: j["offer"]["stock"]["quantity"]
for j in [i for i in results if "simples" in i][0]["simples"]
}
print(final)
Explanatory notes from chat:
Use chrome browser to navigate to link
Press Ctrl + U to view page source
Press Ctrl + F to search for 38.5 in html
The first match is the long string you already know about. The string is long and difficult to navigate in page source and identify which tag it is part of. There are a number of ways I could identify the right script from these, but for now, an easy way would be:
from bs4 import BeautifulSoup as bs
link = 'https://www.zalando.dk/nike-sportswear-air-max-90-sneakers-ni112o0bt-a11.html'
headers = {'User-Agent':'Mozilla/5.0'}
r = requests.get(link, headers = headers)
soup = bs(r.text, 'lxml')
for i in soup.select('script[type="application/json"]'):
if '38.5' in i.text:
print(i)
break
Slower method would be:
soup.find("script", text=re.compile(r'.*38.5.*'))
Whilst I used bs4 to get the right script tag contents, this was so I knew the start and end of the string denoting the JavaScript object I wanted to use re to extract, and then to deserialize into a JSON object with json; this in a re-write to use re rather than bs4 i.e. use re on entire response text, from the request, and pass a regex pattern which would pull out the same string
I put the entire page source in a regex tool and wrote a regex to return that same string as identified above. See that regex here
Click on right hand side, match 1 group 1, to see highlighted the same string being returned from regex as you saw with BeautifulSoup. Two different ways of getting the same string containing the sizes
That is the string which I needed to examine, as JSON, the structure of. See in json viewer here
You will notice the JSON is very nested with some keys to dictionaries that are likely dynamic, meaning I needed to write code which could traverse the JSON and use certain more stable keys to pull out the colours available, and for the default shoe colour the sizes and availability
There is an expand all button in that JSON viewer. You can then search with Ctrl + F for 38.5 again
10a) I noticed that size and availability were for the default shoe colour
10b) I also noticed that within JSON, if I searched by one of the other colours from the dropdown, I could find URIs for each colour of show listed
I used Wolf as my search term (as I suspected less matches for that term within the JSON)
You can see one of the alternate colours and its URI listed above
I visited that URI and found the availability and shoe sizes for that colour in same place as I did for the default white shoes
I realised I could make an initial request and get the default colour and sizes with availability. From that same request, extract the other colours and their URIs
I could then make requests to those other URIs and re-use my existing code to extract the sizes/availability for the new colours
This is why I created my get_color_results() function. This was the re-usable code to extract the sizes and availability from each page
results holds all the matches within the JSON to certain keys I am looking for to navigate to the right place to get the sizes and availabilities, as well as the current colour
This code traverses the JSON to get to the right place to extract data I want to use later
results = []
color = ""
for i in data["graphqlCache"]:
if "ern:product" in i:
if "product" in data["graphqlCache"][i]["data"]:
if "name" in data["graphqlCache"][i]["data"]["product"]:
results.append(data["graphqlCache"][i]["data"]["product"])
if (
color == ""
and "color" in data["graphqlCache"][i]["data"]["product"]
):
color = data["graphqlCache"][i]["data"]["product"]["color"]["name"]
The following pulls out the sizes and availability from results:
{
j["size"]: j["offer"]["stock"]["quantity"]
for j in [i for i in results if "simples" in i][0]["simples"]
}
For the first request only, the following gets the other shoes colours and their URIs into a dictionary to later loop:
colors = {
j["node"]["color"]["name"]: j["node"]["uri"]
for j in [
a
for b in [
i["family"]["products"]["edges"]
for i in results
if "family" in i
if "products" in i["family"]
]
for a in b
]
}
This bit gets all the other colours and their availability:
for k, v in colors.items():
if k not in final:
color, results = get_color_results(v)
final[color] = {
j["size"]: j["offer"]["stock"]["quantity"]
for j in [i for i in results if "simples" in i][0]["simples"]
}
Throughout, I update the dictionary final with the found colour and associated size and availabilities
Always check if an hidden api is available, it will save you a looooot of time.
In this case I found this api:
https://www.zalando.dk/api/graphql
You can pass a payload and you obtain a json answer
# I extracted the payload from the network tab of my browser debbuging tools
payload = """[{"id":"0ec65c3a62f6bd0b29a59f22021a44f42e6282b7f8ff930718a1dd5783b336fc","variables":{"id":"ern:product::NI112O0S7-H11"}},{"id":"0ec65c3a62f6bd0b29a59f22021a44f42e6282b7f8ff930718a1dd5783b336fc","variables":{"id":"ern:product::NI112O0RY-A11"}}]"""
conn = http.client.HTTPSConnection("www.zalando.dk")
headers = {
'content-type': "application/json"
}
conn.request("POST", "/api/graphql", payload, headers)
res = conn.getresponse()
res = res.read() # json output
res contains for each product a json leaf containing the available size:
"simples": [
{
"size": "38.5",
"sku": "NI112O0P5-A110060000"
},
{
"size": "44.5",
"sku": "NI112O0P5-A110105000"
},
{
...
It's now easy to extract the informations.
There also is a field that indicate if the product got a promotion or not, cool if you want to track a discount.
I'm new to Python and I'm quite stuck (I've gone through multiple other stackoverflows and other sites and still can't get this to work).
I've the below json coming out of an API connection
{
"results":[
{
"group":{
"mediaType":"chat",
"queueId":"67d9fb5e-26b2-4db5-b062-bbcfa8d2ca0d"
},
"data":[
{
"interval":"2021-01-14T13:12:19.000Z/2022-01-14T13:12:19.000Z",
"metrics":[
{
"metric":"nOffered",
"qualifier":null,
"stats":{
"max":null,
"min":null,
"count":14,
"count_negative":null,
"count_positive":null,
"sum":null,
"current":null,
"ratio":null,
"numerator":null,
"denominator":null,
"target":null
}
}
],
"views":null
}
]
}
]
}
and what I'm mainly looking to get out of it is (or at least something as close as)
MediaType
QueueId
NOffered
Chat
67d9fb5e-26b2-4db5-b062-bbcfa8d2ca0d
14
Is something like that possible? I've tried multiple things and I either get the whole of this out in one line or just get different errors.
The error you got indicates you missed that some of your values are actually a dictionary within an array.
Assuming you want to flatten your json file to retrieve the following keys: mediaType, queueId, count.
These can be retrieved by the following sample code:
import json
with open(path_to_json_file, 'r') as f:
json_dict = json.load(f)
for result in json_dict.get("results"):
media_type = result.get("group").get("mediaType")
queue_id = result.get("group").get("queueId")
n_offered = result.get("data")[0].get("metrics")[0].get("count")
If your data and metrics keys will have multiple indices you will have to use a for loop to retrieve every count value accordingly.
Assuming that the format of the API response is always the same, have you considered hardcoding the extraction of the data you want?
This should work: With response defined as the API output:
response = {
"results":[
{
"group":{
"mediaType":"chat",
"queueId":"67d9fb5e-26b2-4db5-b062-bbcfa8d2ca0d"
},
"data":[
{
"interval":"2021-01-14T13:12:19.000Z/2022-01-14T13:12:19.000Z",
"metrics":[
{
"metric":"nOffered",
"qualifier":'null',
"stats":{
"max":'null',
"min":'null',
"count":14,
"count_negative":'null',
"count_positive":'null',
"sum":'null',
"current":'null',
"ratio":'null',
"numerator":'null',
"denominator":'null',
"target":'null'
}
}
],
"views":'null'
}
]
}
]
}
You can extract the results as follows:
results = response["results"][0]
{
"mediaType": results["group"]["mediaType"],
"queueId": results["group"]["queueId"],
"nOffered": results["data"][0]["metrics"][0]["stats"]["count"]
}
which gives
{
'mediaType': 'chat',
'queueId': '67d9fb5e-26b2-4db5-b062-bbcfa8d2ca0d',
'nOffered': 14
}
I have the following code
import requests
import json
import sys
credentials_User=sys.argv[1]
credentials_Password=sys.argv[2]
email=sys.argv[3]
def auth_api(login_User,login_Password,):
gooddata_user=login_User
gooddata_password=login_Password
body = json.dumps({
"postUserLogin":{
"login": gooddata_user,
"password": gooddata_password,
"remember":1,
"verify_level":0
}
})
headers = {
'Content-Type': 'application/json',
'Accept': 'application/json'
}
url="https://reports.domain.com/gdc/account/login"
response = requests.request(
"POST",
url,
headers=headers,
data=body
)
sst=response.headers.get('Set-Cookie')
return sst
def query_api(cookie,email):
url="https://reports.domain.com/gdc/account/domains/domain/users?login="+email
body={}
headers={
'Content-Type': 'application/json',
'Accept': 'application/json',
'Cookie': cookie
}
response = requests.request(
"GET",
url,
headers=headers,
data=body
)
jsonContent=[]
jsonContent.append({response.text})
accountSettings=jsonContent[0]
print(accountSettings)
cookie=auth_api(credentials_User,credentials_Password)
profilehash=query_api(cookie,email)
The code itself works and sends a request to the Gooddata API.
The query_api() function returns JSON similar to below
{
"accountSettings": {
"items": [
{
"accountSetting": {
"login": "user#example.com",
"email": "user#example.com",
"firstName": "First Name",
"lastName": "Last Name",
"companyName": "Company Name",
"position": "Data Analyst",
"created": "2020-01-08 15:44:23",
"updated": "2020-01-08 15:44:23",
"timezone": null,
"country": "United States",
"phoneNumber": "(425) 555-1111",
"old_password": "secret$123",
"password": "secret$234",
"verifyPassword": "secret$234",
"authenticationModes": [
"SSO"
],
"ssoProvider": "sso-domain.com",
"language": "en-US",
"ipWhitelist": [
"127.0.0.1"
],
"links": {
"projects": "/gdc/account/profile/{profile_id}/projects",
"self": "/gdc/account/profile/{profile_id}",
"domain": "/gdc/domains/default",
"auditEvents": "/gdc/account/profile/{profile_id}/auditEvents"
},
"effectiveIpWhitelist": "[ 127.0.0.1 ]"
}
}
],
"paging": {
"offset": 20,
"count": 100,
"next": "/gdc/uri?offset=100"
}
}
}
The issue I am having is reading specific keys from this JSON Dict, I can use accountSettings=jsonContent[0] but that just returns the same JSON.
What I want to do is read the value of the project key within links
How would I do this with a dict?
Thanks
Based on your description, uyou have your value inside a list, (not a set. Foergt about set: sets are not used with JSON). Inside your list, you either your content as a single string, which then you'd have to parse with json.loads, or it is simply a well behaved nested data structure already extracted from JSON, but which is inside a single element list. This seems the most likely.
So, you should be able to do:
accountlink = jsonContent[0]["items"][0]["accountSetting"]["login"]
otherwise, if it is encoded as a a json string, you have to parse it first:
import json
accountlink = json.loads(jsonContent[0])["items"][0]["accountSetting"]["login"]
Now, given your question, I'd say your are on a begginer level as a programmer, or a casual user, just using Python to automatize something either way, I'd recommend you do try some exercising before proceeding: it will save you time (a lot of time). I am not trying to bully or mock anything here: this is the best advice I can offer you. Seek for tutorials that play around on the interactive mode, rather than trying entire programs at once that you'd just copy and paste.
Using the below code fixed the issue
jsonContent=json.loads(response.text)
print(type(jsonContent))
test=jsonContent["accountSettings"]["items"][0]
test2=test["accountSetting"]["links"]["self"]
print(test)
print(test2)
I believe this works because for some reason I didn't notice I was using .append for my jsonContent. This resulted in the data type being something other than it should have been.
Thanks to everyone who tried helping me.
Problem: The output of this code seems to be repeating alot of the same entries in the final list, thus making it exponentially longer.
The goal would be complete the query and the print the final list with all city within the region
[
{
"name": "Herat",
"id": "AF~HER~Herat"
}
]
[
{
"name": "Herat",
"id": "AF~HER~Herat"
},
{
"name": "Kabul",
"id": "AF~KAB~Kabul"
}
]
[
{
"name": "Herat",
"id": "AF~HER~Herat"
},
{
"name": "Kabul",
"id": "AF~KAB~Kabul"
},
{
"name": "Kandahar",
"id": "AF~KAN~Kandahar"
}
]
My goal is to to a get a list with cityID. I first to a GET request and parse the JSON response to get the country IDs to a list,
Second: I have a for loop, which will make another GET request for the region id, but i now need to add the country IDs to the api url. I do that by adding .format on the GET request. and iterate trough all the countries and there respective region IDs, i parse them and store them in a list.
Third: i have another for loop, which will make another GET request for the cityID that will loop trough all cities with the above Region ID list, and the respectively collect the cityID that i really need.
Code :
from requests.auth import HTTPBasicAuth
import requests
import json
import urllib3
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
def countries():
data = requests.get("https://localhost/api/netim/v1/countries/", verify=False, auth=HTTPBasicAuth("admin", "admin"))
rep = data.json()
a = []
for elem in rep['items']:
a.extend([elem.get("id","")])
print(a)
return a
def regions():
ids = []
for c in countries():
url = requests.get("https://localhost/api/netim/v1/countries/{}/regions".format(c), verify=False, auth=HTTPBasicAuth("admin", "admin"))
response = url.json()
for cid in response['items']:
ids.extend([cid.get("id","")])
data = []
for r in ids:
url = requests.get("https://localhost/api/netim/v1/regions/{}/cities".format(r), verify=False, auth=HTTPBasicAuth("admin", "admin"))
response = url.json()
data.extend([{"name":r.get("name",""),"id":r.get("id", "")} for r in response['items']])
print(json.dumps(data, indent=4))
return data
regions()
print(regions())
You will see thou output contains several copies of the same entry.
Not a programmer, not sure where am i getting it wrong
It looks as though the output you're concerned with might be due to the fact that you're printing data as you iterate through it in the regions() method.
Try to remove the line:
print(json.dumps(data, indent=4))?
Also, and more importantly - you're setting data to an empty list every time you iterate on an item in Countries. You should probably declare that variable before the initial loop.
You're already printing the final result when you call the function. So printing as you iterate only really makes sense if you're debugging & needing to review the data as you go through it.
i am using this python script to feed my data to elasticsearch 6.0. How can i store the variable Value with type float in Elasticsearch?
I can't use the metric options for the visualization in Kibana, because all the data is stored automatically as string
from elasticsearch import Elasticsearch
Device=""
Value=""
for key, value in row.items():
Device = key
Value = value
print("Dev",Device, "Val:", Value)
doc = {'Device':Device, 'Measure':Value , 'Sourcefile':filename}
print(' doc: ', doc)
es.index(index=name, doc_type='trends', body=doc)
Thanks
EDIT:
After the advice of #Saul, i could fix this problem with the following code:
import os,csv
import time
from elasticsearch import Elasticsearch
#import pandas as pd
import requests
Datum = time.strftime("%Y-%m-%d_")
path = '/home/pi/Desktop/Data'
os.chdir(path)
name = 'test'
es = Elasticsearch()
#'Time': time ,
#url = 'http://localhost:9200/index_name?pretty'
doc = {
"mappings": {
"doc": {
"properties": {
"device": { "type": "text" },
"measure": { "type": "text" },
"age": { "type": "integer" },
"created": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
}
}
}
}
}
#headers = {'Content-type': 'application/json', 'Accept': 'text/plain'}
#r = requests.post(url, data=json.dumps(data), headers=headers)
r= es.index(index=name, doc_type='trends', body=doc)
print(r)
You need to send a HTTP Post request using python request, as follows:
url = "http://localhost:9200/index_name?pretty”
data = {
"mappings": {
"doc": {
"properties": {
"title": { "type": "text" },
"name": { "type": "text" },
"age": { "type": "integer" },
"created": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
}
}
}
}
}
headers = {'Content-type': 'application/json', 'Accept': 'text/plain'}
r = requests.post(url, data=json.dumps(data), headers=headers)
Please replace index_name in the URL with the name of the index you are defining in to elasticsearch engine.
If you want to delete the index before creating it again, please do as follows:
url = "http://localhost:9200/index_name”
data = { }
headers = {'Content-type': 'application/json', 'Accept': 'text/plain'}
r = requests.delete(url, data=json.dumps(data), headers=headers)
please replace index_name in the URL with your actual index name. After deleting the index, create it again with the first code example above including the mappings that you would need. Enjoy.
Elasticsearch defines field types in the index mapping. It looks like you probably have dynamic mapping enabled, so when you send data to Elasticsearch for the first time, it makes an educated guess about the shape of your data and the field types.
Once those types are set, they are fixed for that index, and Elasticsearch will continue to interpret your data according to those types no matter what you do in your python script.
To fix this you need to either:
Define the index mapping before you load any data. This is the better option as it gives you complete control over how your data is interpreted. https://www.elastic.co/guide/en/elasticsearch/reference/6.0/mapping.html
Make sure that, the first time you send data into the index, you use the correct data types. This will rely dynamic mapping generation, but it will typically do the right thing.
Defining the index mapping is the best option. It's common to do that once off, in Kibana or with curl, or if you create a lot of indices, with a template.
However if you want to use python, you should look at the create or put_mapping functions on IndicesClient