Problems in trying to decode through statistics - python

My goal is to decode a message with the statistics of letters occurrence in the French language. For that, I created a first function which creates a string in decreasing order of occurrence in the text. Then, I match the index of the characters and I change each character with the one corresponding to the French language. The problem is that the text comes out unchanged in the console. I am a beginner in programming and any help on my mistakes or inaccuracies are appreciated. Thanks !!!
import string
texte = f"""iwnspa rynjjdj arg sj hjuajhask awaigkhihaj ag sj ongyaonghihaj ps vvha rhaiwa, rdsqajg idjrhpaka
idooa wa xaka pa wn gyadkha pa w’hjbdkonghdj. hw arg ja wa 30 nqkhw 1916 n xagdrlaf, pnjr wa
ohiyhunj. rdj xaka arg sj csua, ag rn oaka arg wa xkdqhrask ps wfiaa pa unfwdkp, sja nsgka qhwwa
ps ohiyhunj. hw agspha n w’sjhqakrhga ps ohiyhunj ds hw rshg sj pdszwa iskrsr aj awaigkhihga
ag aj ongyaonghmsar. hw dzghajg sja whiajia pnjr iar pasv phrihxwhjar aj 1936, nqnjg pa xdskrshqka
rar agspar ns kaxsga ohg (onrrniysraggr hjrghgsga db gaiyjdwduf ). hw rdsghajg aj 1940 sja
gyara pa pdigdkng aj ongyaonghmsar (awwa xdkgnhg rsk par nxxwhinghdjr par ongyaonghmsar n
wn uajaghmsa) ag sj oaodhka pa onrgak aj awaigkhihga. rh wn xkaohaka bsg wnkuaoajg hujdkaa,
wn raidjpa,msh avxwhmsa idooajg sghwhrak war nwuazkar pa zddwa xdsk w’njnwfra par rhujnsv
awaigkhmsar, arg kargaa iawazka. aj 1941, hw arg aoznsiya pnjr war wnzdkngdhkar pa wn idoxnujha
pa gawaxydja ngg zaww. hooaphngaoajg, hw gknqnhwwa rsk par xkdcagr aj whnhrdj nqai war
rakqhiar raikagr, msh wsh bdjg jdgnooajg nzdkpak par msarghdjr pa ikfxgduknxyha. aj 1949, hw
ra onkha; xnk wn rshga, hw nskn gkdhr ajbnjgr. rynjjdj gknqnhwwa nsv wnzdkngdhkar pa zaww
csrms’aj 1971. xnknwwawaoajg n iawn, hw arg nsrrh xkdbarrask ns ohg pa 1958 n 1978. rynjjdj
idoxkajpmsa gdsga pdjjaa, oaoa wn qdhv ds par honuar, xasg ra gknjroaggka n w’nhpa p’sja rshga
pa 0 ag pa 1 (war zhgr), dsqknjg wn qdha nsv idoosjhinghdjr jsoakhmsar ag jdj xwsr njnwduhmsar.
hw odjgka nsrrh idooajg wa bnhg p’ncdsgak iakgnhjr zhgr n sj oarrnua xasg xakoaggka pa qakhbhak
msa war nsgkar djg aga idkkaigaoajg gknjrohr (dj xnkwa pa idpa idkkaigask p’akkaskr). hw n
kais pa jdozkasv ydjjaskr, pdjg wn oapnhwwa jnghdjnwa par rihajiar par onhjr ps xkarhpajg cdyjrdj
aj 1966, ag wa xkhv lfdgd aj 1985. n wn bhj pa rn qha, hw rdsbbka pa wn onwnpha p’nwtyahoak,
ia msh wa idjpshg pnjr sja onhrdj pa kaxdr ps onrrniysraggr. hw f paiapa wa 24 baqkhak 2001, n
w’nua pa 84 njr."""
def count(texte):
ordre_txt = {}
order = ""
texte.lower()
alfb = list(string.ascii_lowercase)
for i in range(len(alfb)):
ordre_txt[alfb[i]] = texte.count(alfb[i])
ordre_txt = sorted(ordre_txt.items(), key = lambda x: x[1], reverse = True)
for elt in ordre_txt:
order += elt[0]
return order
def trad2(texte):
ordre_fr = 'esaitnruolhdcgmpvfqbjxzykw'
ordre_txt = count(texte)
for i in range(26):
texte.replace(ordre_txt[i], ordre_fr[i])
return texte
print(trad2(texte))

str.lower()
Return a copy of the string with all the cased characters converted to lowercase. So use texte = texte.lower() instead of texte.lower() at the 3rd line in def count(texte):.
Apply str.translate(table) in def trad2(texte): e.g. as follows:
def trad2(texte):
ordre_fr = 'esaitnruolhdcgmpvfqbjxzykw'
ordre_txt = count(texte)
tr_table = str.maketrans(ordre_txt, ordre_fr)
return texte.translate(tr_table)
Note the str.maketrans() static method (returns a translation table usable for str.translate())…If there are two arguments, they must be strings of equal length, and in the resulting dictionary, each character in x will be mapped to the character at the same position in y.

Related

How would I get readeable text from webpage using selenium and python

I am using selenium to navigate a webpage - https://gmet.edupage.org/. I have gotten to the login page entered the login details and entered the page where I wanna read the text and print it into console. There isnt any text I dont want I just want to print out only and only the visible text not the code. Unfortunately there is a lot of text so manually setting each element and printing the text out would take ages. So is there an easy function or way to print out only the readeable text with selenium?
First find the body of the page and just print out the text within it using .text
txt = driver.find_element_by_class_name('browser_chrome')
print(txt.text)
Output:
Tieto stránky používajú súbory cookie na analýzu návštevnosti stránok. Informácie o používaní našich stránok sú na tento účel zdieľané so spoločnosťou Google.
Ok
PRIHLÁSENIE
GYMNÁZIUM, Metodova 2, Bratislava
Rozvrh
Známky
Písomky / DÚ
Objednávanie obedov
Menu
Aktuálne informácie
O škole
Školský rok
Informácie pre žiakov školy
Tlačivá na stiahnutie
Štúdium v zahraničí
Záujemcovia o štúdium
Projekt FinQ
Školská jedáleň
Školský psychológ
METODKA občianske združenie
Športový klub METODKA
Rada školy
Web Mail
Informácie pre verejnosť
Kontakty
Škola je zapojená
Škola získala certifikát
Naša škola získala v roku 2014 a v roku 2017 a 2020 opakovane obnovila francúzsky certifikát kvality LabelFrancEducation
VIAC »
Pondelok 12. 10. 2020
Počet návštev: 25872260
"Naše obedy"
VIAC »
Škola je zapojená
Škola je zapojená
FinQ – PROGRAM FINANČNÉHO VZDELÁVANIA A ROZVOJA FINANČNEJ KULTÚRY PRE ŠKOLY
VIAC »
Aktuálne informácie
Aktualizácia opatrení ku dňu 1.10.2020
Usmernenie MŠVVaŠ SR k postupu škôl a školských zariadení pri realizácií výchovno vzdelávacieho procesu počas mimoriadnej situácie v súvislosti s ochorením Covid-19
Nad rámec opatrení vyplývajúcich z manuálov opatrení pre školy a školské zariadenia sa bez ohľadu na farbu semaforu predmetnej školy neodporúča realizovať:
organizáciu škôl v prírode a lyžiarskych výcvikov, a to ani dennou formou
organizácia kultúrnych, umeleckých a tanečných aktivít mimo povinného výchovno-vzdelávacieho procesu
VIAC »
Ako postupovať v jednotlivých prípadoch v súvislosti s COVID-19?
Riaditeľka školy zverejňuje pre rodičov, žiakov a učiteľov prehľad ako postupovať v jednotlivých prípadoch pri ochorení COVID-19
Ako postupovať v jednotlivých prípadoch (podozrenie, resp. pozitivita)?
Vyhlásenie zákonného zástupcu o bezinfekčnosti
Vyhlásenie návštevníka školy po súhlase riaditeľky školy
Informácie pre rodičov a žiakov školy
Riaditeľka školy si Vám dovoľuje oznámiť, že aktuálna situácia na škole v súvislosti s pandémiou COVID- 19 je nezmenená od začiatku školského roka, t.j. škola je v zelenej fáze semaforu. Výsledky testovaných žiakov a učiteľov na COVID-19, ktorí nastúpili na začiatku školského roka do školy, boli všetky k dnešnému dňu, 18.9.2020, negatívne.
Podrobné informácie k jednotlivým fázam a im prislúchajúcim opatreniam sú uvedené v „Manuáli pre stredné školy k úprave organizácie a podmienok výchovy a vzdelávania v stredných školách pre školský rok 2020/2021“ zverejnenom na stránke MŠVVaŠ SR na tomto linku: https://www.minedu.sk/stredne-skoly-aktualizovane-1692020/, ďalej len „Manuál“. Podmienky výchovy a vzdelávania na našej škole sú vždy plne v súlade s uvedeným dokumentom, aj po jeho aktualizáciách.
VIAC »
Ukáž sa! Pomôž Eliške získať grant.
Podpor svojim hlasom Elišku Klepkovú zo Septimy A, ktorá je vynikajúca atlétka, výborná žiačka, držiteľka strieborného odznaku DofE.
Bližšie informácie o Eliške a spôsobe hlasovania nájdeš na stránke:
https://www.olympic.sk/ukazsa/eliska-klepkova.
Hlasovať môžeš každý deň do 27.9.2020.
Začiatok školského roka 2020/2021
Vyučovanie začína dňa 2.9.2020.
V tento deň majú žiaci vyučovanie od 8:00 hod. do 9:30 hod. (V tento deň sa obedy v školskej jedálni nevydávajú).
Vstup žiakov do školy je od 7:00 hod. do 8:00 hod. do budovy na Metodovej aj Jelačičovej z ulice aj z dvora, celkovo cez štyri vstupy. Prichádzať môžu priebežne. Pri každom vchode bude pedagogický dozor koordinovať vstup žiakov.
Odporúčame žiakom, aby sa nezdržiavali v priestoroch pred budovami školy a dodržiavali dvojmetrové rozostupy.
Z dôvodu dodržiavania protiepidemických opatrení a odporúčaní Ministerstva školstva, vedy, výskumu a športu SR zákonný zástupca:
VIAC »
OZNAM
Úradné hodiny počas prázdnin
v termíne od 13. 07. 2020 do 21. 08. 2020
sú každú stredu od 9.00 h do 12.00 h.
Ing. Zuzana Vaterková
riaditeľka školy
Informácia pre uchádzačov o štúdium na 8GYM
Všetkým uchádzačom o 8-ročné štúdium oznamujeme, že ku dňu 7.7.2020 sú všetky miesta na prijatie obsadené na 100%.
Ing. Zuzana Vaterková, riaditeľka školy
Kontakt
GYMNÁZIUM
Metodova 2
821 08 BRATISLAVA
+421 2 50 10 24 11
Fotogaléria

how to create groups from string? [duplicate]

This question already has answers here:
How do I split a list into equally-sized chunks?
(66 answers)
Closed 2 years ago.
I have string and I will do split for this string and then I will get 320 elements. I need to create 4 groups. Every group will be with 100 elements and the last group must be with 20 last elements. And the last step is that all groups must be string and not list. how can I do that?
I can do that if I know how many elements I have:
s_l = 'AA AAL AAOI ACLS ADT ADTX ADVM AEL AIG ALEC ALLY ALT AMCX ANGI APA APDN APLS APPS APRN AQUA ARMK ARNC ARVN ATNM ATOM ATRA ATSG AVCT AVT AX AXDX BCLI BE BEAM BJRI BKE BKU BLDR BLNK BMRA BOOT BXS BYD CAKE CALX CAPR CARG CARR CARV CATM CC CCL CELH CEQP CFG CHEF CHRS CIT CLDX CLR CLSK CNK CNST CODX COLB COOP CPE CRS CTVA CUK CVET CVI CVM CYTK DAL DBX DCP DDS DEI DISCA DISCK DK DNB DRNA DVAX DXC ECOM EIGR ELAN ELF ELY ENVA EQ EQT EXEL FE FHI FIXX FL FLWS FMCI FORM FOX FOXA FRTA FUN GBX GIII GM GNMK GOCO GPRE GRAF GRPN GRWG GTHX GWB HALO HCC HCSG HEAR HFC HGV HIBB HMSY HOG HOME HP HSC HTH HWC IMUX IMVT INO INOV INSG INSM INT IOVA IRDM ITCI JELD JWN KMT KODK KPTI KSS KTB KTOS KURA LAKE LB LCA LL LPI LPRO LSCC LYFT MAXR MBOT MCRB MCS MD MDP MGM MGNX MIC MLHR MOS MRSN MTOR MXL MYGN NCLH NCR NK NKTR NLS NMIH NOVA NTLA NTNX NUAN NVST NXTC ODP OFC OKE OMER OMF OMI ONEM OSPN OSUR OXY OZK PACW PD PDCE PDCO PEAK PGNY PLAY PLCE PLT PLUG PPBI PRPL PRTS PRVB PS PSNL PSTX PSXP PTGX PVAC RCUS REAL REZI RKT RMBL RPAY RRGB RRR RVLV RVP RXN SANM SAVE SBGI SC SCPL SEAS SEM SFIX SFM SGMS SGRY SHLL SHOO SHYF SIX SKX SLQT SMCI SNAP SNDX SNV SONO SPAQ SPCE SPR SPWH SPWR SRG SRNE SSNT SSSS STOR SUM SUN SUPN SVMK SWBI SYF SYRS TBIO TCDA TCF TCRR TDC TEX TFFP TGTX THC TMHC TRGP TRIP TSE TUP TVTY UBX UCBI UCTT UFS UNFI UONE UPWK URBN USFD VCRA VERI VIAC VIRT VIVO VREX VSLR VSTO VXRT WAFD WBS WFC WHD WIFI WKHS WORK WORX WRK WRTC WW WWW WYND XEC XENT XPER XRX YELP ZGNX ZUMZ ZYXI'
split_s_l = s_l.split(" ")
part_1 = ' '.join(split_s_l[:100])
part_2 = ' '.join(split_s_l[100:200])
part_3 = ' '.join(split_s_l[200:300])
part_4 = ' '.join(split_s_l[300:])
for part in (part_1, part_2, part_3, part_4):
print(part)
but I don't know how to do that If I have many elements in list.
For a variable number of items, you can loop using:
sep = ' '
num = 100
split_s_l = s_l.split(sep)
for i in range(0, len(split_s_l), num):
part = sep.join(split_s_l[i : i+num])
print(part)
Bear in mind that for the last slice in the example case ([300:400]) it does not matter that there are only 320 elements -- just the last 20 items will be included (no error).
Something like this?
def break_up(s, nwords, separator):
words = s.split()
return [separator.join(words[n:n+nwords]) for n in range(0, len(words), nwords)]
print(break_up('a b c d e f g h', 3, ' '))
Result:
['a b c', 'd e f', 'g h']
Of course, you might call as print(break_up(s_l, 100, ' '))
Below
s_l = 'AA AAL AAOI ACLS ADT ADTX ADVM AEL AIG ALEC ALLY ALT AMCX ANGI APA APDN APLS APPS APRN AQUA ARMK ARNC ARVN ATNM ATOM ATRA ATSG AVCT AVT AX AXDX BCLI BE BEAM BJRI BKE BKU BLDR BLNK BMRA BOOT BXS BYD CAKE CALX CAPR CARG CARR CARV CATM CC CCL CELH CEQP CFG CHEF CHRS CIT CLDX CLR CLSK CNK CNST CODX COLB COOP CPE CRS CTVA CUK CVET CVI CVM CYTK DAL DBX DCP DDS DEI DISCA DISCK DK DNB DRNA DVAX DXC ECOM EIGR ELAN ELF ELY ENVA EQ EQT EXEL FE FHI FIXX FL FLWS FMCI FORM FOX FOXA FRTA FUN GBX GIII GM GNMK GOCO GPRE GRAF GRPN GRWG GTHX GWB HALO HCC HCSG HEAR HFC HGV HIBB HMSY HOG HOME HP HSC HTH HWC IMUX IMVT INO INOV INSG INSM INT IOVA IRDM ITCI JELD JWN KMT KODK KPTI KSS KTB KTOS KURA LAKE LB LCA LL LPI LPRO LSCC LYFT MAXR MBOT MCRB MCS MD MDP MGM MGNX MIC MLHR MOS MRSN MTOR MXL MYGN NCLH NCR NK NKTR NLS NMIH NOVA NTLA NTNX NUAN NVST NXTC ODP OFC OKE OMER OMF OMI ONEM OSPN OSUR OXY OZK PACW PD PDCE PDCO PEAK PGNY PLAY PLCE PLT PLUG PPBI PRPL PRTS PRVB PS PSNL PSTX PSXP PTGX PVAC RCUS REAL REZI RKT RMBL RPAY RRGB RRR RVLV RVP RXN SANM SAVE SBGI SC SCPL SEAS SEM SFIX SFM SGMS SGRY SHLL SHOO SHYF SIX SKX SLQT SMCI SNAP SNDX SNV SONO SPAQ SPCE SPR SPWH SPWR SRG SRNE SSNT SSSS STOR SUM SUN SUPN SVMK SWBI SYF SYRS TBIO TCDA TCF TCRR TDC TEX TFFP TGTX THC TMHC TRGP TRIP TSE TUP TVTY UBX UCBI UCTT UFS UNFI UONE UPWK URBN USFD VCRA VERI VIAC VIRT VIVO VREX VSLR VSTO VXRT WAFD WBS WFC WHD WIFI WKHS WORK WORX WRK WRTC WW WWW WYND XEC XENT XPER XRX YELP ZGNX ZUMZ ZYXI'
words= s_l.split(" ")
num_of_100_groups = int(len(words) / 100)
groups = []
for i in range(0,num_of_100_groups):
groups.append(words[i * 100 : (i+1) * 100])
groups.append(words[num_of_100_groups * 100:])
for sub_group in groups:
print(' '.join(sub_group))

Not all keys of dictionary are present in CSV

I am trying to write to a csv in scrapy. I have a dictionary named features. I add keys and then values. There are some particular keys that are being added in a loop like this:
for i in paint_variable:
print("inside")
print(str(i[0]).replace(":",""))
features[str(i[0]).replace(":","")] = ""
features[str(i[0]).replace(":", "")]=(i[1])
pain_variables is a list. It is adding keys correctly to it.
title Sahibinden Peugeot 407 2.0 HDi Executive Premium 2006 Model
price 43.000 TL
İlan No: 14215895
İlan Tarihi: 15 Nisan 2020
Marka: Peugeot
Seri: 407
Model: 2.0 HDi Executive Premium
Yıl: 2006
Yakıt Tipi: Dizel
Vites Tipi: Otomatik
Motor Hacmi: 1997 cc
Motor Gücü: 138 hp
Kilometre: 295.000 km
Boya-değişen: Takasa Uygun
Takasa Uygun: Sahibinden
Kimden:
user_name [' Ayhan BOYSAN ']
explanation ['-', '- Tuna Caddesi, Hayır Sokak, Lena Reklam. Hemen cadde üzerinde, köşede araç görülebilir. Arkadaşımın ofisinin önünde durmakta.', '* Muayenesi yok, biteli iki ay kadar oldu, satışta anlaşılırsa yaptırıp vereceğim.', '* Sol arka ABS sensörü arıza veriyor, 150 lira parça fiyatı var, parçacılar açılır açılmaz sipariş verip yaptıracağım.', '* 295000 bakımı yapılmadı. Satışta anlaşılırsa yaptırıp vereceğim.', '* Aküsü 2 yıllık, sorunu yok ama araç kullanılmadığı için bir iki gün ya şarj edilmeli ya da bir iki saat dolaşılmalı.\xa0', '* Düz yolda iyi ama yokuşta çekişi düşük.\xa0', '* Cam tavan perdesinin ikisi eksik. Biz perdesiz kullanıyorduk çünkü 3 numara Ziebart noktasında film yaptırmıştım, çok rahat.', '* Isıtmalı koltukların rolesi değişmesi gerek, koltuk ısıtmaları çalışmıyor, geçen kış arızalandı, aracı kullanmadığımız için\xa0 \xa0 \xa0 \xa0yaptırmaya ihtiyaç duymadık.', '* Ezik yok, çizikler var bazı yerlerde, tampon köşelerinde de resimlerde gözüken yıpranmalar var.\xa0', 'Aracın ARTILARI', '* Orjinal, değişeni yok, kazası yok, macun yok.', '* Kaskolu hala. Boyattığım için sadece 2500 lira trameri var. Tamamen kozmetik amaçlı boyattım, gören usta anlar.', '* Ön cam 1 numara, yanlar 2 numara, cam tavan 3 numara Ziebarth filmi mevcut.', '* Derileri ve döşemeleri çok iyi.', '* Kesinlikle ezik bir araç değil.\xa0', '* Cd Changer mevcut', '* Yedek anahtarı mevcut', '* 17" Jant ve Lastikleri iyi durumda.', '* Xenon - Far Yıkama - Cam Tavan vb.', '* Her yıl bakımını orjinal parça kullanarak yaptım.', 'Evde iki kişiyiz ve bunu babamın kullanımı için almıştım ama yaştan ötürü artık kullanamaz hale geldi, evdeki 3. araç. İlk kez satışa koyuyorum ve masrafları olduğu için bu rakamı yazıyorum. Piyasasının ne olduğunu hepimiz biliyoruz. Ucuz diye kazalı sanmayın.', 'Bedavacı kişiler ararsa eğer karşılıklı şekilde üzülürüz uyarıyorum :)', 'Göz boyamak, kandırmak niyetindeki insanlardan olmadığım için ne biliyorsam yazdım. Yıkamadan resimleri ekledim. Buna göre yakışıksız teklifler mesajlar atmayın.', 'Değerinde toprak ile takasa girerim dağın başında olmaması kaydı ile.', 'Aracı görmek isteyenler Bayrampaşa eski cezaevi tarafına USTASI ile gelebilir. Oraya buraya gitmez aracım. Olumlu olumsuz yazdıklarım geçerlidir.', 'MESAJLA SON RAKAM YAZANLARA CEVAP VERMEM. Pazarlık gönül kırmamak adına hak ettiği kadar yapılır.', 'Saat 20.00 den sonra telefonlara bakamayabilirim, mesaj atarsanız dönerim.', 'Teşekkürler']
Sağ Arka Çamurluk Boyanmış
Arka Kaput Boyanmış
Sol Arka Çamurluk Boyanmış
Sağ Arka Kapı Boyanmış
Sağ Ön Kapı Boyanmış
Tavan Orijinal
Sol Arka Kapı Boyanmış
Sol Ön Kapı Boyanmış
Sağ Ön Çamurluk Boyanmış
Motor Kaputu Boyanmış
Sol Ön Çamurluk Boyanmış
Ön Tampon Boyanmış
Arka Tampon Boyanmış
year 2006
feul_type Dizel
gear_type Otomatik
case_type Station wagon
kilometer 295.000 km
vehicle_type Bireysel Araç
color Bej
nationality (TR) Türkiye
warranty Garantisi Yok
first_owner İlk Sahibi Değilim
suitable_for_clearing Takasa Uygun
from_who Sahibinden
annual_mtv 1.124 TL
traction Motor ve Performans
number_of_cylinders Önden Çekiş
maxiumum_power 138 hp
minimum_power 11,2 sn
acceleration 201 km/s
maximumspeed Ortalama Yakıt Tüketimi
Keys in loop start after explanation key. They are present here. But they are not written to in csv.
enter image description here
As you can see these keys are not present.
Here is a fully functional code if anyone wants to recreate:
class Myspider(SitemapSpider):
name = 'spidername'
sitemap_urls = ['https://www.arabam.com/sitemap/otomobil_1.xml']
sitemap_rules = [
('/otomobil/', 'parse'),
# ('/category/', 'parse_category'),
]
custom_settings = {'FEED_FORMAT':'csv','FEED_URI': str(datetime.today().strftime('%d%m%y'))+'.csv'}
def parse(self,response):
for td in response.xpath("/html/body/div[3]/div[6]/div[4]/div/div[2]/table/tbody/tr/td[4]/div/a/#href").extract():
checks = str(td.split("/")[3]).split("-")
for items in checks:
if items.isdigit():
if int(items) > 2001:
url = "https://www.arabam.com/"+ td
yield scrapy.Request(url, callback=self.parse_dir_contents)
def parse_dir_contents(self,response):
# print("hi here")
features = {}
features["ad_url"] = response.request.url
features["ad_path"] =""
for paths in response.xpath(" /html/body/div[3]/div[6]/div[1]/div/div/div/div[1]/div/span/a/text()").extract():
features["ad_path"]+=paths
features["ad_path"]+="/"
features["title"] = response.xpath("/html/body/div[3]/div[6]/div[3]/div/div[1]/h1/text()").extract()[0]
features["price"] = response.xpath("/html/body/div[3]/div[6]/div[3]/div/div[1]/div[1]/div[2]/div[1]/div/span/text()").extract()[0]
for items in response.xpath("/html/body/div[3]/div[6]/div[3]/div/div[1]/div[1]/div[2]/ul/li/span[1]/text()").extract():
features[(str(items) )] = ""
counter = 1
for items in response.xpath("/html/body/div[3]/div[6]/div[3]/div/div[1]/div[1]/div[2]/ul/li/span[2]/text()").extract():
query = (response.xpath("/html/body/div[3]/div[6]/div[3]/div/div[1]/div[1]/div[2]/ul/li["+str(counter)+"]/span[1]/text()").extract())
features[str(query[0])] = items
counter+=1
features["user_name"] = response.xpath("/html/body/div[3]/div[6]/div[3]/div/div[3]/div/div/div[1]/p/text()").extract()
if not features["user_name"]:
features["user_name"] = response.xpath("/html/body/div[3]/div[6]/div[3]/div/div[3]/div/div/div[1]/a/span/text()").extract()
features["explanation"] = response.xpath("/html/body/div[3]/div[6]/div[3]/div/div[1]/div[3]/div/div[1]/div/div/p/text()").extract()
#change this
paint_variable = []
for li in response.xpath("/html/body/div[3]/div[6]/div[3]/div/div[1]/div[3]/div/div[2]/div/div[2]/ul[1]/li"):
paint_variable.append(li.xpath('span/text()').extract())
for i in paint_variable:
print("inside")
print(str(i[0]).replace(":",""))
features[str(i[0]).replace(":","")] = ""
features[str(i[0]).replace(":", "")]=(i[1])
options = webdriver.ChromeOptions()
options.add_argument('headless')
options.add_argument('window-size=1200x600')
d = webdriver.Chrome(chrome_options=options,
executable_path='/Users/fatima.arshad/Downloads/chromedriver')
d.get(features["ad_url"])
# Use send_keys(Keys.HOME) to scroll up to the top of page
d.find_element_by_tag_name('body').send_keys(
Keys.END)
while True:
d.find_element_by_tag_name('body').send_keys(
Keys.UP)
e = d.find_element_by_xpath("/html/body/div[3]/div[6]/div[3]/div/div[1]/div[3]/div/div[3]/div")
if e.text:
break
overview1 = e.text.split("\n")
features["year"] = overview1[2]
features["feul_type"] = overview1[4]
features["gear_type"] = overview1[6]
features["case_type"] = overview1[8]
features["kilometer"] = overview1[10]
features["vehicle_type"] = overview1[12]
features["color"] = overview1[14]
features["nationality"] = overview1[16]
features["warranty"] = overview1[18]
features["first_owner"] = overview1[20]
features["suitable_for_clearing"] = overview1[22]
features["from_who"] = overview1[24]
features["annual_mtv"] = overview1[26]
features["traction"] = overview1[29]
features["number_of_cylinders"] = overview1[31]
features["maxiumum_power"] = overview1[39]
features["minimum_power"] = overview1[41]
features["acceleration"] = overview1[43]
features["maximumspeed"] = overview1[45]
print("feature starting")
for key, value in features.items():
print(key, value)
# print("s"+ str(overview1))
yield features
process = CrawlerProcess({
})
process.crawl(Myspider)
process.start() # the script wi
[1]: https://i.stack.imgur.com/qfS2e.png
You must do one of the following:
Define all output fields in the FEED_EXPORT_FIELDS setting
Use an Item subclass instead of a dict

Python error: string indices must be integers

i need to write a small json data object with python, but when i use this, it don't work, what do i wrong?
This is for the newest version of Python
import urllib, json
import requests
import json
with open('locaties.json') as json_file:
data = json.load(json_file)
for parkeerlocaties in data['parkeerlocaties']:
for locatie in parkeerlocaties['parkeerlocatie']:
for title in locatie['title']:
print("Hello World")
{"parkeerlocaties":[{"parkeerlocatie":{"title":"Fietsenstalling Tolhuisplein","Locatie":"{\"type\":\"Point\",\"coordinates\":[4.9032801,52.3824545]}","type":"Fietspunt","url":"https:\/\/www.amsterdam.nl\/parkeren-verkeer\/fiets\/fietsparkeren\/gemeentelijke\/","urltitle":"www.amsterdam.nl\/fiets","adres":"Buiksloterweg 3","postcode":"1031 CC","woonplaats":"Amsterdam","opmerkingen":"Alleen toegankelijk voor abonnementhouders van Tolhuisplein, automatische stalling"}},{"parkeerlocatie":{"title":"Fietsenstalling Paradiso","Locatie":"{\"type\":\"Point\",\"coordinates\":[4.8833735,52.3621851]}","type":"Fietspunt","url":"https:\/\/www.amsterdam.nl\/parkeren-verkeer\/fiets\/fietsparkeren\/gemeentelijke\/","urltitle":"www.amsterdam.nl\/fiets","adres":"Weteringschans 4 A","postcode":"1017 SG","woonplaats":"Amsterdam","opmerkingen":"Maximale parkeerduur 28 dagen, stalling met toezicht"}},{"parkeerlocatie":{"title":"Fietsenstalling Zuidplein","Locatie":"{\"type\":\"Point\",\"coordinates\":[4.8719467,52.3398642]}","type":"Fietspunt","url":"https:\/\/www.amsterdam.nl\/parkeren-verkeer\/fiets\/fietsparkeren\/gemeentelijke\/","urltitle":"www.amsterdam.nl\/fiets","adres":"Zuidplein 5","postcode":"1077 XV","woonplaats":"Amsterdam","opmerkingen":"Maximale parkeerduur 28 dagen, stalling met toezicht"}},{"parkeerlocatie":{"title":"Fietsenstalling Station Rai (gesloten tot februari 2019)","Locatie":"{\"type\":\"Point\",\"coordinates\":[4.8905079,52.339392]}","type":"Fietspunt","url":"https:\/\/www.amsterdam.nl\/parkeren-verkeer\/fiets\/fietsparkeren\/gemeentelijke\/","urltitle":"www.amsterdam.nl\/fiets","adres":"Europaboulevard 4","postcode":"1083 AD","woonplaats":"Amsterdam","opmerkingen":"Sluit voor renovatie op 21 juli 2018. Er zijn rond het station extra parkeerplekken voor fiets gemaakt."}},{"parkeerlocatie":{"title":"P+R Zeeburg","Locatie":"{\"type\":\"Point\",\"coordinates\":[4.9607015,52.3719632]}","type":"P+R","url":"https:\/\/www.amsterdam.nl\/parkeren-verkeer\/parkeren-reizen\/#h4f9f93f8-875b-4d18-936a-c1eba9d6f198","urltitle":"www.amsterdam.nl\/penr ","adres":"Zuiderzeeweg 46 a","postcode":"1095KJ","woonplaats":"Amsterdam","opmerkingen":"","OV_bus":"bus 37 Noord - Amstelstation vv","OV_tram":"tram 26 Ijburg - Centraal Station vv","OV":"tram;GVB_26_1;08240, bus;GVB_37_2;08134"}},{"parkeerlocatie":{"title":"Weekend P+R VUmc","Locatie":"{\"type\":\"Point\",\"coordinates\":[4.8611063,52.3361167]}","type":"P+R","url":"https:\/\/www.amsterdam.nl\/parkeren-verkeer\/parkeren-reizen\/#hdce18cfd-fc8f-4728-be57-2d9a23b494d9","urltitle":"www.amsterdam.nl\/penr","adres":"Gustav Mahlerlaan 3004","postcode":"1081 LA","woonplaats":"Amsterdam","opmerkingen":"","OV_metro":"metro 51 Isolatorweg - Centraal Station vv (maart 2019 t\/m eind 2020), metro 50 met overstap Overamstel op 51 Centraal Station","OV_tram":"tram 24 VU medisch centrum - Centraal Station vv, tram 5 Amstelveen - Van Hallstraat vv","OV":"metro;GVB_50_1;07343;09563, tram;GVB_24_1;07350, tram;GVB_5_1;07410"}},{"parkeerlocatie":{"title":"P+R Bos en Lommer","Locatie":"{\"type\":\"Point\",\"coordinates\":[4.8453671,52.379131]}","type":"P+R","url":"https:\/\/www.amsterdam.nl\/parkeren-verkeer\/parkeren-reizen\/#h9434503d-d323-4331-b792-5210ce062c42","urltitle":"www.amsterdam.nl\/penr ","adres":"Leeuwendalersweg 23 b","postcode":"1055JE","woonplaats":"Amsterdam","opmerkingen":"","OV_bus":"bus 21 Geuzenveld - Centraal Station vv","OV_tram":"tram 7 Slotermeer - Azartplein vv","OV":"bus;GVB_21_1;03060, tram;GVB_7_1;03167"}},{"parkeerlocatie":{"title":"P+R Sloterdijk","Locatie":"{\"type\":\"Point\",\"coordinates\":[4.8384209,52.3900128]}","type":"P+R","url":"https:\/\/www.amsterdam.nl\/parkeren-verkeer\/parkeren-reizen\/#h628fb483-dec3-4a9d-9d52-50136e9639ec","urltitle":"www.amsterdam.nl\/penr","adres":"Piarcoplein 1","postcode":"1043DW","woonplaats":"Amsterdam","opmerkingen":"","OV_bus":"bus 22 Station Sloterdijk - Muiderpoortstation vv","OV_metro":"metro 50 Isolatorweg - Gein vv, overstap 51 op Station Zuid \/ Station RAI \/ Overamstel","OV_tram":"tram 19 Station Sloterdijk - Diemen vv","OV_trein":"Treinen tussen station Sloterdijk en de stations CS, Muiderpoort en Amstel (GVB P+R-kaart niet geldig)","OV":"tram;GVB_19_1;02361;00014, metro;GVB_50_1;02295;09563, metro;GVB_51_1;*09563, bus;GVB_22_1;02367;00001"}},{"parkeerlocatie":{"title":"P+R Olympisch Stadion","Locatie":"{\"type\":\"Point\",\"coordinates\":[4.8539215,52.3440266]}","type":"P+R","url":"https:\/\/www.amsterdam.nl\/parkeren-verkeer\/parkeren-reizen\/#h4567b083-9fea-4848-882a-280b6abc7853","urltitle":"www.amsterdam.nl\/penr ","adres":"Olympisch Stadion 44","postcode":"1076DE","woonplaats":"Amsterdam","opmerkingen":"","OV_tram":"tram 24 VU medisch centrum - Centraal Station vv","OV":"tram;GVB_24_1;07121"}},{"parkeerlocatie":{"title":"P+R Johan Cruijff ArenA","Locatie":"{\"type\":\"Point\",\"coordinates\":[4.9405734,52.3137551]}","type":"P+R","url":"https:\/\/www.amsterdam.nl\/parkeren-verkeer\/parkeren-reizen\/#h1dfa5189-98e8-42ce-8119-ce74f2451969","urltitle":"www.amsterdam.nl\/penr","adres":"Burgemeester Stramanweg 130","postcode":"1101EP","woonplaats":"Amsterdam","opmerkingen":"","OV_metro":"metro 54 Gein - Centraal Station vv","OV_trein":"Treinen tussen station Bijlmer Arena en stations Amstel, Muiderpoort en Centraal Station (GVB P+R-kaart niet geldig)","OV":"metro;GVB_54_1;09522"}},{"parkeerlocatie":{"title":"Amsterdamse Poort (P21 t\/m 24)","Locatie":"{\"type\":\"Point\",\"coordinates\":[4.9626214,52.3192019]}","type":"CommercieleParkeergarage","url":"https:\/\/www.q-park.nl\/nl-nl\/parkeren\/amsterdam\/amsterdamse-poort-p21\/","urltitle":"Amsterdamse Poort P21","adres":"Bijlmerdreef 700","postcode":"1103DS","woonplaats":"Amsterdam","opmerkingen":""}},{"parkeerlocatie":{"title":"P18 HES\/ ROC","Locatie":"{\"type\":\"Point\",\"coordinates\":[4.9466199,52.3152543]}","type":"Parkeergarage","url":"https:\/\/www.amsterdam.nl\/parkeren-verkeer\/parkeergarages\/parkeergarages\/garage-p18-hes-roc\/","urltitle":"Bekijk P18 HES\/ ROC op www.amsterdam.nl\/parkeergarages","adres":"Fraijlemaborg 131","postcode":"1102CV","woonplaats":"Amsterdam","opmerkingen":""}},{"parkeerlocatie":{"title":"P1 ArenA","Locatie":"{\"type\":\"Point\",\"coordinates\":[4.9405851,52.3137433]}","type":"Parkeergarage","url":"https:\/\/www.amsterdam.nl\/parkeren-verkeer\/parkeergarages\/parkeergarages\/parkeergarage-p1\/","urltitle":"Bekijk P1 ArenA op www.amsterdam.nl\/parkeergarages","adres":"Burgemeester Stramanweg 130","postcode":"1101EP","woonplaats":"Amsterdam","opmerkingen":""}},{"parkeerlocatie":{"title":"P10 Plaza ArenA","Locatie":"{\"type\":\"Point\",\"coordinates\":[4.9409531,52.3080762]}","type":"Parkeergarage","url":"https:\/\/www.amsterdam.nl\/parkeren-verkeer\/parkeergarages\/parkeergarages\/p10-plaza-arena\/","urltitle":"Bekijk P10 Plaza ArenA op www.amsterdam.nl\/parkeergarages","adres":"Herikerbergweg 288","postcode":"1101CT","woonplaats":"Amsterdam","opmerkingen":""}},{"parkeerlocatie":{"title":"P 3 Mikado","Locatie":"{\"type\":\"Point\",\"coordinates\":[4.9413266,52.3103066]}","type":"Parkeergarage","url":"https:\/\/www.amsterdam.nl\/parkeren-verkeer\/parkeergarages\/parkeergarages\/garage-p3-mikado\/","urltitle":"Bekijk P3 Mikado op www.amsterdam.nl\/parkeergarages","adres":"De entree 228","postcode":"1101EE","woonplaats":"Amsterdam","opmerkingen":""}},{"parkeerlocatie":{"title":"RAI Parking","Locatie":"{\"type\":\"Point\",\"coordinates\":[4.8921615,52.3383996]}","type":"CommercieleParkeergarage","url":"https:\/\/www.rai.nl\/nl\/contact-bereikbaarheid-en-parkeren\/parkeren-bij-rai-amsterdam\/","urltitle":"Rai Parking","adres":"Europaboulevard 24","postcode":"1078GZ","woonplaats":"Amsterdam","opmerkingen":""}},{"parkeerlocatie":{"title":"Qpark Eurocenter","Locatie":"{\"type\":\"Point\",\"coordinates\":[4.8888094,52.3358123]}","type":"CommercieleParkeergarage","url":"https:\/\/www.q-park.nl\/nl-nl\/parkeren\/amsterdam\/eurocenter\/","urltitle":"Qpark Eurocenter ","adres":"Barbara Strozzilaan 342","postcode":"1083HN","woonplaats":"Amsterdam","opmerkingen":""}},{"parkeerlocatie":{"title":"Qpark Mahler","Locatie":"{\"type\":\"Point\",\"coordinates\":[4.8723915,52.3377672]}","type":"CommercieleParkeergarage","url":"https:\/\/www.q-park.nl\/nl-nl\/parkeren\/amsterdam\/mahler\/","urltitle":"Qpark Mahler","adres":"Claude Debussylaan 42","postcode":"1082MD","woonplaats":"Amsterdam","opmerkingen":""}},{"parkeerlocatie":{"title":"Olympisch Stadion","Locatie":"{\"type\":\"Point\",\"coordinates\":[4.8539215,52.3440266]}","type":"CommercieleParkeergarage","url":"http:\/\/www.p1.nl\/parkeren\/parkeergarage-olympisch-stadion\/","urltitle":"P1 Parkeergarage Olympisch Stadion ","adres":"Olympisch Stadion 44","postcode":"1076DE","woonplaats":"Amsterdam","opmerkingen":""}},{"parkeerlocatie":{"title":"Interparking Oranjekwartier Amsterdam","Locatie":"{\"type\":\"Point\",\"coordinates\":[4.839149,52.3546448]}","type":"CommercieleParkeergarage","url":"http:\/\/www.interparking.nl\/nl-NL\/find-parking\/Oranjekwartier\/","urltitle":"Interparking Oranjekwartier Amsterdam","adres":"Carnapstraat 200","postcode":"1062KZ","woonplaats":"Amsterdam","opmerkingen":""}},{"parkeerlocatie":{"title":"Bomengarage P2 (Boven 't IJ) ","Locatie":"{\"type\":\"Point\",\"coordinates\":[4.9388323,52.3994577]}","type":"Parkeergarage","url":"https:\/\/www.amsterdam.nl\/parkeren-verkeer\/parkeergarages\/parkeergarages\/garage-p2bomengarage\/","urltitle":"Bekijk Bomengarage P2 op www.amsterdam.nl\/parkeergarages","adres":"Buikslotermeerplein 237","postcode":"1025 XB","woonplaats":"Amsterdam","opmerkingen":""}},{"parkeerlocatie":{"title":"Qpark Westergasfabriek","Locatie":"{\"type\":\"Point\",\"coordinates\":[4.8662388,52.3847072]}","type":"CommercieleParkeergarage","url":"https:\/\/www.q-park.nl\/nl-nl\/parkeren\/amsterdam\/westergasfabriek\/","urltitle":"Qpark Amsterdam Westergasfabriek","adres":"Van Bleiswijkstraat 8","postcode":"1051DG","woonplaats":"Amsterdam","opmerkingen":""}},{"parkeerlocatie":{"title":"Qpark Europarking","Locatie":"{\"type\":\"Point\",\"coordinates\":[4.8766781,52.3699218]}","type":"CommercieleParkeergarage","url":"https:\/\/www.q-park.nl\/nl-nl\/parkeren\/amsterdam\/europarking\/","urltitle":"Qpark Europarking","adres":"Marnixstraat 250","postcode":"1016TL","woonplaats":"Amsterdam","opmerkingen":""}},{"parkeerlocatie":{"title":"Qpark Byzantium","Locatie":"{\"type\":\"Point\",\"coordinates\":[4.8793897,52.3618422]}","type":"CommercieleParkeergarage","url":"https:\/\/www.q-park.nl\/nl-nl\/parkeren\/amsterdam\/byzantium\/","urltitle":"Qpark Amsterdam Byzantium","adres":"Tesselschadestraat 1","postcode":"1054ET","woonplaats":"Amsterdam","opmerkingen":""}},{"parkeerlocatie":{"title":"Piet Heingarage","Locatie":"{\"type\":\"Point\",\"coordinates\":[4.9173751,52.3773883]}","type":"Parkeergarage","url":"https:\/\/www.amsterdam.nl\/parkeren-verkeer\/parkeergarages\/parkeergarages\/parkeergarage-piet\/","urltitle":"Bekijk Piet Heingarage op www.amsterdam.nl\/parkeergarages","adres":"Piet Heinkade 59","postcode":"1019GM","woonplaats":"Amsterdam","opmerkingen":""}},{"parkeerlocatie":{"title":"Parking Centrum Oosterdok","Locatie":"{\"type\":\"Point\",\"coordinates\":[4.9092051,52.3761913]}","type":"CommercieleParkeergarage","url":"http:\/\/www.parkingcentrumoosterdok.nl\/","urltitle":"Parking Centrum Oosterdok","adres":"Oosterdoksstraat 150","postcode":"1011AD","woonplaats":"Amsterdam","opmerkingen":""}},{"parkeerlocatie":{"title":"Markenhoven","Locatie":"{\"type\":\"Point\",\"coordinates\":[4.908618,52.3696328]}","type":"Parkeergarage","url":"https:\/\/www.amsterdam.nl\/parkeren-verkeer\/parkeergarages\/parkeergarages\/garage-markenhoven\/","urltitle":"Bekijk Markenhoven op www.amsterdam.nl\/parkeergarages","adres":"Anne Frankstraat 220","postcode":"1011 MP","woonplaats":"Amsterdam","opmerkingen":""}},{"parkeerlocatie":{"title":"(P1) Parking Waterlooplein","Locatie":"{\"type\":\"Point\",\"coordinates\":[4.9043352,52.3689665]}","type":"CommercieleParkeergarage","url":"http:\/\/www.parkereninwaterlooplein.nl\/","urltitle":"Parkeergarage Waterlooplein ","adres":"Valkenburgerstraat 238","postcode":"1011ND","woonplaats":"Amsterdam","opmerkingen":""}},{"parkeerlocatie":{"title":"Stadhuis - Muziektheater","Locatie":"{\"type\":\"Point\",\"coordinates\":[4.9018035,52.3670615]}","type":"Parkeergarage","url":"https:\/\/www.amsterdam.nl\/parkeren-verkeer\/parkeergarages\/parkeergarages\/garage-stadhuis\/","urltitle":" Bekijk Stadhuis-Muziektheater op www.amsterdam.nl\/parkeergarages","adres":"Waterlooplein 28","postcode":"1011PG","woonplaats":"Amsterdam","opmerkingen":""}},{"parkeerlocatie":{"title":"Parkeergarage Prins & Keizer","Locatie":"{\"type\":\"Point\",\"coordinates\":[4.891798,52.3622906]}","type":"CommercieleParkeergarage","url":"http:\/\/www.apcoa.nl\/parkeren-in\/amsterdam\/apcoa-parking-prins-keizer.html","urltitle":"Apcoa Parking Prins & Keizer","adres":"Prinsengracht 927","postcode":"1017HL","woonplaats":"Amsterdam","aantal":"140","opmerkingen":""}},{"parkeerlocatie":{"title":"Qpark De Bijenkorf","Locatie":"{\"type\":\"Point\",\"coordinates\":[4.895162,52.373881]}","type":"CommercieleParkeergarage","url":"https:\/\/www.q-park.nl\/nl-nl\/parkeren\/amsterdam\/de-bijenkorf\/","urltitle":"Qpark De Bijenkorf","adres":"Beursplein 15","postcode":"1012JW","woonplaats":"Amsterdam","opmerkingen":""}},{"parkeerlocatie":{"title":"Qpark Nieuwendijk","Locatie":"{\"type\":\"Point\",\"coordinates\":[4.8944693,52.3764423]}","type":"CommercieleParkeergarage","url":"https:\/\/www.q-park.nl\/nl-nl\/parkeren\/amsterdam\/nieuwendijk\/","urltitle":"Qpark Nieuwendijk","adres":"Nieuwezijds Kolk 18","postcode":"1012PV","woonplaats":"Amsterdam","opmerkingen":""}},{"parkeerlocatie":{"title":"(P1) Parking Amsterdam Centre","Locatie":"{\"type\":\"Point\",\"coordinates\":[4.8970068,52.3785141]}","type":"CommercieleParkeergarage","url":"http:\/\/www.p1.nl\/parkeren\/p1-parking-amsterdam-centre\/","urltitle":"P1 Parking Amsterdam Centre ","adres":"Prins Hendrikkade 20 a","postcode":"1012TL","woonplaats":"Amsterdam","opmerkingen":""}},{"parkeerlocatie":{"title":"Parkeergarage Apcoa Heinekenplein","Locatie":"{\"type\":\"Point\",\"coordinates\":[4.8924871,52.3571537]}","type":"CommercieleParkeergarage","url":"http:\/\/www.apcoa.nl\/parkeren-in\/amsterdam\/apcoa-parking-heinekenplein.html","urltitle":"Apcoa garage Heinekenplein ","adres":"Eerste Van der Helststraat 6","postcode":"1072NV","woonplaats":"Amsterdam","opmerkingen":""}},{"parkeerlocatie":{"title":"Qpark Museumplein","Locatie":"{\"type\":\"Point\",\"coordinates\":[4.8798246,52.3571347]}","type":"CommercieleParkeergarage","url":"https:\/\/www.q-park.nl\/nl-nl\/parkeren\/amsterdam\/museumplein\/","urltitle":"Qpark Museumplein ","adres":"Van Baerlestraat 33 B","postcode":"1071AP","woonplaats":"Amsterdam","opmerkingen":""}},{"parkeerlocatie":{"title":"P4 en P5 Villa ArenA","Locatie":"{\"type\":\"Point\",\"coordinates\":[4.9389632,52.3118578]}","type":"Parkeergarage","url":"https:\/\/www.amsterdam.nl\/parkeren-verkeer\/parkeergarages\/parkeergarages\/","urltitle":"Bekijk P4 en P5 Villa ArenA op www.amsterdam.nl\/parkeergarages ","adres":"De entree 7","postcode":"1101BH","woonplaats":"Amsterdam","opmerkingen":""}},{"parkeerlocatie":{"title":"P4 en P5 Villa ArenA","Locatie":"{\"type\":\"Point\",\"coordinates\":[4.9389632,52.3118578]}","type":"Parkeergarage","url":"https:\/\/www.amsterdam.nl\/parkeren-verkeer\/parkeergarages\/parkeergarages\/","urltitle":"Bekijk P4 en P5 Villa ArenA op www.amsterdam.nl\/parkeergarages","adres":"De entree 7","postcode":"1101BH","woonplaats":"Amsterdam","opmerkingen":""}}
The current error message is "TypeError: string indices must be integers" but i think it should give all the titles of the parkeerlocatie.
parkeerlocaties['parkeerlocatie'] is not a list, it's a dictionary. You should use parkeerlocaties['parkeerlocaties']['title']. And the title is a string, there's no reason to iterate over it (unless you want to process it character by character for some reason).
with open('locaties.json') as json_file:
data = json.load(json_file)
for parkeerlocaties in data['parkeerlocaties']:
print('Title: ', parkeerlocaties['parkeerlocaties']['title'])
>>> type(data)
<type 'dict'>
>>> data['parkeerlocaties']
<type 'list'>
So the code could be
import urllib, json
import requests
import json
with open('locaties.json') as json_file:
data = json.load(json_file)
parkeerlocaties = data['parkeerlocaties']
for parkeerlocatie in parkeerlocaties:
print(parkeerlocatie['parkeerlocatie']['title'])

Python way to detect language ISO code

I have millions of sentence fragments and I am trying to determine if each is in English, French, Japanese, or Germ. Is there a python program to do this?
s1 = 'This is where lies a person'
s2 = 'ボウリング・フォー・コロンバイン(字幕版)'
s3 = 'Ep. 2448 : épisode du 12 mars 2014 (Plus belle la vie, Saison 10, Vol. 6)
language_of_string(s1) ==> EN
language_of_string(s2) ==> JP
language_of_string(s3) ==> FR
try langid with source code
https://github.com/saffsd/langid.py
>>> import langid
>>> langid.classify("This is a test")
('en', 0.99999999099035441)
guess_language
s1 = 'This is where lies a person'
s2 = 'ボウリング・フォー・コロンバイン(字幕版)'
s3 = 'Ep. 2448 : épisode du 12 mars 2014 (Plus belle la vie, Saison 10, Vol. 6)'
import guess_language
print guess_language.guessLanguage(s1)
print guess_language.guessLanguage(s2)
print guess_language.guessLanguage(s3)
en
ja
fr

Categories