How to sort the websites by their popularity? - python

Im using the script currently and i cant seem to find out a way to sort the Websites by their popularity, im a beginner.
import random
# création d'un dictionnaire Hypertexte
Hypertext = {}
# création d'un dictionnaire pour le nombre de visite
Walk_Number = {}
# une variable pour le nombre total de visite
Total_Walk = 0
#liste des sites web
Websites = ["A","B","C","D","E","F"]
# les liens hypertextes
# le dictionnaire possède des clés ( nom des sites)
# Qui contiennent des listes (liens hypertextes)
Hypertext["A"] = ["B","C","E"]
Hypertext["B"] = ["F"]
Hypertext["C"] = ["A","E"]
Hypertext["D"] = ["B","C"]
Hypertext["E"] = ["A","B","C","D","F"]
Hypertext["F"] = ["E"]
print(Hypertext)
# On initialise à 0.0 les visites des sites
Walk_Number["A"] = 0.0
Walk_Number["B"] = 0.0
Walk_Number["C"] = 0.0
Walk_Number["D"] = 0.0
Walk_Number["E"] = 0.0
Walk_Number["F"] = 0.0
i = 0
while i < 1000:
x = random.choice(Websites)
while random.random() < 0.85:
Walk_Number[x] = Walk_Number[x] + 1
Total_Walk = Total_Walk + 1
x = random.choice(Hypertext[x])
i = i + 1
print (Walk_Number)
print(Total_Walk)
I tried using the sort() function but i cant seem to find a way to sort it into the script

I think by popularity you mean the number of visits that you have saved in your Walk_Number dictionary. If you want to resort your dictionary based on values with a descending order you can do it like this:
def sort_dict_by_value(d, reverse=False):
return dict(sorted(d.items(), key=lambda x: x[1], reverse=reverse))
print(sort_dict_by_value(Walk_Number, True))

Related

convert a tuple of tuples into bytes in python

I'm making a video game in upbge (which for this aspect is basically python) but to send the info between sockets it needs to be in bytes but i only found a way to do it turning it into a string, (because the list get's created with the game (i have a script that once a new entity appears it get's added to the list in the form of (Player-entity|data) but i want the data to variate between entities and i ran out of symbols to use to split it into lists so i'm trying to send the data as a list to be able to simply check it's components, but as it's a list inside of a list (and sometimes more) the bytearray won't work (at least i can't figure out how to make it work)
I'm unsure of how to summarize the code, but this is the code that I have now to generate that info
import socket
mi_socket = socket.socket()
mi_socket.bind(('127.0.0.1',55555))
mi_socket.listen()
TP = 0 #Total players
AP = 0 #Actual player
PL = ["host"] #Player List
AE = 0 #Actual entity
TE = 0 #Total entities
EL = ["|PlayerEntity$[Data]|"] #Entity list
PI = [] #Player Intel (Player name, Total Players, Player Code) (apartado uno en las comunicaciones)
order = None #Variable reservada para guardar comandos u identificaciones (ej: nombre de la entidad) en las comunicaciones
content = None #Variable reservada para almacenar la información recibida
incoming = 0 #Variable reservada para analizar el mensaje recibido de entrada
def login():
global AP
global TP
global PL
PJ = str(incoming).split("$")[1] #incoming Player
if AP <= TP:
if PJ!=str(PL[int(AP)]): #Nombre jugador que se conecta
AP+=1
login()
pass
else:
identity()
pass
if AP > TP:
PL.append(str("Player" + str(AP))) #Agregar jugador a la lista
TP = int(AP)
def identity(): #identifica de qué se trata la orden para responder en consecuencia, si se trata de entidad anota tmb la entidad en la lista
global TE
global AE
global EL
global AP
global PL
PJ = str(incoming).split("$")[1] # incoming Player
PE = str(incoming).split("$")[2] # incoming entity
if AE <= TE:
# Si nombre de jugador|nombre de objeto no és EL[AE]([1]|[2])
if str(str(PL[AP])+"-"+str(PE)) != str(str(EL[AE]).split("|")[1]).split("$")[0]:
AE+=1
identity()
pass
else:
EL[AE] = "|"+PL[AP]+"-"+PE+"$"+str(incoming).split("$")[3]+"|"
if AE > TE:
EL.append("|"+PL[AP]+"-"+PE+"$"+str(incoming).split("$")[3]+"|")
TE = int(AE)
def main():
global AP
global AE
global TE
global incoming
conexion, ip = mi_socket.accept()
incoming = conexion.recv(1024)
login()
conexion.send(str("#"+str(PL[AP])+"#"+str(EL[1:TE+1])+"#").encode())
AP=0
AE=0
conexion.close()
while True:
main()

Python Function Exception Error

I've writen a code you'll find below but it doesn't work. I get an indentation error. This is the message I've received from cmd.
Can you tell me where I've done the mistake?
loan = input("indiquer le montant de l'emprunt")
loan = float(loan)
duree = input("indiquer la durée de l'emprunt en mois")
duree = int(duree)
principal = loan / duree
tauxinteret = input("mettre le taux d'intéret")
tauxinteret = float(tauxinteret)
interets = principal * tauxinteret
assurance = input("indiquer le montant des assurances")
assurance = float(assurance)
mensualite = principal + interets + assurance
return mensualite
print("Mr l'esclave, la mensualité à payer s'élève à {} dirhams". format (mensualite)")
calcul_mensualite(mensualite)
Have a look at defining functions in Python (seems like you tried to do that) - after def you need to indent next lines:
def my_function():
print("Hello from a function")
my_function()
And here's the link to the basics:
functions
You seem to have left out the define statement of the method. You will again get an indentation error because you have included a return statement outside of a method. The following code should work fine after removing the return statement.
loan = input("indiquer le montant de l'emprunt")
loan = float(loan)
duree = input("indiquer la durée de l'emprunt en mois")
duree = int(duree)
principal = loan / duree
tauxinteret = input("mettre le taux d'intéret")
tauxinteret = float(tauxinteret)
interets = principal * tauxinteret
assurance = input("indiquer le montant des assurances")
assurance = float(assurance)
mensualite = principal + interets + assurance
print("Mr l'esclave, la mensualité à payer s'élève à {} dirhams". format (mensualite)")
Sorry for you all. there is one missing line in my code:
def calcul_mensualite (loan, principal, duree,tauxinteret, interets, assurance, mensualite):

correct way to implement thread python

I am implement a simple simulator of soccer with python using threads and lock, the app works fine but I have doubts in the way that implement the thread it seems to me that the first team has an advantage because is executing first.
def jugar(Equipo1, Equipo2):
# Busco las probabilidades de encajar por cada equipo
prob_encajar_eq1 = Equipo1.probabilidad_encajar()
prob_encajar_eq2 = Equipo2.probabilidad_encajar()
def jugar_equipo1(defensa_rival):
semaforo.acquire()
if Equipo1.hacer_pases():
Equipo1.shoot(defensa_rival)
semaforo.release()
def jugar_equipo2(defensa_rival):
semaforo.acquire()
if Equipo2.hacer_pases():
Equipo2.shoot(defensa_rival)
semaforo.release()
hilo_equipo1 = threading.Thread(name = 'hilo_eq1', target = jugar_equipo1, args = (prob_encajar_eq2,))
hilo_equipo2 = threading.Thread(name = 'hilo_eq2', target = jugar_equipo2, args = (prob_encajar_eq1,))
hilo_equipo1.start()
hilo_equipo2.start()
hilo_equipo1.join()
hilo_equipo2.join()
to make several attempts both teams, I do a cicle for a few seconds and inside the function jugar() which is the one that does the work with threads but here is were I have the doubts, because every time that jugar is executing the threads are declared again.
if __name__ == '__main__':
cargar_informacion()
eqA = Equipo(equipoA, ranking_eqA)
eqB = Equipo(equipoB, ranking_eqB)
probabilidades = porcenajes_ranking(ranking_eqA)
eqA.cargar_probabilidades(probabilidades)
probabilidades = porcenajes_ranking(ranking_eqB)
eqB.cargar_probabilidades(probabilidades)
starttime=time.time()
tiempo = 0
# creo la barra de progreso
bar = progressbar.ProgressBar(widgets=[
progressbar.Percentage(),
progressbar.Bar(),
], max_value=100).start()
# hacemos que el juego tarde aproximadamente 10seg en simularse.
while tiempo < 10:
time.sleep(0.3 - ((time.time() - starttime) % 0.3))
jugar(eqA,eqB)
tiempo = time.time() - starttime
bar += 2.8
bar.finish() # Para que finalice la barra de progreso
resultados_finales(eqA, eqB) # Mostramos el resultado final del partido.

Pandas + Python: More efficient code

This is my code:
import pandas as pd
import os
import glob as g
archivos = g.glob('C:\Users\Desktop\*.csv')
for archiv in archivos:
nombre = os.path.splitext(archiv)[0]
df = pd.read_csv(archiv, sep=",")
d = pd.to_datetime(df['DATA_LEITURA'], format="%Y%m%d")
df['FECHA_LECTURA'] = d.dt.date
del df['DATA_LEITURA']
df['CONSUMO']=""
df['DIAS']=""
df["SUMDIAS"]=""
df["SUMCONS"]=""
df["CONSANUAL"] = ""
ordenado = df.sort_values(['NR_CPE','FECHA_LECTURA', 'HORA_LEITURA'], ascending=True)
##Agrupamos por el CPE
agrupado = ordenado.groupby('NR_CPE')
for name, group in agrupado: #Recorremos el grupo
indice = group.index.values
inicio = indice[0]
fin = indice[-1]
#Llenamos la primeras lectura de cada CPE, con esa lectura (porque no hay una lectura anterior)
ordenado.CONSUMO.loc[inicio] = 0
ordenado.DIAS.loc[inicio] = 0
cont=0
for i in indice: #Recorremos lo que hay dentro de los grupos, dentro de los CPES(lecturas)
if i > inicio and i <= fin :
cont=cont+1
consumo = ordenado.VALOR_LEITURA[indice[cont]] - ordenado.VALOR_LEITURA[indice[cont-1]]
dias = (ordenado.FECHA_LECTURA[indice[cont]] - ordenado.FECHA_LECTURA[indice[cont-1]]).days
ordenado.CONSUMO.loc[i] = consumo
ordenado.DIAS.loc[i] = dias
# Hago las sumatorias, el resultado es un objeto DataFrame
dias = agrupado['DIAS'].sum()
consu = agrupado['CONSUMO'].sum()
canu = (consu/dias) * 365
#Contador con el numero de courrencias de los campos A,B y C
conta=0
contb=0
contc=0
#Como es un DF, para recorrerlo tengo que iterar sobre ellos para hacer la comparacion
print "Grupos:"
for ind, sumdias in dias.iteritems():
if sumdias <= 180:
grupo = "A"
conta=conta+1
elif sumdias > 180 and sumdias <= 365:
grupo = "B"
contb=contb+1
elif sumdias > 365:
grupo = "C"
contc=contc+1
print "grupo A: " , conta
print "grupo B: " , contb
print "grupo C: " , contc
#Formateamos los campos para no mostrar todos los decimales
Fdias = dias.map('{:.0f}'.format)
Fcanu = canu.map('{:.2f}'.format)
frames = [Fdias, consu, Fcanu]
concat = pd.concat(frames,axis=1).replace(['inf','nan'],[0,0])
with open('C:\Users\Documents\RPE_PORTUGAL\Datos.csv','a') as f:
concat.to_csv(f,header=False,columns=['CPE','DIAS','CONSUMO','CONSUMO_ANUAL'])
try:
ordenado.to_excel(nombre+'.xls', columns=["NOME_DISTRITO",
"NR_CPE","MARCA_EQUIPAMENTO","NR_EQUIPAMENTO","VALOR_LEITURA","REGISTADOR","TIPO_REGISTADOR",
"TIPO_DADOS_RECOLHIDOS","FACTOR_MULTIPLICATIVO_FINAL","NR_DIGITOS_INTEIRO","UNIDADE_MEDIDA",
"TIPO_LEITURA","MOTIVO_LEITURA","ESTADO_LEITURA","HORA_LEITURA","FECHA_LECTURA","CONSUMO","DIAS"],
index=False)
print (archiv)
print ("===============================================")
print ("*****Se ha creado el archivo correctamente*****")
print ("===============================================")
except IOError:
print ("===================================================")
print ("¡¡¡¡¡Hubo un error en la escritura del archivo!!!!!")
print ("===================================================")
This takes a file where I have lectures of energy consumption from different dates for every light meter('NR_CPE') and do some calculations:
Calculate the energy consumption for every 'NR_CPE' by substracting the previous reading with the next one and the result put in a new column named 'CONSUMO'.
Calculate the number of days where I'v got a reading and sum up the number of days
Add the consumption for every 'NR_CPE' and calculate the anual consumption.
Finally I want to classify by number of days that every light meter('NR_CPE') has a lecture. A if it has less than 180 days, B between 180 and 1 year and C more than a year.
And finally write this result in two differents files.
Any idea of how should I re-code this to have the same output and be faster?
Thank you all.
BTW this is my dataset:
,NOME_DISTRITO,NR_CPE,MARCA_EQUIPAMENTO,NR_EQUIPAMENTO,VALOR_LEITURA,REGISTADOR,TIPO_REGISTADOR,TIPO_DADOS_RECOLHIDOS,FACTOR_MULTIPLICATIVO_FINAL,NR_DIGITOS_INTEIRO,UNIDADE_MEDIDA,TIPO_LEITURA,MOTIVO_LEITURA,ESTADO_LEITURA,DATA_LEITURA,HORA_LEITURA
0,GUARDA,A002000642VW,101,1865411,4834,001,S,1,1,4,kWh,1,1,A,20150629,205600
1,GUARDA,A002000642VW,101,1865411,4834,001,S,1,1,4,kWh,2,2,A,20160218,123300
2,GUARDA,A002000642VJ,122,204534,25083,001,S,1,1,5,kWh,1,1,A,20150629,205700
3,GUARDA,A002000642VJ,122,204534,27536,001,S,1,1,5,kWh,2,2,A,20160218,123200
4,GUARDA,A002000642HR,101,1383899,11734,001,S,1,1,5,kWh,1,1,A,20150629,205600
5,GUARDA,A002000642HR,101,1383899,11800,001,S,1,1,5,kWh,2,2,A,20160218,123000
6,GUARDA,A002000995VM,101,97706436,12158,001,S,1,1,5,kWh,1,3,A,20150713,155300
7,GUARDA,A002000995VM,101,97706436,12163,001,S,1,1,5,kWh,2,2,A,20160129,162300
8,GUARDA,A002000995VM,101,97706436,12163,001,S,1,1,5,kWh,2,2,A,20160202,195800
9,GUARDA,A2000995VM,101,97706436,12163,001,S,1,1,5,kWh,1,3,A,20160404,145200
10,GUARDA,A002000996LV,168,5011703276,3567,001,V,1,1,6,kWh,1,1,A,20150528,205900
11,GUARDA,A02000996LV,168,5011703276,3697,001,V,1,1,6,kWh,2,2,A,20150929,163500
12,GUARDA,A02000996LV,168,5011703276,1287,002,P,1,1,6,kWh,1,1,A,20150528,205900
Generally you want to avoid for loops in pandas.
For example, the first loop where you calculate total consumption and days could be rewritten as a groupby apply something like:
def last_minus_first(df):
columns_of_interest = df[['VALOR_LEITURA', 'days']]
diff = columns_of_interest.iloc[-1] - columns_of_interest.iloc[0]
return diff
df['date'] = pd.to_datetime(df['DATA_LEITURA'], format="%Y%m%d")
df['days'] = (df['date'] - pd.datetime(1970,1,1)).dt.days # create days column
df.groupby('NR_CPE').apply(last_minus_first)
(btw I don't understand why you are subtracting each entry from the previous, surely for meter readings this is the same as last-first?)
Then given the result of the above as consumption, you can replace your second for loop (for ind, sumdias in dias.iteritems()) with something like:
pd.cut(consumption.days, [-1, 180, 365, np.inf], labels=['a', 'b', 'c']).value_counts()

Error when I try to iterate more than once

I've got this program which calculate k-means for IA
#! /usr/bin/env python
# -*- coding: utf-8 -*-
from random import sample
from itertools import repeat
from math import sqrt
# Parametros
k = 6
maxit = 2
def leeValoracionesFiltradas (nomFichero = "valoracionesFiltradas.data"):
lineas = [(l.strip()).split("\t") for l in (open(nomFichero).readlines())]
diccio = {}
for l in lineas:
diccio[int(l[0])] = {}
for l in lineas:
diccio[int(l[0])][int(l[1])] = (float(l[2]),float(l[3]))
return diccio
def distEuclidea(dic1, dic2):
# Se calcula la suma de cuadrados de los elementos comunes a los dos diccionarios
sum2 = sum([pow(dic1[elem]-dic2[elem], 2)
for elem in dic1 if elem in dic2])
return sqrt(sum2)
def similitudEuclidea(dic1, dic2):
return 1/(1+distEuclidea(dic1, dic2))
def coefPearson(dic1, dic2):
# Se consiguen los elementos comunes en ambos diccionarios
comunes = [x for x in dic1 if x in dic2]
nComunes = float(len(comunes))
# Si no hay comunes -> cero
if nComunes==0:
return 0
# Calculo de las medias de cada diccionario
media1 = sum([dic1[x][1] for x in comunes]) / nComunes
media2 = sum([dic2[x][1] for x in comunes]) / nComunes
# Numerador y denominador
num = sum([(dic1[x][1] - media1) * (dic2[x][1] - media2) for x in comunes])
den1 = sqrt(sum([pow(dic1[x][1] - media1, 2) for x in comunes]))
den2 = sqrt(sum([pow(dic2[x][1] - media2, 2) for x in comunes]))
den = den1 * den2
# Caculo del coeficiente
if den==0:
return 0
return num/den
# Dado un diccionario {key1 : {key2 : valor}} calcula el agrupamiento k-means
# con k clusters (grupo), ejecutando maxit iteraciones, con la funcion de similitud especificada
# Retorna una tupla
# -{key1:numero de clusters} con las asignaciones de clusters (a que clusters pertenece cada elemento)
# -[{key2:valores}] una lista con los k centroides (media de los valores para cada clusters)
def kmeans (diccionario, k, maxit, similitud = coefPearson):
# K puntos aleatorios son elegidos como centroides incialmente
# Cada centroide es {key2 : valor}
centroides = [diccionario[x] for x in sample(diccionario.keys(), k)]
# Se asigna cada key1 a un numero de cluster
previo = None
asignacion = {}
# En cada iteracion se asignan puntos a los centroides y se calculan nuevos centroides
for it in range(maxit):
# Se asignan puntos a los centroides mas cercanos
for key1 in diccionario:
similitudes = map(similitud,repeat(diccionario[key1],k), centroides)
asignacion[key1] = similitudes.index(max(similitudes))
# Si no hay cambios en la asignacion, se termina
if previo == asignacion: break
previo = asignacion
# Se recalculan los centroides (se anotan los valores de cada key a cada centroide)
valores = {x : {} for x in range(k)}
contadores = {x : {} for x in range(k)}
for key1 in diccionario:
grupo = asignacion[key1]
for key2 in diccionario[key1]:
if not valores[grupo].has_key(key2):
valores [grupo][key2] = 0
contadores [grupo][key2] = 0
valores [grupo][key2] += diccionario[key1][key2][1]
contadores[grupo][key2] += 1
# Se calculan las medias (nuevos centroides)
centroides = []
for grupo in valores:
centro = {}
for key2 in valores[grupo]:
centro[key2] = round((valores[grupo][key2] / contadores[grupo][key2]),2)
centroides.append(centro)
if None in centroides: break
return (asignacion, centroides)
# Se obtiene el diccionario de valoraciones (las valoraciones ya han sido filtradas)
diccionario = leeValoracionesFiltradas()
# Se obtienen las asignaciones y los centroides con la correlacion de Pearson
tupla = kmeans (diccionario, k, maxit)
asignaciones = tupla[0]
centroids = tupla[1]
print asignaciones
print centroids
And when I execute this for example for maxit = 2, it throws:
File "kmeans_dictio.py", line 46, in coefPearson
media2 = sum([dic2[x][1] for x in comunes]) / nComunes
TypeError: 'float' object has no attribute '__getitem__'
How can I fix this?
It looks like you have a dictionary (dic2) of floats and a dictionary of dictionaries of floats (dic1) that you are pulling an item out of with this line:
comunes = [x for x in dic1 if x in dic2]
Then you are trying to iterate over this float here:
media2 = sum([dic2[x][1] for x in comunes]) / nComunes
To fix this look at dic1 and dic2 and how they are defined.

Categories