Trying to Generate an NFT Using .CSV Metadata and Pandas - python

I have been scratching my head at generating my actual NFT's from a .csv file for a long time and looking for resources has been challenging at the very least for my Hardcoding Method (Following a Guide) If Anyone could Look through what I have and Offer some Help Figuring out what's going on I would be FOREVER Endebted to you!
def generateOneRandRow(ADATvID):
FILENAME = "ADA Tv" + str(ADATvID)
NO = ADATvID
BACKGROUND = randBackground()
ACCESSORIES = randAccessories()
HEAD = randHead()
HAT = randHat()
BODY = randBody()
CHEST = randChest()
ARMS = randArms()
FACE = randFace()
singleRow = [FILENAME,NO,BACKGROUND,ACCESSORIES,HEAD,HAT,BODY,CHEST,ARMS,FACE]
testThisRow =["ADA Tv2925","2925","cnft","couchbear","bnw","mullet","damagedorange","bluesuit","greenlightsaber","inlove"]
def checkIfExists(checkRow):
aData = pd.read_csv('adalist.csv')
index_list = aData[(aData['Background'] == checkRow[2])] & (aData['Accessories'] == checkRow[3]) & (aData['Head'] == checkRow[4]) & (aData['Hat'] == checkRow[5]) & (aData['Body'] == checkRow[6]) &(aData['Chest'] ==checkRow[7]) & (aData['Arms'] ==checkRow[7]) & (aData['Face'] == checkRow[8]).index.tolist()
print(index_list)
if index_list == []:
return False
else:
return True
checkIfExists(testThisRow)
Error Messages... Help a Python Noob Out Please! and Feel Free To FLAME Me If It's Super Obvious. THANKS!!

Change:
index_list = aData[(aData['Background'] == checkRow[2])] & (aData['Accessories'] == checkRow[3]) & (aData['Head'] == checkRow[4]) & (aData['Hat'] == checkRow[5]) & (aData['Body'] == checkRow[6]) &(aData['Chest'] ==checkRow[7]) & (aData['Arms'] ==checkRow[7]) & (aData['Face'] == checkRow[8]).index.tolist()
to:
index_list = aData[(aData['Background'] == checkRow[2])
& (aData['Accessories'] == checkRow[3]) &
(aData['Head'] == checkRow[4]) &
(aData['Hat'] == checkRow[5]) &
(aData['Body'] == checkRow[6]) &
(aData['Chest'] ==checkRow[7]) &
(aData['Arms'] ==checkRow[7]) &
(aData['Face'] == checkRow[8])].index.tolist()
Because you did not provide data, i reproduced your error as follows:
df = pd.DataFrame({'a':[1,'2'], 'b':[5,6]})
df[(df['a']=='2')]&(df['b']==6).index.tolist()
With error:
TypeError: unsupported operand type(s) for &: 'str' and 'int'
Editing the brackets:
df = pd.DataFrame({'a':[1,'2'], 'b':[5,6]})
df[(df['a']=='2')&(df['b']==6)].index.tolist()
With no error.

Related

Optimize dataframe filtering on large datasets, pandas

I have a little challenge here and to be honest, I have absolutely no idea how to handle it.
I have this dataframe composed of 660,000 rows and about 50 columns. I need to filter this dataframe very frequently and retrieve the filtered dataframe as fast as possible (goal is to have a processing time <1second). I'd like to be able to run that locally on a laptop, therefore my "processing power" is limited.
I have multiple inputs to filter the dataframe, some are set manually (see input 1) some are retrieved from another script (see input 2, the other script is not included in the code here for simplicity).
I was hoping to simple filter through the dataset using df[(df.column == filtervalue)]. However, it seems that the processing time is way too long.
Therefore, I am wondering whether there are some technics to optimize such processing time or if on the contrary the only way to optimize that is to go with a server that has a good CPU / Memory capacity?
Thanks for the help
import pandas as pd
df = pd.read_csv('xxxxxxxx', sep=";", dtype={"id": str,"dataset1": str,"dataset2":str,"myposition":str,"bet_1_preflop":float,"bet_2_preflop":float,"bet_3_preflop":float,"bet_1_flop":float,"bet_2_flop":float,
"bet_3_flop":float,"bet_1_turn":float ,"bet_2_turn":float,"bet_3_turn":float,"bet_1_river":float,"bet_2_river":float, "bet_3_river":float,
"myhand":str,"myposition":str,"cards_flop":str,"cards_turn":str,"cards_river":str,"action1_preflop":str," action2_preflop":str,
"action3_preflop":str,"action4_preflop":str, "action1_flop":str, "action2_flop":str, "action3_flop":str,"action4_flop":str,"action1_turn":str,
"action2_turn":str, "action3_turn":str, "action4_turn":str, "action1_river":str,"action2_river":str, "action3_river":str, "action4_river":str,
"action1_preflop_binary":'Int64', "action2_preflop_binary":'Int64', "action3_preflop_binary":'Int64', "action4_preflop_binary":'Int64',
"action1_flop_binary":'Int64',"action2_flop_binary":'Int64', "action3_flop_binary":'Int64', "action4_flop_binary":'Int64', "action1_turn_binary":'Int64',
"action2_turn_binary":'Int64', "action3_turn_binary":'Int64', "action4_turn_binary":'Int64',"action1_river_binary":'Int64', "action2_river_binary":'Int64',
"action3_river_binary":'Int64', "action4_river_binary":'Int64', "tiers":'Int64',"assorties":str,
"besthand_flop":str,"checker_flop":float,"handtype_flop":str,"topsuite_flop":'Int64',"topcolor_flop":'Int64',"besthand_turn":str,"checker_turn":float,"handtype_turn":str,
"topsuite_turn":'Int64',"topcolor_turn":'Int64',"besthand_river":str,"checker_river":float,"handtype_river":str,"topsuite_river":'Int64',"topcolor_river":'Int64'})
df = df.reset_index()
#Inputs for filters 1
myposition ="sb"
myhand = "ackc"
flop = "ad9d4h"
turn = "8d"
river = "th"
a1_preflop = "r"
a2_preflop = "r"
a3_preflop = "c"
a4_preflop = ""
a1_flop = "r"
a2_flop = "f"
a3_flop = ""
a4_flop = ""
a1_turn = ""
a2_turn = ""
a3_turn = ""
a4_turn = ""
a1_river = ""
a2_river = ""
a3_river = ""
a4_river = ""
#Inputs for filters 2 (from a different script)
tiers
assorties_status
best_allhands_flop[0]
best_allhands_flop[1]
best_allhands_flop[2]
highest_suite_flop
highest_color_flop
best_allhands_turn[0]
best_allhands_turn[1]
best_allhands_turn[2]
highest_suite_turn
highest_color_turn
best_allhands_river[0]
best_allhands_river[1]
best_allhands_river[2]
highest_suite_river
highest_color_river
#filtre_preflop_a1 = df[(df.myposition == myposition) & (df.tiers == tiers) & (df.assorties == assorties_status) & (df.action1_preflop == a1_preflop)]
#filtre_preflop_a2 = df[(df.myposition == myposition) & (df.tiers == tiers) & (df.assorties == assorties_status) & (df.action1_preflop == a1_preflop) & (df.action2_preflop == a2_preflop)]
#filtre_preflop_a3 = df[(df.myposition == myposition) & (df.tiers == tiers) & (df.assorties == assorties_status) & (df.action1_preflop == a1_preflop) & (df.action2_preflop == a2_preflop) & (df.action3_preflop == a3_preflop)]
#filtre_preflop_a4 = df[(df.myposition == myposition) & (df.tiers == tiers) & (df.assorties == assorties_status) & (df.action1_preflop == a1_preflop) & (df.action2_preflop == a2_preflop) & (df.action3_preflop == a3_preflop) & (df.action4_preflop == a4_preflop)]

Another Traceback Error When I Run My Python Code

I have a new Traceback Error When, I run my Python Code. It appears to be to do with the very last ) Parentheses, also maybe the last ] in my Code.
((df['Location'].str.contains('- Display')) &
df['Lancaster'] != 'L' &
df['Dakota'] == 'D' &
df['Spitfire'] == 'SS' &
df['Hurricane'] != 'H'))
)]
And here is the Traceback Error I get :
File "<ipython-input-5-6d53e7e5ec10>", line 31
)
^
SyntaxError: invalid syntax
Here is my latest, whole Code John S, that works. I will let you know, if I get
more issues, many thanks for your help :
import pandas as pd
import requests
from bs4 import BeautifulSoup
res = requests.get("http://web.archive.org/web/20070701133815/http://www.bbmf.co.uk/june07.html")
soup = BeautifulSoup(res.content,'lxml')
table = soup.find_all('table')[0]
df = pd.read_html(str(table))
df = df[1]
df = df.rename(columns=df.iloc[0])
df = df.iloc[2:]
df.head(15)
display = df[(df['Location'].str.contains('- Display')) & (df['Dakota'].str.contains('D')) & (df['Spitfire'].str.contains('S')) & (df['Lancaster'] != 'L')]
display </code>
You just have to many brackets
((df['Location'].str.contains('- Display') &
df['Lancaster'] == '' &
df['Dakota'] == 'D' &
df['Spitfire'] == 'SS' &
df['Hurricane'] == ''))
You needed to remove a ')' after each ('- Display') it looks like you will still have some problems with sorting through your data. But this should get you past your syntax error.
Look at this online version so see my edits.
https://onlinegdb.com/Skceaucyr
you need to add ")]" in the end. So you variable southport will be now
Southport = df[
(
((df['Location'].str.contains('- Display') &
df['Lancaster'] != 'L' &
df['Dakota'] == 'D' &
df['Spitfire'] == 'S' &
df['Hurricane'] == 'H'))
)
] | df[
(
((df['Location'].str.contains('- Display') &
df['Lancaster'] != 'L' &
df['Dakota'] == 'D' &
df['Spitfire'] == 'S' &
df['Hurricane'] != 'H'))
)
] | df[
(
((df['Location'].str.contains('- Display') &
df['Lancaster'] != 'L' &
df['Dakota'] == 'D' &
df['Spitfire'] == 'SS' &
df['Hurricane'] != 'H'))
)]

Python - For loop - run each line of output

I guess I am running into a beginner problem:
-> I want to loop over an array and insert the the values into lines of code that are executed.
For the attempt below I get "SyntaxError: can't assign to operator"
#Country-subsets (all countries in dataframe)
for s in country_filter:
s.lower() + '_immu_edu' = immu_edu.loc[immu_edu['CountryName'] == s]
Thanks for helping!
My expected output would be:
guinea_immu_edu = immu_edu.loc[immu_edu['CountryName'] == "Guinea"]
lao_immu_edu = immu_edu.loc[immu_edu['CountryName'] == "Lao PDR"]
bf_immu_edu = immu_edu.loc[immu_edu['CountryName'] == "Burkina Faso"]
us_immu_edu = immu_edu.loc[immu_edu['CountryName'] == "United States"]
ge_immu_edu = immu_edu.loc[immu_edu['CountryName'] == "Germany"]
Store your values in a dictionary and access using the keys:
my_dict = dict()
for s in country_filter:
my_dict[s.lower() + '_immu_edu'] = immu_edu.loc[immu_edu['CountryName'] == s]

reading different Excel sheets in Python with if-elif-else

I am trying to read different sheets from Excel with if-elif-else statement depending upon the input and have written following code
import numpy as np
import pandas as pd
def ABSMATDATA(a,b,c,d,Material,Tmpref):
if Material == 2.016:
df = pd.read_excel('F:\MAschinenbau\Bachelorarbeit\ABSMAT.xlsx',sheet_name='H2')
elif Material == 28.016:
df = pd.read_excel('F:\MAschinenbau\Bachelorarbeit\ABSMAT.xlsx',sheet_name='N2')
elif Material == 32.000:
df = pd.read_excel('F:\MAschinenbau\Bachelorarbeit\ABSMAT.xlsx',sheet_name='O2')
elif Material == 32.065:
df = pd.read_excel('F:\MAschinenbau\Bachelorarbeit\ABSMAT.xlsx',sheet_name='S')
elif Material == 18.016:
df = pd.read_excel('F:\MAschinenbau\Bachelorarbeit\ABSMAT.xlsx',sheet_name='H2O')
elif Material == 64.065:
df = pd.read_excel('F:\MAschinenbau\Bachelorarbeit\ABSMAT.xlsx',sheet_name='SO2')
elif Material == 12.001:
df = pd.read_excel('F:\MAschinenbau\Bachelorarbeit\ABSMAT.xlsx',sheet_name='C Graphite')
elif Material == 28.011:
df = pd.read_excel('F:\MAschinenbau\Bachelorarbeit\ABSMAT.xlsx',sheet_name='CO')
elif Material == 44.011:
df = pd.read_excel('F:\MAschinenbau\Bachelorarbeit\ABSMAT.xlsx',sheet_name='CO2')
elif Material == 16.043:
df = pd.read_excel('F:\MAschinenbau\Bachelorarbeit\ABSMAT.xlsx',sheet_name='CH4')
elif Material == 30.070:
df = pd.read_excel('F:\MAschinenbau\Bachelorarbeit\ABSMAT.xlsx',sheet_name='C2H6')
elif Material == 44.097:
df = pd.read_excel('F:\MAschinenbau\Bachelorarbeit\ABSMAT.xlsx',sheet_name='C3H8')
elif Material == 58.124:
df = pd.read_excel('F:\MAschinenbau\Bachelorarbeit\ABSMAT.xlsx',sheet_name='C4H10')
else:
print('No data for this material available')
df =[list(np.arange(0,1100,100)),list(np.arange(0,11,1)),list(np.arange(0,11,1)),list(np.arange(0,11,1)),list(np.arange(0,11,1))]
return df
I am trying to run the Code calling ABSMATDATA(1,2,3,4,28.011,100) in the IPython Console but it is not giving any output. I was expecting to see the df in my Variable Explorer as an 2-dimensional array.
Your function is not returning anything, and you can cut your code a bit:
def ABSMATDATA(a,b,c,d,Material,Tmpref):
material_map = {2.016: 'H2',
28.016: 'N2',
32.000: 'O2',
32.065: 'S',
18.016: 'H20'}
if Material in material_map:
df = pd.read_excel('F:\MAschinenbau\Bachelorarbeit\ABSMAT.xlsx',sheet_name=material_map[Material])
else:
df = [list(np.arange(0,1100,100)),list(np.arange(0,11,1)),list(np.arange(0,11,1)),list(np.arange(0,11,1)),list(np.arange(0,11,1))]
print('No data for this material available')
return df

Inverse line graph year count matplotlib pandas python

I'm trying to create a lineplot of the count of three different groups i.e. desktop, mobile & tablet with the x axis having the years of 2014, 2015 and 2016 but I am getting the error
my code is currently:
#year-by-year change
desktop14 = od.loc[(od.Account_Year_Week >= 201401) & (od.Account_Year_Week <= 201453) & (od.online_device_type_detail == "DESKTOP"), "Gross_Demand_Pre_Credit"]
desktop15 = od.loc[(od.Account_Year_Week >= 201501) & (od.Account_Year_Week <= 201553) & (od.online_device_type_detail == "DESKTOP"), "Gross_Demand_Pre_Credit"]
desktop16 = od.loc[(od.Account_Year_Week >= 201601) & (od.Account_Year_Week <= 201653) & (od.online_device_type_detail == "DESKTOP"), "Gross_Demand_Pre_Credit"]
mobile14 = od.loc[(od.Account_Year_Week >= 201401) & (od.Account_Year_Week <= 201453) & (od.online_device_type_detail == "MOBILE"), "Gross_Demand_Pre_Credit"]
mobile15 = od.loc[(od.Account_Year_Week >= 201501) & (od.Account_Year_Week <= 201553) & (od.online_device_type_detail == "MOBILE"), "Gross_Demand_Pre_Credit"]
mobile16 = od.loc[(od.Account_Year_Week >= 201601) & (od.Account_Year_Week <= 201653) & (od.online_device_type_detail == "MOBILE"), "Gross_Demand_Pre_Credit"]
tablet14 = od.loc[(od.Account_Year_Week >= 201401) & (od.Account_Year_Week <= 201453) & (od.online_device_type_detail == "TABLET"), "Gross_Demand_Pre_Credit"]
tablet15 = od.loc[(od.Account_Year_Week >= 201501) & (od.Account_Year_Week <= 201553) & (od.online_device_type_detail == "TABLET"), "Gross_Demand_Pre_Credit"]
tablet16 = od.loc[(od.Account_Year_Week >= 201601) & (od.Account_Year_Week <= 201653) & (od.online_device_type_detail == "TABLET"), "Gross_Demand_Pre_Credit"]
devicedata = [["Desktop", desktop14.count(), desktop15.count(), desktop16.count()], ["Mobile", mobile14.count(), mobile15.count(), mobile16.count()], ["Tablet", tablet14.count(), tablet15.count(), tablet16.count()]]
df = pd.DataFrame(devicedata, columns=["Device", "2014", "2015", "2016"]).set_index("Device")
plt.show()
I want to make each of the lines the Device types and the x axis showing the change in year. How do I do this - (essentially reversing the axis).
any help is greatly appreciated
Just do
df.transpose().plot()
Result will be something like this:

Categories