python - Looking up a dictionary key in another file with two criteria

python - Looking up a dictionary key in another file with two criteria - python

After the end of my code, I have a dictionary like so:
{'"WS1"': 1475.9778073075058, '"BRO"': 1554.1437268304624, '"CHA"': 1552.228925324831}
What I want to do is to find each of the keys in a separate file, teams.txt, which is formatted like this:
1901,'BRO','LAD'
1901,'CHA','CHW'
1901,'WS1','MIN'
Using the year, which is 1901, and the team, which is the key of each item in the dictionary, I want to create a new dictionary where the key is the third column in teams.txt if the year and team both match, and the value is the value of the team in the first dictionary.
I figured this would be easiest if I created a function to "lookup" the year and the team, and return "franch", and then apply that function to each key in the dictionary. This is what I have so far, but it gives me a KeyError
def franch(year, team_str):
team_str = str(team_str)
with open('teams.txt') as imp_file:
teams = imp_file.readlines()
for team in teams:
(yearID, teamID, franchID) = team.split(',')
yearID = int(yearID)
if yearID == year:
if teamID == team_str:
break
franchID = franchID[1:4]
return franchID
And in the other function with the dictionary that I want to apply this function to:
franch_teams={}
for team in teams:
team = team.replace('"', "'")
franch_teams[franch(year, team)] = teams[team]
The ideal output of what I am trying to accomplish would look like:
{'"MIN"': 1475.9778073075058, '"LAD"': 1554.1437268304624, '"CHW"': 1552.228925324831}
Thanks!

Does this code suite your needs?
I am doing an extra check for equality, because there were different string signs in different parts of your code.
def almost_equals(one, two):
one = one.replace('"', '').replace("'", "")
two = two.replace('"', '').replace("'", "")
return one == two
def create_data(year, data, text_content):
""" This function returns new dictionary. """
content = [line.split(',') for line in text_content.split('\n')]
res = {}
for key in data.keys():
for one_list in content:
if year == one_list[0] and almost_equals(key, one_list[1]):
res[one_list[2]] = data[key]
return res
teams_txt = """1901,'BRO','LAD'
1901,'CHA','CHW'
1901,'WS1','MIN'"""
year = '1901'
data = { '"WS1"': 1475.9778073075058, '"BRO"': 1554.1437268304624, '"CHA"': 1552.228925324831 }
result = create_data(year, data, teams_txt)
And the output:
{"'CHW'": 1552.228925324831, "'LAD'": 1554.1437268304624, "'MIN'": 1475.9778073075058}
Update:
To read from text file use this function:
def read_text_file(filename):
with open(filename) as file_object:
result = file_object.read()
return result
teams_txt = read_text_file('teams.txt')

You may try something like:
#!/usr/bin/env python
def clean(_str):
return _str.strip('"').strip("'")
first = {'"WS1"': 1475.9778073075058, '"BRO"': 1554.1437268304624, '"CHA"': 1552.228925324831}
clean_first = dict()
second = dict()
for k,v in first.items():
clean_first[clean(k)] = v
with open("teams.txt", "r") as _file:
lines = _file.readlines()
for line in lines:
_,old,new = line.split(",")
second[new.strip()] = clean_first[clean(old)]
print second
Which gives the expected:
{"'CHW'": 1552.228925324831, "'LAD'": 1554.1437268304624, "'MIN'": 1475.9778073075058}

Related

Iterating through the python and pandas loop

What is the best way to look into how the loop works thourh iterations?
I have defined 2 functions which have to go in one after another (the 2nd gets the result of the 1st and works it through).
Ultimately I need 2-line pandas dataframe as the output.
Sample code below.
def candle_data (
figi,
int1 = candle_resolution,
from1 = previous_minutemark,
to1 = last_minutemark
):
response = market_api.market_candles_get(figi = ticker_figi_test, from_ = from1, to = to1, interval = int1)
if response.status_code == 200:
return response.parse_json().dict()
else:
return print(response.parse_error())
def response_to_pandas_df (response):
df_candles = pd.DataFrame(response['payload'])
df_candles = pd.json_normalize(df_candles['candles'])
df_candles = df_candles[df_candles['time'] >= previous_minutemark]
df_candles = df_candles[['c', 'figi','time']]
df_candles_main = df_candles_template.append(df_candles)
return df_candles_main
then I call the functions in a loop:
ticker_figi_list = ["BBG000CL9VN6", "BBG000R7Z112"]
df_candles_main = df_candles_template
for figi in ticker_figi_list:
response = candle_data(figi)
df_candles_main = response_to_pandas_df(response)
But in return I get only 1 row of data for the 1st FIGI in the list.
I suppose, that I define the candle_data() function with figi_ticker_test which contain only 1 value may be the case. But I'm not sure how to work this around.
Thank you in advance.

It looks like the problem is you are calling the api with figi = ticker_figi_test. I assume ticket_figi_test is equal to the first figi in your list and so you aren't actually calling the api with different figi on each iteration. Try changing to the following:
response = market_api.market_candles_get(figi = figi, from_ = from1, to = to1, interval = int1)

Python return value without inverted commas

I have csv file:
shack_imei.csv:
shack, imei
F10, "5555"
code:
reader = csv.reader(open("shack_imei.csv", "rb"))
my_dict = dict(reader)
shack = raw_input('Enter Shack:')
print shack
def get_imei_from_entered_shack(shack):
for key, value in my_dict.iteritems():
if key == shack:
return value
list = str(get_imei_from_entered_shack(shack))
print list
which gives me "5555"
But I need this value in a list structure like this:
["5555"]
I've tried a lot of different methods, and they all end up with extra ' or""
EDIT 1:
new simpler code:
reader = csv.reader(open("shack_imei.csv", "rb"))
my_dict = dict(reader)
shack = raw_input('Enter Shack:')
imei = my_dict[shack]
print imei
"5555"
list(imei) gives me ['"5555"'], I need it to be ["5555"]

You can change your "return" sentence:
shack = raw_input('Enter Shack:')
print shack
def get_imei_from_entered_shack(shack):
for key, value in my_dict.iteritems():
if key == shack:
return [str(value)]
list = get_imei_from_entered_shack(shack)
print list

As far as I understand, you want to create a list containing the returned string, which you do with [ ]
list = [str(get_imei_from_entered_shack(shack))]

There are a few problems with this code, which are too long to tackle in comments
my_dict
my_dict = dict(reader) works only well if this csv is a collection of keys and values. If there are duplicate keys, this might give some problems
get_imei_from_entered_shack
Why this special method, instead of just asking my_dict the correct value. Even if you don't want it to trow an Exception when you ask for a shack that doesn't exists, you can use the dict.get(<key>, <default>) method
my_dict(shack, None)
does the same as your 4-line method
list
don't name variables the same as builtins
list2
if you want a list, you can do [<value>] or list(<value>) (unless you replaced list with your own variable assignment)

reader = csv.reader(open("shack_imei.csv", "rb"))
my_dict = dict(reader)
shack = raw_input('Enter Shack:')
imei = my_dict[shack]
imei = imei.replace('"',"")
IMEI_LIST =[]
IMEI_LIST.append(imei)
print IMEI_LIST
['5555']

A log file which contains information like < timestamp , customer-id , page-id ,list of titles>

Write a code to print all the unique customers visited in last hour
My try:
import datetime
def find_repeated_customer():
file_obj = open(" my file path","r")
customer_last_visit = {}
repeat_customer = set()
while line in file_obj:
timestamp,customer_id,page_id = line.split(" : ")
last_visit = customer_last_vist.get(customer_id,None)
if not last_visit:
customer_last_visit[customer_id] = last_visit
else:
# assuming time stamp looks like 2016-10-29 01:03:26.947000
year,month,date = timestamp.split(" ")[0].split("-")
current_visit = datetime.date(year,month,date)
day_diff = current_visit - last_visit
if day_diff >=1:
repeat_customer.add(customer_id)
customer_last_visit[customer_id] = current_visit
I am completely failing over in order to get my desired output. By doing this I am able to get repeated customers in last one day but how to get unique users?

You can't do this kind of manipulation in one pass. You have to pass once through lines to get customers, and only then You can check who came once. In another pass, You check if current customer is in list on once-customers and do something with him.

How do I instantiate a group of objects from a text file?

I have some log files that look like many lines of the following:
<tickPrice tickerId=0, field=2, price=201.81, canAutoExecute=1>
<tickSize tickerId=0, field=3, size=25>
<tickSize tickerId=0, field=8, size=534349>
<tickPrice tickerId=0, field=2, price=201.82, canAutoExecute=1>
I need to define a class of type tickPrice or tickSize. I will need to decide which to use before doing the definition.
What would be the Pythonic way to grab these values? In other words, I need an effective way to reverse str() on a class.
The classes are already defined and just contain the presented variables, e.g., tickPrice.tickerId. I'm trying to find a way to extract these values from the text and set the instance attributes to match.
Edit: Answer
This is what I ended up doing-
with open(commandLineOptions.simulationFilename, "r") as simulationFileHandle:
for simulationFileLine in simulationFileHandle:
(date, time, msgString) = simulationFileLine.split("\t")
if ("tickPrice" in msgString):
msgStringCleaned = msgString.translate(None, ''.join("<>,"))
msgList = msgStringCleaned.split(" ")
msg = message.tickPrice()
msg.tickerId = int(msgList[1][9:])
msg.field = int(msgList[2][6:])
msg.price = float(msgList[3][6:])
msg.canAutoExecute = int(msgList[4][15:])
elif ("tickSize" in msgString):
msgStringCleaned = msgString.translate(None, ''.join("<>,"))
msgList = msgStringCleaned.split(" ")
msg = message.tickSize()
msg.tickerId = int(msgList[1][9:])
msg.field = int(msgList[2][6:])
msg.size = int(msgList[3][5:])
else:
print "Unsupported tick message type"

I'm not sure how you want to dynamically create objects in your namespace, but the following will at least dynamically create objects based on your loglines:
Take your line:
line = '<tickPrice tickerId=0, field=2, price=201.81, canAutoExecute=1>'
Remove chars that aren't interesting to us, then split the line into a list:
line = line.translate(None, ''.join('<>,'))
line = line.split(' ')
Name the potential class attributes for convenience:
line_attrs = line[1:]
Then create your object (name, base tuple, dictionary of attrs):
tickPriceObject = type(line[0], (object,), { key:value for key,value in [at.split('=') for at in line_attrs]})()
Prove it works as we'd expect:
print(tickPriceObject.field)
# 2

Approaching the problem with regex, but with the same result as tristan's excellent answer (and stealing his use of the type constructor that I will never be able to remember)
import re
class_instance_re = re.compile(r"""
<(?P<classname>\w[a-zA-Z0-9]*)[ ]
(?P<arguments>
(?:\w[a-zA-Z0-9]*=[0-9.]+[, ]*)+
)>""", re.X)
objects = []
for line in whatever_file:
result = class_instance_re.match(line)
classname = line.group('classname')
arguments = line.group('arguments')
new_obj = type(classname, (object,),
dict([s.split('=') for s in arguments.split(', ')]))
objects.append(new_obj)

append in a list causing value overwrite

I am facing a peculiar problem. I will describe in brief bellow
Suppose i have this piece of code -
class MyClass:
__postBodies = []
.
.
.
for the_file in os.listdir("/dir/path/to/file"):
file_path = os.path.join(folder, the_file)
params = self.__parseFileAsText(str(file_path)) #reads the file and gets some parsed data back
dictData = {'file':str(file_path), 'body':params}
self.__postBodies.append(dictData)
print self.__postBodies
dictData = None
params = None
Problem is, when i print the params and the dictData everytime for different files it has different values (the right thing), but as soon as the append occurs, and I print __postBodies a strange thing happens. If there are thee files, suppose A,B,C, then
first time __postBodies has the content = [{'body':{A dict with some
data related to file A}, 'file':'path/of/A'}]
second time it becomes = [{'body':{A dict with some data relaed to
file B}, 'file':'path/of/A'}, {'body':{A dict with some data relaed to
file B}, 'file':'path/of/B'}]
AND third time = [{'body':{A dict with some data relaed to file C},
'file':'path/of/A'}, {'body':{A dict with some data relaed to file C},
'file':'path/of/B'}, {'body':{A dict with some data relaed to file C},
'file':'path/of/C'}]
So, you see the 'file' key is working very fine. Just strangely the 'body' key is getting overwritten for all the entries with the one last appended.
Am i making any mistake? is there something i have to? Please point me to a direction.
Sorry if I am not very clear.
EDIT ------------------------
The return from self.__parseFileAsText(str(file_path)) call is a dict that I am inserting as 'body' in the dictData.
EDIT2 ----------------------------
as you asked, this is the code, but i have checked that params = self.__parseFileAsText(str(file_path)) call is returning a diff dict everytime.
def __parseFileAsText(self, fileName):
i = 0
tempParam = StaticConfig.PASTE_PARAMS
tempParam[StaticConfig.KEY_PASTE_PARAM_NAME] = ""
tempParam[StaticConfig.KEY_PASTE_PARAM_PASTEFORMAT] = "text"
tempParam[StaticConfig.KEY_PASTE_PARAM_EXPIREDATE] = "N"
tempParam[StaticConfig.KEY_PASTE_PARAM_PRIVATE] = ""
tempParam[StaticConfig.KEY_PASTE_PARAM_USER] = ""
tempParam[StaticConfig.KEY_PASTE_PARAM_DEVKEY] = ""
tempParam[StaticConfig.KEY_PASTE_FORMAT_PASTECODE] = ""
for line in fileinput.input([fileName]):
temp = str(line)
temp2 = temp.strip()
if i == 0:
postValues = temp2.split("|||")
if int(postValues[(len(postValues) - 1)]) == 0 or int(postValues[(len(postValues) - 1)]) == 2:
tempParam[StaticConfig.KEY_PASTE_PARAM_NAME] = str(postValues[0])
if str(postValues[1]) == '':
tempParam[StaticConfig.KEY_PASTE_PARAM_PASTEFORMAT] = 'text'
else:
tempParam[StaticConfig.KEY_PASTE_PARAM_PASTEFORMAT] = postValues[1]
if str(postValues[2]) != "N":
tempParam[StaticConfig.KEY_PASTE_PARAM_EXPIREDATE] = str(postValues[2])
tempParam[StaticConfig.KEY_PASTE_PARAM_PRIVATE] = str(postValues[3])
tempParam[StaticConfig.KEY_PASTE_PARAM_USER] = StaticConfig.API_USER_KEY
tempParam[StaticConfig.KEY_PASTE_PARAM_DEVKEY] = StaticConfig.API_KEY
else:
tempParam[StaticConfig.KEY_PASTE_PARAM_USER] = StaticConfig.API_USER_KEY
tempParam[StaticConfig.KEY_PASTE_PARAM_DEVKEY] = StaticConfig.API_KEY
i = i+1
else:
if tempParam[StaticConfig.KEY_PASTE_FORMAT_PASTECODE] != "" :
tempParam[StaticConfig.KEY_PASTE_FORMAT_PASTECODE] = str(tempParam[StaticConfig.KEY_PASTE_FORMAT_PASTECODE])+"\n"+temp2
else:
tempParam[StaticConfig.KEY_PASTE_FORMAT_PASTECODE] = temp2
return tempParam

You are likely returning the same dictionary with every call to MyClass.__parseFileAsText(), a couple of common ways this might be happening:
__parseFileAsText() accepts a mutable default argument (the dict that you eventually return)
You modify an attribute of the class or instance and return that instead of creating a new one each time
Making sure that you are creating a new dictionary on each call to __parseFileAsText() should fix this problem.
Edit: Based on your updated question with the code for __parseFileAsText(), your issue is that you are reusing the same dictionary on each call:
tempParam = StaticConfig.PASTE_PARAMS
...
return tempParam
On each call you are modifying StaticConfig.PASTE_PARAMS, and the end result is that all of the body dictionaries in your list are actually references to StaticConfig.PASTE_PARAMS. Depending on what StaticConfig.PASTE_PARAMS is, you should change that top line to one of the following:
# StaticConfig.PASTE_PARAMS is an empty dict
tempParam = {}
# All values in StaticConfig.PASTE_PARAMS are immutable
tempParam = dict(StaticConfig.PASTE_PARAMS)
If any values in StaticConfig.PASTE_PARAMS are mutable, you could use copy.deepcopy but it would be better to populate tempParam with those default values on your own.

What if __postBodies wasn't a class attribute, as it is defined now, but just an instance attribute?

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

python - Looking up a dictionary key in another file with two criteria - python

Related

Iterating through the python and pandas loop

Python return value without inverted commas

A log file which contains information like < timestamp , customer-id , page-id ,list of titles>

How do I instantiate a group of objects from a text file?

append in a list causing value overwrite

Categories

Resources