Collection items in Python

Collection items in Python - python

I have a collection of an item like below in my mongoDB database:
{u'Keywords': [[u'european', 7], [u'bill', 5], [u'uk', 5], [u'years', 4], [u'brexit', 4]], u'Link': u'http://www.bbc.com/
news/uk-politics-39042876', u'date': datetime.datetime(2017, 2, 21, 22, 47, 7, 463000), u'_id': ObjectId('58acc36b3040a218bc62c6d3')}
.....
These come from a mongo DB query
mydb = client['BBCArticles']
##mydb.adminCommand({'setParameter': True, 'textSearchEnabled': True})
my_collection = mydb['Articles']
print 'Articles containing higher occurences of the keyword is sorted as follow:'
for doc in my_collection.find({"Keywords":{"$elemMatch" : {"$elemMatch": {"$in": [keyword.lower()]}}}}):
print doc
However, I want to print documents as follow:
doc1
Kewords: european,bill, uk
Link:"http://www.bbc.com/"
doc2
....

Since your collection looks like a list of dictionaries, it should be iterable and parseable using a for-loop. If indeed you want only a portion of the url and keywords, this should work:
# c = your_collection, a list of dictionaries
from urlparse import urlparse
for n in range(len(c)):
print 'doc{n}'.format(n=n+1)
for k, v in c[n].iteritems():
if k == 'Keywords':
print k+':', ', '.join([str(kw[0]) for kw in v[0:3]])
if k == 'Link':
parsed_uri = urlparse( v )
domain = '{uri.scheme}://{uri.netloc}/'.format(uri=parsed_uri)
print k+':', '"{0}"\n'.format(domain)
prints:
doc1
Keywords: european, bill, uk
Link: "http://www.bbc.com/"

Related

How to extract information from atom feed based on condition?

I have output of API request in given below.
From each atom:entry I need to extract
<c:series href="http://company.com/series/product/123"/>
<c:series-order>2020-09-17T00:00:00Z</c:series-order>
<f:assessment-low precision="0">980</f:assessment-low>
I tried to extract them to different list with BeautifulSoup, but that wasn't successful because in some entries there are dates but there isn't price (I've shown example below). How could I conditionally extract it? at least put N/A for entries where price is ommited.
soup = BeautifulSoup(request.text, "html.parser")
date = soup.find_all('c:series-order')
value = soup.find_all('f:assessment-low')
quot=soup.find_all('c:series')
p_day = []
p_val = []
q_val=[]
for i in date:
p_day.append(i.text)
for j in value:
p_val.append(j.text)
for j in quot:
q_val.append(j.get('href'))
d2={'date': p_day,
'price': p_val,
'quote': q_val
}
and
<atom:feed xmlns:atom="http://www.w3.org/2005/Atom" xmlns:a="http://company.com/ns/assets" xmlns:c="http://company.com/ns/core" xmlns:f="http://company.com/ns/fields" xmlns:s="http://company.com/ns/search">
<atom:id>http://company.com/search</atom:id>
<atom:title> COMPANYSearch Results</atom:title>
<atom:updated>2022-11-24T19:36:19.104414Z</atom:updated>
<atom:author>COMPANY atom:author>
<atom:generator> COMPANY/search Endpoint</atom:generator>
<atom:link href="/search" rel="self" type="application/atom"/>
<s:first-result>1</s:first-result>
<s:max-results>15500</s:max-results>
<s:selected-count>212</s:selected-count>
<s:returned-count>212</s:returned-count>
<s:query-time>PT0.036179S</s:query-time>
<s:request version="1.0">
<s:scope>
<s:series>http://company.com/series/product/123</s:series>
</s:scope>
<s:constraints>
<s:compare field="c:series-order" op="ge" value="2018-10-01"/>
<s:compare field="c:series-order" op="le" value="2022-11-18"/>
</s:constraints>
<s:options>
<s:first-result>1</s:first-result>
<s:max-results>15500</s:max-results>
<s:order-by key="commodity-name" direction="ascending" xml:lang="en"/>
<s:no-currency-rate-scheme>no-element</s:no-currency-rate-scheme>
<s:precision>embed</s:precision>
<s:include-last-commit-time>false</s:include-last-commit-time>
<s:include-result-types>live</s:include-result-types>
<s:relevance-score algorithm="score-logtfidf"/>
<s:lang-data-missing-scheme>show-available-language-content</s:lang-data-missing-scheme>
</s:options>
</s:request>
<s:facets/>
<atom:entry>
<atom:title>http://company.com/series-item/product/123-pricehistory-20200917000000</atom:title>
<atom:id>http://company.com/series-item/product/123-pricehistory-20200917000000</atom:id>
<atom:updated>2020-09-17T17:09:43.55243Z</atom:updated>
<atom:relevance-score>60800</atom:relevance-score>
<atom:content type="application/vnd.icis.iddn.entity+xml"><a:price-range>
<c:id>http://company.com/series-item/product/123-pricehistory-20200917000000</c:id>
<c:version>1</c:version>
<c:type>series-item</c:type>
<c:created-on>2020-09-17T17:09:43.55243Z</c:created-on>
<c:descriptor href="http://company.com/descriptor/price-range"/>
<c:domain href="http://company.com/domain/product"/>
<c:released-on>2020-09-17T21:30:00Z</c:released-on>
<c:series href="http://company.com/series/product/123"/>
<c:series-order>2020-09-17T00:00:00Z</c:series-order>
<f:assessment-low precision="0">980</f:assessment-low>
<f:assessment-high precision="0">1020</f:assessment-high>
<f:mid precision="1">1000</f:mid>
<f:assessment-low-delta>0</f:assessment-low-delta>
<f:assessment-high-delta>+20</f:assessment-high-delta>
<f:delta-type href="http://company.com/ref-data/delta-type/regular"/>
</a:price-range></atom:content>
</atom:entry>
<atom:entry>
<atom:title>http://company.com/series-item/product/123-pricehistory-20200910000000</atom:title>
<atom:id>http://company.com/series-item/product/123-pricehistory-20200910000000</atom:id>
<atom:updated>2020-09-10T18:57:55.128308Z</atom:updated>
<atom:relevance-score>60800</atom:relevance-score>
<atom:content type="application/vnd.icis.iddn.entity+xml"><a:price-range>
<c:id>http://company.com/series-item/product/123-pricehistory-20200910000000</c:id>
<c:version>1</c:version>
<c:type>series-item</c:type>
<c:created-on>2020-09-10T18:57:55.128308Z</c:created-on>
<c:descriptor href="http://company.com/descriptor/price-range"/>
<c:domain href="http://company.com/domain/product"/>
<c:released-on>2020-09-10T21:30:00Z</c:released-on>
<c:series href="http://company.com/series/product/123"/>
<c:series-order>2020-09-10T00:00:00Z</c:series-order>
for example here is no price
<f:delta-type href="http://company.com/ref-data/delta-type/regular"/>
</a:price-range></atom:content>
</atom:entry>

May try to iterate per entry, use xml parser to get a propper result and check if element exists:
soup = BeautifulSoup(request.text,'xml')
data = []
for i in soup.select('entry'):
data.append({
'date':i.find('series-order').text,
'value': i.find('assessment-low').text if i.find('assessment-low') else None,
'quot': i.find('series').get('href')
})
data
or with html.parser:
soup = BeautifulSoup(xml,'html.parser')
data = []
for i in soup.find_all('atom:entry'):
data.append({
'date':i.find('c:series-order').text,
'value': i.find('f:assessment-low').text if i.find('assessment-low') else None,
'quot': i.find('c:series').get('href')
})
data
Output:
[{'date': '2020-09-17T00:00:00Z',
'value': '980',
'quot': 'http://company.com/series/product/123'},
{'date': '2020-09-10T00:00:00Z',
'value': None,
'quot': 'http://company.com/series/product/123'}]

You can try this:
split your request.text by <atom:entry>
deal with each section seperately.
Use enumerate to identify the section that it came from
entries = request.text.split("<atom:entry>")
p_day = []
p_val = []
q_val=[]
for i, entry in enumerate(entries):
soup = BeautifulSoup(entry, "html.parser")
date = soup.find_all('c:series-order')
value = soup.find_all('f:assessment-low')
quot=soup.find_all('c:series')
for d in date:
p_day.append([i, d.text])
for v in value:
p_val.append([i, v.text])
for q in quot:
q_val.append([i, q.get('href')])
d2={'date': p_day,
'price': p_val,
'quote': q_val
}
print(d2)
OUTPUT:
{'date': [[1, '2020-09-17T00:00:00Z'], [2, '2020-09-10T00:00:00Z']],
'price': [[1, '980']],
'quote': [[1, 'http://company.com/series/product/123'],
[2, 'http://company.com/series/product/123']]}

Getting the sum of a csv column without pandas in python

I have a csv file passed into a function as a string:
csv_input = """
quiz_date,location,size
2022-01-01,london_uk,134
2022-01-02,edingburgh_uk,65
2022-01-01,madrid_es,124
2022-01-02,london_uk,125
2022-01-01,edinburgh_uk,89
2022-01-02,madric_es,143
2022-01-02,london_uk,352
2022-01-01,edinburgh_uk,125
2022-01-01,madrid_es,431
2022-01-02,london_uk,151"""
I want to print the sum of how many people were surveyed in each city by date, so something like:
Date. City. Pop-Surveyed
2022-01-01. London. 134
2022-01-01. Edinburgh. 214
2022-01-01. Madrid. 555
2022-01-02. London. 628
2022-01-02. Edinburgh. 65
2022-01-02. Madrid. 143
As I can't import pandas on my machine (can't install without internet access) I thought I could use a defaultdict to store the value of each city by date
from collections import defaultdict
survery_data = csv_input.split()[1:]
survery_data = [survey.split(',') for survey in survery_data]
survey_sum = defaultdict(dict)
for survey in survery_data:
date = survey[0]
city = survey[1].split("_")[0]
quantity = survey[-1]
survey_sum[date][city] += quantity
print(survey_sum)
But doing this returns a KeyError:
KeyError: 'london'
When I was hoping to have a defaultdict of
{'2022-01-01': {'london': 134}, {'edinburgh': 214}, {'madrid': 555}},
{'2022-01-02': {'london': 628}, {'edinburgh': 65}, {'madrid': 143}}
Is there a way to create a default dict that gives a structure so I could then iterate over to print out each column like above?

Try:
csv_input = """\
quiz_date,location,size
2022-01-01,london_uk,134
2022-01-02,edingburgh_uk,65
2022-01-01,madrid_es,124
2022-01-02,london_uk,125
2022-01-01,edinburgh_uk,89
2022-01-02,madric_es,143
2022-01-02,london_uk,352
2022-01-01,edinburgh_uk,125
2022-01-01,madrid_es,431
2022-01-02,london_uk,151"""
header, *rows = (
tuple(map(str.strip, line.split(",")))
for line in map(str.strip, csv_input.splitlines())
)
tmp = {}
for date, city, size in rows:
key = (date, city.split("_")[0])
tmp[key] = tmp.get(key, 0) + int(size)
out = {}
for (date, city), size in tmp.items():
out.setdefault(date, []).append({city: size})
print(out)
Prints:
{
"2022-01-01": [{"london": 134}, {"madrid": 555}, {"edinburgh": 214}],
"2022-01-02": [{"edingburgh": 65}, {"london": 628}, {"madric": 143}],
}

Changing
survey_sum = defaultdict(dict)
to
survey_sum = defaultdict(lambda: defaultdict(int))
allows the return of
defaultdict(<function survey_sum.<locals>.<lambda> at 0x100edd8b0>, {'2022-01-01': defaultdict(<class 'int'>, {'london': 134, 'madrid': 555, 'edinburgh': 214}), '2022-01-02': defaultdict(<class 'int'>, {'edingburgh': 65, 'london': 628, 'madrid': 143})})
Allowing iterating over to create a list.

Python: Adding letter plus sequential numbers as values to a dictionary key

I want to create a dictionary like this:
{'cat': ['anm_0', 'anm_1', 'anm_2', ... 'anm_99'],
'dog': ['anm_100', 'anm_101', 'anm_102', ... 'anm_199'],
'snake': ['anm_200', 'anm_201', 'anm_202', ... 'anm_299']}
I tried to manually define them this way:
anmdict = {}
anmdict['cat'] = "anm_" + str((list(range(0,100))))
but that outputs:
{'cat': 'anm_[0, 1, 2, 3, 4, 5, ...]'}
instead of the output with anm_0, anm_1, etc as different values.

You can use range:
_k = iter(['cat', 'dog', 'snake'])
new_d = {next(_k):[f'anm_{c}' for c in range(i, i+100)] for i in range(0, 300, 100)}
Output:
{'cat': ['anm_0', 'anm_1', 'anm_2', 'anm_3', 'anm_4', 'anm_5', 'anm_6', 'anm_7', 'anm_8', 'anm_9', 'anm_10', 'anm_11', 'anm_12', 'anm_13', 'anm_14', 'anm_15', 'anm_16', 'anm_17', 'anm_18', 'anm_19', 'anm_20', 'anm_21', 'anm_22', 'anm_23', 'anm_24', 'anm_25', 'anm_26', 'anm_27', 'anm_28', 'anm_29', 'anm_30', 'anm_31', 'anm_32', 'anm_33', 'anm_34', 'anm_35', 'anm_36', 'anm_37', 'anm_38', 'anm_39', 'anm_40', 'anm_41', 'anm_42', 'anm_43', 'anm_44', 'anm_45', 'anm_46', 'anm_47', 'anm_48', 'anm_49', 'anm_50', 'anm_51', 'anm_52', 'anm_53', 'anm_54', 'anm_55', 'anm_56', 'anm_57', 'anm_58', 'anm_59', 'anm_60', 'anm_61', 'anm_62', 'anm_63', 'anm_64', 'anm_65', 'anm_66', 'anm_67', 'anm_68', 'anm_69', 'anm_70', 'anm_71', 'anm_72', 'anm_73', 'anm_74', 'anm_75', 'anm_76', 'anm_77', 'anm_78', 'anm_79', 'anm_80', 'anm_81', 'anm_82', 'anm_83', 'anm_84', 'anm_85', 'anm_86', 'anm_87', 'anm_88', 'anm_89', 'anm_90', 'anm_91', 'anm_92', 'anm_93', 'anm_94', 'anm_95', 'anm_96', 'anm_97', 'anm_98', 'anm_99'], 'dog': ['anm_100', 'anm_101', 'anm_102', 'anm_103', 'anm_104', 'anm_105', 'anm_106', 'anm_107', 'anm_108', 'anm_109', 'anm_110', 'anm_111', 'anm_112', 'anm_113', 'anm_114', 'anm_115', 'anm_116', 'anm_117', 'anm_118', 'anm_119', 'anm_120', 'anm_121', 'anm_122', 'anm_123', 'anm_124', 'anm_125', 'anm_126', 'anm_127', 'anm_128', 'anm_129', 'anm_130', 'anm_131', 'anm_132', 'anm_133', 'anm_134', 'anm_135', 'anm_136', 'anm_137', 'anm_138', 'anm_139', 'anm_140', 'anm_141', 'anm_142', 'anm_143', 'anm_144', 'anm_145', 'anm_146', 'anm_147', 'anm_148', 'anm_149', 'anm_150', 'anm_151', 'anm_152', 'anm_153', 'anm_154', 'anm_155', 'anm_156', 'anm_157', 'anm_158', 'anm_159', 'anm_160', 'anm_161', 'anm_162', 'anm_163', 'anm_164', 'anm_165', 'anm_166', 'anm_167', 'anm_168', 'anm_169', 'anm_170', 'anm_171', 'anm_172', 'anm_173', 'anm_174', 'anm_175', 'anm_176', 'anm_177', 'anm_178', 'anm_179', 'anm_180', 'anm_181', 'anm_182', 'anm_183', 'anm_184', 'anm_185', 'anm_186', 'anm_187', 'anm_188', 'anm_189', 'anm_190', 'anm_191', 'anm_192', 'anm_193', 'anm_194', 'anm_195', 'anm_196', 'anm_197', 'anm_198', 'anm_199'], 'snake': ['anm_200', 'anm_201', 'anm_202', 'anm_203', 'anm_204', 'anm_205', 'anm_206', 'anm_207', 'anm_208', 'anm_209', 'anm_210', 'anm_211', 'anm_212', 'anm_213', 'anm_214', 'anm_215', 'anm_216', 'anm_217', 'anm_218', 'anm_219', 'anm_220', 'anm_221', 'anm_222', 'anm_223', 'anm_224', 'anm_225', 'anm_226', 'anm_227', 'anm_228', 'anm_229', 'anm_230', 'anm_231', 'anm_232', 'anm_233', 'anm_234', 'anm_235', 'anm_236', 'anm_237', 'anm_238', 'anm_239', 'anm_240', 'anm_241', 'anm_242', 'anm_243', 'anm_244', 'anm_245', 'anm_246', 'anm_247', 'anm_248', 'anm_249', 'anm_250', 'anm_251', 'anm_252', 'anm_253', 'anm_254', 'anm_255', 'anm_256', 'anm_257', 'anm_258', 'anm_259', 'anm_260', 'anm_261', 'anm_262', 'anm_263', 'anm_264', 'anm_265', 'anm_266', 'anm_267', 'anm_268', 'anm_269', 'anm_270', 'anm_271', 'anm_272', 'anm_273', 'anm_274', 'anm_275', 'anm_276', 'anm_277', 'anm_278', 'anm_279', 'anm_280', 'anm_281', 'anm_282', 'anm_283', 'anm_284', 'anm_285', 'anm_286', 'anm_287', 'anm_288', 'anm_289', 'anm_290', 'anm_291', 'anm_292', 'anm_293', 'anm_294', 'anm_295', 'anm_296', 'anm_297', 'anm_298', 'anm_299']}

Change this
anmdict['cat'] = "anm_" + str((list(range(0,100))))
To this
values = list(range(0,100))
anmdict['cat'] = ["anm_" + str(values[i]) for i in range(0,len(values))]

Extract the list of running IP's from AWS

I need to exactract a list of all Running IP's from AWS and along the list of IP i also want the tag-value & the region-name(or availability-zone). The list should look as follows
ex : [[('23.1.1.141', ' Production LB', 'us-east-1'), ('47.2.14.93', 'sung2-cloud-trial01-LBi-b89671', 'us-west-2'),................]]
The list consists of the following:
<'Running IP'>,<'Tag-value'>,<AvailabilityZone>
Using boto3 and I can achieve this using describe_instances(), so I tried using it as follows:
def gather_public_ip():
regions = ['us-west-2', 'eu-central-1', 'ap-southeast-1']
combined_list = [] ##This needs to be returned
for region in regions:
instance_information = {}
ip_dict = {}
client = boto3.client('ec2', aws_access_key_id=ACCESS_KEY, aws_secret_access_key=SECRET_KEY,
region_name=region, )
instance_dict = client.describe_instances().get('Reservations')
for instance in instance_dict:
.....
#WHAT CODE COMES HERE???
gather_public_ip()
So what exactly should i be looking into for this ? I am not sure how to do so, please help
NEW ANSWER RESPONSE:
Traceback (most recent call last):
File "C:/Users/nisingh/PycharmProjects/untitled1/more_specific.py", line 35, in <module>
gather_public_ip()
File "C:/Users/nisingh/PycharmProjects/untitled1/more_specific.py", line 27, in gather_public_ip
ipaddress = instance['PublicIpAddress'] # Which one do you need?
KeyError: 'PublicIpAddress'
{u'Monitoring': {u'State': 'disabled'}, u'PublicDnsName': '', u'RootDeviceType': 'ebs', u'State': {u'Code': 80, u'Name': 'stopped'}, u'EbsOptimized': False, u'LaunchTime': datetime.datetime(2015, 5, 2, 16, 34, 17, tzinfo=tzutc()), u'PrivateIpAddress': '10.11.32.10', u'ProductCodes': [], u'VpcId': 'vpc-8ae766ef', u'StateTransitionReason': 'User initiated (2016-01-29 22:20:18 GMT)', u'InstanceId': 'i-5e8ae8a8', u'ImageId': 'ami-5189a661', u'PrivateDnsName': 'ip-10-11-32-10.us-west-2.compute.internal', u'KeyName': 'pdx01-cump-050215', u'SecurityGroups': [{u'GroupName': 'pdx01-clv-jumpeng', u'GroupId': 'sg-50abbf35'}], u'ClientToken': 'inYin1430584456904', u'SubnetId': 'subnet-57e45b20', u'InstanceType': 't2.micro', u'NetworkInterfaces': [{u'Status': 'in-use', u'MacAddress': '06:91:7c:86:7c:d5', u'SourceDestCheck': True, u'VpcId': 'vpc-8ae766ef', u'Description': 'Primary network interface', u'NetworkInterfaceId': 'eni-7d566d0b', u'PrivateIpAddresses': [{u'Primary': True, u'PrivateIpAddress': '10.11.32.10'}], u'Attachment': {u'Status': 'attached', u'DeviceIndex': 0, u'DeleteOnTermination': True, u'AttachmentId': 'eni-attach-08d40929', u'AttachTime': datetime.datetime(2015, 5, 2, 16, 34, 17, tzinfo=tzutc())}, u'Groups': [{u'GroupName': 'pdx01-cloud-dev-jumpeng', u'GroupId': 'sg-50abbf35'}], u'SubnetId': 'subnet-57e45b20', u'OwnerId': '547316675291', u'PrivateIpAddress': '10.11.32.10'}], u'SourceDestCheck': True, u'Placement': {u'Tenancy': 'default', u'GroupName': '', u'AvailabilityZone': 'us-west-2b'}, u'Hypervisor': 'xen', u'BlockDeviceMappings': [{u'DeviceName': '/dev/sda1', u'Ebs': {u'Status': 'attached', u'DeleteOnTermination': True, u'VolumeId': 'vol-0990f21b', u'AttachTime': datetime.datetime(2015, 5, 2, 16, 34, 21, tzinfo=tzutc())}}], u'Architecture': 'x86_64', u'StateReason': {u'Message': 'Client.UserInitiatedShutdown: User initiated shutdown', u'Code': 'Client.UserInitiatedShutdown'}, u'RootDeviceName': '/dev/sda1', u'VirtualizationType': 'hvm', u'Tags': [{u'Value': 'pdx01-jump02-DEPRECATED', u'Key': 'Name'}, {u'Value': 'Cloud Eng Jump Box', u'Key': 'Description'}], u'AmiLaunchIndex': 0}

I think it is all about finding the information in the response and grabbing it;
def gather_public_ip():
regions = ['us-west-2', 'eu-central-1', 'ap-southeast-1']
combined_list = [] ##This needs to be returned
for region in regions:
instance_information = [] # I assume this is a list, not dict
ip_dict = {}
client = boto3.client('ec2', aws_access_key_id=ACCESS_KEY, aws_secret_access_key=SECRET_KEY,
region_name=region, )
instance_dict = client.describe_instances().get('Reservations')
for reservation in instance_dict:
for instance in reservation['Instances']: # This is rather not obvious
if instance[unicode('State')][unicode('Name')] == 'running':
try:
ipaddress = instance[unicode('PublicIpAddress')]
except:
break
tagValue = instance[unicode('Tags')][0][unicode('Value')] # 'Tags' is a list, took the first element, you might wanna switch this
zone = instances[unicode('Placement')][unicode('AvailabilityZone')]
info = ipaddress, tagValue, zone
instance_information.append(info)
combined_list.append(instance_information)
return combined_list
gather_public_ip()
I hope this is what you are looking for, and I also hope you do NOT copy this straight off, but look at how I extracted the info from the return dict in the docs.

Try ip_addr = instance_dict['Reservations'][0]['Instances'][0]['PublicIpAddress'] for IP addresses. describe_instaces() response is a dictionary whose value is an array which conatins multiple other dictionaries and so on....

Getting the index information of a specific string in a nested list? In python 3

So I have a list with lots of nested lists in which include a students name and their test score, I was wondering how to retrieve the index (position) of the sub list that contains a specific students scores by searching with the variable 'name' that the user enters when the quiz starts. The lists sub list are created from the import of a text file, the code i used for that is:
with open('class1quizscoreboard.txt') as scoreboard:
for line in scoreboard:
importedscores.append(line.strip().split(',')
This works and the list looks like this:
[['EmilyScott ', ' 3'], ['Student Name ', ' 2'], ['Another Student ', ' 1'], ['Third Student ', ' 10']]
So I want to find the sublist by searching with the variable name, then find the position of that sublist IF the name is already in the list of scores. I would then add another test score into the list, but if it is not there just create another list.
I thought of approaching this with an IF with FOR loop, but didn't work
for sublist in importedscores:
if sublist[0] == search:
print ("Found it!"), sublist
Returned 'not found' even though I used a name I knew was definitely in the list
I am probably approaching this in the complete wrong way but the end goal is to be able to add the students new score to their existing list if they already have one
Thank you for any help

By dictionary:
Read txt file by csv module because file have well define structure.
Create dictionary from the file content where key is name and value is integer number i.e. score
Add new score for existing student.
Create new entry for new sequent.
Demo:
import csv
import pprint
p = "input34.txt"
with open(p, "rb") as fp:
root = csv.reader(fp, delimiter=',')
result = {}
for i in root:
result[i[0].strip()] = int(i[1].strip())
print "Debug1 result:"
pprint.pprint (result)
serach_key = "Test student"
add_value = 5
if serach_key in result:
result[serach_key] = result[serach_key] + add_value
else:
result[serach_key] = add_value
print "Debug2 result:"
pprint.pprint (result)
serach_key = "Another Student"
add_value = 5
if serach_key in result:
result[serach_key] = result[serach_key] + add_value
else:
result[serach_key] = add_value
print "Debug3 result:"
pprint.pprint (result)
Output:
vivek#vivek:~/Desktop/stackoverflow$ python 34.py
Debug1 result:
{'Another Student': 1,
'Emily Scott': 3,
'Student Name': 2,
'Third Student': 10}
Debug2 result:
{'Another Student': 1,
'Emily Scott': 3,
'Student Name': 2,
'Test student': 5,
'Third Student': 10}
Debug3 result:
{'Another Student': 6,
'Emily Scott': 3,
'Student Name': 2,
'Test student': 5,
'Third Student': 10}
By list:
Demo:
mport csv
import pprint
p = "input34.txt"
with open(p, "rb") as fp:
root = csv.reader(fp, delimiter=',')
result = []
for i in root:
result.append([i[0].strip(), int(i[1].strip())])
print "Debug1 result:"
pprint.pprint (result)
serach_key = "Test student"
add_value = 5
add_flag = False
for i,j in enumerate(result):
if serach_key==j[0]:
j[1] = j[1] + add_value
add_flag = True
break
if add_flag==False:
result.append([serach_key, add_value])
print "Debug2 result:"
pprint.pprint (result)
serach_key = "Another Student"
add_value = 5
add_flag = False
for i,j in enumerate(result):
if serach_key==j[0]:
j[1] = j[1] + add_value
add_flag = True
break
if add_flag==False:
result.append([serach_key, add_value])
print "Debug3 result:"
pprint.pprint (result)
Output:
vivek#vivek:~/Desktop/stackoverflow$ python 34.py
Debug1 result:
[['Emily Scott', 3],
['Student Name', 2],
['Another Student', 1],
['Third Student', 10]]
Debug2 result:
[['Emily Scott', 3],
['Student Name', 2],
['Another Student', 1],
['Third Student', 10],
['Test student', 5]]
Debug3 result:
[['Emily Scott', 3],
['Student Name', 2],
['Another Student', 6],
['Third Student', 10],
['Test student', 5]]
Use collection.defaultdict to optimize code. So no need to check key is in dictionary or not.
e.g.:
>>> import collections
>>> result = collections.defaultdict(int)
>>> result["student 1"] = 10
>>> result["student 2"] = 20
>>> print result
defaultdict(<type 'int'>, {'student 1': 10, 'student 2': 20})
>>> result["student 2"] = result["student 2"] + 2
>>> print result
defaultdict(<type 'int'>, {'student 1': 10, 'student 2': 22})
>>> result["student 3"] = result["student 3"] + 5
>>> print result
defaultdict(<type 'int'>, {'student 1': 10, 'student 2': 22, 'student 3': 5})
>>>

Try this:
def find_index(name,list):
for i, j in enumerate(list) :
if name in j :
return ('{}'.format(i))
for i, j in list:
if name == i:
ind = int(find_index(name, list))
list[ind].append(score)
Here in the find_index function the i represents the index and the j the sublist. - if name in sublist, return the index
In the for loop the i and j represent every subsection in the list. Its a bit hard to explain, but the i is the name and j the score

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Collection items in Python - python

Related

How to extract information from atom feed based on condition?

Getting the sum of a csv column without pandas in python

Python: Adding letter plus sequential numbers as values to a dictionary key

Extract the list of running IP's from AWS

Getting the index information of a specific string in a nested list? In python 3

Categories

Resources