Sorting Array using Sorting Algorithm in PYthon

Sorting Array using Sorting Algorithm in PYthon - python

As part of my project, I want to make a database which sorts the Age based on their birthdate.
import datetime
profile = (
('Joe', 'Clark', '1989-11-20'),
('Charlie', 'Babbitt', '1988-11-20'),
('Frank', 'Abagnale', '2002-11-20'),
('Bill', 'Clark', '2009-11-20'),
('Alan', 'Clark', '1925-11-20'),
)
age_list = []
for prof in profile:
date = prof[2]
datem = datetime.datetime.strptime(date, "%Y-%m-%d")
tod = datem.day
mos = datem.month
yr = datem.year
today_date = datetime.datetime.now()
dob = datetime.datetime(yr, mos, tod)
time_diff = today_date - dob
Age = time_diff.days // 365
age_list.append(Age)
def insertionsort(age_list):
for him in range(1, len(age_list)):
call = him - 1
while age_list[call] > age_list[call + 1] and call >= 0:
age_list[call], age_list[call + 1] = age_list[call + 1], age_list[call]
call -= 1
print("")
print("\t\t\t\t\t\t\t\t\t\t\t---Insertion Sort---")
print("Sorted Array of Age: ", age_list)
and the output would be:
---Insertion Sort---
Sorted Array of Age: [12, 19, 32, 33, 96]
But that's not what I want, I don't want just the Age but also the other elements to be included in the output
So instead of the output earlier, what I want is:
---Insertion Sort---
Sorted Array of Age: [Bill, Clark, 12]
[Frank, Abagnale, 19]
[Joe, Clark, 32]
[Charlie, Babbitt, 33]
[Alan, Clark, 96]
Thank you in advanced!

As you want to keep your own insertion sort implementation, I would suggest putting the date of birth as the first tuple member: that way you can just compare tuples in your sorting implementation. The date of birth is in fact a better value to sort by (but reversed) than the age, as the date has more precision (day) compared to the age (year).
Secondly, your algorithm to calculate the age is error prone, as not all years have 365 days. Use the code as provided in this question:
import datetime
def calculate_age(born):
today = datetime.date.today()
return today.year - born.year - ((today.month, today.day) < (born.month, born.day))
def insertionsort(lst):
for i, value in enumerate(lst):
for j in range(i - 1, -1, -1):
if lst[j] > value: # this will give a sort in descending order
break
lst[j], lst[j + 1] = lst[j + 1], lst[j]
# Your example data as a list
profiles = [
('Joe', 'Clark', '1989-11-20'),
('Charlie', 'Babbitt', '1988-11-20'),
('Frank', 'Abagnale', '2002-11-20'),
('Bill', 'Clark', '2009-11-20'),
('Alan', 'Clark', '1925-11-20'),
]
# Put date of birth first, and append age
profiles = [(dob, first, last, calculate_age(datetime.datetime.strptime(dob, "%Y-%m-%d")))
for first, last, dob in profiles]
insertionsort(profiles)
print(profiles)

results = sorted(profile, key = lamda x: datetime.datetime.strptime(x[2], "%Y-%m-%d"))

You could do it like this. Note that the strptime function may not be necessary for you but it implicitly validates the format of the date in your input data. Also note that because the dates are in the form of YYYY-MM-DD they can be sorted lexically to give the desired result.
from datetime import datetime
from dateutil.relativedelta import relativedelta
profile = (
('Joe', 'Clark', '1989-11-20'),
('Charlie', 'Babbitt', '1988-11-20'),
('Frank', 'Abagnale', '2002-11-20'),
('Bill', 'Clark', '2009-11-20'),
('Alan', 'Clark', '1925-11-20')
)
for person in sorted(profile, key=lambda e: e[2], reverse=True):
age = relativedelta(datetime.today(), datetime.strptime(person[2], '%Y-%m-%d')).years
print(f'{person[0]}, {person[1]}, {age}')

Related

How to tell if two dates in a list are consecutive in Python?

I have sorted a list of string dates by order
sorteddates =['2017-04-26', '2017-05-05', '2017-05-10', '2017-05-11', '2017-05-16']
I have tried using this to sort my code by consecutive dates by I am having a difficult time understanding. I want to see if which 2 dates are consecutive. Only two dates.
dates = [datetime.strptime(d, "%Y-%m-%d") for d in sorteddates]
date_ints = set([d.toordinal() for d in dates])

Convert the list from str to datetime -- still in sorted order.
Iterate through the list; for each item, check to see whether the next item is one day later -- datetime has timedelta values as well.
Some code:
# Convert list to datetime; you've shown you can do that part.
enter code here
one_day = datetime.timedelta(days=1)
for today, tomorrow in zip(sorteddates, sorteddates[1:]):
if today + one_day == tomorrow:
print ("SUCCESS")

If I understand your question correctly, to get first pair of consecutive dates you can check if their delta is 1 day:
from datetime import datetime
sorteddates =['2017-04-26', '2017-05-05', '2017-05-10', '2017-05-11', '2017-05-16']
dates = [datetime.strptime(d, "%Y-%m-%d") for d in sorteddates]
d = next(((d1, d2) for d1, d2 in zip(dates, dates[1:]) if (d2 - d1).days == 1), None ) # <-- returns pair or None if no consecutive dates are found
print(d)
Prints:
(datetime.datetime(2017, 5, 10, 0, 0), datetime.datetime(2017, 5, 11, 0, 0))
Or formatted:
if d:
print([datetime.strftime(i, "%Y-%m-%d") for i in d])
Prints:
['2017-05-10', '2017-05-11']

closest date when looping over array. python [duplicate]

This question already has answers here:
Find the closest date to a given date
(9 answers)
Closed 3 years ago.
If I have an array of dates like the following:
array = [{'date': '09-Jul-2018'},
{'date': '09-Aug-2018'},
{'date': '09-Sep-2018'}]
and I have a date like the following 17-Aug-2018.
can anyone advise the best way to check for the closest date, always in the past?
I have tried the following, but to no avail.
closest_date
for i in range(len(array)):
if(date > array[i].date and date < array[i + 1].date):
closest_date = array[i]

Follows yet another approach:
from datetime import datetime
convert = lambda e: datetime.strptime(e, '%d-%b-%Y')
array = [{'date': '09-Jul-2018'},
{'date': '09-Aug-2018'},
{'date': '09-Sep-2018'}]
ref = convert("17-Aug-2018")
transform = ((convert(elem['date']), elem['date']) for elem in array)
_, closest_date = max((elem for elem in transform if (elem[0] - ref).days < 0), key = lambda e: e[0])
print(closest_date)
Output is
09-Aug-2018
Hope this helps.

My approach first creates a list of datetime objects from your list of dicts, and then simply sorts the dates while comparing with the input date.
input_dt = datetime.strptime('17-Aug-2018', '%d-%b-%Y')
sorted(
map(lambda date: datetime.strptime(date['date'], '%d-%b-%Y'), array),
key=lambda dt: (input_dt - dt).total_seconds() if dt < input_dt else float("inf"),
)[0].strftime('%d-%b-%Y')

This is one approach.
Ex:
import datetime
array = [{'date': '09-Jul-2018'},
{'date': '09-Aug-2018'},
{'date': '09-Sep-2018'}]
to_check = "17-Aug-2018"
to_check = datetime.datetime.strptime(to_check, "%d-%b-%Y")
closest_dates = []
val = 0
for date in array:
date_val = datetime.datetime.strptime(date["date"], "%d-%b-%Y")
if date_val <= to_check:
closest_dates.append({(to_check - date_val).days: date["date"]})
print(min(closest_dates, key=lambda x: x.items()[0]))
Output:
{8: '09-Aug-2018'}

If the dates in your dictionary are timestamps here is a way to do it :
from datetime import date
closest_date = min([x['date'] for x in array])
date = date(2018, 8, 17)
for element in array:
current_date = element['date']
if current_date < date and current_date>closest_date:
closest_date = current_date
# Output : datetime.date(2018, 8, 9)
If your dates are not in the timestamp format, here is a way to convert them easily :
from datetime import datetime
array = [ {'date' : datetime.strptime(s['date'],'%d-%b-%Y')} for s in array]

I would advise you to use always vectorised operations in NumPy. It is always much faster :D. I would do it this way:
import numpy as np
import datetime
dates = np.array(list(map(lambda d: datetime.datetime.strptime(d["date"], "%d-%b-%Y"), array)))
differences = dates - datetime.datetime.strptime("17-Aug-2018", "%d-%b-%Y")
differences = np.vectorize(lambda d: d.days)(differences)
differences[differences >= 0] = -9e9
most_recent_date = dates[np.argmax(differences)]

How to rank tuple value in nested dictionary

I am trying to make a function that accepts a database and a year and it calculates the ranking of the names based on their count and then updates the database.
database = {('Spot','DOG'): {2013: (612, 1), 2014: (598, 3)},
('Princess','CAT'): {2013: (713, 2)},
('Smokey', 'CAT'): {2013: (523, 1), 2014: (514, 1)},
('Fido', 'DOG'): {2013: (558, 2), 2014: (655, 1)},
('Sparky','DOG'): {2104: (572, 2)}}
I have to rank the cats and dogs separately with the most popular name being rank 1, so descending order.
I'm only allowed to use basic expressions and statements, len, range, enumerate, int, get, items, keys, values, pop, copy, popitem, update, append, insert, extend,min, max, index, split, join, replace, sorted, sort, reversed, reverse, sets and basic expressions and statements and I can write my own functions and use them.
I can't import modules or use lambda or any thing besides whats on the allowed list. That is why I am stuck. This is my first programming class. I have no idea how to sort the tuple. I know how to sort a nested dictionary, there was a great post on here for that, but I am stuck at this tuple. Could you please help.
def rank_names_for_one_year(db, year):
list = []
list_2 = []
for k in db:
if 'MALE' in k:
for key in db[k]:
if year == key:
for n in db[k][year]:
list.append(n)
break
if 'FEMALE' in k:
for key in db[k]:
if year == key:
for n in db[k][year]:
list_2.append(n)
break
list.sort()
new_list = [()]
list_2.sort()
new_list_2 = [()]
for k in db:
if 'MALE' in k:
for i in range(len(list)):
new_list.append((list[i],i+1))
for key in db[k]:
if year == key:
for n in db[k][year]:
for i in range(len(new_list)):
for j in range(len(new_list[i])):
if new_list[i][j] == n:
db[k][year] = new_list[i]
if 'FEMALE' in k:
for i in range(len(list_2)):
new_list_2.append((list_2[i],i+1))
for key in db[k]:
if year == key:
for n in db[k][year]:
for i in range(len(new_list_2)):
for j in range(len(new_list_2[i])):
if new_list_2[i][j] == n:
db[k][year] = new_list_2[i]

database = {('TEST','DOG'):{2013:(612,1), 2014:(598,3)}, ('Spot','DOG'):{2013:(612,1), 2014:(598,3)},('Princess','CAT'):{2013:(713,2)},
('Princess1','CAT'):{2013:(713,2)},('Smokey', 'CAT'):{2013:(523,1), 2014:(514,1)},('Smokey1', 'CAT'):{2013:(523,1), 2014:(514,1)},('Fido', 'DOG'):{2013:(558, 2), 2014:(655, 1)},('Sparky','DOG'):{2104:(572,2)}}
def rank_names_for_one_year(db, year):
temp = {'DOG' : [], 'CAT' : []}
for k, v in db.items():
if year not in v:
continue
temp[k[1]].append((v[year][0], k[0]))
for animal_type, v in temp.items():
rank = 0
countPrev = -1
for i, (count, name) in enumerate(reversed(sorted(v))):
if countPrev != count:
rank = i + 1
countPrev = count
db[(name, animal_type)][year] = (count, rank)
rank_names_for_one_year(database, 2013)

How can I calculate the average of a list of tuples in python?

I have a list of tuples in the format:
[(security, price paid, number of shares purchased)....]
[('MSFT', '$39.458', '1,000'), ('AAPL', '$638.416', '200'), ('FOSL', '$52.033', '1,000'), ('OCZ', '$5.26', '34,480'), ('OCZ', '$5.1571', '5,300')]
I want to consolidate the data. Such that each security is only listed once.
[(Name of Security, Average Price Paid, Number of shares owned), ...]

I used a dictionary as Output.
lis=[('MSFT', '$39.458', '1,000'), ('AAPL', '$638.416', '200'), ('FOSL', '$52.033', '1,000'), ('OCZ', '$5.26', '34,480'), ('OCZ', '$5.1571', '5,300')]
dic={}
for x in lis:
if x[0] not in dic:
price=float(x[1].strip('$'))
nos=int("".join(x[2].split(',')))
#print(nos)
dic[x[0]]=[price,nos]
else:
price=float(x[1].strip('$'))
nos=int("".join(x[2].split(',')))
dic[x[0]][1]+=nos
dic[x[0]][0]=(dic[x[0]][0]+price)/2
print(dic)
output:
{'AAPL': [638.416, 200], 'OCZ': [5.20855, 39780], 'FOSL': [52.033, 1000], 'MSFT': [39.458, 1000]}

It's not very clear what you're trying to do. Some example code would help, along with some information of what you've tried. Even if your approach is dead wrong, it'll give us a vague idea of what you're aiming for.
In the meantime, perhaps numpy's numpy.mean function is appropriate for your problem? I would suggest transforming your list of tuples into a numpy array and then applying the mean function on a slice of said array.
That said, it does work on any list-like data structure and you can specify along which access you would like to perform the average.
http://docs.scipy.org/doc/numpy/reference/generated/numpy.mean.html
EDIT:
From what I've gathered, your list of tuples organizes data in the following manner:
(name, dollar ammount, weight)
I'd start by using numpy to transform your list of tuples into an array. From there, find the unique values in the first column (the names):
import numpy as np
a = np.array([(tag, 23.00, 5), (tag2, 25.00, 10)])
unique_tags = np.unique(a[0,:]) # note the slicing of the array
Now calculate the mean for each tag
meandic = {}
for element in unique_tags:
tags = np.nonzero(a[0,:] == element) # identify which lines are tagged with element
meandic[element] = np.mean([t(1) * t(2) for t in a[tags]])
Please note that this code is untested. I may have gotten small details wrong. If you can't figure something out, just leave a comment and I'll gladly correct my mistake. You'll have to remove '$' and convert strings to floats where necessary.

>>> lis
[('MSFT', '$39.458', '1,000'), ('AAPL', '$638.416', '200'), ('FOSL', '$52.033', '1,000'), ('OCZ', '$5.26', '34,480'), ('OCZ', '$5.1571', '5,300')]
>>> from collections import defaultdict
>>> d = defaultdict(list)
>>> for i in lis:
... amt = float(i[1].strip('$'))
... num = int(i[2].replace(",", ""))
... d[i[0]].append((amt,num))
...
>>> for i in d.iteritems():
... average_price = sum([s[0] for s in i[1]])/len([s[0] for s in i[1]])
... total_shares = sum([s[1] for s in i[1]])
... print (i[0],average_price,total_shares)
...
('AAPL', 638.416, 200)
('OCZ', 5.20855, 39780)
('FOSL', 52.033, 1000)
('MSFT', 39.458, 1000)

Here you go:
the_list = [('msft', '$31', 5), ('msft','$32', 10), ('aapl', '$100', 1)]
clean_list = map (lambda x: (x[0],float (x[1][1:]), int(x[2])), the_list)
out = {}
for name, price, shares in clean_list:
if not name in out:
out[name] = [price, shares]
else:
out[name][0] += price * shares
out[name][1] += shares
# put the output in the requested format
# not forgetting to calculate avg price paid
# out contains total # shares and total price paid
nice_out = [ (name, "$%0.2f" % (out[name][0] / out[name][1]), out[name][1])
for name in out.keys()]
print nice_out
>>> [('aapl', '$100.00', 1), ('msft', '$23.40', 15)]

Insert an item into sorted list in Python

I'm creating a class where one of the methods inserts a new item into the sorted list. The item is inserted in the corrected (sorted) position in the sorted list. I'm not allowed to use any built-in list functions or methods other than [], [:], +, and len though. This is the part that's really confusing to me.
What would be the best way in going about this?

Use the insort function of the bisect module:
import bisect
a = [1, 2, 4, 5]
bisect.insort(a, 3)
print(a)
Output
[1, 2, 3, 4, 5]

Hint 1: You might want to study the Python code in the bisect module.
Hint 2: Slicing can be used for list insertion:
>>> s = ['a', 'b', 'd', 'e']
>>> s[2:2] = ['c']
>>> s
['a', 'b', 'c', 'd', 'e']

You should use the bisect module. Also, the list needs to be sorted before using bisect.insort_left
It's a pretty big difference.
>>> l = [0, 2, 4, 5, 9]
>>> bisect.insort_left(l,8)
>>> l
[0, 2, 4, 5, 8, 9]
timeit.timeit("l.append(8); l = sorted(l)",setup="l = [4,2,0,9,5]; import bisect; l = sorted(l)",number=10000)
1.2235019207000732
timeit.timeit("bisect.insort_left(l,8)",setup="l = [4,2,0,9,5]; import bisect; l=sorted(l)",number=10000)
0.041441917419433594

I'm learning Algorithm right now, so i wonder how bisect module writes.
Here is the code from bisect module about inserting an item into sorted list, which uses dichotomy:
def insort_right(a, x, lo=0, hi=None):
"""Insert item x in list a, and keep it sorted assuming a is sorted.
If x is already in a, insert it to the right of the rightmost x.
Optional args lo (default 0) and hi (default len(a)) bound the
slice of a to be searched.
"""
if lo < 0:
raise ValueError('lo must be non-negative')
if hi is None:
hi = len(a)
while lo < hi:
mid = (lo+hi)//2
if x < a[mid]:
hi = mid
else:
lo = mid+1
a.insert(lo, x)

If there are no artificial restrictions, bisect.insort() should be used as described by stanga. However, as Velda mentioned in a comment, most real-world problems go beyond sorting pure numbers.
Fortunately, as commented by drakenation, the solution applies to any comparable objects. For example, bisect.insort() also works with a custom dataclass that implements __lt__():
from bisect import insort
#dataclass
class Person:
first_name: str
last_name: str
age: int
def __lt__(self, other):
return self.age < other.age
persons = []
insort(persons, Person('John', 'Doe', 30))
insort(persons, Person('Jane', 'Doe', 28))
insort(persons, Person('Santa', 'Claus', 1750))
# [Person(first_name='Jane', last_name='Doe', age=28), Person(first_name='John', last_name='Doe', age=30), Person(first_name='Santa', last_name='Claus', age=1750)]
However, in the case of tuples, it would be desirable to sort by an arbitrary key. By default, tuples are sorted by their first item (first name), then by the next item (last name), and so on.
As a solution you can manage an additional list of keys:
from bisect import bisect
persons = []
ages = []
def insert_person(person):
age = person[2]
i = bisect(ages, age)
persons.insert(i, person)
ages.insert(i, age)
insert_person(('John', 'Doe', 30))
insert_person(('Jane', 'Doe', 28))
insert_person(('Santa', 'Claus', 1750))
Official solution: The documentation of bisect.insort() refers to a recipe how to use the function to implement this functionality in a custom class SortedCollection, so that it can be used as follows:
>>> s = SortedCollection(key=itemgetter(2))
>>> for record in [
... ('roger', 'young', 30),
... ('angela', 'jones', 28),
... ('bill', 'smith', 22),
... ('david', 'thomas', 32)]:
... s.insert(record)
>>> pprint(list(s)) # show records sorted by age
[('bill', 'smith', 22),
('angela', 'jones', 28),
('roger', 'young', 30),
('david', 'thomas', 32)]
Following is the relevant extract of the class required to make the example work. Basically, the SortedCollection manages an additional list of keys in parallel to the items list to find out where to insert the new tuple (and its key).
from bisect import bisect_left
class SortedCollection(object):
def __init__(self, iterable=(), key=None):
self._given_key = key
key = (lambda x: x) if key is None else key
decorated = sorted((key(item), item) for item in iterable)
self._keys = [k for k, item in decorated]
self._items = [item for k, item in decorated]
self._key = key
def __getitem__(self, i):
return self._items[i]
def __iter__(self):
return iter(self._items)
def insert(self, item):
'Insert a new item. If equal keys are found, add to the left'
k = self._key(item)
i = bisect_left(self._keys, k)
self._keys.insert(i, k)
self._items.insert(i, item)
Note that list.insert() as well as bisect.insort() have O(n) complexity. Thus, as commented by nz_21, manually iterating through the sorted list, looking for the right position, would be just as good in terms of complexity. In fact, simply sorting the array after inserting a new value will probably be fine, too, since Python's Timsort has a worst-case complexity of O(n log(n)). For completeness, however, note that a binary search tree (BST) would allow insertions in O(log(n)) time.

This is a possible solution for you:
a = [15, 12, 10]
b = sorted(a)
print b # --> b = [10, 12, 15]
c = 13
for i in range(len(b)):
if b[i] > c:
break
d = b[:i] + [c] + b[i:]
print d # --> d = [10, 12, 13, 15]

# function to insert a number in an sorted list
def pstatement(value_returned):
return print('new sorted list =', value_returned)
def insert(input, n):
print('input list = ', input)
print('number to insert = ', n)
print('range to iterate is =', len(input))
first = input[0]
print('first element =', first)
last = input[-1]
print('last element =', last)
if first > n:
list = [n] + input[:]
return pstatement(list)
elif last < n:
list = input[:] + [n]
return pstatement(list)
else:
for i in range(len(input)):
if input[i] > n:
break
list = input[:i] + [n] + input[i:]
return pstatement(list)
# Input values
listq = [2, 4, 5]
n = 1
insert(listq, n)

Well there are many ways to do this, here is a simple naive program to do the same using inbuilt Python function sorted()
def sorted_inserter():
list_in = []
n1 = int(input("How many items in the list : "))
for i in range (n1):
e1 = int(input("Enter numbers in list : "))
list_in.append(e1)
print("The input list is : ",list_in)
print("Any more items to be inserted ?")
n2 = int(input("How many more numbers to be added ? : "))
for j in range (n2):
e2= int(input("Add more numbers : "))
list_in.append(e2)
list_sorted=sorted(list_in)
print("The sorted list is: ",list_sorted)
sorted_inserter()
The output is
How many items in the list : 4
Enter numbers in list : 1
Enter numbers in list : 2
Enter numbers in list : 123
Enter numbers in list : 523
The input list is : [1, 2, 123, 523]
Any more items to be inserted ?
How many more numbers to be added ? : 1
Add more numbers : 9
The sorted list is: [1, 2, 9, 123, 523]

To add to the existing answers: When you want to insert an element into a list of tuples where the first element is comparable and the second is not you can use the key parameter of the bisect.insort function as follows:
import bisect
class B:
pass
a = [(1, B()), (2, B()), (3, B())]
bisect.insort(a, (3, B()), key=lambda x: x[0])
print(a)
Without the lambda function as the third parameter of the bisect.insort function the code would throw a TypeError as the function would try to compare the second element of a tuple as a tie breaker which isn't comparable by default.

This is the best way to append the list and insert values to sorted list:
a = [] num = int(input('How many numbers: ')) for n in range(num):
numbers = int(input('Enter values:'))
a.append(numbers)
b = sorted(a) print(b) c = int(input("enter value:")) for i in
range(len(b)):
if b[i] > c:
index = i
break d = b[:i] + [c] + b[i:] print(d)`

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Sorting Array using Sorting Algorithm in PYthon - python

results = sorted(profile, key = lamda x: datetime.datetime.strptime(x[2], "%Y-%m-%d"))

Related

How to tell if two dates in a list are consecutive in Python?

closest date when looping over array. python [duplicate]

How to rank tuple value in nested dictionary

How can I calculate the average of a list of tuples in python?

Insert an item into sorted list in Python

Categories

Resources