Changing Values in a Histogram Python - python

I am trying to write a function 'add_to_hist' that takes a character and a histogram and adds an occurrence of that character to the histogram. If the character doesn't already have an entry, it should create one. For example:
>>>hist = [['B',1],['a',3],['n',2],['!',1]]
>>>add_to_hist('a',hist)
>>>hist
Should return: [['B', 1], ['a', 4], ['n', 2], ['!', 1]]
Here is what I have so far:
def add_to_hist(x,hist):
if x in hist:
hist['a'] = hist['a'] + 1
return hist
else: hist.append(x)
return (hist)

You choose to represent your histogram as a list of 2-element-lists; however in your add_to_hist function, in the else branch, you append the item itself. You should append [x, 1]. Also, for the same reason, you cannot check if x in hist, because x is an item (str), but the elements of hist are lists.
There are also other errors in your function (incorrect indentation; use of 'a' instead of x).
In case using a list of lists is not a requirement, there are better ways to do this, for example using a dict instead of a list, a defaultdict(int), or the collections.Counter class.
It looks like your code has been written with dictionaries in mind:
hist = {'B': 1, 'a': 4, 'n': 2, '!': 1}
def add_to_hist(x,hist):
if x in hist:
hist[i] = hist[i] + 1
else:
hist[i] = 1
return hist

It'd be easier to represent your histogram as a dictionary, as then you could directly access elements that match x.
However, as is (assuming you're forced to use lists), here's how you'd solve this problem:
def add_to_hist(x,hist):
for i in range(len(hist)):
if x == hist[i][0]:
hist[i][1] = hist[i][1] + 1
return hist
hist.append([x, 1])
return hist
It's a list of list, you cannot directly do "x in hist" as it'll just try and match x with each element of hist (which is a list) so this'll never work. You have to run through hist, getting each element, and then comparing x on that. If you find it, add one to the second element of that found list, and return out.
Now, if you run through the entire for loop without finding a matching element, you know it doesn't exist, so you can append a new value to the hist.

Related

python list comprehension with cls

I encountered a snippet of code like the following:
array = ['a', 'b', 'c']
ids = [array.index(cls.lower()) for cls in array]
I'm confusing for two points:
what does [... for cls in array] mean, since cls is a reserved keyword for class, why not just using [... for s in array]?
why bother to write something complicated like this instead of just [i for i in range(len(array))].
I believe this code is written by someone more experienced with python than me, and I believe he must have some reason for doing so...
cls is not a reserved word for class. That would be a very poor choice of name by the language designer. Many programmers may use it by convention but it is no more reserved than the parameter name self.
If you use distinct upper and lower case characters in the list, you will see the difference:
array = ['a', 'b', 'c', 'B','A','c']
ids = [array.index(cls.lower()) for cls in array]
print(ids)
[0, 1, 2, 1, 0, 2]
The value at position 3 is 1 instead of 3 because the first occurrence of a lowercase 'B' is at index 1. Similarly, the value at the last positions is 2 instead of 5 because the first 'c' is at index 2.
This list comprehension requires that the array always contain a lowercase instance of every uppercase letter. For example ['a', 'B', 'c'] would make it crash. Hopefully there are other safeguards in the rest of the program to ensure that this requirement is always met.
A safer, and more efficient way to write this would be to build a dictionary of character positions before going through the array to get indexes. This would make the time complexity O(n) instead of O(n^2). It could also help make the process more robust.
array = ['a', 'b', 'c', 'B','A','c','Z']
firstchar = {c:-i for i,c in enumerate(array[::-1],1-len(array))}
ids = [firstchar.get(c.lower()) for c in array]
print(ids)
[0, 1, 2, 1, 0, 2, None]
The firstchar dictionary contains the first index in array containing a given letter. It is built by going backward through the array so that the smallest index remains when there are multiple occurrences of the same letter.
{'Z': 6, 'c': 2, 'A': 4, 'B': 3, 'b': 1, 'a': 0}
Then, going through the array to form ids, each character finds the corresponding index in O(1) time by using the dictionary.
Using the .get() method allows the list comprehension to survive an upper case letter without a corresponding lowercase value in the list. In this example it returns None but it could also be made to return the letter's index or the index of the first uppercase instance.
Some developers might be experienced, but actually terrible with the code they write and just "skate on by".
Having said that, your suggested output for question #2 would differ if the list contained two of any element. The suggested code would return the first indices where a list element occurs where as yours would give each individual items index. It would also differ if the array elements weren't lowercase.

Given an array of ints length 3, figure out which is larger, the first or last element in the array, and set all the other elements to be that value

Given an array of ints length 3, figure out which is larger, the first or last element in the array, and set all the other elements to be that value. Return the changed array
def max_end3(nums):
if nums[0:] > nums[:-1]:
# this is where I am lost
return print(nums)
elif nums[:-1] > nums[0:]:
# this is where I am lost
return print(nums)
max_end3([1, 2, 3])
max_end3([11, 5, 9])
max_end3([2, 11, 3])
I am missing something and I cant seem to remember it. there is a +1 somewhere that allows me to iterate through each element and modify them as it goes. I cannot seem to remember how to assign the larger value to every item in the list.
Here is a link to the problem https://codingbat.com/prob/p135290
I appreciate any assistance.
Thank You.
You can find the maximum value using in-built max and form a list of the maximum values:
def max_end3(nums):
m = max(nums[0], nums[-1])
print([m for _ in nums])
You could do a list comprehension to change all the values, it would be something like
list = [value = list[:0] for value in list]
also the return statements aren't indented
Check this out
def max_end3(nums):
return 3 * [max(nums[0], nums[-1])]

Finding first time value occurs in an array when you don't know what it is

I have a very long array (over 2 million values) with repeating value. It looks something like this:
array = [1,1,1,1,......,2,2,2.....3,3,3.....]
With a bunch of different values. I want to create individual arrays for each group of points. IE: an array for the ones, an array for the twos, and so forth. So something that would look like:
array1 = [1,1,1,1...]
array2 = [2,2,2,2.....]
array3 = [3,3,3,3....]
.
.
.
.
None of the values occur an equal amount of time however, and I don't know how many times each value occurs. Any advice?
Assuming that repeated values are grouped together (otherwise you simply need to sort the list), you can create a nested list (rather than a new list for every different value) using itertools.groupby:
from itertools import groupby
array = [1,1,1,1,2,2,2,3,3]
[list(v) for k,v in groupby(array)]
[[1, 1, 1, 1], [2, 2, 2], [3, 3]]
Note that this will be more convenient than creating n new lists created dinamically as shown for instance in this post, as you have no idea of how many lists will be created, and you will have to refer to each list by its name rather by simply indexing a nested list
You can use bisect.bisect_left to find the indices of the first occurence of each element. This works only if the list is sorted:
from bisect import bisect_left
def count_values(l, values=None):
if values is None:
values = range(1, l[-1]+1) # Default assume list is [1..n]
counts = {}
consumed = 0
val_iter = iter(values)
curr_value = next(val_iter)
next_value = next(val_iter)
while True:
ind = bisect_left(l, next_value, consumed)
counts[curr_value] = ind - consumed
consumed = ind
try:
curr_value, next_value = next_value, next(val_iter)
except StopIteration:
break
counts[next_value] = len(l) - consumed
return counts
l = [1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3]
print(count_values(l))
# {1: 9, 2: 8, 3: 7}
This avoids scanning the entire list, trading that for a binary search for each value. Expect this to be more performant where there are very many of each element, and less performant where there are few of each element.
Well, it seems to be wasteful and redundant to create all those arrays, each of which just stores repeating values.
You might want to just create a dictionary of unique values and their respective counts.
From this dictionary, you can always selectively create any of the individual arrays easily, whenever you want, and whichever particular one you want.
To create such a dictionary, you can use:
from collections import Counter
my_counts_dict = Counter(my_array)
Once you have this dict, you can get the number of 23's, for example, with my_counts_dict[23].
And if this returns 200, you can create your list of 200 23's with:
my_list23 = [23]*200
****Use this code ****
<?php
$arrayName = array(2,2,5,1,1,1,2,3,3,3,4,5,4,5,4,6,6,6,7,8,9,7,8,9,7,8,9);
$arr = array();
foreach ($arrayName as $value) {
$arr[$value][] = $value;
}
sort($arr);
print_r($arr);
?>
Solution with no helper functions:
array = [1,1,2,2,2,3,4]
result = [[array[0]]]
for i in array[1:]:
if i == result[-1][-1]:
result[-1].append(i)
else:
result.append([i])
print(result)
# [[1, 1], [2, 2, 2], [3], [4]]

Optimize search to find next matching value in a list

I have a program that goes through a list and for each objects finds the next instance that has a matching value. When it does it prints out the location of each objects. The program runs perfectly fine but the trouble I am running into is when I run it with a large volume of data (~6,000,000 objects in the list) it will take much too long. If anyone could provide insight into how I can make the process more efficient, I would greatly appreciate it.
def search(list):
original = list
matchedvalues = []
count = 0
for x in original:
targetValue = x.getValue()
count = count + 1
copy = original[count:]
for y in copy:
if (targetValue == y.getValue):
print (str(x.getLocation) + (,) + str(y.getLocation))
break
Perhaps you can make a dictionary that contains a list of indexes that correspond to each item, something like this:
values = [1,2,3,1,2,3,4]
from collections import defaultdict
def get_matches(x):
my_dict = defaultdict(list)
for ind, ele in enumerate(x):
my_dict[ele].append(ind)
return my_dict
Result:
>>> get_matches(values)
defaultdict(<type 'list'>, {1: [0, 3], 2: [1, 4], 3: [2, 5], 4: [6]})
Edit:
I added this part, in case it helps:
values = [1,1,1,1,2,2,3,4,5,3]
def get_next_item_ind(x, ind):
my_dict = get_matches(x)
indexes = my_dict[x[ind]]
temp_ind = indexes.index(ind)
if len(indexes) > temp_ind + 1:
return(indexes)[temp_ind + 1]
return None
Result:
>>> get_next_item_ind(values, 0)
1
>>> get_next_item_ind(values, 1)
2
>>> get_next_item_ind(values, 2)
3
>>> get_next_item_ind(values, 3)
>>> get_next_item_ind(values, 4)
5
>>> get_next_item_ind(values, 5)
>>> get_next_item_ind(values, 6)
9
>>> get_next_item_ind(values, 7)
>>> get_next_item_ind(values, 8)
There are a few ways you could increase the efficiency of this search by minimising additional memory use (particularly when your data is BIG).
you can operate directly on the list you are passing in, and don't need to make copies of it, in this way you won't need: original = list, or copy = original[count:]
you can use slices of the original list to test against, and enumerate(p) to iterate through these slices. You won't need the extra variable count and, enumerate(p) is efficient in Python
Re-implemented, this would become:
def search(p):
# iterate over p
for i, value in enumerate(p):
# if value occurs more than once, print locations
# do not re-test values that have already been tested (if value not in p[:i])
if value not in p[:i] and value in p[(i + 1):]:
print(e, ':', i, p[(i + 1):].index(e))
v = [1,2,3,1,2,3,4]
search(v)
1 : 0 2
2 : 1 2
3 : 2 2
Implementing it this way will only print out the values / locations where a value is repeated (which I think is what you intended in your original implementation).
Other considerations:
More than 2 occurrences of value: If the value repeats many times in the list, then you might want to implement a function to walk recursively through the list. As it is, the question doesn't address this - and it may be that it doesn't need to in your situation.
using a dictionary: I completely agree with Akavall above, dictionary's are a great way of looking up values in Python - especially if you need to lookup values again later in the program. This will work best if you construct a dictionary instead of a list when you originally create the list. But if you are only doing this once, it is going to cost you more time to construct the dictionary and query over it than simply iterating over the list as described above.
Hope this helps!

Python Values in Lists

I am using Python 3.0 to write a program. In this program I deal a lot with lists which I haven't used very much in Python.
I am trying to write several if statements about these lists, and I would like to know how to look at just a specific value in the list. I also would like to be informed of how one would find the placement of a value in the list and input that in an if statement.
Here is some code to better explain that:
count = list.count(1)
if count > 1
(This is where I would like to have it look at where the 1 is that the count is finding)
Thank You!
Check out the documentation on sequence types and list methods.
To look at a specific element in the list you use its index:
>>> x = [4, 2, 1, 0, 1, 2]
>>> x[3]
0
To find the index of a specific value, use list.index():
>>> x.index(1)
2
Some more information about exactly what you are trying to do would be helpful, but it might be helpful to use a list comprehension to get the indices of all elements you are interested in, for example:
>>> [i for i, v in enumerate(x) if v == 1]
[2, 4]
You could then do something like this:
ones = [i for i, v in enumerate(your_list) if v == 1]
if len(ones) > 1:
# each element in ones is an index in your_list where the value is 1
Also, naming a variable list is a bad idea because it conflicts with the built-in list type.
edit: In your example you use your_list.count(1) > 1, this will only be true if there are two or more occurrences of 1 in the list. If you just want to see if 1 is in the list you should use 1 in your_list instead of using list.count().
You can use list.index() to find elements in the list besides the first one, but you would need to take a slice of the list starting from one element after the previous match, for example:
your_list = [4, 2, 1, 0, 1, 2]
i = -1
while True:
try:
i = your_list[i+1:].index(1) + i + 1
print("Found 1 at index", i)
except ValueError:
break
This should give the following output:
Found 1 at index 2
Found 1 at index 4
First off, I would strongly suggest reading through a beginner’s tutorial on lists and other data structures in Python: I would recommend starting with Chapter 3 of Dive Into Python, which goes through the native data structures in a good amount of detail.
To find the position of an item in a list, you have two main options, both using the index method. First off, checking beforehand:
numbers = [2, 3, 17, 1, 42]
if 1 in numbers:
index = numbers.index(1)
# Do something interesting
Your other option is to catch the ValueError thrown by index:
numbers = [2, 3, 17, 1, 42]
try:
index = numbers.index(1)
except ValueError:
# The number isn't here
pass
else:
# Do something interesting
One word of caution: avoid naming your lists list: quite aside from not being very informative, it’ll shadow Python’s native definition of list as a type, and probably cause you some very painful headaches later on.
You can find out in which index is the element like this:
idx = lst.index(1)
And then access the element like this:
e = lst[idx]
If what you want is the next element:
n = lst[idx+1]
Now, you have to be careful - what happens if the element is not in the list? a way to handle that case would be:
try:
idx = lst.index(1)
n = lst[idx+1]
except ValueError:
# do something if the element is not in the list
pass
list.index(x)
Return the index in the list of the first item whose value is x. It is an error if there is no such item.
--
In the docs you can find some more useful functions on lists: http://docs.python.org/tutorial/datastructures.html#more-on-lists
--
Added suggestion after your comment: Perhaps this is more helpful:
for idx, value in enumerate(your_list):
# `idx` will contain the index of the item and `value` will contain the value at index `idx`

Categories