Python: Dictionary to Spare Vector - python

I am new to Python and programming in general. I was working on Pyschool exercises Topic 8, Q 11 on converting Dictionary to Spare Vectore.
I was asked to Write a function that converts a dictionary back to its sparese vector representation.
Examples
>>> convertDictionary({0: 1, 3: 2, 7: 3, 12: 4})
[1, 0, 0, 2, 0, 0, 0, 3, 0, 0, 0, 0, 4]
>>> convertDictionary({0: 1, 2: 1, 4: 2, 6: 1, 9: 1})
[1, 0, 1, 0, 2, 0, 1, 0, 0, 1]
>>> convertDictionary({})
[]
I have attempted many times. Below is the latest code I have:
def convertDictionary(dictionary):
k=dictionary.keys()
v=dictionary.values()
result=[]
for i in range(0,max(k)):
result.append(0)
for j in k:
result[j]=v[k.index(j)]
return result
The returned error is:
Traceback (most recent call last):
File "Code", line 8, in convertDictionary
IndexError: list assignment index out of range
Could anyone help me? Thank you so much!

Something like this should suffice:
M = max(dictionary, default=0)
vector = [dictionary.get(i, 0) for i in range(M)]
Translated into a plain old for-loop
M = max(dictionary, default=0)
vector = []
for i in range(M):
vector.append(dictionary.get(i, 0))
The get method lets you provide a default as a second argument in case the key is missing. Once you get more advance you could use a defaultdict
Edit: the default parameter for max requires Python >3.4 . You can either use exception handling (generally prefered) or explicit checks for empty dictionary to deal with that case if you have earlier versions.

Your code works logically fine, but you have a problem of indentation. Your function should be:
def convertDictionary(dictionary):
k=dictionary.keys()
v=dictionary.values()
result=[]
for i in range(0,max(k)):
result.append(0)
for j in k:
result[j]=v[k.index(j)]
return result
The problem is that your second for was inside of the first. What you want is to build a list with max(k) elements and then put the right values into it. Then, the two for loops should be one after the other, rather than one inside of the other.

Related

lists and for loop syntax

so I'm a beginner programmer and I'm trying to build a python program to print the Fibonacci sequence.
my code is as follows:
fib_sequence = [0,1,1]
def fib_add(x):
fib_seq.insert(x, int(fibseq[x-1]+fibseq[x-2])
for n in range(2,10):
fib_add(n)
print(fib_seq)
the program says there is a syntax error at the colon on
for n in range(2,10):
I don't know how to correct it
Interestingly, that is not where the syntax error is. It is the preceding line that is the problem:
fib_seq.insert(x, int(fibseq[x-1]+fibseq[x-2]))
This line was missing closing parentheses. What happens in such cases is that, since the parentheses were not closed, the Python interpreter continues looking for more stuff to put in the expression. It hits the for in the next line and continues all the way to right before the colon. At this point, there is a way to continue the code which is still valid.
Then, it hits the colon. There is no valid Python syntax which allows a colon there, so it stops and raises an error at the first token which is objectively in the wrong place. In terms of your intention, however, we can see that the mistake was actually made earlier.
Also, as noted in a comment, your original list was named fib_sequence, while in the rest of your code you reference fib_list. This will raise a NameError.
You have to place your for loop code inside main. Also as the other answer suggests, you must add another parenthesis after
fib_seq.insert(x, int(fibseq[x-1]+fibseq[x-2]))
if __name__ == '__main__':
for n in range(2,10):
fib_add(n)
print(fib_seq)
While you have some useful answers, you might look into generators, they make Python a powerful language:
def fibonacci():
x, y = 0, 1
while True:
yield x
x, y = y, x + y
for x in fibonacci():
if x >= 10:
break
print(x)
This prints
0
1
1
2
3
5
8
Here is the corrected code:
fib_seq = [0,1,1]
def fib_add(x):
fib_seq.insert(x, int(fib_seq[x-1]+fib_seq[x-2]))
for n in range(3,10):
fib_add(n)
print(fib_seq)
Resulting Output:
[0, 1, 1, 2]
[0, 1, 1, 2, 3]
[0, 1, 1, 2, 3, 5]
[0, 1, 1, 2, 3, 5, 8]
[0, 1, 1, 2, 3, 5, 8, 13]
[0, 1, 1, 2, 3, 5, 8, 13, 21]
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]

Python Dictonary KeyError when key exist

I'm new to python, and I'm doing a homework with it.
I'm using PyEDA with Python3 on Debian 8.
I created a variable named:
R = exprvars('r', n, n)
with n = 4, this gives me:
farray([[r[0,0], r[0,1], r[0,2], r[0,3]],
[r[1,0], r[1,1], r[1,2], r[1,3]],
[r[2,0], r[2,1], r[2,2], r[2,3]],
[r[3,0], r[3,1], r[3,2], r[3,3]]])
Then, after some logic, I create an CNF boolean function f and a BDD with it using:
f = expr2bdd(f)
Then, the expression:
U = f.satisfy_one()
gives me:
{r[2,1]: 0, r[3,2]: 1, r[1,1]: 1, r[0,2]: 0, r[0,3]: 1, r[2,2]: 0, r[2,3]: 0, r[3,3]: 0, r[3,1]: 0, r[1,2]: 0, r[0,1]: 0, r[1,0]: 0, r[2,0]: 1, r[3,0]: 0, r[0,0]: 0, r[1,3]: 0}
But here is what I can't understand:
I was expecting
U[R[0,0]]
To return a 0, but instead it gives me
KeyError: r[0,0]
What is the problem? R[0,0] gives me r[0,0] and the dictionary do have it as a key.
[edit]
When I said R[0,0] gives me r[0,0], this means that I printed it using pdb, placing a breakpoint right after U = f.satisfy_one():
(Pdb) p R[0,0]
r[0,0]
This needs more details.
How do you say, R[0,0] gives me r[0,0]? Did you print them?
Try to verify if you are looking for same
assert list(U.keys())[0] == R[2,1]
Look the the values
print(list(type(U.keys())[0])
print(type(R[2,1])
See if they match, only if they match, will you be able to gather it.
Also, check if U itself has any methods to do the query for you.

NetworkX largest component no longer working?

according to networkx documentation, connected_component_subgraphs(G) returns a sorted list of all components. Thus the very first one should be the largest component.
However, when I try to get the largest component of a graph G using the example code on the documentation page
G=nx.path_graph(4)
G.add_edge(5,6)
H=nx.connected_component_subgraphs(G)[0]
I get
TypeError: 'generator' object has no attribute '__getitem__'
It used to work on my other computer with earlier versions of networkx (1.7 I think, not 100% sure)
Now I am using a different computer with python 2.7.7 and networkx 1.9. Is it a version problem?
I have written a small function with a couple lines myself to find the largest component, just wondering why this error came out.
BTW, I can get the components by converting the generator object to a list.
components = [comp for comp in nx.connected_components(G)]
But the list is not sorted by component size as stated in the documentation.
example:
G = nx.Graph()
G.add_edges_from([(1,2),(1,3),(4,5)])
G.add_nodes_from(range(6,20))
components = [comp for comp in nx.connected_components(G)]
component_size = [len(comp) for comp in components]
print G.number_of_nodes(), G.number_of_edges(), component_size
G = nx.Graph()
G.add_edges_from([(1000,2000),(1000,3000),(4000,5000)])
G.add_nodes_from(range(6,20))
components = [comp for comp in nx.connected_components(G)]
component_size = [len(comp) for comp in components]
print G.number_of_nodes(), G.number_of_edges(), component_size
output:
19 3 [3, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
19 3 [2, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
looks like when the node names are large numbers and when there are a bunch of single nodes, the returned subgraphs are not sorted properly
The networkx-1.9 documentation is here http://networkx.github.io/documentation/networkx-1.9/reference/generated/networkx.algorithms.components.connected.connected_components.html#networkx.algorithms.components.connected.connected_components
The interface was changed to return a generator (as you figured out). The example in the documentation shows how to do what you ask.
Generate a sorted list of connected components, largest first.
>> G = nx.path_graph(4)
>>> G.add_path([10, 11, 12])
>>> sorted(nx.connected_components(G), key = len, reverse=True)
[[0, 1, 2, 3], [10, 11, 12]]
or
>>> sorted(nx.connected_component_subgraphs(G), key = len, reverse=True)
As for version 2.4:
nx.connected_component_subgraphs(G) is removed.
Instead to get the same result use:
connected_subgraphs = [G.subgraph(cc) for cc in nx.connected_components(G)]
And to get the giant component:
gcc = max(nx.connected_components(G), key=len)
giantC = G.subgraph(gcc)

Python: is index() buggy at all?

I'm working through this thing on pyschools and it has me mystified.
Here's the code:
def convertVector(numbers):
totes = []
for i in numbers:
if i!= 0:
totes.append((numbers.index(i),i))
return dict((totes))
Its supposed to take a 'sparse vector' as input (ex: [1, 0, 1 , 0, 2, 0, 1, 0, 0, 1, 0])
and return a dict mapping non-zero entries to their index.
so a dict with 0:1, 2:1, etc where x is the non zero item in the list and y is its index.
So for the example number it wants this: {0: 1, 9: 1, 2: 1, 4: 2, 6: 1}
but instead gives me this: {0: 1, 4: 2} (before its turned to a dict it looks like this:
[(0, 1), (0, 1), (4, 2), (0, 1), (0, 1)]
My plan is for i to iterate through numbers, create a tuple of that number and its index, and then turn that into a dict. The code seems straightforward, I'm at a loss.
It just looks to me like numbers.index(i) is not returning the index, but instead returning some other, unsuspected number.
Is my understanding of index() defective? Are there known index issues?
Any ideas?
index() only returns the first:
>>> a = [1,2,3,3]
>>> help(a.index)
Help on built-in function index:
index(...)
L.index(value, [start, [stop]]) -> integer -- return first index of value.
Raises ValueError if the value is not present.
If you want both the number and the index, you can take advantage of enumerate:
>>> for i, n in enumerate([10,5,30]):
... print i,n
...
0 10
1 5
2 30
and modify your code appropriately:
def convertVector(numbers):
totes = []
for i, number in enumerate(numbers):
if number != 0:
totes.append((i, number))
return dict((totes))
which produces
>>> convertVector([1, 0, 1 , 0, 2, 0, 1, 0, 0, 1, 0])
{0: 1, 9: 1, 2: 1, 4: 2, 6: 1}
[Although, as someone pointed out though I can't find it now, it'd be easier to write totes = {} and assign to it directly using totes[i] = number than go via a list.]
What you're trying to do, it could be done in one line:
>>> dict((index,num) for index,num in enumerate(numbers) if num != 0)
{0: 1, 2: 1, 4: 2, 6: 1, 9: 1}
Yes your understanding of list.index is incorrect. It finds the position of the first item in the list which compares equal with the argument.
To get the index of the current item, you want to iterate over with enumerate:
for index, item in enumerate(iterable):
# blah blah
The problem is that .index() looks for the first occurence of a certain argument. So for your example it always returns 0 if you run it with argument 1.
You could make use of the built in enumerate function like this:
for index, value in enumerate(numbers):
if value != 0:
totes.append((index, value))
Check the documentation for index:
Return the index in the list of the first item whose value is x. It is
an error if there is no such item.
According to this definition, the following code appends, for each value in numbers a tuple made of the value and the first position of this value in the whole list.
totes = []
for i in numbers:
if i!= 0:
totes.append((numbers.index(i),i))
The result in the totes list is correct: [(0, 1), (0, 1), (4, 2), (0, 1), (0, 1)].
When turning it into again, again, the result is correct, since for each possible value, you get the position of its first occurrence in the original list.
You would get the result you want using i as the index instead:
result = {}
for i in range(len(numbers)):
if numbers[i] != 0:
result[i] = numbers[i]
index() returns the index of the first occurrence of the item in the list. Your list has duplicates which is the cause of your confusion. So index(1) will always return 0. You can't expect it to know which of the many instances of 1 you are looking for.
I would write it like this:
totes = {}
for i, num in enumerate(numbers):
if num != 0:
totes[i] = num
and avoid the intermediate list altogether.
Riffing on #DSM:
def convertVector(numbers):
return dict((i, number) for i, number in enumerate(numbers) if number)
Or, on re-reading, as #Rik Poggi actually suggests.

Assigning multiple array indices at once in Python/Numpy

I'm looking to quickly (hopefully without a for loop) generate a Numpy array of the form:
array([a,a,a,a,0,0,0,0,0,b,b,b,0,0,0, c,c,0,0....])
Where a, b, c and other values are repeated at different points for different ranges. I'm really thinking of something like this:
import numpy as np
a = np.zeros(100)
a[0:3,9:11,15:16] = np.array([a,b,c])
Which obviously doesn't work. Any suggestions?
Edit (jterrace answered the original question):
The data is coming in the form of an N*M Numpy array. Each row is mostly zeros, occasionally interspersed by sequences of non-zero numbers. I want to replace all elements of each such sequence with the last value of the sequence. I'll take any fast method to do this! Using where and diff a few times, we can get the start and stop indices of each run.
raw_data = array([.....][....])
starts = array([0,0,0,1,1,1,1...][3, 9, 32, 7, 22, 45, 57,....])
stops = array([0,0,0,1,1,1,1...][5, 12, 50, 10, 30, 51, 65,....])
last_values = raw_data[stops]
length_to_repeat = stops[1]-starts[1]
Note that starts[0] and stops[0] are the same information (which row the run is occurring on). At this point, since the only route I know of is what jterrace suggest, we'll need to go through some contortions to get similar start/stop positions for the zeros, then interleave the zero start/stop with the values start/stops, and interleave the number 0 with the last_values array. Then we loop over each row, doing something like:
for i in range(N)
values_in_this_row = where(starts[0]==i)[0]
output[i] = numpy.repeat(last_values[values_in_this_row], length_to_repeat[values_in_this_row])
Does that make sense, or should I explain some more?
If you have the values and repeat counts fully specified, you can do it this way:
>>> import numpy
>>> values = numpy.array([1,0,2,0,3,0])
>>> counts = numpy.array([4,5,3,3,2,2])
>>> numpy.repeat(values, counts)
array([1, 1, 1, 1, 0, 0, 0, 0, 0, 2, 2, 2, 0, 0, 0, 3, 3, 0, 0])
you can use numpy.r_:
>>> np.r_[[a]*4,[b]*3,[c]*2]
array([1, 1, 1, 1, 2, 2, 2, 3, 3])

Categories