Python: List Dictionary Comprehension

Python: List Dictionary Comprehension - python

I have the following code:
letters = 'defghijklmno'
K = {letters[i]:(i*i-1) for i in range(len(letters))}
I understand that I'm iterating over the sequence variable of letters and how the value is calculated, but I'm confused over how the key gets set to the individual characters of the string. Especially because I have letters being indexed as my key. Basically, I'm just trying to figure out how python evaluates this expression

That dict comprehension is basically a synonym for:
k = {}
for i in range(len(letters)):
k[letters[i]] = i*i - 1
The difference is that it creates a new scope instead of using the outer scope:
>>> letters = 'defghijklmno'
>>> K = {letters[i]:(i*i-1) for i in range(len(letters))}
>>> i # was defined in an inner scope
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'i' is not defined
>>> k = {}
>>> for i in range(len(letters)):
... k[letters[i]] = i*i - 1
...
>>> i # still defined!
11

Explanation:
>>> letters = 'defghijklmno'
>>> range(len(letters))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
This means, that
>>> [letters[i] for i in range(len(letters))]
['d', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o']
At the same time
>>> [(i*i-1) for i in range(len(letters))]
[-1, 0, 3, 8, 15, 24, 35, 48, 63, 80, 99, 120]
So, your dictionary comprehension builds dict of pairs 'd':-1, 'e':0, 'f':3, ... (etc).

Well, first of all, this is a rather bad way of doing it. Looping by indices is a really bad practice in Python (it's slower, and horrible to read), so the much better way is this:
letters = 'defghijklmno'
K = {letter: (i*i-1) for i, letter in enumerate(letters)}
All this is is a simple dictionary comprehension. When we loop over a string, we get the individual characters making it up. We use the enumerate() builtin to give us matching numbers, and then produce a dictionary from the letter to the number squared, minus one.
If you are struggling with the comprehension itself, it's equivalent to a for loop (except faster), and I recommend you watch my video for a complete explanation with examples of dictionary comprehensions alongside it's cousins (list/set comprehensions and generator expressions).

To understand it, it helps to look at the individual parts of what happens. A for i in range(len(letters)) loop does not loop over the individual characters of the letters, but over the indizes of the string. That is because you can access indidual characters of a string using their index. So letters[0] refers to the first character, letters[1] to the second, and letters[len(letters)-1] to the last.
So, let’s look at the keys of the dictionary individually:
>>> [letters[i] for i in range(len(letters))]
['d', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o']
So you get all the letters individually in the original order.
Now, let’s look at the values of the dictionary:
>>> [(i*i-1) for i in range(len(letters))]
[-1, 0, 3, 8, 15, 24, 35, 48, 63, 80, 99, 120]
So, now we have both keys and values; all that the dictionary comprehension does now is link those keys to the values—in the order above.

The second line is a dictionary comprehension. It is like a normal list comprehension or generator expression except that it generates key value pairs which are then used to form a dictionary.
The code is roughly equivalent to
letters = 'defghijklmno'
K = {}
for i in range(len(letters)):
key = letters[i]
val = (i*i-1)
K[key] = val

You could rewrite the Dict comprehension as loop like
K = {} # empty dict
for i in range(len(letters)): # i goes from 0 to 11
K[letters[i]] = i*i-1
so in the single iterations you have
K['d'] = -1
K['e'] = 0
K['f'] = 3
# ...
and so on. The dict comprehension is just a shorter (and in the opinion of most python programmers) more elegant way to write this loop.

For every i from i == 0 to i == 11 (index of the last letter in letters), an entry is added to the resulting dictionary where the key is letters[i] and its associated value is i*i-1. This gives:
K['d'] == -1
K['e'] == 0
K['f'] == 3
and so on.

You're not actually iterating over the letters of letters, per se; rather, you're iterating over the length of letters, by varying i from 0, to 1, to 2, ..., to 11. As you vary i, you create a dictionary entry whose key is the ith letter of letters and whose value is i*i - 1.
In other words, you create a dictionary, each entry of which consists of a letter (key) k from letters, paired with a value equal to k's index squared, minus 1.
You can read the dictionary comprehension in plain English as: the dictionary of all letters (keys) k from letters with index i, paired with the value i*i - 1.

Related

Find all indices of each letter in a string

I'm trying to get a list consisting of the indexes for each item of another sequence.
Sounds easy enough in theory.
a = 'string of letters'
b = [a.index(x) for x in a]
But it doesn't work. I've tried list comprehensions, simple for loops, using enumerate etc, but every time b will return the same index for duplicates in a.
That is, 's' in a, for example, will return '0' in b for both the first and last item because they're the same character.
I'm guessing is cache or something like that as a way for Python to speed things up.
In any case, I can't figure this out and I'd appreciate some help as to how I can get this working as well as maybe an explanation of why this happens.

Thanks a lot for the input. I did figure it out with enumerate, actually.
To elaborate, I had two lists, a and b. a contains both uppercase and lowercase characters. b consists of the same characters as a, but shifted by a certain number of positions, like in a cipher.
I wanted to keep the case of the characters in b at the same position, after the 'encoding', but I needed the index of each character in 'A'.
Anyway, it was as simple as this:
a = 'tEXt'
c = [x for x,y in enumerate(a) if y.isupper()]
b = ['x', 't', 't', 'e'] #(this is the encoded version of 'a', returned from a different place as a string, but converted here to a list)
for x in c:
b[x] = b[x].upper()
b = ''.join[b]
b
'xTTe'

.index just returns the first occurrence of a character in a string - this has nothing to do with caches. It seems like you just want the list of numbers from 0 until your string length-1:
b = list(range(len(a)))
You do not mention why you need this, but it's pretty rare to need something like this in Python. Note in Python 3 range returns a a special type of it's own representing an immutable sequence of numbers, so you need to explicitly convert it to a list if you do actually need that.

I refactored the code you posted as an answer, let me know if I understood things correctly.
from typing import List
def copy_case(a: str, b: str) -> str:
res_chars: List[str] = []
curr_a: str
curr_b: str
for curr_a, curr_b in zip(a, b):
if curr_a.isupper():
curr_b = curr_b.upper()
else:
curr_b = curr_b.lower()
res_chars.append(curr_b)
return ''.join(res_chars)
print(copy_case('tEXt', 'xTTe'))

One approach could be to build a dictionary, iterating over the distinct letters in the string and using re.finditer to obtain the index of all occurrences in the string. So going step by step:
import re
a = 'string of letters'
We can find the unique letters in the string by taking a set:
letters = set(a.replace(' ',''))
# {'e', 'f', 'g', 'i', 'l', 'n', 'o', 'r', 's', 't'}
Then we could use a dictionary comprehension to build the dictionary, in which the the values are a list generated by iterating over all match instances returned by re.finditer:
{w: [m.start() for m in re.finditer(w, a)] for w in letters}
{'i': [3],
'o': [7],
'f': [8],
'l': [10],
'g': [5],
'e': [11, 14],
't': [1, 12, 13],
's': [0, 16],
'n': [4],
'r': [2, 15]}

A dict is probably better than a list for this purpose:
foo = {x : [] for x in a} #creates dict with keys being unique values in a
for i,x in enumerate(a):
foo[x].append(i) #adds each index into dict
for example for string 'abababababa':
{'a': [0, 2, 4, 6, 8], 'b': [1, 3, 5, 7, 9]}

Sounds like you're trying to get a list of the indeces of each input char as
an output. So, for s, you would get [0, 16], or something along those lines.
So for each input char, you would add its position to the right list.
Storing the results in a dict seems like a good approach, so, something like:
def index_dict(stringy):
d = {}
for index, char in enumerate(stringy):
if char not in d:
d[char] = []
d[char].append(index)
return d
The index() method always finds the first occurrence. You need to find all occurrences. So, the above func will give you a dict with all the keys matching the chars of your input string, and then the value for each key is a list of indeces where that char is found.

String Compression using a Dictionary

Code:
'''
Program to Compress a string using Dictonaries.
Input samle--> 'AAAABBBBCCCCDDDDaa'
output sample--> 'A4B4C4D4a2'
'''
# Function declaration
def string_compression(str):
d = {}
x = []
# Generating Key-Value pair for entire string
for i in str:
if i not in d.keys():
d[i] = 1
else:
d[i] = d[i] + 1
# Copying Key Value Pairs in a list
for key,value in d.items():
x.append(key)
x.append(value)
# Printing a Cocktail list of Strings and Integers
print(x)
# Converting Integers in list x to Strings and Joining them
for i in x[1::2]:
x[i] = str(x[i])
print(''.join(x))
#print(''.join(map(str, x)))
y = 'AAAABBBBCCCCDDDDaa'
string_compression(y) # Function Call
Output:
['A', 4, 'B', 4, 'C', 4, 'D', 4, 'a', 2]
Traceback (most recent call last):
File "string_compression.py", line 36, in <module>
string_compression(y) # Function Call
File "string_compression.py", line 30, in string_compression
x[i] = str(x[i])
TypeError: 'str' object is not callable
I'm able to print x as list,
But I'm unable to print the list in string format.
Also I tried all the possible combinations of solutions as per my previous post:
Converting list of Integers and Strings to purely a string
But The link's solutions are working only if I try to run the code in a new file taking only 2 lines of code as:
x = ['A', 4, 'B', 4, 'C', 4, 'D', 4, 'a', 2]
print(''.join(map(str, x)))
Why are any of the methods not working here in this above code? Any concept which I'm lagging?

Change the param from built-in keyword type str to something else, perhaps s, also switching to map in joining:
def string_compression(s):
d = {}
x = []
# Generating Key-Value pair for entire string
for i in s:
if i not in d.keys():
d[i] = 1
else:
d[i] = d[i] + 1
# Copying Key Value Pairs in a list
for key,value in d.items():
x.append(key)
x.append(value)
# Printing a Cocktail list of Strings and Integers
print(x)
# Converting Integers in list x to Strings and Joining them
for i in x[1::2]:
x[i] = str(x[i])
print(''.join(map(str, x)))
y = 'AAAABBBBCCCCDDDDaa'
string_compression(y) # Function Call
OUTPUT:
['A', 4, 'B', 4, 'C', 4, 'D', 4, 'a', 2]
A4B4C4D4a2

Your particular problem is you're using the name str for the argument of your function, which shadows the builtin str() function you're trying to call.
However, there are also some bugs in your algorithm – dictionaries aren't ordered (not before Python 3.6 anyway) so you aren't guaranteed that you get the same order of characters back. Also, if the input string is aaaaabbbbbaaaaa, you'll end up with something like a10b5, which isn't reversible to aaaaaabbbbbaaaaa. (I'm not going to go into how strings with numbers can't be "compressed" reversibly with this, but that's also an issue.)
I'd do something like this:
def string_rle(s):
parts = []
for c in s:
if not parts or c != parts[-1][0]:
parts.append([c, 1])
else:
parts[-1][-1] += 1
return ''.join('%s%d' % (part[0], part[1]) for part in parts)
print(string_rle('AAAABBBBCCCCDDDDaa'))
The idea is that parts holds an ordered list of pairs of a character and a count, and for each character in the original string, we see if the current pair (parts[-1]) has the same character; if it does, we increment the counter, otherwise, we generate a new pair.

You have overritten the 'str' function in the function definition as a function argument:
def string_compression(str): # your argument overrides the 'str' function
so then when you are trying to call the 'str' function here:
x[i] = str(x[i])
you are actually calling the 'string_compression' argument, which is not callable.
Change the variable 'str' name to something else.
BTW, this for loop:
for i in x[1::2]:
x[i] = str(x[i])
produces a list [C, a, C, C, C] which is not what you are looking for. You need to get the integers indexes not the integers itself.
for i in range(1,len(x),2):
print str(x[i])

Permutations using a multidict

I'm trying to put together a code that replaces unique characters in a given input string with corresponding values in a dictionary in a combinatorial manner while preserving the position of 'non' unique characters.
For example, I have the following dictionary:
d = {'R':['A','G'], 'Y':['C','T']}
How would go about replacing all instances of 'R' and 'Y' while producing all possible combinations of the string but maintaining the positions of 'A' and 'C'?
For instance, the input 'ARCY' would generate the following output:
'AACC'
'AGCC'
'AACT'
'AGCT'
Hopefully that makes sense. If anyone can point me in the right directions, that would be great!

Given the dictionary, we can state a rule that tells us what letters are possible at a given position in the output. If the original letter from the input is in the dictionary, we use the value; otherwise, there is a single possibility - the original letter itself. We can express that very neatly:
def candidates(letter):
d = {'R':['A','G'], 'Y':['C','T']}
return d.get(letter, [letter])
Knowing the candidates for each letter (which we can get by mapping our candidates function onto the letters in the pattern), we can create the Cartesian product of candidates, and collapse each result (which is a tuple of single-letter strings) into a single string by simply ''.joining them.
def substitute(pattern):
return [
''.join(result)
for result in itertools.product(*map(candidates, pattern))
]
Let's test it:
>>> substitute('ARCY')
['AACC', 'AACT', 'AGCC', 'AGCT']

The following generator function produces all of your desired strings, using enumerate, zip, itertools.product, a list comprehension and argument list unpacking all of which are very handy Python tools/concepts you should read up on:
from itertools import product
def multi_replace(s, d):
indexes, replacements = zip(*[(i, d[c]) for i, c in enumerate(s) if c in d])
# indexes: (1, 3)
# replacements: (['A', 'G'], ['C', 'T'])
l = list(s) # turn s into sth. mutable
# iterate over cartesian product of all replacement tuples ...
for p in product(*replacements):
for index, replacement in zip(indexes, p):
l[index] = replacement
yield ''.join(l)
d = {'R': ['A', 'G'], 'Y': ['C', 'T']}
s = 'ARCY'
for perm in multi_replace(s, d):
print perm
AACC
AACT
AGCC
AGCT
s = 'RRY'
AAC
AAT
AGC
AGT
GAC
GAT
GGC
GGT

Change ARCY to multiple list and use below code:
import itertools as it
list = [['A'], ['A','G'],['C'],['C','T']]
[''.join(item) for item in it.product(*list)]
or
import itertools as it
list = ['A', 'AG','C', 'CT']
[''.join(item) for item in it.product(*list)]

going through a dictionary and printing its values in sequence

def display_hand(hand):
for letter in hand.keys():
for j in range(hand[letter]):
print letter,
Will return something like: b e h q u w x. This is the desired output.
How can I modify this code to get the output only when the function has finished its loops?
Something like below code causes me problems as I can't get rid of dictionary elements like commas and single quotes when printing the output:
def display_hand(hand):
dispHand = []
for letter in hand.keys():
for j in range(hand[letter]):
##code##
print dispHand
UPDATE
John's answer is very elegant i find. Allow me however to expand o Kugel's response:
Kugel's approach answered my question. However i kept running into an additional issue: the function would always return None as well as the output. Reason: Whenever you don't explicitly return a value from a function in Python, None is implicitly returned. I couldn't find a way to explicitly return the hand. In Kugel's approach i got closer but the hand is still buried in a FOR loop.

You can do this in one line by combining a couple of list comprehensions:
print ' '.join(letter for letter, count in hand.iteritems() for i in range(count))
Let's break that down piece by piece. I'll use a sample dictionary that has a couple of counts greater than 1, to show the repetition part working.
>>> hand
{'h': 3, 'b': 1, 'e': 2}
Get the letters and counts in a form that we can iterate over.
>>> list(hand.iteritems())
[('h', 3), ('b', 1), ('e', 2)]
Now just the letters.
>>> [letter for letter, count in hand.iteritems()]
['h', 'b', 'e']
Repeat each letter count times.
>>> [letter for letter, count in hand.iteritems() for i in range(count)]
['h', 'h', 'h', 'b', 'e', 'e']
Use str.join to join them into one string.
>>> ' '.join(letter for letter, count in hand.iteritems() for i in range(count))
'h h h b e e'

Your ##code perhaps?
dispHand.append(letter)
Update:
To print your list then:
for item in dispHand:
print item,

another option without nested loop
"".join((x+' ') * y for x, y in hand.iteritems()).strip()

Use
" ".join(sequence)
to print a sequence without commas and the enclosing brackets.
If you have integers or other stuff in the sequence
" ".join(str(x) for x in sequence)

Using an index to get an item

I have a list in python ('A','B','C','D','E'), how do I get which item is under a particular index number?
Example:
Say it was given 0, it would return A.
Given 2, it would return C.
Given 4, it would return E.

What you show, ('A','B','C','D','E'), is not a list, it's a tuple (the round parentheses instead of square brackets show that). Nevertheless, whether it to index a list or a tuple (for getting one item at an index), in either case you append the index in square brackets.
So:
thetuple = ('A','B','C','D','E')
print thetuple[0]
prints A, and so forth.
Tuples (differently from lists) are immutable, so you couldn't assign to thetuple[0] etc (as you could assign to an indexing of a list). However you can definitely just access ("get") the item by indexing in either case.

values = ['A', 'B', 'C', 'D', 'E']
values[0] # returns 'A'
values[2] # returns 'C'
# etc.

You can use _ _getitem__(key) function.
>>> iterable = ('A', 'B', 'C', 'D', 'E')
>>> key = 4
>>> iterable.__getitem__(key)
'E'

Same as any other language, just pass index number of element that you want to retrieve.
#!/usr/bin/env python
x = [2,3,4,5,6,7]
print(x[5])

You can use pop():
x=[2,3,4,5,6,7]
print(x.pop(2))
output is 4

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python: List Dictionary Comprehension - python

For every i from i == 0 to i == 11 (index of the last letter in letters), an entry is added to the resulting dictionary where the key is letters[i] and its associated value is i*i-1. This gives: K['d'] == -1 K['e'] == 0 K['f'] == 3 and so on.

Related

Find all indices of each letter in a string

String Compression using a Dictionary

Permutations using a multidict

going through a dictionary and printing its values in sequence

Using an index to get an item

Categories

Resources