Append values to a set in Python - python

How do I add values to an existing set?

your_set.update(your_sequence_of_values)
e.g, your_set.update([1, 2, 3, 4]). Or, if you have to produce the values in a loop for some other reason,
for value in ...:
your_set.add(value)
But, of course, doing it in bulk with a single .update call is faster and handier, when otherwise feasible.

Define a set
a = set()
Use add to append single values
a.add(1)
a.add(2)
Use update to add elements from tuples, sets, lists or frozen-sets
a.update([3, 4])
>>> print(a)
{1, 2, 3, 4}
Note: Since set elements must be hashable, and lists are considered mutable, you cannot add a list to a set. You also cannot add other sets to a set. You can however, add the elements from lists and sets as demonstrated with the .update method.

You can also use the | operator to concatenate two sets (union in set theory):
>>> my_set = {1}
>>> my_set = my_set | {2}
>>> my_set
{1, 2}
Or a shorter form using |=:
>>> my_set = {1}
>>> my_set |= {2}
>>> my_set
{1, 2}
Note: In versions prior to Python 2.7, use set([...]) instead of {...}.

Use update like this:
keep.update(newvalues)

This question is the first one that shows up on Google when one looks up "Python how to add elements to set", so it's worth noting explicitly that, if you want to add a whole string to a set, it should be added with .add(), not .update().
Say you have a string foo_str whose contents are 'this is a sentence', and you have some set bar_set equal to set().
If you do
bar_set.update(foo_str), the contents of your set will be {'t', 'a', ' ', 'e', 's', 'n', 'h', 'c', 'i'}.
If you do bar_set.add(foo_str), the contents of your set will be {'this is a sentence'}.

The way I like to do this is to convert both the original set and the values I'd like to add into lists, add them, and then convert them back into a set, like this:
setMenu = {"Eggs", "Bacon"}
print(setMenu)
> {'Bacon', 'Eggs'}
setMenu = set(list(setMenu) + list({"Spam"}))
print(setMenu)
> {'Bacon', 'Spam', 'Eggs'}
setAdditions = {"Lobster", "Sausage"}
setMenu = set(list(setMenu) + list(setAdditions))
print(setMenu)
> {'Lobster', 'Spam', 'Eggs', 'Sausage', 'Bacon'}
This way I can also easily add multiple sets using the same logic, which gets me an TypeError: unhashable type: 'set' if I try doing it with the .update() method.

I just wanted to add a quick note here. So I was looking for the fastest method among the three methods.
Using the set.add() function
Using the set.update() function
Using the "|" operator function.
I find it out that to add either a single value or multiple values to a set you have to use the set.add() function. It is the most efficient method among the others.
So I ran a test and Here is the result:
set.add() Took: 0.5208224999951199
set.update() Took:
0.6461397000239231 `
"|" operator` Took: 0.7649438999942504
PS: If you want to know more the analysis.
Check here : Fastest way to append values to set.

For me, in Python 3, it's working simply in this way:
keep = keep.union((0,1,2,3,4,5,6,7,8,9,10))
I don't know if it may be correct...

keep.update((0,1,2,3,4,5,6,7,8,9,10))
Or
keep.update(np.arange(11))

Related

numpy.unique gives wrong output for list of sets

I have a list of sets given by,
sets1 = [{1},{2},{1}]
When I find the unique elements in this list using numpy's unique, I get
np.unique(sets1)
Out[18]: array([{1}, {2}, {1}], dtype=object)
As can be seen seen, the result is wrong as {1} is repeated in the output.
When I change the order in the input by making similar elements adjacent, this doesn't happen.
sets2 = [{1},{1},{2}]
np.unique(sets2)
Out[21]: array([{1}, {2}], dtype=object)
Why does this occur? Or is there something wrong in the way I have done?
What happens here is that the np.unique function is based on the np._unique1d function from NumPy (see the code here), which itself uses the .sort() method.
Now, sorting a list of sets that contain only one integer in each set will not result in a list with each set ordered by the value of the integer present in the set. So we will have (and that is not what we want):
sets = [{1},{2},{1}]
sets.sort()
print(sets)
# > [{1},{2},{1}]
# ie. the list has not been "sorted" like we want it to
Now, as you have pointed out, if the list of sets is already ordered in the way you want, np.unique will work (since you would have sorted the list beforehand).
One specific solution (though, please be aware that it will only work for a list of sets that each contain a single integer) would then be:
np.unique(sorted(sets, key=lambda x: next(iter(x))))
That is because set is unhashable type
{1} is {1} # will give False
you can use python collections.Counter if you can can convert the set to tuple like below
from collections import Counter
sets1 = [{1},{2},{1}]
Counter([tuple(a) for a in sets1])

Find difference between list and set

I am trying to find differences between MongoDB records. After performing my queries, I end up with a set of unique results (by applying set()).
Now, I want to compare a new extraction with the set that I just defined to see if there are any new additions to the record.
What I have done now is the following:
unique_documents = set([str(i) for i in dict_of_uniques[my_key]])
all_documents = [str(i) for i in (dict_of_all_docs[my_key])]
Basically I am trying to compare the string version of a dict among the two variables.
I have several approaches, among which unique_documents.difference(all_documents), but it keeps out returning an empty set. I know for a fact that the all_documents variable contains two new entries in the record. I would like to know which ones are they.
Thank you,
If all_documents is the set with new elements that you want to get as the result, then you need to reverse the order of the arguments to the difference method.
unique_documents = set([str(i) for i in dict_of_uniques[my_key]])
all_documents = set([str(i) for i in (dict_of_all_docs[my_key])])
all_documents.difference(unique_documents)
See how the order matters:
>>> x = set([1,2,3])
>>> y = set([3,4,5])
>>> x.difference(y)
{1, 2}
>>> y.difference(x)
{4, 5}
difference gives you the elements of the first set that are not present in the second set.
If you want to see things that were either added or removed, you can symmetric_difference. This function is described as "symmetric" because it gives the same results regardless of argument order.
>>> x.symmetric_difference(y)
{1, 2, 4, 5}
>>> y.symmetric_difference(x)
{1, 2, 4, 5}
It is hard to tell without a description of the dictionary structure but your code seems to be comparing single keys only. If you want to compare the content of both dictionaries, you need to get all the values:
currentData = set( str(rec) for rec in dict_of_all_docs.values() )
changedKeys = [k for k,value in dict_of_fetched.items() if str(value) not in currentData]
This doesn't seem very efficient though but without more information on the data structure, it is hard to make a better suggestion. If your records can already matched by a dictionary key, you probably don't need to use a set at all. A simple loop should do.
Rather than unique_documents.difference(all_documents) use all_documents.difference(unique_documents)
More on Python Sets

Repeat string in one element in a set using Python3

I was making the difference between two sets to see what had changed when I noticed that whenever I had one element which was a repeated sequence it would be represented as only one char.
Example:
>>> set("aaaa")
{'a'}
How can I represent it as {'aaaa'} so I can make the diff between two sets and get the right value? If it's not possible using sets, what is the easiest way to compare two data structures and get the diff in python3?
Example:
>>> a = set(['red', 'blue', 'green'])
>>> b = set(['red', 'green'])
>>> a
{'green', 'red', 'blue'}
>>> b
{'red', 'green'}
>>> a - b
{'blue'}
You don't call set on a single str unless you want to treat the str as a sequence of individual characters to be uniquified (that is, set('aaaa') is equivalent to set(['a', 'a', 'a', 'a']), because Python str are iterables of their own characters). If you want a set containing only "aaaa", you do either {'aaaa'} (set literal syntax) or set(['aaaa']) (set constructor wrapping a one-element sequence containing the str). The former is more efficient when you have a fixed number of items to put in a set, the latter works with existing iterables (e.g. set(mysequence)).
In Python 3.5+, you can use literal syntax more flexibly, creating sets from a combination of single values and iterables, e.g. {'aaaa', *mysequence} where mysequence is [1, 2, 3] would be equivalent to typing {'aaaa', 1, 2, 3}.
If you're trying to set a variable to {'aaaa'} like in your example, simply change the set assignment from set("aaaa") to set(['aaaa']).

What is the difference between curly brace and square bracket in Python?

what is the difference between curly brace and square bracket in python?
A ={1,2}
B =[1,2]
when I print A and B on my terminal, they made no difference. Is it real?
And sometimes, I noticed some code use {} and [] to initialize different variables.
E.g. A=[], B={}
Is there any difference there?
Curly braces create dictionaries or sets. Square brackets create lists.
They are called literals; a set literal:
aset = {'foo', 'bar'}
or a dictionary literal:
adict = {'foo': 42, 'bar': 81}
empty_dict = {}
or a list literal:
alist = ['foo', 'bar', 'bar']
empty_list = []
To create an empty set, you can only use set().
Sets are collections of unique elements and you cannot order them. Lists are ordered sequences of elements, and values can be repeated. Dictionaries map keys to values, keys must be unique. Set and dictionary keys must meet other restrictions as well, so that Python can actually keep track of them efficiently and know they are and will remain unique.
There is also the tuple type, using a comma for 1 or more elements, with parenthesis being optional in many contexts:
atuple = ('foo', 'bar')
another_tuple = 'spam',
empty_tuple = ()
WARNING_not_a_tuple = ('eggs')
Note the comma in the another_tuple definition; it is that comma that makes it a tuple, not the parenthesis. WARNING_not_a_tuple is not a tuple, it has no comma. Without the parentheses all you have left is a string, instead.
See the data structures chapter of the Python tutorial for more details; lists are introduced in the introduction chapter.
Literals for containers such as these are also called displays and the syntax allows for procedural creation of the contents based of looping, called comprehensions.
They create different types.
>>> type({})
<type 'dict'>
>>> type([])
<type 'list'>
>>> type({1, 2})
<type 'set'>
>>> type({1: 2})
<type 'dict'>
>>> type([1, 2])
<type 'list'>
These two braces are used for different purposes. If you just want a list to contain some elements and organize them by index numbers (starting from 0), just use the [] and add elements as necessary. {} are special in that you can give custom id's to values like a = {"John": 14}. Now, instead of making a list with ages and remembering whose age is where, you can just access John's age by a["John"].
The [] is called a list and {} is called a dictionary (in Python). Dictionaries are basically a convenient form of list which allow you to access data in a much easier way.
However, there is a catch to dictionaries. Many times, the data that you put in the dictionary doesn't stay in the same order as before. Hence, when you go through each value one by one, it won't be in the order you expect. There is a special dictionary to get around this, but you have to add this line from collections import OrderedDict and replace {} with OrderedDict(). But, I don't think you will need to worry about that for now.

Python: Looping starts from final item and ends with the first one

Is there any "pythonic way" to tell python to loop in a string (or list) starting from the last item and ending with the first one?
For example the word Hans i want python to read or sort it as snaH
Next, how can i tell pyhon the following: now from the string you resulted , search for 'a' find it ok , if you find 'n' follows 'a' , put '.' after 'n' and then print the original order of letters
The clearest and most pythonic way to do this is to used the reversed() builtin.
wrong_way = [1, 2, 3, 4]
for item in reversed(wrong_way):
print(item)
Which gives:
4
3
2
1
This is the best solution as not only will it generate a reversed iterator naturally, but it can also call the dedicated __reversed__() method if it exists, allowing for a more efficient reversal in some objects.
You can use wrong_way[::-1] to reverse a list, but this is a lot less readable in code, and potentially less efficient. It does, however, show the power of list slicing.
Note that reversed() provide iterators, so if you want to do this with a string, you will need to convert your result back to a string, which is fortunately easy, as you just do:
"".join(iterator)
e.g:
"".join(reversed(word))
The str.join() method takes an iterator and joins every element into a string, using the calling string as the separator, so here we use the empty string to place them back-to-back.
How about this?
>>> s = "Hans"
>>> for c in s[::-1]:
print c
s
n
a
H
Alternatively, if you want a new string that's the reverse of the first, try this:
>>> "".join(reversed("Hans"))
'snaH'
Sure, just use list_name[::-1]. e.g.
>>> l = ['one', 'two', 'three']
>>> for i in l[::-1]:
... print i
...
three
two
one

Categories