set() in Python - python

I have used the set() function but I am confused.
x = set("car")
print(x)
Why does this code output: "a", "c", "r" and not "car"? Thanks for your answers.

The set() function converts an iterable into a set. Because a string is an iterable collection of characters (well... of shorter, one-character strings), set("car") becomes the set of its characters e.g. {"c", "r", "a"}. (The order is random.)
There's three ways you can make a set containing the string "car":
Make the set directly using braces:
my_string = "car"
my_set = {my_string}
Put the string into another iterable and convert that
my_string = "car"
temp_tuple = (my_string,)
my_set = set(temp_tuple)
Make an empty set and add the string to it
my_string = "car"
my_set = set()
my_set.add(my_string)
If you just want a set containing specifically the word "car" you can construct this as a literal:
my_set = {"car"}
but I expect the actual problem you're trying to solve is more complex than that!

It's because of the main properties of set in Python.
They can't contain duplicates
They're unordered
So things like this one just happen.
>>> set("car")
{'a', 'r', 'c'}
I suggest you to read this useful documentation about Python's set.
Python - Set
Programiz - Python's Set
W3Schools - Python's Set
If you really want an ordered set, there's something that can be useful to you.
OrderedSet
I would suggest you to check this answers too.

if you just want one item in the set
do this
x = set(["car"])
print(x)
## or create a set directly like this
x=set({"car"})
set takes an iterable...a string is an iterable so it gets split
if you pass a list even with one item..it will remain as such

Related

python string to list (special list)

I'm trying to get this string into list, how can i do that pleas ?
My string :
x = "[(['xyz1'], 'COM95'), (['xyz2'], 'COM96'), (['xyz3'], 'COM97'), (['xyz4'], 'COM98'), (['xyz5'], 'COM99'), (['xyz6'], 'COM100')]"
I want to convert it to a list, so that:
print(list[0])
Output : (['xyz1'], 'COM95')
If you have this string instead of a list, that presumes it is coming from somewhere outside your control (otherwise you'd just make a proper list). If the string is coming from a source outside your program eval() is dangerous. It will gladly run any code passed to it. In this case you can use ast.liter_eval() which is safer (but make sure you understand the warning on the docs):
import ast
x = "[(['xyz1'], 'COM95'), (['xyz2'], 'COM96'), (['xyz3'], 'COM97'), (['xyz4'], 'COM98'), (['xyz5'], 'COM99'), (['xyz6'], 'COM100')]"
l = ast.literal_eval(x)
Which gives an l of:
[(['xyz1'], 'COM95'),
(['xyz2'], 'COM96'),
(['xyz3'], 'COM97'),
(['xyz4'], 'COM98'),
(['xyz5'], 'COM99'),
(['xyz6'], 'COM100')]
If the structure is uniformly a list of tuples with a one-element list of strings and an individual string, you can manually parse it using the single quote as a separator. This will give you one string value every other component of the split (which you can access using a striding subscript). You can then build the actual tuple from pairing of two values:
tuples = [([a],s) for a,s in zip(*[iter(x.split("'")[1::2])]*2)]
print(tuples[0])
(['xyz1'], 'COM95')
Note that this does not cover the case where an individual string contains a single quote that needed escaping
You mean convert list like string into list? Maybe you can use eval().
For example
a="[1,2,3,4]"
a=eval(a)
Then a become a list
to convert as list use x = eval(x)
print(list[0]) will give you an error because list is a python builtin function
you should do print(x[0]) to get what you want

How to remove extra quotation marks from each element list in python

I have list in the form:
list=["'A'","'B'","'C'"]
and would like change this list to the form:
list=['A','B','C']
I have tried the following;
for i in list:
str.replace("''",'')
however, this returned an error of "replace() takes at least 2 arguments (1 given)"
I'd be grateful for any help on how to achieve the removing of quotation marks, or if anyone can tell me how to alter my code to make it successful.
Thank you.
[item.strip("'") for item in lst]
*Please note it's advisable to avoid naming lists 'list' and so the variable has been renamed to 'lst' here.
All you need to do is remove the ' character from each element using the strip method:
for i in range(len(list)):
list[i] = list[i].strip("'")
You can just strip off the quote at the beginning and end of the string, you can use list-comprehension to do it for each of the items:
>>> [i.strip("'") for i in lst]
['A', 'B', 'C']
You are calling a method of the str builtin class/type instead of calling it for an instance. Use i.replace(...) instead.
Why it happens? Because a method has a reserved first parameter (mostly named) self which is a reference to the class instance - in your case an instance of the str class. Since str is a builtin, thus generally available, you using that name are causing an access to that builtin i.e. to the replace() method, but not actually calling it properly.
# notice this
replace(self, ...)
Help on method_descriptor:
replace(self, old, new, count=-1, /)
Return a copy with all occurrences of substring old replaced by new.
count
Maximum number of occurrences to replace.
-1 (the default value) means replace all occurrences.
If the optional argument count is given, only the first count occurrences are
replaced.
You can however call it with str.replace() just fine - if you really want to - but you first need to supply the instance e.g. like this:
str.replace(i, "''", '')
Also, mutability will play a role in here, so just calling it on a string this way will replace those characters, but won't save it back to the list, it'll just create a copy. Instead you'd need to use indexing:
mylist = ["'A'","'B'","'C'"]
for idx, item in enumerate(mylist):
list[idx] = item.replace("''", '')
And by this point it's still trying to remove two single-quotes, thus either replace only a single one (replace("'", "")) or use this answer with strip().
You're using str (the class) instead of i (the instance). The reason you're getting an error about the number of arguments is because there's an implicit first argument, self.
As well, strings aren't mutable, so you'll either need to assign back into the list by index or overwrite it, which I'd recommend since it's simpler.
Assign back
L = ["'A'", "'B'", "'C'"]
for idx, s in enumerate(L):
L[idx] = s.replace("'", "")
print(L) # -> ['A', 'B', 'C']
Overwrite
L = ["'A'", "'B'", "'C'"]
L[:] = [s.replace("'", "") for s in L]
print(L) # -> ['A', 'B', 'C']
This uses a full-slice assignment so that you keep the same list object. If you don't mind replacing it, you can simplify it to this:
L = [s.replace("'", "") for s in L]
Sidenote: I'm using L because list is a bad variable name since it shadows the builtin list type. C.f. TypeError: 'list' object is not callable in python

Python Replace Character In List

I have a Python list that looks like the below:
list = ['|wwwwwwwwadawwwwwwwwi', '|oooooooocFcooooooooi']
I access the letter in the index I want by doing this:
list[y][x]
For example, list[1][10] returns F.
I would like to replace F with a value. Thus changing the string in the list.
I have tried list[y][x] = 'o' but it throws the error:
self.map[y][x] = 'o'
TypeError: 'str' object does not support item assignment
Can anybody help me out? Thanks.
As #Marcin says, Python strings are immutable. If you have a specific character/substring you want to replace, there is string.replace. You can also use lists of characters instead of strings, as described here if you want to support the functionality of changing one particular character.
If you want something like string.replace, but for an index rather than a substring, you can do something like:
def replaceOneCharInString(string, index, newString):
return string[:index] + newString + string[index+len(newString):]
You would need to do some length checking, though.
Edit: forgot string before the brackets on string[index+len(newString):]. Woops.
Since python strings are immutable, they cannot be modified. You need to make new ones. One way is as follows:
tmp_list = list(a_list[1])
tmp_list[10] = 'o' # simulates: list[1][10]='o'
new_str = ''.join(tmp_list)
#Gives |oooooooococooooooooi
# substitute the string in your list
a_list[1] = new_str
As marcin says, strings are immutable in Python so you can not assign to individual characters in an existing string. The reason you can index them is that thay are sequences. Thus
for c in "ABCDEF":
print(c)
Will work, and print each character of the string on a separate line.
To achieve what you want you need to build a new string.For example, here is a brute force approach to replacing a single character of a string
def replace_1(s, index, c)
return s[:index] + c + s[index+1:]
Which you can use thus:
self.map[y] = replace_1(self.map[y], x, 'o')
This will work because self.map is list, which is mutable by design.
Let use L to represent the "list" since list is a function in python
L= ['|wwwwwwwwadawwwwwwwwi', '|oooooooocFcooooooooi']
L[1]='|oooooooococooooooooi'
print(L)
Unfortunately changing a character from an object (in this case) is not supported. The proper way would be to remove the object and add a new string object.
Output
['|wwwwwwwwadawwwwwwwwi', '|oooooooococooooooooi']

How to find an item with a specific start string in a set

I have a set of ~10 million items which look something like this:
1234word:something
4321soup:ohnoes
9cake123:itsokay
[...]
Now I'd need to quickly check if an item witha specific start is in the set.
For example
x = "4321soup"
is x+* in a_set:
print ("somthing that looks like " +x +"* is in the set!")
How do I accomplish this? I've considered using a regex, but I have no clue whether it is even possible in this scenario.
^4321soup.*$
Yes it is possible.Try match.If result is positive you have it.If it is None you dont have it.
Do not forget to set m and g flags.
See demo.
http://regex101.com/r/lS5tT3/28
use str.startswith instead of using regex, if you want to match only with the start of the string, also considering the number of lines you are having ~10 million items
#!/usr/bin/python
str = "1234word:something";
print str.startswith( '1234' );
python, considering your contents are inside a file named "mycontentfile"
>>> with open("mycontentfile","r") as myfile:
... data=myfile.read()
...
>>> for item in data.split("\n"):
... if item.startswith("4321soup"):
... print item.strip()
...
4321soup:ohnoes
In this case, the importance is how to iterate set in the optimistic way.
Since you should check every result until you find the matching result, the best way is create a generator (list expression form) and execute it until you find a result.
To accomplish this, I should use next approach.
a_set = set(['1234word:something','4321soup:ohnoes','9cake123:itsokay',]) #a huge set
prefix = '4321soup' #prefix you want to search
next(x for x in a_set if x.startswith(prefix), False) #pass a generator with the desired match condition, and invoke it until it exhaust (will return False) or until it find something
Hash-set's are very good for checking existance of some element, completely. In your task you need check existence of starting part, not complete element. That's why better use tree or sorted sequence instead of hash mechanism (internal implementation of python set).
However, according to your examples, it looks like you want to check whole part before ':'. For that purpose you can buildup set with these first parts, and then it will be good for checking existence with sets:
items = set(x.split(':')[0] for x in a_set) # a_set can be any iterable
def is_in_the_set(x):
return x in items
is_in_the_set("4321soup") # True
I'm currently thinking that the most reasonable solution would be
something like a sorted tree of dicts (key = x and value = y) and the
tree is sorted by the dicts keys. - no clue how to do that though –
Daedalus Mythos
No need for a tree of dicts ... just a single dictionary would do. If you have the key:value pairs stored in a dictionary, let's say itemdict, you can write
x = "4321soup"
if x in itemdict:
print ("something that looks like "+x+"* is in the set!")

What is the best way to create a string array in python?

I'm relatively new to Python and it's libraries and I was wondering how I might create a string array with a preset size. It's easy in java but I was wondering how I might do this in python.
So far all I can think of is
strs = ['']*size
And some how when I try to call string methods on it, the debugger gives me an error X operation does not exist in object tuple.
And if it was in java this is what I would want to do.
String[] ar = new String[size];
Arrays.fill(ar,"");
Please help.
Error code
strs[sum-1] = strs[sum-1].strip('\(\)')
AttributeError: 'tuple' object has no attribute 'strip'
Question: How might I do what I can normally do in Java in Python while still keeping the code clean.
In python, you wouldn't normally do what you are trying to do. But, the below code will do it:
strs = ["" for x in range(size)]
In Python, the tendency is usually that one would use a non-fixed size list (that is to say items can be appended/removed to it dynamically). If you followed this, there would be no need to allocate a fixed-size collection ahead of time and fill it in with empty values. Rather, as you get or create strings, you simply add them to the list. When it comes time to remove values, you simply remove the appropriate value from the string. I would imagine you can probably use this technique for this. For example (in Python 2.x syntax):
>>> temp_list = []
>>> print temp_list
[]
>>>
>>> temp_list.append("one")
>>> temp_list.append("two")
>>> print temp_list
['one', 'two']
>>>
>>> temp_list.append("three")
>>> print temp_list
['one', 'two', 'three']
>>>
Of course, some situations might call for something more specific. In your case, a good idea may be to use a deque. Check out the post here: Python, forcing a list to a fixed size. With this, you can create a deque which has a fixed size. If a new value is appended to the end, the first element (head of the deque) is removed and the new item is appended onto the deque. This may work for what you need, but I don't believe this is considered the "norm" for Python.
The simple answer is, "You don't." At the point where you need something to be of fixed length, you're either stuck on old habits or writing for a very specific problem with its own unique set of constraints.
The best and most convenient method for creating a string array in python is with the help of NumPy library.
Example:
import numpy as np
arr = np.chararray((rows, columns))
This will create an array having all the entries as empty strings. You can then initialize the array using either indexing or slicing.
Are you trying to do something like this?
>>> strs = [s.strip('\(\)') for s in ['some\\', '(list)', 'of', 'strings']]
>>> strs
['some', 'list', 'of', 'strings']
But what is a reason to use fixed size? There is no actual need in python to use fixed size arrays(lists) so you always have ability to increase it's size using append, extend or decrease using pop, or at least you can use slicing.
x = ['' for x in xrange(10)]
strlist =[{}]*10
strlist[0] = set()
strlist[0].add("Beef")
strlist[0].add("Fish")
strlist[1] = {"Apple", "Banana"}
strlist[1].add("Cherry")
print(strlist[0])
print(strlist[1])
print(strlist[2])
print("Array size:", len(strlist))
print(strlist)
The error message says it all: strs[sum-1] is a tuple, not a string. If you show more of your code someone will probably be able to help you. Without that we can only guess.
Sometimes I need a empty char array. You cannot do "np.empty(size)" because error will be reported if you fill in char later. Then I usually do something quite clumsy but it is still one way to do it:
# Suppose you want a size N char array
charlist = [' ']*N # other preset character is fine as well, like 'x'
chararray = np.array(charlist)
# Then you change the content of the array
chararray[somecondition1] = 'a'
chararray[somecondition2] = 'b'
The bad part of this is that your array has default values (if you forget to change them).
def _remove_regex(input_text, regex_pattern):
findregs = re.finditer(regex_pattern, input_text)
for i in findregs:
input_text = re.sub(i.group().strip(), '', input_text)
return input_text
regex_pattern = r"\buntil\b|\bcan\b|\bboat\b"
_remove_regex("row and row and row your boat until you can row no more", regex_pattern)
\w means that it matches word characters, a|b means match either a or b, \b represents a word boundary
If you want to take input from user here is the code
If each string is given in new line:
strs = [input() for i in range(size)]
If the strings are separated by spaces:
strs = list(input().split())

Categories