output for str.join() method is not consistent - python

Lets assign two variables:
>>> a_id = 'c99faf24275d476d84e0c8f0ad953582'
>>> u_id = '59958a11a6ad4d8b39707a70'
Right output:
>>> a_id+u_id
'c99faf24275d476d84e0c8f0ad95358259958a11a6ad4d8b39707a70'
Wrong output:
>>> str.join(a_id,u_id)
'5c99faf24275d476d84e0c8f0ad9535829c99faf24275d476d84e0c8f0ad9535829c99faf24275d476d84e0c8f0ad9535825c99faf24275d476d84e0c8f0ad9535828c99faf24275d476d84e0c8f0ad953582ac99faf24275d476d84e0c8f0ad9535821c99faf24275d476d84e0c8f0ad9535821c99faf24275d476d84e0c8f0ad953582ac99faf24275d476d84e0c8f0ad9535826c99faf24275d476d84e0c8f0ad953582ac99faf24275d476d84e0c8f0ad953582dc99faf24275d476d84e0c8f0ad9535824c99faf24275d476d84e0c8f0ad953582dc99faf24275d476d84e0c8f0ad9535828c99faf24275d476d84e0c8f0ad953582bc99faf24275d476d84e0c8f0ad9535823c99faf24275d476d84e0c8f0ad9535829c99faf24275d476d84e0c8f0ad9535827c99faf24275d476d84e0c8f0ad9535820c99faf24275d476d84e0c8f0ad9535827c99faf24275d476d84e0c8f0ad953582ac99faf24275d476d84e0c8f0ad9535827c99faf24275d476d84e0c8f0ad9535820'
Now consider this case, the output is correct now:
>>> a="asdf"
>>> b="asdfsdfsd"
>>> str.join(a,b)
'aasdfsasdfdasdffasdfsasdfdasdffasdfsasdfd'
Confirming the type of all variables in the example:
>>> type(a)
<class 'str'>
>>> type(a_id)
<class 'str'>
>>> type(u_id)
<class 'str'>
Edit
I just realized the second case in the output was not quite what I expected as well. I was using join method in a wrong way.

str.join(a, b) is equivalent to a.join(b), provided a is a str object and b is an iterable. Strings are always iterable, as you will be iterating though each characters in it when you're iterating over a string.
This is basically "insert a copy of a between every element of b (as an iterable)", so if a and b are both strings, a copy of a is inserted into every pair of letters in b. For example:
>>> str.join(".", "123456")
'1.2.3.4.5.6'
If you simply want to concatenate two strings, + is enough, not join:
>>> "." + "123456"
'.123456'
If you really want join, put the strings in a list and use an empty string as "delimiter":
>>> str.join('', ['123', '456', '7890'])
'1234567890'
>>> ''.join(['123', '456', '7890'])
'1234567890'

Related

How to index into the string result of a dictionary item reference?

What is the best syntax to reference the characters in a reference to dictionary item? How do I make it indexable?
>>> myd = {'abc':'123','def':'456','ghi':'789'}
>>> myd
{'def': '456', 'ghi': '789', 'abc': '123'}
>>> type(myd)
<class 'dict'>
>>> s=myd['def']
>>> s
'456'
>>> type(s)
<class 'str'>
>>> s[0]
'4'
>>> s[2]
'6'
>>> myd['def'].[0]
SyntaxError: invalid syntax
myd['def'] returns you the string '456'. You can access a specific index of an array using the same bracket notation that most languages support. Hence, myd['def'][0] will return the string literal '4'
Just remove the . and it will work.
You have not actually sliced your string. Once you get the value myd['def'] it returns a string. You then need to use [] to slice it. [0] in this case however adding a . is just a syntax error in Python.
This link describes slicing strings

Assigning empty value or string in Python

I would like to understand if there is a difference between assigning an empty value and an empty output, as follows:
1> Assigning a value like this
string = ""
2> An empty value returned as output
string = "abcd:"
str1, str2 = split(':')
In other words, is there a difference in values of 'string' in 1> and 'str2' in 2>? And how would a method see the value of 'str2' if it is passed as an argument?
Checking equality with ==
>>> string = ""
>>> s = "abcd:"
>>> str1, str2 = s.split(':')
>>> str1
'abcd'
>>> str2
''
>>> str2 == string
True
Maybe you were trying to compare with is. This is for testing identity: a is b is equivalent to id(a) == id(b).
Or check both strings for emptiness:
>>> not str2
True
>>> not string
True
>>>
So that both are empty ...
>>> string1 = ""
>>> string2 = "abcd:"
>>> str1, str2 = string.split(':')
>>> str1
'abcd'
>>> str2
''
>>> string1 == str2
True
No. There is no difference between the two empty strings. They would behave the same in all cases.
In other words, is there a difference in values of 'string' in 1> and 'str2' in 2>?
No, there is no difference, both are empty strings "".
And how would a method see the value of 'str2' if it is passed as an argument?
The method would see it as a string of length 0, in other words, an empty string.
If you will check id(string) in case-1 and id(str2) in case2, it will give u the same value, both the string objects are same.
def mine(str1, str2):
print str1, str2
see the above method you can call mine(* string.split(':')) it will pass the 'abcd:' as str1 = 'abcd' and str2 = ''.
You can see for yourself.
>>> s1 = ''
>>> s2 = 'abcd:'
>>> s3, s4 = s2.split(':')
>>> s1 == s4
True
>>> string = ""
>>> id(string)
2458400
>>> print string
>>> string = "abcd:"
>>> str1, str2 = string.split(':')
>>> print str1
abcd
>>> print str2
>>> id(str2)
2458400
>>> type(string)
<type 'str'>
>>> type(str2)
<type 'str'>
No there is no difference
Empty string is a literal, in Python literals are immutable objects and there value never changes. However, in some cases two literal objects having same value can have different identities (Identity of an object is an address of the memory location in CPython and you can get it by using id(obj)) so to answer your question
print id(string) == id(str2) # Can output either True or False
print string == str2 # Will always output True
Note that most of the time id(string) should be equal to id(str2) :).
You can read about the Data Model in the Python Language Reference for further details. I am quoting the text which is pertinent to the question:
Types affect almost all aspects of object behavior. Even the
importance of object identity is affected in some sense: for immutable
types, operations that compute new values may actually return a
reference to any existing object with the same type and value, while
for mutable objects this is not allowed. E.g., after a = 1; b = 1, a
and b may or may not refer to the same object with the value one,
depending on the implementation, but after c = []; d = [], c and d are
guaranteed to refer to two different, unique, newly created empty
lists. (Note that c = d = [] assigns the same object to both c and d.)

python convert unicode to string

I got my results from sqlite by python, it's like this kind of tuples: (u'PR:000017512',)
However, I wanna print it as 'PR:000017512'. At first, I tried to select the first one in tuple by using index [0]. But the print out results is still u'PR:000017512'. Then I used str() to convert and nothing changed. How can I print this without u''?
You're confusing the string representation with its value. When you print a unicode string the u doesn't get printed:
>>> foo=u'abc'
>>> foo
u'abc'
>>> print foo
abc
Update:
Since you're dealing with a tuple, you don't get off this easy: You have to print the members of the tuple:
>>> foo=(u'abc',)
>>> print foo
(u'abc',)
>>> # If the tuple really only has one member, you can just subscript it:
>>> print foo[0]
abc
>>> # Join is a more realistic approach when dealing with iterables:
>>> print '\n'.join(foo)
abc
Don't see the problem:
>>> x = (u'PR:000017512',)
>>> print x
(u'PR:000017512',)
>>> print x[0]
PR:000017512
>>>
You the string is in unicode format, but it still means PR:000017512
Check out the docs on String literals
http://docs.python.org/2/reference/lexical_analysis.html#string-literals
In [22]: unicode('foo').encode('ascii','replace')
Out[22]: 'foo'

How to properly convert list of one element to a tuple with one element

>>> list=['Hello']
>>> tuple(list)
('Hello',)
Why is the result of the above statements ('Hello',) and not ('Hello')?. I would have expected it to be the later.
You've got it right. In python if you do:
a = ("hello")
a will be a string since the parenthesis in this context are used for grouping things together. It is actually the comma which makes a tuple, not the parenthesis (parenthesis are just needed to avoid ambiguity in certain situations like function calls)...
a = "Hello","goodbye" #Look Ma! No Parenthesis!
print (type(a)) #<type 'tuple'>
a = ("Hello")
print (type(a)) #<type 'str'>
a = ("Hello",)
print (type(a)) #<type 'tuple'>
a = "Hello",
print (type(a)) #<type 'tuple'>
And finally (and most direct for your question):
>>> a = ['Hello']
>>> b = tuple(a)
>>> print (type(b)) #<type 'tuple'> -- It thinks it is a tuple
>>> print (b[0]) #'Hello' -- It acts like a tuple too -- Must be :)

Alternative to python string item assignment

What is the best / correct way to use item assignment for python string ?
i.e s = "ABCDEFGH" s[1] = 'a' s[-1]='b' ?
Normal way will throw : 'str' object does not support item assignment
Strings are immutable. That means you can't assign to them at all. You could use formatting:
>>> s = 'abc{0}efg'.format('d')
>>> s
'abcdefg'
Or concatenation:
>>> s = 'abc' + 'd' + 'efg'
>>> s
'abcdefg'
Or replacement (thanks Odomontois for reminding me):
>>> s = 'abc0efg'
>>> s.replace('0', 'd')
'abcdefg'
But keep in mind that all of these methods create copies of the string, rather than modifying it in-place. If you want in-place modification, you could use a bytearray -- though that will only work for plain ascii strings, as alexis points out.
>>> b = bytearray('abc0efg')
>>> b[3] = 'd'
>>> b
bytearray(b'abcdefg')
Or you could create a list of characters and manipulate that. This is probably the most efficient and correct way to do frequent, large-scale string manipulation:
>>> l = list('abc0efg')
>>> l[3] = 'd'
>>> l
['a', 'b', 'c', 'd', 'e', 'f', 'g']
>>> ''.join(l)
'abcdefg'
And consider the re module for more complex operations.
String formatting and list manipulation are the two methods that are most likely to be correct and efficient IMO -- string formatting when only a few insertions are required, and list manipulation when you need to frequently update your string.
Since strings are "immutable", you get the effect of editing by constructing a modified version of the string and assigning it over the old value. If you want to replace or insert to a specific position in the string, the most array-like syntax is to use slices:
s = "ABCDEFGH"
s = s[:3] + 'd' + s[4:] # Change D to d at position 3
It's more likely that you want to replace a particular character or string with another. Do that with re, again collecting the result rather than modifying in place:
import re
s = "ABCDEFGH"
s = re.sub("DE", "--", s)
I guess this Object could help:
class Charray(list):
def __init__(self, mapping=[]):
"A character array."
if type(mapping) in [int, float, long]:
mapping = str(mapping)
list.__init__(self, mapping)
def __getslice__(self,i,j):
return Charray(list.__getslice__(self,i,j))
def __setitem__(self,i,x):
if type(x) <> str or len(x) > 1:
raise TypeError
else:
list.__setitem__(self,i,x)
def __repr__(self):
return "charray['%s']" % self
def __str__(self):
return "".join(self)
For example:
>>> carray = Charray("Stack Overflow")
>>> carray
charray['Stack Overflow']
>>> carray[:5]
charray['Stack']
>>> carray[-8:]
charray['Overflow']
>>> str(carray)
'Stack Overflow'
>>> carray[6] = 'z'
>>> carray
charray['Stack zverflow']
s = "ABCDEFGH" s[1] = 'a' s[-1]='b'
you can use like this
s=s[0:1]+'a'+s[2:]
this is very simple than other complex ways

Categories