Get Python tuple in different format

Get Python tuple in different format - python

I have a python tuple like so,
((1420455415000L, 2L), (1420545729000L, 3L), (1420653453000L, 2L))
I want to convert it into this format:
[[1420455415000, 2], [1420545729000, 3], [1420653453000, 2]]
Please note that I also want to remove the 'L' that is automatically removed when I convert this tuple to dict. I have converted the tuple of tuples to list using :
def listit(t):
return list(map(listit, t)) if isinstance(t, (list, tuple)) else t
but the L still remains. That is a problem because I am sending the data to Javascript
How can I do this?

If you're passing the data to JavaScript, you can do this trivially with the json (JavaScript Object Notation) module:
>>> import json
>>> json.dumps(((1420455415000L, 2L), (1420545729000L, 3L), (1420653453000L, 2L)))
'[[1420455415000, 2], [1420545729000, 3], [1420653453000, 2]]'

To get the output in your question you could use
t = ((1420455415000L, 2L), (1420545729000L, 3L), (1420653453000L, 2L))
l = [map(int,x) for x in t]
The conversion from long to int would only work if the value was less than or equal to sys.maxint. Otherwise it will stay as a long. The conversion is not necessary though as the L is only really denoting the type and not the value.
If you are passing it to javascript, the conversion to json makes more sense.

'L' merely indicates variable's type, in this case Long Integer. Hence whatever the way you are sending the data it will behave as an Int.
That said, if you really don't want to see that 'L' you would need to change the type into integer with simple int():

L denotes that the numbers is of type long , if you are 100% sure that the number would be less than the limit that int can handle (in python , which means on conversion to int it would remain int and not revert back to long , which can happen if the number is very very large), then you can simply convert by using int(num) . But please note, L is just an internal representation and it would not show up when the number is converted to string (or printed, for which it is internally converted to string) , it will only show up when using repr() .
Example -
>>> i = 2L
>>> i
2L
>>> int(i)
2
>>> print i
2
>>> str(i)
'2'
>>> i
2L
In your case, to convert longs to int inside a list use -
>>> l = [1L , 2L , 3L]
>>> print l
[1L, 2L, 3L]
>>> l = map(int, l)
>>> l
[1, 2, 3]
>>> print l
[1, 2, 3]
If its possible that the lists have sublists, use a recursive function such as -
def convertlist(l):
if isinstance(l , (list, tuple)):
return list(map(convertlist, l))
elif isinstance(l , long):
return int(l)
else:
return l
>>> l = [1L , 2L , [3L]]
>>> convertlist(l)
[1, 2, [3]]

Related

Split string to various data types

I would like to convert the following string:
s = '1|2|a|b'
to
[1, 2, 'a', 'b']
Is it possible to do the conversion in one line?

Is it possible to do the conversion in one line?
YES, It is possible. But how?
Algorithm for the approach
Split the string into its constituent parts using str.split. The output of this is
>>> s = '1|2|a|b'
>>> s.split('|')
['1', '2', 'a', 'b']
Now we have got half the problem. Next we need to loop through the split string and then check if each of them is a string or an int. For this we use
A list comprehension, which is for the looping part
str.isdigit for finding if the element is an int or a str.
The list comprehension can be easily written as [i for i in s.split('|')]. But how do we add an if clause there? This is covered in One-line list comprehension: if-else variants. Now that we know which all elements are int and which are not, we can easily call the builtin int on it.
Hence the final code will look like
[int(i) if i.isdigit() else i for i in s.split('|')]
Now for a small demo,
>>> s = '1|2|a|b'
>>> [int(i) if i.isdigit() else i for i in s.split('|')]
[1, 2, 'a', 'b']
As we can see, the output is as expected.
Note that this approach is not suitable if there are many types to be converted.

You cannot do it for negative numbers or lots of mixed types in one line but you could use a function that would work for multiple types using ast.literal_eval:
from ast import literal_eval
def f(s, delim):
for ele in s.split(delim):
try:
yield literal_eval(ele)
except ValueError:
yield ele
s = '1|-2|a|b|3.4'
print(list(f(s,"|")))
[1, -2, 'a', 'b', 3.4]

Another way, is using map built-in method:
>>> s='1|2|a|b'
>>> l = map(lambda x: int(x) if x.isdigit() else x, s.split('|'))
>>> l
[1, 2, 'a', 'b']
If Python3, then:
>>> s='1|2|a|b'
>>> l = list(map(lambda x: int(x) if x.isdigit() else x, s.split('|')))
>>> l
[1, 2, 'a', 'b']
Since map in Python3 would give a generator, so you must convert it to list

It is possible to do arbitrarily many or complex conversions "in a single line" if you're allowed a helper function. Python does not natively have a "convert this string to the type that it should represent" function, because what it "should" represent is vague and may change from application to application.
def convert(input):
converters = [int, float, json.loads]
for converter in converters:
try:
return converter(input)
except (TypeError, ValueError):
pass
# here we assume if all converters failed, it's just a string
return input
s = "1|2.3|a|[4,5]"
result = [convert(x) for x in s.split("|")]

If you have all kinds of data types(more than str and int), I believe this does the job.
s = '1|2|a|b|[1, 2, 3]|(1, 2, 3)'
print [eval(x) if not x.isalpha() else x for x in s.split("|")]
# [1, 2, 'a', 'b', [1, 2, 3], (1, 2, 3)]
This fails if there exists elements such as "b1"

Error casting to float, then int in python

Strange error happening.
I know of the issue with trying to cast strings with decimals directly into ints:
int(float('0.0'))
works, while
int('0.0')
does not. However, I'm still getting an error that I can't seem to figure out:
field = line.strip().split()
data[k,:] = [int(float(k)) for k in field[1:]]
ValueError: invalid literal for long() with base 10: '0.0'
Any ideas what could be happening here? The script seems to be thinking it's a cast to long instead of float. Any way to convince it otherwise?
Thanks in advance!
EDIT: the data line is of the form:
'c1c9r2r8\t0.0\t3.4\t2.1\t9.0\n'

It appears that what is happening is that the list comprehension is polluting your namespace.
eg.
k = 0
[k for k in range(10)]
After executing the above code in python 2.x the value of k will be 9 (the last value that was produced by range(10)).
I'll simplify your code to show you what is happening.
>>> l = [None, None, None]
>>> k = 0
>>> l[k] = [k for k in range(3)]
>>> print k, l
2 [None, None, [0, 1, 2]]
You see that l[k] evaluated to l[2] rather than l[0]. To avoid this namespace pollution either do not use the same variable names in a list comprehension as you do in the outer code, or use python 3.x where inner variables of list comprehensions no longer escape to the outer code.
For python 2.x your code should be modified to be something like:
data[k,:] = [int(float(_k)) for _k in field[1:]]

>>> line = 'c1c9r2r8\t0.0\t3.4\t2.1\t9.0\n'
>>> field = line.strip().split()
>>> field
['c1c9r2r8', '0.0', '3.4', '2.1', '9.0']
>>> [int(x) for x in map(float, field[1:])]
[0, 3, 2, 9]

Your error is coming from the left-hand side of the assignment data[k, :] = .... Here you're trying to index a NumPy array (data) with a string (k). NumPy tries to do an implicit conversion of that string to a usable integer index, and fails. For example:
>>> import numpy as np
>>> data = np.arange(12).reshape(3, 4)
>>> data['3.4', :] = 6
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for long() with base 10: '3.4'
Use an integer instead of a string here, and the problem should go away.

Using max() on a list that contains numbers and strings

I have a list that is like this:
self.thislist = ['Name', 13, 'Name1', 160, 'Name2', 394]
Basically the list has a name and a number after it, and I'm trying to find out the highest number in the list, which is 394. But for some reason, it picks a name as this.
if max(self.thislist) > 150:
this = max(self.thislist) # so this should be 394
position = self.thislist.index(this) # The index of it
temponary = position - 1 # This is so we can find the name that is associated with it
name = self.thislist[temponary] #and this retrieves the name
and it retrieves for example, 'Name', when it should be 394.
So the point is to retreive a name and a number associated with that name. Any ideas?

By calling max, you're asking it to compare all the values.
In Python 2.x, most values can be compared to each other, even if they're of different types; the comparison will be meaningful in some arbitrary and implementation-specific way (in CPython, it mostly comes down to comparing the names of type objects themselves), but that's rarely if ever useful to you.
In Python 3.x, most values of unrelated types can't be compared to each other, so you'd just get a TypeError instead of a useless answer. But the solution is the same.
If you want to compare the numbers and ignore the names, you can filter out all non-numbers, skip every even element, use a key function that converts all non-numbers to something smaller than any number, or almost anything else that avoids trying to compare the names and the numbers. For example:
if max(self.thislist[1::2]) > 150:
As a side note, using data structures like this is going to make a lot of things more complicated. It seems like what you really want here is not a list of alternating names and numbers, but a dict mapping names to numbers, or a list of name-number pairs, or something similar. Then you could write things more readably. For example, after this:
self.thisdict = dict(zip(self.thislist[::2], self.thislist[1::2]))
… you can do things like:
if max(self.thisdict.itervalues()) > 150:

In Python 2, you can compare values of different types (and they will then be compared by the name of the type. Since str comes after int, any string will compare higher than any integer. Since this doesn't make any sense, Python 3 has wisely removed this feature)
In order to get what you want, use a custom key:
>>> thislist = ['Name', 13, 'Name1', 160, 'Name2', 394]
>>> max(thislist, key = lambda x: x if isinstance(x, (int, long, float)) else 0)
394
(This assumes that there is at least one positive number in the list)

That's because string always compare greater than integers in Python 2. You can use a custom key function to fix that:
>>> lst = ['Name', 13, 'Name1', 160, 'Name2', 394]
>>> max(lst, key=lambda x: (isinstance(x, (int, float)), x))
394

I would do it like this:
>>> thislist = ['Name', 13, 'Name1', 160, 'Name2', 394]
>>> names, numbers = thislist[0::2], thislist[1::2]
>>> max(zip(numbers, names))
(394, 'Name2')

you'll want to filter the strings out of the list first as values of different types have and arbitrary (but consistent) ordering (in python2.x) as can be checked easily:
>>> 'foo' > 1
True
I'd just filter with a generator expression that only pulls out the numbers and pass that to max:
import numbers
max(x for x in self.thislist if isinstance(x, numbers.Number))
demo:
>>> lst = ['foo', 1, 1.6, 'bar']
>>> max(x for x in lst if isinstance(x, numbers.Number))
1.6
>>> lst = ['foo', 1, 1.6, 2, 'bar']
>>> max(x for x in lst if isinstance(x, numbers.Number))
2

What does [u'abcd', u'bcde'] mean in Python?

Used a loop to add a bunch of elements to a list with
mylist = []
for x in otherlist:
mylist.append(x[0:5])
But instead of the expected result ['x1','x2',...], I got: [u'x1', u'x2',...]. Where did the u's come from and why? Also is there a better way to loop through the other list, inserting the first six characters of each element into a new list?

The u means unicode, you probably will not need to worry about it
mylist.extend(x[:5] for x in otherlist)

The u means unicode. It's Python's internal string representation (from version ... ?).
Most times you don't need to worry about it. (Until you do.)

The answers above me already answered the "u" part - that the string is encoded in Unicode. About whether there's a better way to extract the first 6 letters from the items in a list:
>>> a = ["abcdefgh", "012345678"]
>>> b = map(lambda n: n[0:5], a);
>>> for x in b:
print(x)
abcde
01234
So, map applies a function (lambda n: n[0:5]) to each element of a and returns a new list with the results of the function for every element. More precisely, in Python 3, it returns an iterator, so the function gets called only as many times as needed (i.e. if your list has 5000 items, but you only pull 10 from the result b, lambda n: n[0:5] gets called only 10 times). In Python2, you need to use itertools.imap instead.
>>> a = [1, 2, 3]
>>> def plusone(x):
print("called with {}".format(x))
return x + 1
>>> b = map(plusone, a)
>>> print("first item: {}".format(b.__next__()))
called with 1
first item: 2
Of course, you can apply the function "eagerly" to every element by calling list(b), which will give you a normal list with the function applied to each element on creation.
>>> b = map(plusone, a)
>>> list(b)
called with 1
called with 2
called with 3
[2, 3, 4]

How do I do what strtok() does in C, in Python?

I am learning Python and trying to figure out an efficient way to tokenize a string of numbers separated by commas into a list. Well formed cases work as I expect, but less well formed cases not so much.
If I have this:
A = '1,2,3,4'
B = [int(x) for x in A.split(',')]
B results in [1, 2, 3, 4]
which is what I expect, but if the string is something more like
A = '1,,2,3,4,'
if I'm using the same list comprehension expression for B as above, I get an exception. I think I understand why (because some of the "x" string values are not integers), but I'm thinking that there would be a way to parse this still quite elegantly such that tokenization of the string a works a bit more directly like strtok(A,",\n\t") would have done when called iteratively in C.
To be clear what I am asking; I am looking for an elegant/efficient/typical way in Python to have all of the following example cases of strings:
A='1,,2,3,\n,4,\n'
A='1,2,3,4'
A=',1,2,3,4,\t\n'
A='\n\t,1,2,3,,4\n'
return with the same list of:
B=[1,2,3,4]
via some sort of compact expression.

How about this:
A = '1, 2,,3,4 '
B = [int(x) for x in A.split(',') if x.strip()]
x.strip() trims whitespace from the string, which will make it empty if the string is all whitespace. An empty string is "false" in a boolean context, so it's filtered by the if part of the list comprehension.

Generally, I try to avoid regular expressions, but if you want to split on a bunch of different things, they work. Try this:
import re
result = [int(x) for x in filter(None, re.split('[,\n,\t]', A))]

Mmm, functional goodness (with a bit of generator expression thrown in):
a = "1,2,,3,4,"
print map(int, filter(None, (i.strip() for i in a.split(','))))
For full functional joy:
import string
a = "1,2,,3,4,"
print map(int, filter(None, map(string.strip, a.split(','))))

For the sake of completeness, I will answer this seven year old question:
The C program that uses strtok:
int main()
{
char myLine[]="This is;a-line,with pieces";
char *p;
for(p=strtok(myLine, " ;-,"); p != NULL; p=strtok(NULL, " ;-,"))
{
printf("piece=%s\n", p);
}
}
can be accomplished in python with re.split as:
import re
myLine="This is;a-line,with pieces"
for p in re.split("[ ;\-,]",myLine):
print("piece="+p)

This will work, and never raise an exception, if all the numbers are ints. The isdigit() call is false if there's a decimal point in the string.
>>> nums = ['1,,2,3,\n,4\n', '1,2,3,4', ',1,2,3,4,\t\n', '\n\t,1,2,3,,4\n']
>>> for n in nums:
... [ int(i.strip()) for i in n if i.strip() and i.strip().isdigit() ]
...
[1, 2, 3, 4]
[1, 2, 3, 4]
[1, 2, 3, 4]
[1, 2, 3, 4]

How about this?
>>> a = "1,2,,3,4,"
>>> map(int,filter(None,a.split(",")))
[1, 2, 3, 4]
filter will remove all false values (i.e. empty strings), which are then mapped to int.
EDIT: Just tested this against the above posted versions, and it seems to be significantly faster, 15% or so compared to the strip() one and more than twice as fast as the isdigit() one

Why accept inferior substitutes that cannot segfault your interpreter? With ctypes you can just call the real thing! :-)
# strtok in Python
from ctypes import c_char_p, cdll
try: libc = cdll.LoadLibrary('libc.so.6')
except WindowsError:
libc = cdll.LoadLibrary('msvcrt.dll')
libc.strtok.restype = c_char_p
dat = c_char_p("1,,2,3,4")
sep = c_char_p(",\n\t")
result = [libc.strtok(dat, sep)] + list(iter(lambda: libc.strtok(None, sep), None))
print(result)

Why not just wrap in a try except block which catches anything not an integer?

I was desperately in need of strtok equivalent in Python. So I developed a simple one by my own
def strtok(val,delim):
token_list=[]
token_list.append(val)
for key in delim:
nList=[]
for token in token_list:
subTokens = [ x for x in token.split(key) if x.strip()]
nList= nList + subTokens
token_list = nList
return token_list

I'd guess regular expressions are the way to go: http://docs.python.org/library/re.html

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Get Python tuple in different format - python

If you're passing the data to JavaScript, you can do this trivially with the json (JavaScript Object Notation) module: >>> import json >>> json.dumps(((1420455415000L, 2L), (1420545729000L, 3L), (1420653453000L, 2L))) '[[1420455415000, 2], [1420545729000, 3], [1420653453000, 2]]'

'L' merely indicates variable's type, in this case Long Integer. Hence whatever the way you are sending the data it will behave as an Int. That said, if you really don't want to see that 'L' you would need to change the type into integer with simple int():

Related

Split string to various data types

Error casting to float, then int in python

Using max() on a list that contains numbers and strings

What does [u'abcd', u'bcde'] mean in Python?

How do I do what strtok() does in C, in Python?

Categories

Resources