Break statement in Python - python

I am trying to break out of a for loop, but for some reason the following doesn't work as expected:
for out in dbOutPut:
case_id = out['case_id']
string = out['subject']
vectorspace = create_vector_space_model(case_id, string, tfidf_dict)
vectorspace_list.append(vectorspace)
case_id_list.append(case_id)
print len(case_id_list)
if len(case_id_list) >= kcount:
print "true"
break
It just keeps iterating untill the end of dbOutput. What am I doing wrong?

I'm guessing, based on your previous question, that kcount is a string, not an int. Note that when you compare an int with a string, (in CPython version 2) the int is always less than the string because 'int' comes before 'str' in alphabetic order:
In [12]: 100 >= '2'
Out[12]: False
If kcount is a string, then the solution is add a type to the argparse argument:
import argparse
parser=argparse.ArgumentParser()
parser.add_argument('-k', type = int, help = 'number of clusters')
args=parser.parse_args()
print(type(args.k))
print(args.k)
running
% test.py -k 2
yields
<type 'int'>
2
This confusing error would not arise in Python3. There, comparing an int and a str raises a TypeError.

Could it happen that kcount is actually a string, not an integer and, therefore, could never become less than any integer?
See string to int comparison in python question for more details.

Related

Why is my PySpark function2 giving error and function 1 working fine, logically they both are doing the same thing ? can someone help me understand?

I am trying to write 2 functions to convert string data in RDD to float format and then finding the average sepal length for iris dataset. Out of the 2 functions one is working fine but 2nd one is giving error. Can someone help me understand what mistake am i making here
is_float = lambda x: x.replace('.','',1).isdigit() and "." in x
def getSapellen(str2):
if isinstance(str2, float):
return str2
attlist=str2.split(",")
if is_float(attlist[0]):
return float(attlist[0])
else:
return 0.0
SepalLenAvg=irisRDD.reduce(lambda x,y: getSapellen(x) + getSapellen(y)) \
/(irisRDD.count()-1)
print(SepalLenAvg)
The above chunk of code is working. I am not able to figure out the mistake in below part
def getSapellen2(str2):
if ( str2.find("Sepal") != -1):
return str2
attlist=str2.split(",")
if isinstance(attlist[0],str):
return float(attlist[0])
else:
return 0.0
SepalLenAvg=irisRDD.reduce(lambda x,y: getSapellen2(x)+ getSapellen2(y)) \
/(irisRDD.count()-1)
print(SepalLenAvg)
On running the second method I am getting following error
TypeError: can only concatenate str (not "float") to str
This error means you are trying to add together string and float - the only place where you are adding things in the code is the lambda applied to whole irisRdd.
That means in at least one instance, calling getSapellen2(x)+ getSapellen2(y) causes str to be returned by one call and float by other.
if you look at first if statement, there is
return str2 - which is returning string, while all other conditions return numbers
That's mean this condition isinstance(str2, float) of getSapellen never true, while this condition str2.find("Sepal") != -1 from getSapellen2 is true at least once. Therefore, type of str2 is definitely not float, it's string, you might want to cast it to float or doing something else and returns float value instead.

Conversion From String To Int

I'm communicating with a modem via COM port to recieve CSQ values.
response = ser.readline()
csq = response[6:8]
print type(csq)
returns the following:
<type 'str'> and csq is a string with a value from 10-20
For further calculation I try to convert "csq" into an integer, but
i=int(csq)
returns following error:
invalid literal for int() with base 10: ''
A slightly more pythonic way:
i = int(csq) if csq else None
Your error message shows that you are trying to convert an empty string into an int which would cause problems.
Wrap your code in an if statement to check for empty strings:
if csq:
i = int(csq)
else:
i = None
Note that empty objects (empty lists, tuples, sets, strings etc) evaluate to False in Python.
As alternative you can put your code inside an try-except-block:
try:
i = int(csq)
except:
# some magic e.g.
i = False

string type check in python

I'm trying to do a type check with an element of a pandas dataframe which appears to be a string:
type(atmdf.ix[row]['bid'])
<type 'str'>
however, if I do a type check I get False:
type(atmdf.ix[row]['bid']) is 'str'
False
even with the isinstance I get the same unexpected result:
isinstance(type(atmdf.ix[row]['bid']), str)
False
where am I wrong?
P.S. the content of the dataframe is something like this:
atmdf.ix[row]['bid']
'28.5'
thank you!
You have to test the string itself with isintance, not the type:
In [2]: isinstance('string', str)
Out[2]: True
So in your case (leaving out the type(..)): isinstance(atmdf.ix[row]['bid'], str).
Your first check did not work because you have to compare to str (the type) not 'str' (a string).

Type of the positional parameters in python

I'm quite new to python programming and I come from a Unix/Linux administration and shell scripting background. I'm trying to write a program in python which accepts command line arguments and depending on there type (int, str) performs certain action. However in my case the input is always being treated as string.Please advice.
#!/usr/bin/python
import os,sys,string
os.system('clear')
# function definition
def fun1(a):
it = type(1)
st = type('strg')
if type(a) == it:
c = a ** 3
print ("Cube of the give int value %d is %d" % (a,c))
elif type(a) == st:
b = a+'.'
c = b * 3
print ("Since given input is string %s ,the concatenated output is %s" % (a,c))
a=sys.argv[1]
fun1(a)
Command line arguments to Programs are always given as strings (this is not only true for python but at least all C-related languages). This means when you give a number like "1" as an argument, you need to explicitly convert it into an integer. In your case, you could try converting it and assume it is a string if this does not work:
try:
v = int(a)
#... do int related stuff
except ValueError:
#... do string related stuff
This is bad design though, it would be better to let the user decide if he wants the argument to be interpreted as a string - after all, every int given by the user is also a valid string. You could for example use something like argparse and specify two different arguments given with "-i" for int and "-s" for string.
First of all, the input will always be treated as string.
You could use argparse:
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("cube", type=int,
help="Cube of the give int value ")
args = parser.parse_args()
answer = args.cube**3
print answer
python prog.py 4
64
All the integers have an attribute __int__, so you could use that attribute to differentiate between int and string.
if hasattr(intvalue, __int__):
print "Integer"
import argparse, ast
parser = argparse.ArgumentParser(description="Process a single item (int/str)")
parser.add_argument('item', type=ast.literal_eval,
help='item may be an int or a string')
item = parser.parse_args().item
if isinstance(item, int):
c = item ** 3
print("Cube of the give int value %d is %d" % (item,c))
elif isinstance(item, str):
b = item + '.'
c = b * 3
print("Since given input is string %s ,the concatenated output is %s"
% (item,c))
else:
pass # print error

Python: If is running even when condition is not met

import imaplib, re
import os
while(True):
conn = imaplib.IMAP4_SSL("imap.gmail.com", 993)
conn.login("xxx", "xxxx")
unreadCount = re.search("UNSEEN (\d+)", conn.status("INBOX", "(UNSEEN)")[1][0]).group(1)
print unreadCount
if unreadCount > 10:
os.system('ls')
Even when unreadCount is < 10, it runs the command 'ls'. Why?
You might want to coerce that value to an integer, as per:
unreadCount = int (re.search (blah, blah, blah).group (1))
The call to re.search is returning a string and, if you have a look at the following transcript:
>>> x = "7"
>>> if x > 10:
... print "yes"
...
yes
>>> if int(x) > 10:
... print "yes"
...
>>> x = 7
>>> if x > 10:
... print "yes"
...
>>>
you'll see why that's not such a good idea.
The reason you're seeing this (what you might call bizarre) behaviour can be gleaned from the manual at the bottom of 5.3:
CPython implementation detail: Objects of different types except numbers are ordered by their type names; objects of the same types that don’t support proper comparison are ordered by their address.
Since the type of "7" is str and the type of 10 is int, it's simply comparing the type names ("str" is always greater than "int" in an alpha order), leading to some interesting things like:
>>> "1" > 99999999999999999999999
True
>>> "1" == 1
False
That implementation detail was still in effect up until at least 2.7.2. It may have changed in the Python 3000 stream (the clause has certainly been removed from the relevant documentation section), but the documentation there still states:
Most other objects of built-in types compare unequal unless they are the same object; the choice whether one object is considered smaller or larger than another one is made arbitrarily but consistently within one execution of a program.
So it's probably not something you should rely on.
Try this:
if int(unreadCount) > 10:
os.system('ls')
You're comparing a string to an integer:
>>> '10' > 10
True
This may be shocking; whether the string is coerced to an integer or the integer is coerced to a string, in both cases the result should have been False. The truth is that neither happens, and the ordering is arbitrary. From the language reference:
Most other objects of built-in types compare unequal unless they are the same object; the choice whether one object is considered smaller or larger than another one is made arbitrarily but consistently within one execution of a program.
This will solve your problem:
unreadCount = int(re.search(...).group(1))

Categories