Here is a piece of the dataset:
18,8,307,130,3504,12,70,1,chevrolet
15,8,350,165,3693,11.5,70,1,buick
18,8,318,150,3436,11,70,1,plymouth
16,8,304,150,3433,12,70,1,amc
17,8,302,140,3449,10.5,70,1,ford
15,8,429,198,4341,10,70,1,ford
14,8,454,220,4354,9,70,1,chevrolet
14,8,440,215,4312,8.5,70,1,plymouth
Here is the code:
data = sc.textFile("hw6/auto_mpg_original.csv")
records = data.map(lambda x: x.split(","))
hp = float(records.map(lambda x: x[3]))
disp = np.array(float(records.map(lambda x: x[2])))
final_data_1 = LabeledPoint(hp, disp)
Here is the error:
Traceback (most recent call last):
File "/home/cloudera/Desktop/hw6.py", line 41, in <module>
hp = float(records.map(lambda x: x[3]))
TypeError: float() argument must be a string or a number
This seems basic, but i'm really having trouble tracking down a solution to this.
Check the type of records.map() probably an RDD. You can apply the float() in the map(), e.g.:
hp = records.map(lambda x: float(x[3]))
But you will need to .collect() the results before using it, e.g.:
hp = records.map(lambda x: float(x[3])).collect()
disp = np.array(records.map(lambda x: float(x[2])).collect())
There is a problem with the input from the CSV, the column is either empty or containing non numeric value
Related
I have problem which when i run the code it shows an error:
Traceback (most recent call last):
File "C:\Users\server\PycharmProjects\Publictest2\main.py", line 19, in <module>
Distance = radar.route.distance(Starts, End, modes='transit')
File "C:\Users\server\PycharmProjects\Publictest2\venv\lib\site-packages\radar\endpoints.py", line 612, in distance
(origin_lat, origin_lng) = origin
ValueError: too many values to unpack (expected 2)
My Code:
from radar import RadarClient
import pandas as pd
API_key = 'API'
radar = RadarClient(API_key)
file = pd.read_excel('files')
file['AntGeo'] = Sourced[['Ant_lat', 'Ant_long']].apply(','.join, axis=1)
file['BaseGeo'] = Sourced[['Base_lat', 'Base_long']].apply(','.join, axis=1)
antpoint = file['AntGeo']
basepoint = file['BaseGeo']
for antpoint in antpoint:
dist= radar.route.distance(antpoint , basepoint, modes='transit')
dist= dist['routes'][0]['distance']
dist= dist / 1000
Firstly, your error code does not match your given code sample correctly.
It is apparent you are working with the python library for the Radar API.
Your corresponding line 19 is dist= radar.route.distance(antpoint , basepoint, modes='transit')
From the radar-python 'pypi manual', your route should be referenced as:
## Routing
radar.route.distance(origin=[lat,lng], destination=[lat,lng], modes='car', units='metric')
Without having sight of your dataset, file, one can nonetheless deduce or expect the following:
Your antpoint and basepoint must be a two-item list (or tuple).
For instance, your antpoint ought to have a coordinate like [40.7041029, -73.98706]
See the radar-python manual
line 11 and 13 in your code
file['AntGeo'] = Sourced[['Ant_lat', 'Ant_long']].apply(','.join, axis=1)
file['BaseGeo'] = Sourced[['Base_lat', 'Base_long']].apply(','.join, axis=1)
Your error is occuring at this part:
Distance = radar.route.distance(Starts, End, modes='transit')
(origin_lat, origin_lng) = origin
First of all check the amount of variables that "origin" delivers to you, it's mismatched with the expectation I guess.
I have fetched a list using pandas, but the numeric is like a numeric string. I am trying to convert it to a list of integers.
excel_frame = read_excel(args.path, sheet_name=1, verbose=True, na_filter=False)
data_need = excel_frame['Dependencies'].tolist()
print(data_need)
intStr = data_need.split(',')
map_list = map(int, intStr)
print(map_list)
I am getting the following error.
$python ExcelCellCSVRead.py -p "C:\MyCave\iso\SDG\Integra\Intest\first.xlsx"
Reading sheet 1
['187045, 187046']
Traceback (most recent call last):
File "ExcelCellCSVRead.py", line 31, in <module>
intStr = data_need.split(',')
AttributeError: 'list' object has no attribute 'split'
The target output must be like this -> [187045, 187046]. The current output is coming out like this ->['187045, 187046']
I am pretty sure I have followed suggested approach to resolve the issue, yet it is throwing error.
Regards
data_need
The problem is:
data_need = excel_frame['Dependencies'].tolist()
returns a list. So you can't split it further.
Change your existing code to this:
intStr = data_need[0].split(',') ## if you have only 1-element in data_need
map_list = list(map(int, intStr))
print(map_list)
Tested on your sample:
In [1000]: data_need = ['187045, 187046']
In [1001]: intStr = data_need[0].split(',')
In [1002]: map_list = list(map(int, intStr))
In [1003]: print(map_list)
[187045, 187046]
I am getting more and more confused in python.
when i try on one row, it works, but when i work on the whole rows of one column, it shows error.
i want to use the function convert_hex_to_int for each row in the column,
but it shows me the error
Traceback (most recent call last):
File
"C:/Users/ranic/.PyCharmCE2018.3/config/scratches/scratch_2.py", line
59, in
result_print = (convert_hex_to_int(hex_int, 4))
File "C:/Users/r/.PyCharmCE2018.3/config/scratches/scratch_2.py", line 32,
in conver
t_hex_to_int
splitted = [hex(n)[2:][i:i + interval] for i in range(0, len(hex(n)[2:]), interval)] TypeError: 'str' object cannot be
interpreted as an integer
here is my code:
cnxn = pyodbc.connect(conn_str)
cnxn.add_output_converter(pyodbc.SQL_VARBINARY, hexToString)
cursor = cnxn.cursor()
def convert_hex_to_int(n:int, interval:int):
splitted = [hex(n)[2:][i:i + interval] for i in range(0, len(hex(n)[2:]), interval)]
return [int(hex(unpack('<H', pack('>H', int(i, 16)))[0]), 16) for i in splitted]
try:
cursor.execute(query)
row=cursor.fetchval()
row_list=[]
while row is not None:
row=cursor.fetchval()
hex_int = int(row, 16)
result_print = (convert_hex_to_int(hex_int, 4))
result_float = [float("{0:.2f}".format((i) * 10 ** -2)) for i in result_print]
row_list.append(result_float)
print(row_list)
Please leave any comment if I miss something, thanks in advance.
When I debugged it, it shows something like this:
Debugged screen
*sorry I had to attach the image as it is the debugged screen and i cant copy the code, and it had to be in link because i am a new user
**edit: i think it has to do with the use of .fetchval twice, but im not too sure
If the line
[hex(n)[2:][i:i + interval] for i in range(0, len(hex(n)[2:]), interval)]
Results in
TypeError: 'str' object cannot be interpreted as an integer
Then n must not be an integer.
Observe, if n is '0x94069206':
>>> hex('0x94069206')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'str' object cannot be interpreted as an integer
As the code is taking slices of n it looks as if n needs to be a string, so the line should be:
splitted = [n[2:][i:i + interval] for i in range(0, len(n[2:]), interval)]
It follows that the function signature should be
def convert_hex_to_int(n:str, interval:int)
On the other hand, if n is an int then the next line needs to be reworked.
so i was coding some problems in python in this page
https://www.codewars.com/kata/airport-arrivals-slash-departures-number-1/train/python
the code work fine on my computer but when i update it, i came across this bug.
note that its python 3.4.3
def flap_display(lines, rotors):
baseString = "ABCDEFGHIJKLMNOPQRSTUVWXYZ ?!##&()|<>.:=-+*/0123456789"
res = []
baseLen = len(baseString)
lineLen = len(lines)
sLen = len(rotors)
carrier = 0
for item in range(0 , sLen):
if (item < lineLen):
carrier =carrier + rotors[item]
tmp = baseString.index(lines[item])
tmp = tmp + carrier
tmp = tmp % baseLen
res.append( baseString[tmp] )
resS = ''.join(res)
return resS
print (flap_display("CAT", [1,13,27]))
all the website gave me is this:
Traceback:
in
in flap_display
TypeError: unsupported operand type(s) for +: 'int' and 'list'
Now i want to know if my code is incorrect or its just the site being buggy.
Problem is solved! Thank to mr.kuro
sum requires an iterable: a sequence of items, such as a list. You gave it a single integer. If you want to add up all the integers in rotors, you can do that outside of a loop, with
carrier = sum(rotors)
More to your code, just add up the items you wanted:
carrier = sum(rotors[:lineLen])
This adds the first lineLen elements of rotors, allowing you to get rid of that pesky if statement.
Can you adapt the rest of the loop logic to take proper advantage of that?
Thr traceback should be like below:
Traceback (most recent call last):
File "test1.py", line 17, in
print (flap_display("CAT", [1,13,27]))
File "test1.py", line 10, in flap_display
carrier =carrier + sum(rotors[item])
TypeError: 'int' object is not iterable
And, as the traceback says, in line
carrier =carrier + sum(rotors[item])
rotors[item] will apparently be an int, so you can't call sum on it, hence the Error.
Replace the above line with:
carrier = carrier + rotors[item]
Or, just skip the loop, and do:
carrier = sum(rotors)
It should be okay now.
The main function that the code should do is to open a file and get the median. This is my code:
def medianStrat(lst):
count = 0
test = []
for line in lst:
test += line.split()
for i in lst:
count = count +1
if count % 2 == 0:
x = count//2
y = lst[x]
z = lst[x-1]
median = (y + z)/2
return median
if count %2 == 1:
x = (count-1)//2
return lst[x] # Where the problem persists
def main():
lst = open(input("Input file name: "), "r")
print(medianStrat(lst))
Here is the error I get:
Traceback (most recent call last):
File "C:/Users/honte_000/PycharmProjects/Comp Sci/2015/2015/storelocation.py", line 30, in <module>
main()
File "C:/Users/honte_000/PycharmProjects/Comp Sci/2015/2015/storelocation.py", line 28, in main
print(medianStrat(lst))
File "C:/Users/honte_000/PycharmProjects/Comp Sci/2015/2015/storelocation.py", line 24, in medianStrat
return lst[x]
TypeError: '_io.TextIOWrapper' object is not subscriptable
I know lst[x] is causing this problem but not too sure how to solve this one.
So what could be the solution to this problem or what could be done instead to make the code work?
You can't index (__getitem__) a _io.TextIOWrapper object. What you can do is work with a list of lines. Try this in your code:
lst = open(input("Input file name: "), "r").readlines()
Also, you aren't closing the file object, this would be better:
with open(input("Input file name: ", "r") as lst:
print(medianStrat(lst.readlines()))
with ensures that file get closed.
basic error my end, sharing in case anyone else finds it useful. Difference between datatypes is really important! just because it looks like JSON doesn't mean it is JSON - I ended up on this answer, learning this the hard way.
Opening the IO Stream needs to be converted using the python json.load method, before it is a dict data type, otherwise it is still a string. Now it is in a dict it can be brought into a dataFrame.
def load_json(): # this function loads json and returns it as a dataframe
with open("1lumen.com.json", "r") as io_str:
data = json.load(io_str)
df = pd.DataFrame.from_dict(data)
logging.info(df.columns.tolist())
return(df)