xlrd read number as string - python

I am trying to read from an xls file a long number (6425871003976) but python keeps trunking it before it reads it as a number not a string (6.42587100398e+12). Is there any method to read it directly as a string even thou in the xls file it is a number?
values = sheet.row_values(rownum)
in values it appears correctly (6425871003976.0) but when I try values[0] it is already switched to the incorrect value.
Solution:
This was my solution using repr():
if type(values[1]) is float:
code_str = repr(values[1]).split(".")[0]
else:
code_str = values[1]
product_code = code_str.strip(' \t\n\r')

It's the same value. All that's different is how the value is being printed to screen. The scientific notation you get is because to print the number str is called on it. When you print the list of values the internal __str__ method of the list calls repr on each of its elements. Try print(repr(values[0])) instead.

use print "%d" %(values[0])
%d is used for printing integer values

This is an example, which bring the value of a cell (in your case it's an int), you need to convert it to a string using the str function
from openpyxl import load_workbook
wb = load_workbook(filename='xls.xlsx', read_only=True)
ws = wb['Sheet1']
for row in ws.rows:
for cell in row:
cell_str=str(cell.value)

Related

pd.to_numeric could not convert string to float

def openfiles():
file1 = tkinter.filedialog.askopenfilename(filetypes=(("Text Files",".csv"),("All files","*")))
read_text=pd.read_csv(file1)
displayed_file.insert(tk.END,read_text)
read_text['OPCODE'] = pd.to_numeric(read_text['OPCODE'],errors = 'coerce').fillna(0.0)
read_text['ADDRESS'] = pd.to_numeric(read_text['ADDRESS'],errors = 'coerce').fillna(0.0)
classtype1=np.argmax(model.predict(read_text), axis=-1)
tab2_display_text.insert(tk.END,read_text)
When running this code it shows "could not convert string to float".
Link of the csv file that is used to as datafram: https://github.com/Yasir1515/Learning/blob/main/Book2%20-%20Copy.csv
Complete code link (probmatic code is at line 118-119): https://github.com/Yasir1515/Learning/blob/main/PythonApplication1.py
In your data ADDRESS is a hexadecimal number and OPCODE is a list of hexadecimal numbers. I don't know why would you want to convert hex numbers to float. You should convert them to integers.
The method to_numeric is not suitable to convert hex string to integer, or handle a list of hex numbers. You need to write help function:
def hex2int(x):
try:
return int(x, 16)
except:
return 0
def hex_list2int_list(zz):
return [hex2int(el) for el in zz.split()]
Now replace relevant lines:
read_text['OPCODE'] = read_text['OPCODE'].apply(hex_list2int_list)
read_text['ADDRESS'] = read_text['ADDRESS'].apply(hex2int)
I look at your CSV file. The column OPCODE contains one row with a long string of some numbers separated by space(' '). therefor you cannot cast that type of value to numeric type (the string '88 99 77 66' != numeric type). I can suggest some solution to split those many values in the column OPCODE to many rows and then perform the to_numeric method after afterwards you can make manipulation and return it to the previous form.
what I suggest is:
read_text=pd.read_csv(file1)
new_df = pd.concat([pd.Series(row['ADDRESS'], row['OPCODE'].split(' '))
for _, row in a.iterrows()]).reset_index()
new_df['OPCODE'] = pd.to_numeric(new_df['OPCODE'],errors = 'coerce').fillna(0.0)

How to store list in rows of mysql using Python/Flask?

I am getting some values from a html form and I am storing these values to a list. List is like:
["string1", "string2", "string3", "string4", "string5"]
I want to store these values in rows of mysql but I am confused how to do?
What I did till now is:
descrip = []
descrip.append(description1)
descrip.append(description2)
descrip.append(description3)
descrip.append(description4)
descrip.append(description5)
for r in descrp:
result_descrp = db.execute("""INSERT INTO description(id,description) VALUES (1,%s)""",((descrip))
return render_template('forms/success.html')
But I am getting this error:
TypeError: not all arguments converted during string formatting
At first, You use the placeholder %s in the format string which expect a str. But you pass a list to it.
And I don't know the type of description in your schema. If you just want to save the string presentation of list in the database, you can transform list to str with str(desciption).
And Mysql also support json type of field.(MariaDB also support json type.)
descrip = []
descrip.append(description1)
descrip.append(description2)
descrip.append(description3)
descrip.append(description4)
descrip.append(description5)
for r in range(5):
if descrip[r]:
result_add_event = db.execute("""INSERT INTO event_description(event_id,title,description, created_at) VALUES (%s,%s,%s)""",(id,descrip[r],timestamp))
This above code worked very fine. :)
Special thanks to #shiva and also to those who helped me.

What format does data input have to be for Python's json.loads?

I'm trying to use json.loads to parse data in a Redshift database table. I've stripped out the function to test in a Python script and am having trouble understanding what's happening.
The code I'm using is:
import json
j="'['Bars', 'American (Traditional)', 'Nightlife', 'Restaurants']'"
def trythis(item, reverse):
if not j:
return '1'
try:
arr = json.loads(j)
except ValueError:
return '2'
if not ascending:
arr = sorted(arr, reverse=True)
else:
arr = sorted(arr)
return json.dumps(arr)
print trythis(j, True)
And this is returning 2.
I've tried changing the input variable to j="['Bars', 'American (Traditional)', 'Nightlife', 'Restaurants']" but that hasn't worked. What format does my entry value need to be?
Your input string j is not valid JSON. JSON doesn't allow the use of single quotes (') to denote string values.
Try switching the quotes: '["Bars", "American (Traditional)", "Nightlife", "Restaurants"]'
The JSON specification is an excellent resource for determining if your input is valid JSON. You can find it here: http://www.json.org/

Assign strings to IDs in Python

I am reading a text file with python, formatted where the values in each column may be numeric or strings.
When those values are strings, I need to assign a unique ID of that string (unique across all the strings under the same column; the same ID must be assigned if the same string appears elsewhere under the same column).
What would be an efficient way to do it?
Use a defaultdict with a default value factory that generates new ids:
ids = collections.defaultdict(itertools.count().next)
ids['a'] # 0
ids['b'] # 1
ids['a'] # 0
When you look up a key in a defaultdict, if it's not already present, the defaultdict calls a user-provided default value factory to get the value and stores it before returning it.
collections.count() creates an iterator that counts up from 0, so collections.count().next is a bound method that produces a new integer whenever you call it.
Combined, these tools produce a dict that returns a new integer whenever you look up something you've never looked up before.
defaultdict answer updated for python 3, where .next is now .__next__, and for pylint compliance, where using "magic" __*__ methods is discouraged:
ids = collections.defaultdict(functoools.partial(next, itertools.count()))
Create a set, and then add strings to the set. This will ensure that strings are not duplicated; then you can use enumerate to get a unique id of each string. Use this ID when you are writing the file out again.
Here I am assuming the second column is the one you want to scan for text or integers.
seen = set()
with open('somefile.txt') as f:
reader = csv.reader(f, delimiter=',')
for row in reader:
try:
int(row[1])
except ValueError:
seen.add(row[1]) # adds string to set
# print the unique ids for each string
for id,text in enumerate(seen):
print("{}: {}".format(id, text))
Now you can take the same logic, and replicate it across each column of your file. If you know the column length in advanced, you can have a list of sets. Suppose the file has three columns:
unique_strings = [set(), set(), set()]
with open('file.txt') as f:
reader = csv.reader(f, delimiter=',')
for row in reader:
for column,value in enumerate(row):
try:
int(value)
except ValueError:
# It is not an integer, so it must be
# a string
unique_strings[column].add(value)

Python - Producing custom filename during creation

So I have a script that produces files named for each key in a dictionary. (script below)
for key, values in sequencelist.items():
with open(key, 'w') as out:
for value in values:
out.write('\n'.join(value.split()) + '\n')
Can someone help me modify the above syntax to do more? I would like to append some plain text onto the filename as well as add the current len(dict.keys()) using range() See my script below, which doesn't work! :)
for key, values in sequencelist.items():
for i in range(len(sequencelist.keys())):
j = i+1
with open('OTU(%j)' +'_' + key +'.txt' %j, 'w') as out:
for value in values:
out.write('\n'.join(value.split()) + '\n')
So this the first file created would be OTU(1)_key.txt
I am sure the with open() line is 100% wrong.
Could someone also link me stuff to read on the use of %j to call the variable j from the line before works? I was trying to use code from this Overflow answer (Input a text file and write multiple output files in Python) with no explanation.
Try the following
for count, (key, values) in enumerate(sequencelist.items()):
with open('OTU(%d)_%s.txt' % (count+1, str(key)), 'w') as out:
for value in values:
out.write('\n'.join(value.split()) + '\n')
I swapped the ordering of your open call with your value iteration so you don't get len(sequencelist) files for each value. It seemed like your j argument was not required after this change. The enumerate call makes the count part of the for loop increment each time the loop repeats (it doesn't have to be called count).
The %d asks for an integer, the %s for a string, which depending on the key name will convert nicely with str(). If your key is some custom class you'll want to convert it to a nicer string format as you'll get someting like <class __main__.Test at 0x00000....>.

Categories