I have one column in my Data Frame with codes like this: 2106080119283699, 2104985492880938 for that I converted that column to string:
File['mtcn']=File['mtcn'].values.astype(str)
To avoid this when I export this Data Frame to excel: 5.47506e+09
But when I open the excel file the column looks good but If I click into the column the value change to:
I know that I can change the data type into excel to text and this would never happen but I want to know if there is a way to convert into the same python the value and after export to excel even if I click int the excel the column the code would keep this structure as: 2106080119283699
Regards
Related
I have a date column in csv file like as shown below
23/6/2011 7:00
21/4/1998 05:00
17/02/1990
11/01/1985 30:30:01
26/02/1976
45:42:7
But the problem here is, when I double click the rows in csv, the actual date value is correctly displayed 15/02/2010 10:30:00` etc.
My csv looks like as below
But I cannot do this manually because you can imagine, I have 20-30 csv files and there are lot of rows like this.
So, when I read the column in pandas dataframe and apply datetime function like below,
df['Date'] = pd.to_datetime(df['Date'])
ParserError: hour must be in 0..23: 55:45.0
But how can I make pandas read the actual value and not csv display value?
I tried changing the format in excel csv file but that doesn't help
Basically I want pandas to read the double clicked value from csv but not the display value?
After scraping I have put the information in a dataframe and want to export it to a .csv but one of the three columns returns empty in the .csv file ("Content"). This is weird since the all of the three columns are visible in the dataframe, see screenshot.
Screenshot dataframe
Line I use to convert:
df.to_csv('filedestination.csv')
Inspecting the df returns objects:
Inspecting dataframe
Does anyone know how it is possible that the last column, "Content" does not show any data in the .csv file?
Screenshot .csv file
After suggestions it seems that the data is available when opening with .txt. How is it possible that excel does not show the data properly?
Screenshot .txt file data
What is the data type of the Content column?
It is not a string, you can convert that to a string. And then perform df.to_csv
Sometimes, this happens weirdly. View & export will be different. Try Resetting the index before exporting it to .csv/ excel. This always works for me.
df.reset_index()
then,
df.to_csv(r'file location/filename.csv')
I am reading an excel sheet and plucking data from rows containing the given PO.
import pandas as pd
xlsx = pd.ExcelFile('Book2.xlsx')
df = pd.read_excel(xlsx)
PO_arr = ['121121','212121']
for i in PO_arr:
PO = i
PO_DATA = df.loc[df['PONUM'] == PO]
for i in range(1, max(PO_DATA['POLINENUM'].values) +1):
When I take this Excel sheet straight from its source, my code works fine. But when I cut out only the rows I want and paste them to a new spreadsheet with the exact same formatting and read this new spreadsheet, I have to change PO_DATA to look for an integer instead of a string as such:
PO_DATA = df.loc[df['PONUM'] == int(PO)]
If not, I get an error, and calling PO_DATA returns an empty dataframe.
C:\...\pandas\core\ops\array_ops.py:253: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
res_values = method(rvalues)
I checked the cell formatting in Excel and in both cases, they are formatted as 'General' cells.
What is going on that makes it so when I chop up my spreadsheet, I have to look for an integer and not a string? What do I have to do to make it work for sheets I've created and pasted relevant data into instead of only sheets from the source?
Excel can do some funky formatting when copy and paste is used: ctl-c : ctl-v.
I am sure you tried these but...
A) Try copy ctl-c then ctl-alt-v:"v":enter ... on new sheet/file
B) Try using the format painter in Excel : Looks like a paintbrush on the home tab - select the properly formatted cells first - double click format painter - move to your new file/sheet - select cells you want the format to conform to.
C) Select your new file/table you pasted into - select purple eraser icon from the top options in excel - clear all formats
Update: I found an old related thread that didn't necessarily answer the question but solved the problem.
you can force pandas to import values as a certain datatype when reading from excel using the converters argument for read_excel.
df = pd.read_excel(xlsx, converters={'POLINENUM':int,'PONUM':int})
I have an excel sheet and I am reading the excel sheet using pandas in python.
Now I want to read the excel file based on a column, if the column has some value then do not read that row, if the column is empty than read that and store the values in a list.
Here is a screenshot
Excel Example
Now in the above image when the uniqueidentifier is yes then it should not read that value, but if it is empty then it should start reading from that value.
How to do that using python and how to get index so that after I have performed some function that I am again able to write to that blank unique identifier column saying that row has been read
This is possible for csv files. There you could do
iter_csv = pandas.read_csv('file.csv', iterator=True, chunksize=100000)
df = pd.concat([chunk[chunk['UniqueIdentifier'] == 'True'] for chunk in iter_csv])
But pd.read_excel does not offer to return an iterator object, maybe some other excel-readers can. But I don't no which ones. Nevertheless you could export your excel file as csv and use the solution for csv files.
I have a Dataframe with an ID column. This column is usually a 8-digit number. When I write this dataframe to a CSV file, is then uploaded and processed by an external application, reading the ID column as a string.
A problem occurs where the column type is misinterpreted in the external application to be a number instead of text. To get around this issue, I have to open the CSV file manually in my choice of application (Libre office Calc) and go Data -> "Text to Columns" and set the column type to be Text. Uploading the file manually after doing this gets around the issue. I've tried setting the column type in my code:
dataframe_df['id'] = dataframe_df['id'].astype(str)
but this does not stop the error from occurring. Is there some step I'm missing here?