How to align numbers to right in excel using pandas - python

I have written a program in Python. using pandas that creates an excel file which shows numbers and strings in a few sheets. I want the numbers to be aligned to the right.
Right now it looks like this:
Test name Failed\passed Running time
Test A Passed 3
How can i move the number column to the right?
I tried using this:
df1.style.set_properties(**{'text-align': 'right'})
But it did not fix the problem.
Thanks.

Take a look # this one: https://xlsxwriter.readthedocs.io/example_pandas_column_formats.html
Maybe this one will work for you
format1 = workbook.add_format({'align': 'right','bold': True, 'bottom':6})
and then maybe pass it to conditional_format

Related

pandas.to_datetime() does not filter when used with loc[] and comparison operator

I downloaded a .csv file to do some practice, a column named "year_month" is string with the format "YYYY-MM"
By doing:
df = pd.read_csv('C:/..../migration_flows.csv',parse_dates=["year_month"])
"year_month" is Dtype=object. So far so good.
By doing:
df["year_month"] = pd.to_datetime(df["year_month"],format='%Y-%m-%d')
it is converted to daterime64[ns]. So far so good.
I try to filter certain dates by doing:
filtered_df = df.loc[(df["year_month"]>= pd.Timestamp(2018-1-1))]
The program returns the whole column as if nothing happened. For instance, it starts displaying, starting from the date "2001-01-01"
Any thoughts on how to filter properly? Many thanks
how about this
df.loc[(df["year_month"]>= pd.to_datetime('2018-01-01'))]
or
df.loc[(df["year_month"]>= pd.Timestamp('2018-01-01'))]

How to turn 'string' into 'BuiltInCategory' in Revit Python Shell when reading excel file

I am trying to import data from Excel to Revit Python Shell in order to verify if some parameters exist in the Revit file for selected categories of objects. But I am having some problems in the first 'for' loop (look into the first column of the excel sheet and get the categories).
The first step to achieving what I wanted would be to get all the elements of the categories that I want to analyze. I've tried a lot of things, but I always end up in this error shown in the picture (got string instead of built-in category). I searched for a method to transform a string into builtincategories, but I did not find anything.
Does anyone know how to deal with it? Is there a way to transform a string into built-in category or is there another solution for this?
Thank you!
picture - screenshot
This question is more a C# than a Revit API one. Search for 'c# enum string convert', which turns up, e.g., convert a string to an enum in C#. In this case:
BuiltInCategory bic = BuiltInCategory.Parse( "some_string" );
Looks like the problem is with these lines:
fec = FilteredElementCollector(doc)
cat = fec.OfCategory(i.Value2)
In this instance, i.Value2 is a string, but fec.OfCategory is wanting a BuiltInCategory as you've commented earlier.
Jeremy's answer will convert a string to a BuiltInCategory (thanks Jeremy, I had no idea you could do that!) like this:
bic = BuiltInCategory.Parse(BuiltInCategory, "OST_PlumbingFixtures")
So in your example it would be:
fec.OfCategory(BuiltInCategory.Parse(BuiltInCategory, i.Value2.split('.')[1]))

(Python) manually copy/paste data from pandas table without copying the index

I've been looking around but could not find an similar post, so I thought I'd give it a go.
I wrote an pandas program that sucessfully displays the resulting dataframe in pandas table format in a tkinter textbox. the aim is that the user can select the data ancopy/paste it into an (existing)excel sheet. when doing this, the index is always copied as well. I was wondering if one could programmatically select the complete table except the index?
I know that one can save to excel or other with index=false, but I could not find a kind of df.select....index=false. I hope my explanation is more or less clear ;-)
Thanks a lot
screenshot
you could use dataframe's 'to_string' function, here you could pass 'index = False' as one of the parameters. For Ex: say we have this df:
import pandas as pd
df = pd.DataFrame({'a': ['yes', 'no', 'yes' ], 'b': [10, 5, 20]})
print(df.to_string(index = False))
this would give you:
a b
yes 10
no 5
yes 20
Hope this helps!
I finally found it.
Instead of using something like self.mytable.copy('columns') to select everything and then switch to Excel and paste it, I use this line of code which does exactly what I need :
df.to_clipboard(sep="\t", index=False)
The sep="\t" makes it split up amongst columns in Excel.
Hopefully someone can use this at some stage.

Python: how to import Excel cell's displayed (formatted) value instead of real value

For example, a cell has a real value of 1.96. The cell is formatted with rounding so that it displays 2 in Excel.
Another example, a cell has a value of 5, but it is formatted to display currency, so $5.
There could be any type of formatting. So I cannot know advance what formatting I am looking for.
How do I import these displayed values with Python? I've looked into Pandas and openpyxl but couldn't find a way.
I can do it semi-manually by creating a custom VBA module. following Duncan O'Donnell's answer (the last one) here: https://superuser.com/questions/678934/how-can-i-get-the-displayed-value-of-a-cell-in-ms-excel-for-text-that-was-conv
But I need to fully automate and do it in Python. Any help is appreciated, thanks!
Openpyxl has a feature called number_format which could come in handy. I basically played with the output of dir to get to this.
#let's say we have a value in cell C2 which is 1.96 but formatted to 2.0
print(ws['C2'].number_format)
# #
The # indicates it is formatted as an integer.
# Create a conditional:
if fmt=="#":
r = round(ws['C2'].value)
print(r)
2.0
It's a bit of a hack, but it should help with your use case.
For the dollar part, I will refer you to this stack overflow post, as I believe it captures your use case quite well.
I was able to solve this with VBA... so I guess there is no easy way with Python after all.

Long numbers conversion format

The conversion of xml to csv file, this is done by some code and the specifications that I have added.
As as result I get a csv file, once I open it I see some weird numbers that look something like this
1,25151E+21
Is there any way to eliminate this and show the whole numbers. The code itself that parses xml to csv is working fine so I’m assuming it is an excel thing.
I don’t want to go and do something manually every time I am generating a new csv file
Additional
The entire code can be found HERE and I have only long numbers in Quality
for qu in sn.findall('.//Qualify'):
repeated_values['qualify'] = qu.text
CSV doesn't pass any cell formatting rules to Excel. Hence if you open a CSV that has very large numbers in it, the default cell formatting will likely be Scientific. You can try changing the cell formatting to Number and if that changes the view to the entire number like you want, consider using the Xlsxwriter to apply cell formatting to the document while writing to Xlsx instead of CSV.
I often end up running a lambda on dataframes with this issue when I bring in csv, fwf, etc, for ETL and back out to XLSX. In my case they are all account numbers, so it's pretty bad when Excel helpfully overrides it to scientific notation.
If you don't mind the long number being a string, you can do this:
# First I force it to be an int column as I import everything as objects for unrelated reasons
df.thatlongnumber = df.thatlongnumber.astype(np.int64)
# Then I convert that to a string
df.thatlongnumber.apply(lambda x: '{:d}'.format(x))
Let me know if this is useful at all.
Scientific notation is a pain, what I've used before to handle situations like this is to cast it into a float and then use a format specifier, something like this should work:
a = "1,25151E+21"
print(f"{float(a.replace(',', '.')):.0f}")
>>> 1251510000000000065536

Categories