Here's the problem, if you don't use check-state column in ObjectListView, there's always this gap in every row:
Especially when you want to use a small column to show row numbers:
This is so NOT looking good, how do you make each row expand to the start?
I've just been adding an extra first empty column that has no no valueGetter and a width of 0 so it doesn't show, that way the following column starts on the left edge.
Add the following as the first column.
ObjectListView.ColumnDefn(title="", valueGetter="", maximumWidth=0)
Related
My question is on Python pandas. Suppose we have pandas dataframe df with two columns "A" and "B". I subset to select all values of column "B" where the entries of column "A" are equal to "value". Why does the order of the following commands not matter?
df[df["A"]=="value"]["B"]
Returns the same as the code
df["B"][df["A"]=="value"]
Can you please explain why these work the same? Thanks.
The df["A"]=="value" part of your code returns a pandas Series containing Boolean values in accordance to the condition ("A" == "value").
By puting a series mask (a filter, basically) on your DataFrame returns a DataFrame containining only the values on the rows where you've had True in your Series mask.
So, in your first code ( df[df["A"]=="value"]["B"] ), you are applying the specific mask on the DataFrame, obtaining only the rows where the column "A" was equal to "value", then you are extracting the "B" column from your DataFrame.
In your second code, you are first selecting the column "B", then you are selecting only the rows where the column "A" == "value" in the initial DataFrame.
Hope this helps!
The first form select rows then column than the second one select column then rows.
You can select rows and column simultaneously with loc:
df.loc[df['A'] == 'value', 'B']
Please read Evaluation order matters
Imagine you have two sheets of standard-sized printer paper in portrait mode. You want to get a very particular rectangle whose top-left corner is in the exact center of a sheet of paper, with width 1cm and height 3cm.
First way:
Step #1: For the first sheet, you make two vertical cuts with a pair of scissors, one exactly in the middle and one a centimeter to the right of the middle. You discard the left and right pieces, keeping only the 1cm wide strip. This is conceptually similar to performing df["B"] on a dataframe to select only the Series that is column B.
Step #2: You then make two horizontal cuts with the pair of scissors, one exactly in the middle and one three centimeters below the first cut. You discard the top and bottom pieces, keeping only the 3cm high (and 1cm wide) rectangle. This is conceptually similar to starting with the Series df["B"] (lets call this series X) and then performing X[df["A"]=="value"] to obtain the rows within X that satisfy the logical condition df["A"]=="value".
Second way:
Step #1: For the second sheet, you make two horizontal cuts with a pair of scissors, one exactly in the middle and one three centimeters below the middle. You discard the top and bottom pieces, keeping only the 3cm high strip. This is conceptually similar to performing df[df["A"]=="value"] on a dataframe to select only the rows that satisfy the logical condition df["A"]=="value".
Step #2: You then make two vertical cuts with the pair of scissors, one exactly in the middle and one a centimeter to the right of the first cut. You discard the left and right pieces, keeping only the 1cm wide (and 3cm high) rectangle. This is conceptually similar to starting with the DateFrame df[df["A"]=="value"] (lets call this dataframe Y) and then performing Y["B"] to obtain the column "B" within Y.
Observation:
The thought experiment above shows that whether we cut the paper vertically then horizontally, or instead cut it horizontally then vertically, we will end up with an identical rectangular result from exactly the same location (horizontally and vertically) in either case.
Conclusion:
The intuition required to understand the answer to your question is almost completely analogous to the more tangible example using paper. One slight difference is that the rows selected by df[df["A"]=="value"] may not be contiguous, so rather than being analogous to a 3cm high slice of paper, they may be analogous to multiple parallel horizontal strips (i.e., multiple groups of contiguous rows).
I'm working with a dataframe containing various datapoints of customer data. I'm looking to essentially replace any junk phone numbers as a blank value, right now I'm struggling to find an efficient way to find potential junk values such as a phone number like 111-111-1111 and replace that specific value with a blank entry.
I currently have a fairly ugly solution where I'm going through 3 fields; home phone, cell phone and work phone, locating the index values of the rows in question and respective column and then am replacing those,
with regards to actually finding junk values in a dataframe, is there a better approach to this than what I am currently doing?
row_index = dataset[dataset['phone'].str.contains('11111')].index
column_index = dataset.columns.get_loc('phone')
Afterwards, I would zip these up and cycle through a for loop, using dataset.iat[row_index, column_index] = ''. The row and column index variables would also have the junk values in the 'cellphone' and 'workphone' columns appended on as well.
Pandas 'where' function tends to be quick:
dataset['phone'] = dataset['phone'].where(~dataset['phone'].str.contains('11111'),
None)
I am trying to use Pandas.dropna() to the date on column 2 so that the date is at index 0.
df=pd.read_excel(excel_filepath,sheetname='Consolidated Balance Sheets',header=None)
Did you try:
df.iloc[0,2] = df.iloc[2,8]
If you like to kind of squeeze just column 2, so the NaN values on top disappear (or move down), you can do this as follows:
import numpy as np
non_na_values= df['2'].dropna().to_list()
df['2']= np.NaN
df.iloc[:len(non_na_values) , df.columns.get_loc('2')]= non_na_values
I know, this is a very unconventional way to do this, but the kind of problem is also very unusual, because normally the column values in a row are related somehow and so in most cases such an operation doesn't make sense. That is probably also the reason, why there is most likely no more elegant solution to solve something like this. I guess you have thought about sorting the data.
Note: Maybe you have to remove the hyphens around the two (depends on what type your column name actually is.
can you please suggest me an easy way to convert time periods to the corresponding indexes?
I have a function that picks entries from data frames based on numerical indexes (from 10th to 20th row) that I can not change. At the same time my data frame has time indexes and I have picked parts of it based on timestamps. How to convert those timestamps to the corresponding numerical indexes?
Thanks a lot
Alex
Adding some examples:
small_df.index[1]
Out[894]: Timestamp('2019-02-08 07:53:33.360000')
small_df.index[10]
Out[895]: Timestamp('2019-02-08 07:54:00.149000') # instead of time stamps.
These are the time period I want to pick from a second data frame that has time indexing as well. But I want to do that with numerical indexing
That means then
1. Find which numerical indexes correspond to the time period above
Based on the comment above this might be quite close on what I need:
start=second_dataframe.index.get_loc(pd.Timestamp(small_df.index[1]))
end=second_dataframe.index.get_loc(pd.Timestamp(small_df.index[10]))
picked_rows= second_dataframe[start:end]
Is there a better way to do that?
I believe you need Index.get_loc if need position:
small_df.index.get_loc(pd.Timestamp('2019-02-08 07:53:33.360000'))
1
EDIT: If values always matched, is possible get timestamp form first and extract second rows by DataFrame.loc:
start = small_df.index[1]
end = small_df.index[10]
picked_rows = second_dataframe.loc[start:end]
OrL
start=pd.Timestamp(small_df.index[1])
end=pd.Timestamp(small_df.index[10])
picked_rows = second_dataframe.loc[start:end]
The example of text file is picture
According to file, the direction of data will be changed after the word 'chapter'
In the other word, Direction of reading is changed horizontal to vertical.
In order to solve this big problem, I find read_fwf in pandas module and apply it but failed.
linefwf = pandas.read_fwf('File.txt', widths=[33,33,33], header=None, nwors = 3)
The gap between categories(Chapter, Title, Assignment) is 33.
But the command(linefwf) prints all of pages line which includes horizontal categories such as Title, Date, Reservation as well as blank lines.
Please, I want to know 'How to export vertical data only'
Let me take a stab in the dark: you wish to turn this table into a column (aka "vertical category"), ignoring the other columns?
I didn't have your precise text, so I guesstimated it. My column widths were different than yours ([11,21,31]) and I omitted the nwors argument (you probably meant to use nrows, but it's superfluous in this case). While the column spec isn't very precise, a few seconds of fiddling left me with a workable DataFrame:
This is pretty typical of read-in datasets. Let's clean it up slightly, by giving it real column names, and taking out the separator rows:
df.columns = list(df.loc[0])
df = df.ix[2:6]
This has the following effect:
Leaving us with df as:
We won't take the time to reindex the rows. Assuming we want the value of a column, we can get it by indexing:
df['Chapter']
Yields:
2 1-1
3 1-2
4 1-3
5 1-4
6 1-5
Name: Chapter, dtype: object
Or if you want it not as a pandas.Series but a native Python list:
list(df['Chapter'])
Yields:
['1-1', '1-2', '1-3', '1-4', '1-5']