Spotfire 5.5
Hi, I am looking for a way to color code or group columns together in a Spotfire cross-table. I have three categories (nearest, any, all) and three columns associated with each category. Is there a way I can visually group these columns with their corresponding category.
Is there a way to change column heading color?
Is there a way to put a border around the three column groups?
Can I display their category above the three corresponding columns?
Thanks
One can make a color according to category by using Properties->Color->Add Rule , where you can see many conditions to apply your visualization.
I'm running 7.0.1.9 and there is still no customization on table headers - besides choosing the font. Using rules to color table values would be your best bet, but it does not apply to the headers themselves, so you're pretty much stuck.
I have seen this example of custom css styling for creating an export tool, but that requires a much complex approach (basically you'll reconstruct the whole analysis by html coding) and I'm not quite sure would allow for header customization... anyways here it is: http://stn.spotfire.com/stn/Tutorials/HowToCreateExportTool.aspx
1.Is there a way to change column heading color?
Ans : You can change the color for column headers. Please find the script below
if($('#my-style').length){ // do nothing style sheet already exists } else { // add CSS to the body $('body').append($('
.sf-element-table-cell.sfc-column-header { background-color:#566573;color:white;text-align:right;}
.sf-element-table-cell.sfc-row { background-color:#F4F6F6;color:Black;text-align:right;}
', { id: 'my-style' })); }
Is there a way to put a border around the three column groups?
Ans : You can do this if you have this column as rows in one column. if not you can perform Unpivot and bring them into one column. then build a cross tab by taking column reference as the category (nearest, any, all) column.
3. Can I display their category above the three corresponding columns?
Ans : you will get category once you perform above operation.
Please let me know if you need any inputs.
Related
I've got multiple excels and I need a specific value but in each excel, the cell with the value changes position slightly. However, this value is always preceded by a generic description of it which remains constant in all excels.
I was wondering if there was a way to ask Python to grab the value to the right of the element containing the string "xxx".
try iterating over the excel files (I guess you loaded each as a separate pandas object?)
somehting like for df in [dataframe1, dataframe2...dataframeN].
Then you could pick the column you need (if the column stays constant), e.g. - df['columnX'] and find which index it has:
df.index[df['columnX']=="xxx"]. Maybe will make sense to add .tolist() at the end, so that if "xxx" is a value that repeats more than once, you get all occurances in alist.
The last step would be too take the index+1 to get the value you want.
Hope it was helpful.
In general I would highly suggest to be more specific in your questions and provide code / examples.
Each day I receive many different files from different vendors, and the sizes are vastly different. I am looking for some dynamic code that will decide what is relevant across all files. I would like to think thru how to break this file into components (df1, df2, df3 for example) which will make it easier for analysis.
Basically the first 6 lines are for overall information about the store (df1).
The 2nd component is reserved for specific item sales (starting on row 9, ending in a DIFFERENT row in every file), and I'm not sure how to capture that. I have tried something along the lines of
numb = df.loc['Type of payment'].index[0] - 2
but it is bringing in the tuple instead of the row location (int). How can i save upperrange and lowerrange to be a dynamic (int) so that each day it will bring in the correct df2 data I am looking for?
The same problem exists at the bottom under "Type of payment" - you will notice that crypto is included for the 1st day but not the 2nd. I need to find a way to get a dynamic range to remove erroneous info and keep the integrity of the rest. I think finding the lowerrange will allow me to capture from that point to the end of the sheet, but I'm open to suggestions.
df = pd.read_csv('GMSALES.csv', skipfooter=2)
upperrange = df.loc['Item Number'] #brings in tuple
lowerrange = df.loc['Type of payment'] #brings in tuple
df1 = df.iloc[:,7] #this works
df2 = df.iloc[:('upperrange':'lowerrange')] # this is what I would like to get to
df3 = df.iloc[:(lowerrange:)] # this is what I would like to get to
Your organizational problem is that your data comes in as a spreadsheet that is used for physical organization more than functional organization. The "columns" are merely typographical tabs. The file contains several types of heterogeneous data; you are right in wanting to reorganize this into individual data frames.
Very simply, you need to parse the file, customer by customer -- either before or after reading it into RAM.
From your current organization, this involves simply scanning the "df2" range of your heterogeneous data frame. I think that the simplest way is to start from row 7 and look for "Item Number" in column A; that is your row of column names. Then scan until you find a row with nothing in column A; back up one row, and that gives you lowerrange.
Repeat with the payments: find the next row with "Type of payment". I will assume that you have some way to discriminate payment types from fake data, such as a list of legal payment types (strings). Scan from "Type of Payment" until you find a row with something other than a legal payment type; the previous row is your lowerrange for df3.
Can you take it form there?
I am trying to add a column from one dataframe to another,
df.head()
street_map2[["PRE_DIR","ST_NAME","ST_TYPE","STREET_ID"]].head()
The PRE_DIR is just the prefix of the street name. What I want to do is add the column STREET_ID at the associated street to df. I have tried a few approaches but my inexperience with pandas and the comparison of strings is getting in the way,
street_map2['STREET'] = df["STREET"]
street_map2['STREET'] = np.where(street_map2['STREET'] == street_map2["ST_NAME"])
The above code shows an "ValueError: Length of values does not match length of index". I've also tried using street_map2['STREET'].str in street_map2["ST_NAME"].str. Can anyone think of a good way to do this? (note it doesn't need to be 100% accurate just get most and it can be completely different from the approach tried above)
EDIT Thank you to all who have tried so far I have not resolved the issues yet. Here is some more data,
street_map2["ST_NAME"]
I have tried this approach as suggested but still have some indexing problems,
def get_street_id(street_name):
return street_map2[street_map2['ST_NAME'].isin(df["STREET"])].iloc[0].ST_NAME
df["STREET_ID"] = df["STREET"].map(get_street_id)
df["STREET_ID"]
This throws this error,
If it helps the data frames are not the same length. Any more ideas or a way to fix the above would be greatly appreciated.
For you to do this, you need to merge these dataframes. One way to do it is:
df.merge(street_map2, left_on='STREET', right_on='ST_NAME')
What this will do is: it will look for equal values in ST_NAME and STREET columns and fill the rows with values from the other columns from both dataframes.
Check this link for more information: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html
Also, the strings on the columns you try to merge on have to match perfectly (case included).
You can do something like this, with a map function:
df["STREET_ID"] = df["STREET"].map(get_street_id)
Where get_street_id is defined as a function that, given a value from df["STREET"]. will return a value to insert into the new column:
(disclaimer; currently untested)
def get_street_id(street_name):
return street_map2[street_map2["ST_NAME"] == street_name].iloc[0].ST_NAME
We get a dataframe of street_map2 filtered by where the st-name column is the same as the street-name:
street_map2[street_map2["ST_NAME"] == street_name]
Then we take the first element of that with iloc[0], and return the ST_NAME value.
We can then add that error-tolerance that you've addressed in your question by updating the indexing operation:
...
street_map2[street_map2["ST_NAME"].str.contains(street_name)]
...
or perhaps,
...
street_map2[street_map2["ST_NAME"].str.startswith(street_name)]
...
Or, more flexibly:
...
street_map2[
street_map2["ST_NAME"].str.lower().replace("street", "st").startswith(street_name.lower().replace("street", "st"))
]
...
...which will lowercase both values, convert, for example, "street" to "st" (so the mapping is more likely to overlap) and then check for equality.
If this is still not working for you, you may unfortunately need to come up with a more accurate mapping dataset between your street names! It is very possible that the street names are just too different to easily match with string comparisons.
(If you're able to provide some examples of street names and where they should overlap, we may be able to help you better develop a "fuzzy" match!)
Alright, I managed to figure it out but the solution probably won't be too helpful if you aren't in the exact same situation with the same data. Bernardo Alencar's answer was essential correct except I was unable to apply an operation on the strings while doing the merge (I still am not sure if there is a way to do it). I found another dataset that had the street names formatted similar to the first. I then merged the first with the third new data frame. After this I had the first and second both with columns ["STREET_ID"]. Then I finally managed to merge the second one with the combined one by using,
temp = combined["STREET_ID"]
CrimesToMapDF = street_maps.merge(temp, left_on='STREET_ID', right_on='STREET_ID')
Thus getting the desired final data frame with associated street ID's
I have a huge dataset with 1000+ columns. Most of them contains *NaN's * or just a few values. Manual sifting through each column is an unreasonable waste of time. How can I do an estimate column diversity, top freq values, etc with a single command?
First, you need to get what single column contains, so you can make a for loop like that:
column = [array[i] for i in range(0,len(array), STEP]
where STEP = the number of columns in your file
Then you can do whatever you want with that. Answering to your questions,
you can use i.e. max(column) - min(column), that will give you diversity.
To get top common values I suggest you look there:
click
As noted here: http://docs.wxwidgets.org/trunk/classwx_list_box.html
Notice that currently TAB characters in list box items text are not handled consistently under all platforms, so they should be replaced by spaces to display strings properly everywhere. The list box doesn't support any other control characters at all.
So far in my experience while using Python 2.7 32-bit in Windows 7, using \t within the string of a wxListBox selection has no effect; as expected
I have a bunch of rows from the database and I have multiple columns that I want to display (and eventually use on selection of one or more row) within a row in wxListBox. For now I am using spaces as recommended as the delimiter between values in the string. However, this is not really ideal since the columns are variable length.
Is there an alternative to the \t that is not a simple delimiter? The point here is to have all of the columns for each row presented neatly i.e.
column1 value1 value2
column442142 values24234234 val2
rather than
column1 value1 value2
column442142 values24234234 val2
wxGrid comes to mind but I don't think that would work for me because I don't want to be able to select specific cells within a row (I can't seem to find the function to disable that), I only want the user to be able to select a row or multiple rows.
My advice would be to use wxListCtrl for the simple multicolumn data display or wxDataViewCtrl if you need more features.
FWIW you can use wxGrid::SetSelectionMode() with wxGridSelectRows argument to disable cell selection but wxGrid is arguably still not the best control to use for something like this.
See this slide from my lectures for a brief summary of different controls.
print " 4 whitespaces replace a tab"
print "%20s"%some_string_padded_to_20_chars