Print out table in a format specified by user in Python? - python

The program starts as how many rows? how many coloumns? Alignment of each coloumn?(Left(L), Centre(C), Right(R)). Then accept entries(data in table) from the user. The entries should be printed in the format specified by the user? Here's what I have done so far:
rows = input("How many rows?")
coloumns = input("How many coloumns?")
alignment = raw_input("Enter alignment of each table?")
entry = raw_input("Enter rows x cols entries:")
print entry
I think I have to format entry in such a way that it comes out exactly how the user wants. How can I do it? Thanks

This block of code referenced from http://ginstrom.com/scribbles/2007/09/04/pretty-printing-a-table-in-python/ will help you.
import locale
locale.setlocale(locale.LC_NUMERIC, "")
def format_num(num):
"""Format a number according to given places.
Adds commas, etc. Will truncate floats into ints!"""
try:
inum = int(num)
return locale.format("%.*f", (0, inum), True)
except (ValueError, TypeError):
return str(num)
def get_max_width(table, index):
"""Get the maximum width of the given column index"""
return max([len(format_num(row[index])) for row in table])
def pprint_table(out, table):
"""Prints out a table of data, padded for alignment
#param out: Output stream (file-like object)
#param table: The table to print. A list of lists.
Each row must have the same number of columns. """
col_paddings = []
for i in range(len(table[0])):
col_paddings.append(get_max_width(table, i))
for row in table:
# left col
print >> out, row[0].ljust(col_paddings[0] + 1),
# rest of the cols
for i in range(1, len(row)):
col = format_num(row[i]).rjust(col_paddings[i] + 2)
print >> out, col,
print >> out
table = [["", "taste", "land speed", "life"],
["spam", 300101, 4, 1003],
["eggs", 105, 13, 42],
["lumberjacks", 13, 105, 10]]
import sys
out = sys.stdout
pprint_table(out, table)
In your case, because you are collecting inputs of rows, columns, alignment and entry in the table you can plug them in to construct your table variable.
len(table[0]) is equivalent to number of columns (-1 to prevent counting in the "y-axis" labels, also known as table index).
len(table) is
equivalent to your number of rows (-1 to prevent counting in the table header).
col_padding (alignment) is
dynamically computed using rjust and ljust methods while calculating a particular column.
And each element in your table list can be
updated using standard python list syntax.

Related

Using gspread, trying to add a column at the end of Google Sheet that already exists

Here is the code I am working with.
dfs=dfs[['Reserved']] #the column that I need to insert
dfs=dfs.applymap(str) #json did not accept the nan so needed to convert
sh=gc.open_by_key('KEY') #would open the google sheet
sh_dfs=sh.get_worksheet(0) #getting the worksheet
sh_dfs.insert_rows(dfs.values.tolist()) #inserts the dfs into the new worksheet
Running this code would insert the rows at the first column of the worksheet but what I am trying to accomplish is adding/inserting the column at the very last, column p.
In your situation, how about the following modification? In this modification, at first, the maximum column is retrieved. And, the column number is converted to the column letter, and the values are put to the next column of the last column.
From:
sh_dfs.insert_rows(dfs.values.tolist())
To:
# Ref: https://stackoverflow.com/a/23862195
def colnum_string(n):
string = ""
while n > 0:
n, remainder = divmod(n - 1, 26)
string = chr(65 + remainder) + string
return string
values = sh_dfs.get_all_values()
col = colnum_string(max([len(r) for r in values]) + 1)
sh_dfs.update(col + '1', dfs.values.tolist(), value_input_option='USER_ENTERED')
Note:
If an error like exceeds grid limits occurs, please insert the blank column.
Reference:
update

How to insert a value in google sheets through python just when the cell is empty/null

I'm trying to automate googlesheets through python, and every time my DF query runs, it inserts the data with the current day.
To put it simple, when a date column is empty, it have to be fulfilled with date when the program runs. The image is:
EXAMPLE IMAGE
I was trying to do something like it:
ws = client.open("automation").worksheet('sheet2')
ws.update(df_h.fillna('0').columns.values.tolist())
I'm not able to fulfill just the empty space, seems that or all the column is replaced, or all rows, etc.
Solved it thorugh another account:
ws_date_pipe = client.open("automation").worksheet('sheet2')
# Range of date column (targeted one, which is the min range)
next_row_min = str(len(list(filter(None, ws_date_pipe.col_values(8))))+1)
# Range of first column (which is the max range)
next_row_max = str(len(list(filter(None, ws_date_pipe.col_values(1)))))
cell_list = ws_date_pipe.range(f"H{next_row_min}:H{next_row_max}")
cell_values = []
# Difference between max-min ranges, space that needs to be fulfilled
for x in range(0, ((int(next_row_max)+1)-int(next_row_min)), 1):
iterator = x
iterator = datetime.datetime.now().strftime("%Y-%m-%d")
iterator = str(iterator)
cell_values.append(iterator)
for i, val in enumerate(cell_values):
cell_list[i].value = val
# If date range len "next_row_min" is lower than the first column, then fill.
if int(next_row_min) < int(next_row_max)+1:
ws_date_pipe.update_cells(cell_list)
print(f'Saved to csv file. {datetime.datetime.now().strftime("%Y-%m-%d")}')

Change Column values in pandas applying another function

I have a data frame in pandas, one of the columns contains time intervals presented as strings like 'P1Y4M1D'.
The example of the whole CSV:
oci,citing,cited,creation,timespan,journal_sc,author_sc
0200100000236252421370109080537010700020300040001-020010000073609070863016304060103630305070563074902,"10.1002/pol.1985.170230401","10.1007/978-1-4613-3575-7_2",1985-04,P2Y,no,no
...
I created a parsing function, that takes that string 'P1Y4M1D' and returns an integer number.
I am wondering how is it possible to change all the column values to parsed values using that function?
def do_process_citation_data(f_path):
global my_ocan
my_ocan = pd.read_csv("citations.csv",
names=['oci', 'citing', 'cited', 'creation', 'timespan', 'journal_sc', 'author_sc'],
parse_dates=['creation', 'timespan'])
my_ocan = my_ocan.iloc[1:] # to remove the first row iloc - to select data by row numbers
my_ocan['creation'] = pd.to_datetime(my_ocan['creation'], format="%Y-%m-%d", yearfirst=True)
return my_ocan
def parse():
mydict = dict()
mydict2 = dict()
i = 1
r = 1
for x in my_ocan['oci']:
mydict[x] = str(my_ocan['timespan'][i])
i +=1
print(mydict)
for key, value in mydict.items():
is_negative = value.startswith('-')
if is_negative:
date_info = re.findall(r"P(?:(\d+)Y)?(?:(\d+)M)?(?:(\d+)D)?$", value[1:])
else:
date_info = re.findall(r"P(?:(\d+)Y)?(?:(\d+)M)?(?:(\d+)D)?$", value)
year, month, day = [int(num) if num else 0 for num in date_info[0]] if date_info else [0,0,0]
daystotal = (year * 365) + (month * 30) + day
if not is_negative:
#mydict2[key] = daystotal
return daystotal
else:
#mydict2[key] = -daystotal
return -daystotal
#print(mydict2)
#return mydict2
Probably I do not even need to change the whole column with new parsed values, the final goal is to write a new function that returns average time of ['timespan'] of docs created in a particular year. Since I need parsed values, I thought it would be easier to change the whole column and manipulate a new data frame.
Also, I am curious what could be a way to apply the parsing function on each ['timespan'] row without modifying a data frame, I can only assume It could be smth like this, but I don't have a full understanding of how to do that:
for x in my_ocan['timespan']:
x = parse(str(my_ocan['timespan'])
How can I get a column with new values? Thank you! Peace :)
A df['timespan'].apply(parse) (as mentioned by #Dan) should work. You would need to modify only the parse function in order to receive the string as an argument and return the parsed string at the end. Something like this:
import pandas as pd
def parse_postal_code(postal_code):
# Splitting postal code and getting first letters
letters = postal_code.split('_')[0]
return letters
# Example dataframe with three columns and three rows
df = pd.DataFrame({'Age': [20, 21, 22], 'Name': ['John', 'Joe', 'Carla'], 'Postal Code': ['FF_222', 'AA_555', 'BB_111']})
# This returns a new pd.Series
print(df['Postal Code'].apply(parse_postal_code))
# Can also be assigned to another column
df['Postal Code Letter'] = df['Postal Code'].apply(parse_postal_code)
print(df['Postal Code Letter'])

Equivalent of arcpy.Statistics_analysis using NumPy (or other)

I am having a problem (I think memory related) when trying to do an arcpy.Statistics_analysis on an approximately 40 million row table. I am trying to count the number of non-null values in various columns of the table per category (e.g. there are x non-null values in column 1 for category A). After this, I need to join the statistics results to the input table.
Is there a way of doing this using numpy (or something else)?
The code I currently have is like this:
arcpy.Statistics_analysis(input_layer, output_layer, "'Column1' COUNT; 'Column2' COUNT; 'Column3' COUNT", "Categories")
I am very much a novice with arcpy/numpy so any help much appreciated!
You can convert a table to a numpy array using the function arcpy.da.TableToNumPyArray. And then convert the array to a pandas.DataFrame object.
Here is an example of code (I assume you are working with Feature Class because you use the term null values, if you work with shapefile you will need to change the code as null values are not supported are replaced with a single space string (' '):
import arcpy
import pandas as pd
# Change these values
gdb_path = 'path/to/your/geodatabase.gdb'
table_name = 'your_table_name'
cat_field = 'Categorie'
fields = ['Column1','column2','Column3','Column4']
# Do not change
null_value = -9999
input_table = gdb_path + '\\' + table_name
# Convert to pandas DataFrame
array = arcpy.da.TableToNumPyArray(input_table,
[cat_field] + fields,
skip_nulls=False,
null_value=null_value)
df = pd.DataFrame(array)
# Count number of non null values
not_null_count = {field: {cat: 0 for cat in df[cat_field].unique()}
for field in fields}
for cat in df[cat_field].unique():
_df = df.loc[df[cat_field] == cat]
len_cat = len(_df)
for field in fields:
try: # If your field contains integrer or float
null_count = _df[field].value_counts()[int(null_value)]
except IndexError: # If it contains text (string)
null_count = _df[field].value_counts()[str(null_value)]
except KeyError: # There is no null value
null_count = 0
not_null_count[field][cat] = len_cat - null_count
Concerning joining the results to the input table without more information, it's complicated to give you an exact answer that will meet your expectations (because there are multiple columns, so it's unsure which value you want to add).
EDIT:
Here is some additional code following your clarifications:
# Create a copy of the table
copy_name = '' # name of the copied table
copy_path = gdb_path + '\\' + copy_name
arcpy.Copy_management(input_table, copy_path)
# Dividing copy data with summary
# This step doesn't need to convert the dict (not_null_value) to a table
with arcpy.da.UpdateCursor(copy_path, [cat_field] + fields) as cur:
for row in cur:
category = row[0]
for i, fld in enumerate(field):
row[i+1] /= not_null_count[fld][category]
cur.updateRow(row)
# Save the summary table as a csv file (if needed)
df_summary = pd.DataFrame(not_null_count)
df_summary.index.name = 'Food Area' # Or any name
df_summary.to_csv('path/to/file.csv') # Change path
# Summary to ArcMap Table (also if needed)
arcpy.TableToTable_conversion('path/to/file.csv',
gdb_path,
'name_of_your_new_table')

How to zip values into a table with uneven lists? (DataNitro)

I'm attempting to get the last 5 orders from currency exchanges through their respective JSON API. Everything is working except for the fact there are some coins that have less than 5 orders (ask/bid) which causes some errors in the table write to Excel.
Here is what I have now:
import grequests
import json
import itertools
active_sheet("Livecoin Queries")
urls3 = [
'https://api.livecoin.net/exchange/order_book?
currencyPair=RBIES/BTC&depth=5',
'https://api.livecoin.net/exchange/order_book?
currencyPair=REE/BTC&depth=5',
]
requests = (grequests.get(u) for u in urls3)
responses = grequests.map(requests)
CellRange("B28:DJ48").clear()
def make_column(catalog_response, name):
column = []
catalog1 = catalog_response.json()[name]
quantities1, rates1 = zip(*catalog1)
for quantity, rate in zip(quantities1, rates1):
column.append(quantity)
column.append(rate)
return column
bid_table = []
ask_table = []
for response in responses:
try:
bid_table.append(make_column(response,'bids'))
ask_table.append(make_column(response,'asks'))
except (KeyError,ValueError,AttributeError):
continue
Cell(28, 2).table = zip(*ask_table)
Cell(39, 2).table = zip(*bid_table)
I've isolated the list of links down to just two with "REE" coin being the issue here.
I've tried:
for i in itertools.izip_longest(*bid_table):
#Cell(28, 2).table = zip(*ask_table)
#Cell(39, 2).table = zip(*i)
print(i)
Which prints out nicely in the terminal:
itertools terminal output
NOTE: As of right now "REE" has zero bid orders so it ends up creating an empty list:
empty list terminal output
When printing to excel I get a lot of strange outputs. None of which resemble what it looks like in the terminal. The way the information is set up in Excel requires it to be Cell(X,X).table
My question is, how do I make zipping with uneven lists play nice with tables in DataNitro?
EDIT1:
The problem is arising at catalog_response.json()[name]
def make_column(catalog_response, name):
column = []
catalog1 = catalog_response.json()[name]
#quantities1, rates1 = list(itertools.izip_longest(*catalog1[0:5]))
print(catalog1)
#for quantity, rate in zip(quantities1, rates1):
# column.append(quantity)
# column.append(rate)
#return column
Since there are zero bids there is not even an empty list created which is why I'm unable to zip them together.
ValueError: need more than 0 values to unpack
I suggest that you build the structure myTable that you intend to write back to excel.
It should be a list of lists
myTable = []
myRow = []
…build each myRow from your code…
if the length of the list for myRow is too short, pad with proper number of [None] elements
in your case if len(myRow) is 0 you need to append two “None” items
myRow.append(None)
myRow.append(None)
add the row to the output table
myTable.append(myRow)
so when ready you have a well formed nn x n table to output via:
Cell(nn, n).table = myTable

Categories