Print python list in groups of 3 - python

I have a list of 107 names, I would like to print them out in groups of 3 or so, with each name separated by a tab, a newline after each line, until the end. How can I do this?
with for item in list print item i only get 1 name per line of course, which is fine I guess but i'd like to fit more in the console at once so I'd like to print 3 or so names on each line as I go through the list, so instead of:
name1
name2
name3
name4
name5
name6
i would get:
name1 name2 name3
name4 name5 name6
It's kindof hard to search for an answer to this, i haven't been able to come up with quite what I need or that I could understand, most things I did find just deal with len() or range() and confused me. Is there some simple way to do this? Thank you!
[edit:update] using #inspectorG4dget's example of:
for i in range(0, len(listnames), 5):
print '\t\t'.join(listnames[i:i+5])
i get the following: http://www.pasteall.org/pic/show.php?id=41159
how can I get that cleaned up so everything is nicely aligned in each column? Is what I want possible to do easily?

1)
li = ['sea','mountain','desert',
'Emma','Cathy','Kate',
'ii','uuuuuuuuuuuuuuuuuuu','aaa',
'round','flat','sharp',
'blueberry','banana','apple',
'red','purple','white',
'hen','tiger']
a,b = divmod(len(li),3)
itn = iter(li).next
print ''.join('%s\t%s\t%s\n' % (itn(),itn(),itn())
for i in xrange(a))\
+ ('%s\t%s\t\n' % (itn(),itn()) if b==2
else '%s\t\n' % itn() if b==1
else '')
result
sea mountain desert
Emma Cathy Kate
ii uuuuuuuuuuuuuuuuuuu aaa
round flat sharp
blueberry banana apple
red purple white
hen tiger
.
2)
And to align in columns whose width depends on the longest element of the list:
li = ['sea','mountain','desert',
'Emma','Cathy','Kate',
'HH','VVVVVVV','AAA',
'round','flat','sharp',
'blueberry','banana','apple',
'red','purple','white',
'hen','tiger']
maxel = max(len(el) for el in li)
a,b = divmod(len(li),3)
itn = iter(li).next
form = '%%-%ds\t%%-%ds\t%%-%ds\n' % (maxel,maxel,maxel)
print ''.join(form % (itn(),itn(),itn())
for i in xrange(a))\
+ ('%%-%ds\t%%-%ds\t\n' %(maxel,maxel) % (itn(),itn()) if b==2
else '%%-%ds\t\n' % ma% itn() if b==1
else '')
result
sea mountain desert
Emma Cathy Kate
HH VVVVVVV AAA
round flat sharp
blueberry banana apple
red purple white
hen tiger
.
3)
To align in column, the width of each column depending upon the longest element in it:
li = ['sea','mountain','desert',
'Emma','Cathy','Kate',
'HH','VVVVVVV','AAA',
'round','flat','sharp',
'nut','banana','apple',
'red','purple','white',
'hen','tiger']
maxel0 = max(len(li[i]) for i in xrange(0,len(li),3))
maxel1 = max(len(li[i]) for i in xrange(1,len(li),3))
maxel2 = max(len(li[i]) for i in xrange(2,len(li),3))
a,b = divmod(len(li),3)
itn = iter(li).next
form = '%%-%ds\t%%-%ds\t%%-%ds\n' % (maxel0,maxel1,maxel2)
print ''.join(form % (itn(),itn(),itn())
for i in xrange(a))\
+ ('%%-%ds\t%%-%ds\t\n' %(maxel0,maxel1) % (itn(),itn()) if b==2
else '%%-%ds\t\n' % maxel0 % itn() if b==1
else '')
result
sea mountain desert
Emma Cathy Kate
HH VVVVVVV AAA
round flat sharp
nut banana apple
red purple white
hen tiger
4)
I've modified the algorithm in order to generalize to any number of columns wanted.
The wanted number of columns must be passed as argument to parameter nc :
from itertools import imap,islice
li = ['sea','mountain','desert',
'Emma','Cathy','Kate',
'HH','VVVVVVV','AAA',
'round','flat','sharp',
'nut','banana','apple',
'heeeeeeeeeeen','tiger','snake'
'red','purple','white',
'atlantic','pacific','antarctic',
'Bellini']
print 'len of li == %d\n' % len(li)
def cols_print(li,nc):
maxel = tuple(max(imap(len,islice(li,st,None,nc)))
for st in xrange(nc))
nblines,tail = divmod(len(li),nc)
stakes = (nc-1)*['%%-%ds\t'] + ['%%-%ds']
form = ''.join(stakes) % maxel
itn = iter(li).next
print '\n'.join(form % tuple(itn() for g in xrange(nc))
for i in xrange(nblines))
if tail:
print ''.join(stakes[nc-tail:]) % maxel[0:tail] % tuple(li[-tail:]) + '\n'
else:
print
for nc in xrange(3,8):
cols_print(li,nc)
print '-----------------------------------------------------------'
result
len of li == 24
sea mountain desert
Emma Cathy Kate
HH VVVVVVV AAA
round flat sharp
nut banana apple
heeeeeeeeeeen tiger snakered
purple white atlantic
pacific antarctic Bellini
-----------------------------------------------------------
sea mountain desert Emma
Cathy Kate HH VVVVVVV
AAA round flat sharp
nut banana apple heeeeeeeeeeen
tiger snakered purple white
atlantic pacific antarctic Bellini
-----------------------------------------------------------
sea mountain desert Emma Cathy
Kate HH VVVVVVV AAA round
flat sharp nut banana apple
heeeeeeeeeeen tiger snakered purple white
atlantic pacific antarctic Bellini
-----------------------------------------------------------
sea mountain desert Emma Cathy Kate
HH VVVVVVV AAA round flat sharp
nut banana apple heeeeeeeeeeen tiger snakered
purple white atlantic pacific antarctic Bellini
-----------------------------------------------------------
sea mountain desert Emma Cathy Kate HH
VVVVVVV AAA round flat sharp nut banana
apple heeeeeeeeeeen tiger snakered purple white atlantic
pacific antarctic Bellini
-----------------------------------------------------------
.
But I prefer a displaying in which there are no tabs between columns, but only a given number of characters.
In the following code, I choosed to separate the columns by 2 characters: it is the 2 in the line
maxel = tuple(max(imap(len,islice(li,st,None,nc)))+2
The code
from itertools import imap,islice
li = ['sea','mountain','desert',
'Emma','Cathy','Kate',
'HH','VVVVVVV','AAA',
'round','flat','sharp',
'nut','banana','apple',
'heeeeeeeeeeen','tiger','snake'
'red','purple','white',
'atlantic','pacific','antarctic',
'Bellini']
print 'len of li == %d\n' % len(li)
def cols_print(li,nc):
maxel = tuple(max(imap(len,islice(li,st,None,nc)))+2
for st in xrange(nc))
nblines,tail = divmod(len(li),nc)
stakes = nc*['%%-%ds']
form = ''.join(stakes) % maxel
itn = iter(li).next
print '\n'.join(form % tuple(itn() for g in xrange(nc))
for i in xrange(nblines))
if tail:
print ''.join(stakes[nc-tail:]) % maxel[0:tail] % tuple(li[-tail:]) + '\n'
else:
print
for nc in xrange(3,8):
cols_print(li,nc)
print 'mwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwm'
the result
len of li == 24
sea mountain desert
Emma Cathy Kate
HH VVVVVVV AAA
round flat sharp
nut banana apple
heeeeeeeeeeen tiger snakered
purple white atlantic
pacific antarctic Bellini
mwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwm
sea mountain desert Emma
Cathy Kate HH VVVVVVV
AAA round flat sharp
nut banana apple heeeeeeeeeeen
tiger snakered purple white
atlantic pacific antarctic Bellini
mwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwm
sea mountain desert Emma Cathy
Kate HH VVVVVVV AAA round
flat sharp nut banana apple
heeeeeeeeeeen tiger snakered purple white
atlantic pacific antarctic Bellini
mwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwm
sea mountain desert Emma Cathy Kate
HH VVVVVVV AAA round flat sharp
nut banana apple heeeeeeeeeeen tiger snakered
purple white atlantic pacific antarctic Bellini
mwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwm
sea mountain desert Emma Cathy Kate HH
VVVVVVV AAA round flat sharp nut banana
apple heeeeeeeeeeen tiger snakered purple white atlantic
pacific antarctic Bellini
mwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwm

You could also try this:
from itertools import izip
l = ['name1', 'name2', 'name3', 'name4', 'name5', 'name6']
for t in izip(*[iter(l)]*3):
print '\t'.join(t)
name1 name2 name3
name4 name5 name6
If you're not certain that the list length will be a multiple of 3, you could use izip_longest, applying the same idea:
from itertools import izip_longest as izipl
l = ['name1', 'name2', 'name3', 'name4', 'name5', 'name6', 'name7']
for t in izipl(fillvalue='', *[iter(l)]*3):
print '\t'.join(t)
name1 name2 name3
name4 name5 name6
name7

This should do it:
In [12]: L
Out[12]: ['name1', 'name2', 'name3', 'name4', 'name5', 'name6']
In [13]: for i in range(0,len(L),3): print ' '.join(L[i:i+3])
name1 name2 name3
name4 name5 name6
EDIT: to get everything into a fixed width (some code that I wrote a while to turn columnar data into a table. All you have to do is columnize your data and call this old code):
def tabularize(infilepath, outfilepath, delim='\t', largeFile=False):
""" Return nothing
Write into the file in outfilepath, the contents of infilepath, expressed in tabular form.
The tabular form is similar to the way in which SQL tables are displayed.
If largeFile is set to True, then no caching of lines occurs. However, two passes of the infile are required"""
if largeFile:
widths = getWidths(infilepath, delim)
else:
with open(infilepath) as infile:
lines = [line.strip().split(delim) for line in infile.readlines() if line.strip()]
widths = [max([len(row) for row in rows])+2 for rows in izip_longest(*lines, fillvalue="")]
with open(outfilepath, 'w') as outfile:
outfile.write("+")
for width in widths:
outfile.write('-'*width + "+")
outfile.write('\n')
for line in lines:
outfile.write("|")
for col,width in izip_longest(line,widths, fillvalue=""):
outfile.write("%s%s%s|" %(' '*((width-len(col))/2), col, ' '*((width+1-len(col))/2)))
outfile.write('\n+')
for width in widths:
outfile.write('-'*width + "+")
outfile.write('\n')
def getWidths(infilepath, delim):
answer = defaultdict(int)
with open(infilepath) as infile:
for line in infile:
cols = line.strip().split(delim)
lens = map(len, cols)
for i,l in enumerate(lens):
if answer[i] < l:
answer[i] = l
return [answer[k] for k in sorted(answer)]
def main(L, n, infilepath, outfilepath):
iterator = iter(L)
with open(infilepath, 'w') as infile:
for row in itertools.izip_longest([iterator]*n, fillavalue=''):
infile.write('\t'.join(row)+'\n')
if len(L) > 10**6:
largeFile = True
tabularize(infilepath, outfilepath, delim='\t', largeFile)

Try using itertools i think its a much more simpler solution.
from itertools import izip_longest
def grouper(n, iterable, fillvalue=None):
args = [iter(iterable)] * n
return izip_longest(fillvalue=fillvalue, *args)
names = ['name1', 'name2', 'name3', 'name4', 'name5', 'name6']
for item1 in grouper(3, names, ''):
print '\t'.join(item1)
Result:
name1 name2 name3
name4 name5 name6

Related

re.IGNORCASE flag not working with .str.extract

I have the dataframe below and have created a column to catagorise based on specific text within a string.
However when I pass re.IGNORECASE flag it is still case sensetive?
Dataframe
test_data = {
"first_name": ['Bruce', 'Clark', 'Bruce', 'James', 'Nanny', 'Dot'],
"last_name": ['Lee', 'Kent', 'Banner', 'Bond', 'Mc Phee', 'Cotton'],
"title": ['mr', 'mr', 'mr', 'mr', 'mrs', 'mrs'],
"text": ["He is a Kung Fu master", "Wears capes and tight Pants", "Cocktails shaken not stirred", "angry Green man", "suspect scottish accent", "East end legend"],
"age": [32, 33, 28, 30, 42, 80]
}
df = pd.DataFrame(test_data)
code
category_dict = {
"Kung Fu":"Martial Art",
"capes":"Clothing",
"cocktails": "Drink",
"green": "Colour",
"scottish": "Scotland",
"East": "Direction"
}
df['category'] = (
df['text'].str.extract(
fr"\b({'|'.join(category_dict.keys())})\b",
flags=re.IGNORECASE)[0].map(category_dict))
Expected output
first_name last_name title text age category
0 Bruce Lee Mr He is a Kung Fu master 32 Martial Art
1 Clark Kent Mr Wears capes and tight Pants 33 Clothing
2 Bruce Banner Mr Cocktails shaken not stirred 28 Drink
3 James Bond Mr angry Green man 30 Colour
4 Nanny Mc Phee Mrs suspect scottish accent 42 Scotland
5 Dot Cotton Mrs East end legend 80 Direction
I have searched the docs and have found no pointers, so any help would be appreciated!
here is one way to do it
the issue you're facing being that while the extract ignores the case, the extracted string mapping to dictionary is still case sensitive.
#create a dictionary with lower case keys
cd= {k.lower(): v for k,v in category_dict.items()}
# alternately, you can convert the category_dict keys to lower case
# I duplicated the dictionary, in case you need to keep the original keys
# convert the extracted word to lowercase and then map with the lowercase dict
df['category'] = (
df['text'].str.extract(
fr"\b({'|'.join((category_dict.keys()))})\b",
flags=re.IGNORECASE)[0].str.lower().map(cd))
df
first_name last_name title text age category
0 Bruce Lee mr He is a Kung Fu master 32 Martial Art
1 Clark Kent mr Wears capes and tight Pants 33 Clothing
2 Bruce Banner mr Cocktails shaken not stirred 28 Drink
3 James Bond mr angry Green man 30 Colour
4 Nanny Mc Phee mrs suspect scottish accent 42 Scotland
5 Dot Cotton mrs East end legend 80 Direction

classifing excel data row by row in n level columns

I have problem with excel file to classify data in some columns and rows, I need to arrange merge cells to next column as a 1 row and next column go to beside them like this pictures:
Input:
Output for Dairy:
Summary:
first we took Dairy row, then we go to the second column in front of Dairy and get data in front of Dairy, then we go to the second column and in front of Milk to Mr. 1 we get the Butter to Mrs. 1 and Butter to Mrs. 2 and so on ...
After that we want to export it into an excel file like in Output picture.
I have written a code which get the first column data and finds all the data in front of it but I need to change it in order to get the data row by row like in the Output picture:
import pandas
import openpyxl
import xlwt
from xlwt import Workbook
df = pandas.read_excel('excel.xlsx')
result_first_level = []
for i, item in enumerate(df[df.columns[0]].values, 2):
if pandas.isna(item):
result_first_level[-1]['index'] = i
else:
result_first_level.append(dict(name=item, index=i, levels_name=[]))
for level in df.columns[1:]:
move_index = 0
for i, obj in enumerate(result_first_level):
if i == 0:
for item in df[level].values[0:obj['index'] - 1]:
if pandas.isna(item):
move_index += 1
continue
else:
obj['levels_name'].append(item)
move_index += 1
else:
for item in df[level].values[move_index:obj['index'] - 1]:
if pandas.isna(item):
move_index += 1
continue
else:
obj['levels_name'].append(item)
move_index += 1
# Workbook is created
wb = Workbook()
# add_sheet is used to create sheet.
sheet1 = wb.add_sheet('Sheet 1')
style = xlwt.easyxf('font: bold 1')
move_index = 0
for item in result_first_level:
for member in item['levels_name']:
sheet1.write(move_index, 0, item['name'], style)
sheet1.write(move_index, 1, member)
move_index += 1
wb.save('test.xls')
download Input File excel from here
Thanks for helping!
First, fill forward your data to fill blank cells with the last valid value the create an ordered collection using pd.CategoricalDtype to sort the product column. Finally, you have just to iterate over columns pairwise and rename columns to allow concatenate. The last step is to sort your rows by product value.
import pandas as pd
# Prepare your dataframe
df = pd.read_excel('input.xlsx').dropna(how='all')
df.update(df.iloc[:, :-1].ffill())
df = df.drop_duplicates()
# Get keys to sort data in the final output
cats = pd.CategoricalDtype(df.T.melt()['value'].dropna().unique(), ordered=True)
# Group pairwise values
data = []
for cols in zip(df.columns, df.columns[1:]):
col_mapping = dict(zip(cols, ['product', 'subproduct']))
data.append(df[list(cols)].rename(columns=col_mapping))
# Merge all data
out = pd.concat(data).drop_duplicates().dropna() \
.astype(cats).sort_values('product').reset_index(drop=True)
Output:
>>> cats
CategoricalDtype(categories=['Dairy', 'Milk to Mr.1', 'Butter to Mrs.1',
'Butter to Mrs.2', 'Cheese to Miss 2 ', 'Cheese to Mr.2',
'Milk to Miss.1', 'Milk to Mr.5', 'yoghurt to Mr.3',
'Milk to Mr.6', 'Fruits', 'Apples to Mr.6',
'Limes to Miss 5', 'Oranges to Mr.7', 'Plumbs to Miss 5',
'apple for mr 2', 'Foods & Drinks', 'Chips to Mr1',
'Jam to Mr 2.', 'Coca to Mr 5', 'Cookies to Mr1.',
'Coca to Mr 7', 'Coca to Mr 6', 'Juice to Miss 1',
'Jam to Mr 3.', 'Ice cream to Miss 3.', 'Honey to Mr 5',
'Cake to Mrs. 2', 'Honey to Miss 2',
'Chewing gum to Miss 7.'], ordered=True)
>>> out
product subproduct
0 Dairy Milk to Mr.1
1 Dairy Cheese to Mr.2
2 Milk to Mr.1 Butter to Mrs.1
3 Milk to Mr.1 Butter to Mrs.2
4 Butter to Mrs.2 Cheese to Miss 2
5 Cheese to Mr.2 Milk to Miss.1
6 Cheese to Mr.2 yoghurt to Mr.3
7 Milk to Miss.1 Milk to Mr.5
8 yoghurt to Mr.3 Milk to Mr.6
9 Fruits Apples to Mr.6
10 Fruits Oranges to Mr.7
11 Apples to Mr.6 Limes to Miss 5
12 Oranges to Mr.7 Plumbs to Miss 5
13 Plumbs to Miss 5 apple for mr 2
14 Foods & Drinks Chips to Mr1
15 Foods & Drinks Juice to Miss 1
16 Foods & Drinks Cake to Mrs. 2
17 Chips to Mr1 Jam to Mr 2.
18 Chips to Mr1 Cookies to Mr1.
19 Jam to Mr 2. Coca to Mr 5
20 Cookies to Mr1. Coca to Mr 6
21 Cookies to Mr1. Coca to Mr 7
22 Juice to Miss 1 Honey to Mr 5
23 Juice to Miss 1 Jam to Mr 3.
24 Jam to Mr 3. Ice cream to Miss 3.
25 Cake to Mrs. 2 Chewing gum to Miss 7.
26 Cake to Mrs. 2 Honey to Miss 2

How to split two strings into different columns in Python with Pandas?

I am new to this, and I need to split a column that contains two strings into 2 columns, like this:
Initial dataframe:
Full String
0 Orange Juice
1 Pink Bird
2 Blue Ball
3 Green Tea
4 Yellow Sun
Final dataframe:
First String Second String
0 Orange Juice
1 Pink Bird
2 Blue Ball
3 Green Tea
4 Yellow Sun
I tried this but doesn't work:
df['First String'] , df['Second String'] = df['Full String'].str.split()
and this:
df['First String', 'Second String'] = df['Full String'].str.split()
How to make it work? Thank you!!!
The key here is to include the parameter expand=True in your str.split() to expand the split strings into separate columns.
Type it like this:
df[['First String','Second String']] = df['Full String'].str.split(expand=True)
Output:
Full String First String Second String
0 Orange Juice Orange Juice
1 Pink Bird Pink Bird
2 Blue Ball Blue Ball
3 Green Tea Green Tea
4 Yellow Sun Yellow Sun
have you tried this solution ?
https://stackoverflow.com/a/14745484/15320403
df = pd.DataFrame(df['Full String'].str.split(' ',1).tolist(), columns = ['First String', 'Second String'])

Assign specific nominal values randomly to rows using pandas

I want to assign some selected nominal values randomly to rows. For example:
I have three nominal values ["apple", "orange", "banana"].
Before assign these values randomly to rows:
**Name Fruit**
Jack
Julie
Juana
Jenny
Christina
Dickens
Robert
Cersei
After assign these values randomly to rows:
**Name Fruit**
Jack Apple
Julie Orange
Juana Apple
Jenny Banana
Christina Orange
Dickens Orange
Robert Apple
Cersei Banana
How can I do this using pandas dataframe?
You can use pd.np.random.choice with your values:
vals = ["apple", "orange", "banana"]
df['Fruit'] = pd.np.random.choice(vals, len(df))
>>> df
Name Fruit
0 Jack apple
1 Julie orange
2 Juana apple
3 Jenny orange
4 Christina apple
5 Dickens banana
6 Robert orange
7 Cersei orange
You can create a DataFrame in pandas and then assign random choices using numpy
ex2 = pd.DataFrame({'Name':['Jack','Julie','Juana','Jenny','Christina','Dickens','Robert','Cersei']})
ex2['Fruits'] = np.random.choice(['Apple','Orange','Banana'],ex2.shape[0])

Transfer string to data frame Python

I have a large string which I have to transfer into a data frame. For example the string is:
meals_string = "APPETIZERS Southern Fried Quail with
Greens,Huckleberries,Pecans & Blue Cheese 14.00 Park Avenue Cafe
Chopped Salad Goat Feta Cheese,Nigoise Olives,Marinated White [...]
ENTREES Horseradish Crusted Canadian Salmon,Potato Fritters, Marinated
Cucumbers,Chive Vinaigrette 27.00 Sautéed Prawns with Mushroom
Tortellini,Grilled Tomato Vinaigrette & Sweet Corn 29.50"
meals = meals_string.splitlines()
Which gives me var "meals" as list, but I am stuck how to convert the string into dataframe with 3 columns: Category; Meal_name; Price
A relatively simple parser for your string can be built and the passed directly to pandas.DataFrame like:
Code:
def meal_string_parser(meal_string):
category = ''
meal = []
price = 0
for word in meal_string.split():
if word:
try:
price = float(word)
yield category, ' '.join(meal), price
meal = []
except ValueError:
# this is not a number, so not a price
if word.upper() == word and word.isalnum():
# found category
category = word
else:
meal.append(word)
if meal:
yield category, ' '.join(meal), price
Test Code:
meals_string = """
APPETIZERS
Southern Fried Quail with Greens,Huckleberries,Pecans & Blue Cheese 14.00
Park Avenue Cafe Chopped Salad Goat Feta Cheese,Nigoise Olives,Marinated White 13.00
ENTREES
Horseradish Crusted Canadian Salmon,Potato Fritters, Marinated Cucumbers,Chive Vinaigrette 27.00
Sautéed Prawns with Mushroom Tortellini,Grilled Tomato Vinaigrette & Sweet Corn 29.50
"""
import pandas as pd
df = pd.DataFrame(meal_string_parser(meals_string),
columns='Category Meal_name Price'.split())
print(df)
Results:
Category Meal_name Price
0 APPETIZERS Southern Fried Quail with Greens,Huckleberries... 14.0
1 APPETIZERS Park Avenue Cafe Chopped Salad Goat Feta Chees... 13.0
2 ENTREES Horseradish Crusted Canadian Salmon,Potato Fri... 27.0
3 ENTREES Sautéed Prawns with Mushroom Tortellini,Grille... 29.5

Categories