Generating multiple csv files from a list in pandas, python - python

I'm trying to create a new dataframe for each possible combination in 'combinations' reading in some values from a dataframe, an example of the dataframe:
+-------------------------------+-----+----------+---------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+---------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+
| Species | OGT | Domain | A | C | D | E | F | G | H | I | K | L | M | N | P | Q | R | S | T | V | W | Y |
+-------------------------------+-----+----------+---------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+---------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+
| Aeropyrum pernix | 95 | Archaea | 9.7659115711 | 0.6720465616 | 4.3895390781 | 7.6501943794 | 2.9344881615 | 8.8666657183 | 1.5011817208 | 5.6901432494 | 4.1428307243 | 11.0604191603 | 2.21143353 | 1.9387130928 | 5.1038552753 | 1.6855017182 | 7.7664358772 | 6.266067034 | 4.2052190807 | 9.2692433532 | 1.318690698 | 3.5614200159 |
| Argobacterium fabrum | 26 | Bacteria | 11.5698896021 | 0.7985475923 | 5.5884500155 | 5.8165463343 | 4.0512504104 | 8.2643271309 | 2.0116736244 | 5.7962804605 | 3.8931525401 | 9.9250463349 | 2.5980609708 | 2.9846761128 | 4.7828063605 | 3.1262365491 | 6.5684282943 | 5.9454781844 | 5.3740045968 | 7.3382308193 | 1.2519739683 | 2.3149400984 |
| Anaeromyxobacter dehalogenans | 27 | Bacteria | 16.0337898849 | 0.8860252895 | 5.1368827707 | 6.1864992608 | 2.9730203513 | 9.3167603253 | 1.9360386851 | 2.940143349 | 2.3473650439 | 10.898494736 | 1.6343905351 | 1.5247123262 | 6.3580285706 | 2.4715303021 | 9.2639057482 | 4.1890063803 | 4.3992339725 | 8.3885969061 | 1.2890166336 | 1.8265589289 |
| Aquifex aeolicus | 85 | Bacteria | 5.8730327277 | 0.795341216 | 4.3287799008 | 9.6746388172 | 5.1386954322 | 6.7148035486 | 1.5438364179 | 7.3358775924 | 9.4641440609 | 10.5736658776 | 1.9263080969 | 3.6183861236 | 4.0518679067 | 2.0493569604 | 4.9229955632 | 4.7976564501 | 4.2005259246 | 7.9169763709 | 0.9292167138 | 4.1438942987 |
| Archaeoglobus fulgidus | 83 | Archaea | 7.8742687687 | 1.1695110027 | 4.9165979364 | 8.9548767369 | 4.568636662 | 7.2640358917 | 1.4998752909 | 7.2472039919 | 6.8957233203 | 9.4826333048 | 2.6014466253 | 3.206476915 | 3.8419576418 | 1.7789787933 | 5.7572748236 | 5.4763351139 | 4.1490633048 | 8.6330814159 | 1.0325605451 | 3.6494619148 |
+-------------------------------+-----+----------+---------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+---------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+
Here is my code at the moment.
import itertools
import pandas as pd
letters = ['G','A','L','M','F','W','K','Q','E','S','P','V','I','C','Y','H','R','N','D','T']
combinations = [''.join(i) for j in range(1,len(letters) + 1) for i in itertools.combinations(letters,r=j)]
df = pd.read_csv('COMPLETECOPYFORR.csv')
for combination in combinations:
new_df = df[['Species', 'OGT']]
new_df['Sum of percentage'] = df[list(combination)]
new_df.to_csv(combination + '.csv')
The desired output is something along the lines of 10 million CSV files, each with the name of the different combinations, so
G.csv, A.csv, through to GALMFWKQESPVICYHRNDT.csv
Species OGT Sum of percentage
------------------------------- ----- -------------------
Aeropyrum pernix 95 23.4353
Anaeromyxobacter dehalogenans 26 20.3232
Argobacterium fabrum 27 14.2312
Aquifex aeolicus 85 15.0403
Archaeoglobus fulgidus 83 34.0532

It looks like need:
new_df['Sum of percentage'] = df[list(combination)].sum(axis=1)

Related

clean and split dataframe rows in different rows

I have a DataFrame of users information like Name as index, Mail, Birthday, Genre, User, Age... but the dataframe have some lists in different values, for example:
Name | Mail | Birthday | Genre | User | Subscription | Age | Comments
A | A#gmail.com | [1-1-1990, | [M, | [Z, | Y | [33,| -
| | 1-1-2000, | F] | A] | | 23,|
| | 1-1-1998] | | | | 25]|
B | B#gmail.com | [1-1-1970, | F | B | [Y, | [53,| -
| | 13-2-1998]| | | N] | 24]|
C | [C#gmail.com, | [1-1-1985, | [M, | C | [Y, | [38,| -
| C2#gmail.com] | 1-1-1975] | F] | | N] | 53]|
D | D#gmail.com | 1-1-1980 | M | [D, | Y | 43 | -
| | | | Q, | | |
| | | | R, | | |
| | | | T] | | |
E | [E#gmail.com, | 2-6-1975 | F | [E, | [Y, | 48 |
| G#gmail.com] | | | G] | N] | |
| | | | | | |
And I want to split the rows in different rows getting something like that depending the values and the cases:
Name | Mail | Birthday | Genre | User | Subscription | Age | Comments
A | A#gmail.com | 1-1-1990 | M | Z | Y | 33 | -
A2 | A#gmail.com | 1-1-1998 | F | A | Y | 25 | -
| | | | | | |
B | B#gmail.com | 13-2-1998 | F | B | Y | 24 | -
| | | | | | |
C | C#gmail.com | 1-1-1985 | M | C | Y | 38 | -
C2 | C2#gmail.com | 1-1-1975 | F | C2 | N | 53 | -
| | | | | | |
D | D#gmail.com | 1-1-1980 | M | D | Y | 43 | -
| | | | | | |
E | E#gmail.com | 2-6-1975 | F | E | Y | 48 | -
E2 | G#gmail.com | 2-6-1975 | F | G | N | 48 | -
| | | | | | |
Different possible cases:
Remove dates like 1-1-1970 and 1-1-2000, and ages of this dates
If only have list in user and not in the rest of columns remove all and use mail (without #)
If only one user and 2 or more cases in other rows take the mail (without #) as user
Split the rows with lists and if in a column there is not a list keep the same value in both rows
I don't know if it's possible i just took the data form a bad organized data base.
I got the first DataFrame using the function from How can i combine and pull apart rows of a DataFrame? answer.
I tried to split this rows with this function:
def split_list_cols(row):
split_rows = []
for col, val in row.items():
if isinstance(val, list):
for item in val:
new_row = row.copy()
new_row[col] = item
split_rows.append(new_row)
row[col] = None
return split_rows
but did not work well.
From this dataframe
Name | Mail | Birthday | Genre | User | Subscription | Age | Comments
E | [E#gmail.com, | 2-6-1975 | F | [E, | [Y, | 48 | -
| G#gmail.com] | | | G] | N] | |
| | | | | | |
gives me:
Name | Mail | Birthday | Genre | User | Subscription | Age | Comments
E | E#gmail.com | 2-6-1975 | F | [E, | [Y, | 48 | -
| | | | G] | N] | |
E | G#gmail.com | 2-6-1975 | F | [E, | [Y, | 48 | -
| | | | G] | N] | |
E | None | 2-6-1975 | F | E | [Y, | 48 |
| | | | | N] | |
E | None | 2-6-1975 | F | G | [Y, | 48 |
| | | | | N] | |
E | None | 2-6-1975 | F | None | Y | 48 |
| | | | | | |
| | | | | | |
E | None | 2-6-1975 | F | None | N | 48 |
and it should give:
E | E#gmail.com | 2-6-1975 | F | E | Y | 48 | -
E2 | G#gmail.com | 2-6-1975 | F | G | N | 48 | -
| | | | | | |
Different possible cases:
Its hard to understand all of the cases. But from what I understood I you could do this:
df = df.explode(["Birthday", "Genre", "User", "Subscription"])
df = df[(df["Birthday"].ne("1-1-1970")) & (df["Birthday"].ne("1-1-2000"))]
df = df.drop_duplicates(ignore_index=True)
df["Name"] = df["Name"] + df.groupby("Name").cumcount().add(1).astype(str)
df["User"] = df["User"] + df.groupby("User").cumcount().add(1).astype(str)

Reading data to python dataframe

I am struggling with reading data into python dataframe. Am R programmer trying to do stuff in Python. So how would I read the following data into pandas dataframe? The data is actually the result of calling API.
Thanks.
b'{"mk_id":"1200011617609","doc_type":"sales_order","opr_code":"0","count_code":"1051885/2022","doc_date":"2022-08-23+02:00","partner":{"mk_id":"400020633177","business_entity":"false","taxpayer":"false","foreign_county":"true","customer":"Emilia Chabadova","street":"Ga\xc5\xa1tanov\xc3\xa1 2915/13","street_number":"2915/13","post_number":"92101","place":"Pie\xc5\xa1\xc5\xa5any","country":"Slovakia","count_code":"5770789334526546744","partner_contact":{"gsm":"+421949340254","email":"emily.chabadova#gmail.com"},"mk_address_id":"400020530565","country_iso_2":"SK","buyer":"true","supplier":"false"},"receiver":{"mk_id":"400020633177","business_entity":"false","taxpayer":"false","foreign_county":"true","customer":"Emilia Chabadova","street":"Ga\xc5\xa1tanov\xc3\xa1 2915/13","street_number":"2915/13","post_number":"92101","place":"Pie\xc5\xa1\xc5\xa5any","country":"Slovakia","count_code":"5770789334526546744","partner_contact":{"gsm":"+421949340254","email":"emily.chabadova#gmail.com"},"mk_address_id":"400020530565","country_iso_2":"SK","buyer":"true","supplier":"false"},"currency_code":"EUR","status_code":"Zaklju\xc4\x8dena","doc_created_email":"stifter.rok#gmail.com","buyer_order":"SK-956103","warehouse":"glavno","delivery_type":"Gls_sk","product_list":[{"count_code":"54","mk_id":"266405022384","code":"MSS","name":"Mousse","unit":"kos","amount":"1","price":"16.66","price_with_tax":"19.99","tax":"200"},{"count_code":"53","mk_id":"266405022383","code":"MIT","name":"Mitt","unit":"kos","amount":"1","price":"0","tax":"200"},{"count_code":"48","mk_id":"266404892511","code":"TM","name":"Tanning mist","name_desc":"TM","unit":"kos","amount":"1","price":"0","tax":"200"}],"extra_column":[{"name":"tracking_number","value":"91114278162"}],"sum_basic":"16.66","sum_tax_200":"3.33","sum_all":"19.99","sum_paid":"19.99","profit_center":"SHINE BROWN, PROIZVODNJA, TRGOVINA IN STORITVE, D.O.O.","bank_ref_number":"10518852022","method_of_payment":"Pla\xc4\x8dilo po povzetju","order_create_ts":"2022-08-23T09:43:00+02:00","created_ts":"2022-08-23T11:59:14+02:00","shipped_date":"2022-08-24+02:00","doc_link_list":[{"mk_id":"266412181173","count_code":"SK-MK-36044","doc_type":"sales_bill_foreign"},{"mk_id":"400015161112","count_code":"1043748/2022","doc_type":"warehouse_packing_list"},{"mk_id":"1200011617609","count_code":"1051885/2022","doc_type":"sales_order"}]}'
you can start by doing something like this :
result = {"mk_id":"1200011617609","doc_type":"sales_order","opr_code":"0","count_code":"1051885/2022","doc_date":"2022-08-23+02:00","partner":{"mk_id":"400020633177","business_entity":"false","taxpayer":"false","foreign_county":"true","customer":"Emilia Chabadova","street":"Ga\xc5\xa1tanov\xc3\xa1 2915/13","street_number":"2915/13","post_number":"92101","place":"Pie\xc5\xa1\xc5\xa5any","country":"Slovakia","count_code":"5770789334526546744","partner_contact":{"gsm":"+421949340254","email":"emily.chabadova#gmail.com"},"mk_address_id":"400020530565","country_iso_2":"SK","buyer":"true","supplier":"false"},"receiver":{"mk_id":"400020633177","business_entity":"false","taxpayer":"false","foreign_county":"true","customer":"Emilia Chabadova","street":"Ga\xc5\xa1tanov\xc3\xa1 2915/13","street_number":"2915/13","post_number":"92101","place":"Pie\xc5\xa1\xc5\xa5any","country":"Slovakia","count_code":"5770789334526546744","partner_contact":{"gsm":"+421949340254","email":"emily.chabadova#gmail.com"},"mk_address_id":"400020530565","country_iso_2":"SK","buyer":"true","supplier":"false"},"currency_code":"EUR","status_code":"Zaklju\xc4\x8dena","doc_created_email":"stifter.rok#gmail.com","buyer_order":"SK-956103","warehouse":"glavno","delivery_type":"Gls_sk","product_list":[{"count_code":"54","mk_id":"266405022384","code":"MSS","name":"Mousse","unit":"kos","amount":"1","price":"16.66","price_with_tax":"19.99","tax":"200"},{"count_code":"53","mk_id":"266405022383","code":"MIT","name":"Mitt","unit":"kos","amount":"1","price":"0","tax":"200"},{"count_code":"48","mk_id":"266404892511","code":"TM","name":"Tanning mist","name_desc":"TM","unit":"kos","amount":"1","price":"0","tax":"200"}],"extra_column":[{"name":"tracking_number","value":"91114278162"}],"sum_basic":"16.66","sum_tax_200":"3.33","sum_all":"19.99","sum_paid":"19.99","profit_center":"SHINE BROWN, PROIZVODNJA, TRGOVINA IN STORITVE, D.O.O.","bank_ref_number":"10518852022","method_of_payment":"Pla\xc4\x8dilo po povzetju","order_create_ts":"2022-08-23T09:43:00+02:00","created_ts":"2022-08-23T11:59:14+02:00","shipped_date":"2022-08-24+02:00","doc_link_list":[{"mk_id":"266412181173","count_code":"SK-MK-36044","doc_type":"sales_bill_foreign"},{"mk_id":"400015161112","count_code":"1043748/2022","doc_type":"warehouse_packing_list"},{"mk_id":"1200011617609","count_code":"1051885/2022","doc_type":"sales_order"}]}
pd.DataFrame([result])
Here is a way using BytesIO and json.normalize:
from ast import literal_eval
from io import BytesIO
import pandas as pd
data = b'{"mk_id":"1200011617609","doc_type":"sales_order","opr_code":"0","count_code":"1051885/2022","doc_date":"2022-08-23+02:00","partner":{"mk_id":"400020633177","business_entity":"false","taxpayer":"false","foreign_county":"true","customer":"Emilia Chabadova","street":"Ga\xc5\xa1tanov\xc3\xa1 2915/13","street_number":"2915/13","post_number":"92101","place":"Pie\xc5\xa1\xc5\xa5any","country":"Slovakia","count_code":"5770789334526546744","partner_contact":{"gsm":"+421949340254","email":"emily.chabadova#gmail.com"},"mk_address_id":"400020530565","country_iso_2":"SK","buyer":"true","supplier":"false"},"receiver":{"mk_id":"400020633177","business_entity":"false","taxpayer":"false","foreign_county":"true","customer":"Emilia Chabadova","street":"Ga\xc5\xa1tanov\xc3\xa1 2915/13","street_number":"2915/13","post_number":"92101","place":"Pie\xc5\xa1\xc5\xa5any","country":"Slovakia","count_code":"5770789334526546744","partner_contact":{"gsm":"+421949340254","email":"emily.chabadova#gmail.com"},"mk_address_id":"400020530565","country_iso_2":"SK","buyer":"true","supplier":"false"},"currency_code":"EUR","status_code":"Zaklju\xc4\x8dena","doc_created_email":"stifter.rok#gmail.com","buyer_order":"SK-956103","warehouse":"glavno","delivery_type":"Gls_sk","product_list":[{"count_code":"54","mk_id":"266405022384","code":"MSS","name":"Mousse","unit":"kos","amount":"1","price":"16.66","price_with_tax":"19.99","tax":"200"},{"count_code":"53","mk_id":"266405022383","code":"MIT","name":"Mitt","unit":"kos","amount":"1","price":"0","tax":"200"},{"count_code":"48","mk_id":"266404892511","code":"TM","name":"Tanning mist","name_desc":"TM","unit":"kos","amount":"1","price":"0","tax":"200"}],"extra_column":[{"name":"tracking_number","value":"91114278162"}],"sum_basic":"16.66","sum_tax_200":"3.33","sum_all":"19.99","sum_paid":"19.99","profit_center":"SHINE BROWN, PROIZVODNJA, TRGOVINA IN STORITVE, D.O.O.","bank_ref_number":"10518852022","method_of_payment":"Pla\xc4\x8dilo po povzetju","order_create_ts":"2022-08-23T09:43:00+02:00","created_ts":"2022-08-23T11:59:14+02:00","shipped_date":"2022-08-24+02:00","doc_link_list":[{"mk_id":"266412181173","count_code":"SK-MK-36044","doc_type":"sales_bill_foreign"},{"mk_id":"400015161112","count_code":"1043748/2022","doc_type":"warehouse_packing_list"},{"mk_id":"1200011617609","count_code":"1051885/2022","doc_type":"sales_order"}]}'
df = pd.DataFrame(BytesIO(data))
df[0] = df[0].str.decode("utf-8").apply(literal_eval)
df = pd.json_normalize(
data=df.pop(0),
record_path="product_list",
meta=["mk_id", "doc_type", "opr_code", "count_code", "doc_date", "currency_code",
"status_code", "doc_created_email", "buyer_order", "warehouse", "delivery_type"],
meta_prefix="meta."
)
print(df.to_markdown())
| | count_code | mk_id | code | name | unit | amount | price | price_with_tax | tax | name_desc | meta.mk_id | meta.doc_type | meta.opr_code | meta.count_code | meta.doc_date | meta.currency_code | meta.status_code | meta.doc_created_email | meta.buyer_order | meta.warehouse | meta.delivery_type |
|---:|-------------:|-------------:|:-------|:-------------|:-------|---------:|--------:|-----------------:|------:|:------------|--------------:|:----------------|----------------:|:------------------|:-----------------|:---------------------|:-------------------|:-------------------------|:-------------------|:-----------------|:---------------------|
| 0 | 54 | 266405022384 | MSS | Mousse | kos | 1 | 16.66 | 19.99 | 200 | nan | 1200011617609 | sales_order | 0 | 1051885/2022 | 2022-08-23+02:00 | EUR | Zaključena | stifter.rok#gmail.com | SK-956103 | glavno | Gls_sk |
| 1 | 53 | 266405022383 | MIT | Mitt | kos | 1 | 0 | nan | 200 | nan | 1200011617609 | sales_order | 0 | 1051885/2022 | 2022-08-23+02:00 | EUR | Zaključena | stifter.rok#gmail.com | SK-956103 | glavno | Gls_sk |
| 2 | 48 | 266404892511 | TM | Tanning mist | kos | 1 | 0 | nan | 200 | TM | 1200011617609 | sales_order | 0 | 1051885/2022 | 2022-08-23+02:00 | EUR | Zaključena | stifter.rok#gmail.com | SK-956103 | glavno | Gls_sk |

Averaging and plotting a non-systematic/arranged data using Python/Pandas

I have a non-systematic/arranged data as follows:
+-------------+------------------+
| x | y |
+-------------+------------------+
| 0.049098 | 82854.2105263158 |
| 0.049058 | 82472.2368421053 |
| 0.066427 | 84358.3421052632 |
| 0.066465 | 83842.9210526316 |
| 0.06095 | 71843.6052631579 |
| 0.060989 | 71951.7368421053 |
| 0.066999 | 84068.5526315789 |
| 0.067037 | 83808.5263157895 |
| 0.089523 | 101753.684210526 |
| 0.089483 | 101556.842105263 |
| 0.084839 | 97206.7105263158 |
| 0.084876 | 97108.8421052632 |
| 0.063842 | 88679.5263157895 |
| 0.063802 | 88309.9210526316 |
| 0.037228 | 11944.3947368421 |
| 0.037268 | 11993.3421052632 |
| 0.029821 | 8830.4131578947 |
| 0.029861 | 8822.0815789474 |
| 0.03014 | 8938.6973684211 |
| 0.03018 | 8964.6868421053 |
| 0.00817 | 138170 |
| 0.0081363 | 137640 |
| 0.093207 | 103233.947368421 |
| 0.093244 | 103177.631578947 |
| 0.097776 | 106011.578947368 |
| 0.097814 | 106073.421052632 |
| 0.0089158 | 135440 |
| 0.0088818 | 136660 |
| 0.02952 | 90309.3421052632 |
| 0.029481 | 89523 |
| 0.034049 | 10589.8342105263 |
| 0.034089 | 10666.1973684211 |
| 0.086063 | 98010.6315789474 |
| 0.0861 | 98108.9736842105 |
| 0.045509 | 82204.8947368421 |
| 0.045469 | 81673.8947368421 |
| 0.03045 | 87057.7105263158 |
| 0.030411 | 89830.2105263158 |
| 0.0072205 | 5150.6763157895 |
| 0.0072587 | 5151.1710526316 |
| 0.068255 | 83407.7894736842 |
| 0.068293 | 83492.1052631579 |
| 0.06145 | 73967.8684210526 |
| 0.061488 | 74132.5789473684 |
| 0.027424 | 8204.1210526316 |
| 0.027464 | 8205.0763157895 |
| 0.027184 | 8141.3947368421 |
| 0.027224 | 8146.002631579 |
| 0.046346 | 81611.4736842105 |
| 0.046306 | 81550.8947368421 |
| 0.058526 | 58270.5526315789 |
| 0.058564 | 58725.2631578947 |
| 0.051402 | 29829.4473684211 |
| 0.05144 | 29684.2105263158 |
| 0.0014855 | 5757.1894736842 |
| 0.0015227 | 5742.7289473684 |
| 0.068954 | 91718 |
| 0.068914 | 91719.2894736842 |
| 0.091635 | 104896.052631579 |
| 0.091595 | 104854.210526316 |
| 0.038524 | 82972.3421052632 |
| 0.038484 | 83128.8684210526 |
| 0.0094275 | 133930 |
| 0.0093933 | 133770 |
| 0.098839 | 105576.842105263 |
| 0.098833 | 105552.105263158 |
| 0.0087119 | 136020 |
| 0.0086779 | 136080 |
| 0.049537 | 82553.5789473684 |
| 0.049497 | 82109.6578947368 |
| 0.0099132 | 5289.7473684211 |
| 0.0099519 | 5290.3421052632 |
| 0.069273 | 91867.1842105263 |
| 0.069233 | 91812 |
| 0.039564 | 12888.1578947368 |
| 0.039603 | 13071.3947368421 |
| 0.023351 | 7234.2473684211 |
| 0.023391 | 7244.1789473684 |
| 0.085419 | 94590.4210526316 |
| 0.085379 | 94557.6578947368 |
| 0.077463 | 90603.6315789474 |
| 0.0775 | 90673.7105263158 |
| 0.015378 | 5389.0578947369 |
| 0.015417 | 5376.9263157895 |
| 0.090262 | 101246.315789474 |
| 0.090299 | 101226.578947368 |
| 0.090969 | 101686.052631579 |
| 0.091006 | 101689.210526316 |
| 0.040275 | 13386.1578947368 |
| 0.040314 | 13415.2368421053 |
| 0.065053 | 84553.3157894737 |
| 0.065091 | 84354.8157894737 |
| 0.041064 | 13609.8684210526 |
| 0.041103 | 13574 |
| 0.028143 | 8369.052631579 |
| 0.028183 | 8367.3710526316 |
| 0.041057 | 81182.5789473684 |
| 0.041018 | 82941.2368421053 |
| 0.049623 | 23859.1315789474 |
| 0.049662 | 24037.2368421053 |
| 0.036984 | 83875.2368421053 |
| 0.036945 | 84167.6315789474 |
| 0.083188 | 94965.8947368421 |
| 0.083148 | 94978.3421052632 |
| 0.058336 | 57671 |
| 0.058374 | 57815.0263157895 |
| 0.089369 | 100686.315789474 |
| 0.089406 | 100711.052631579 |
| 0.069592 | 92141.2368421053 |
| 0.069552 | 92152.5789473684 |
| 0.025793 | 94431.7894736842 |
| 0.025755 | 94648.2368421053 |
| 0.098749 | 105945.526315789 |
| 0.098743 | 105963.421052632 |
| 0.0098043 | 132600 |
| 0.00977 | 133260 |
| 0.060082 | 62819.4210526316 |
| 0.06012 | 62454.0789473684 |
| 0.033059 | 86467 |
| 0.03302 | 86483.9736842105 |
| 0.0036852 | 153970 |
| 0.003653 | 154510 |
| 0.0085422 | 136250 |
| 0.0085083 | 136650 |
| 0.056689 | 84997.6578947368 |
| 0.056649 | 85002 |
| 0.096125 | 105024.736842105 |
| 0.096162 | 105118.421052632 |
| 0.021898 | 101320 |
| 0.02186 | 101500 |
| 0.076134 | 94468.3157894737 |
| 0.076094 | 94467.5789473684 |
| 0.053904 | 42192.3157894737 |
| 0.053942 | 42010.052631579 |
| 0.098707 | 106005 |
| 0.098701 | 106008.947368421 |
| 0.049546 | 23698.6052631579 |
| 0.049584 | 23714.5263157895 |
| 0.067636 | 90985 |
| 0.067596 | 90934 |
| 0.053851 | 84300.7368421053 |
| 0.053811 | 83469.3157894737 |
| 0.075979 | 88750.5 |
| 0.076016 | 88815.7894736842 |
| 0.071428 | 93062.2894736842 |
| 0.071388 | 93069.7368421053 |
| 0.0036544 | 5232.8736842105 |
| 0.0036921 | 5239.252631579 |
| 0.047332 | 17610.8947368421 |
| 0.047371 | 17750.7894736842 |
| 0.03975 | 82067.8684210526 |
| 0.03971 | 83408.6578947368 |
| 0.0038426 | 5238.65 |
| 0.0038802 | 5240.4131578947 |
| 0.014999 | 116980 |
| 0.014964 | 116700 |
| 0.06467 | 84685.7368421053 |
| 0.064709 | 84580.6578947368 |
| 0.047137 | 17372.6578947368 |
| 0.047176 | 17571.9473684211 |
| 0.083467 | 96096.2368421053 |
| 0.083504 | 96070.5263157895 |
| 0.037576 | 82819.5789473684 |
| 0.037536 | 84310.5526315789 |
| 0.016188 | 114852.368421053 |
| 0.016152 | 116141.842105263 |
| 0.016477 | 113262.631578947 |
| 0.016441 | 113310 |
| 0.02246 | 100100 |
| 0.022423 | 100270 |
| 0.005354 | 5153.0815789474 |
| 0.005392 | 5142.1552631579 |
| 0.0284 | 90929 |
| 0.028362 | 91147.2631578947 |
| 0.083626 | 94884.6578947368 |
| 0.083586 | 94890.3421052632 |
| 0.06532 | 89854.8947368421 |
| 0.06528 | 89501 |
| 0.094505 | 106278.947368421 |
| 0.094465 | 106231.578947368 |
| 0.076387 | 88890.1315789474 |
| 0.076424 | 89590.1578947368 |
| 0.055207 | 47767.2894736842 |
| 0.055245 | 47816.7631578947 |
| 0.013148 | 121650 |
| 0.013113 | 122170 |
| 0.058754 | 59487.1842105263 |
| 0.058792 | 59770.1578947368 |
| 0.089592 | 100777.631578947 |
| 0.089629 | 100783.421052632 |
| 0.038813 | 12513.2368421053 |
| 0.038852 | 12835.4473684211 |
| 0.040621 | 82482.0789473684 |
| 0.040581 | 82730.3947368421 |
| 0.079242 | 92299.3157894737 |
| 0.079279 | 92281.5789473684 |
| 0.088328 | 98621.8947368421 |
| 0.088288 | 98518.9736842105 |
| 0.044673 | 81556.3157894737 |
| 0.044633 | 81528.5789473684 |
| 0.0033321 | 155138.947368421 |
| 0.0033001 | 155460 |
| 0.056449 | 84852 |
| 0.056409 | 84876.5 |
| 0.056289 | 84755 |
| 0.056249 | 84690.4473684211 |
| 0.083825 | 94857.6052631579 |
| 0.083786 | 94850.7105263158 |
| 0.097551 | 105735.526315789 |
| 0.097589 | 105823.947368421 |
| 0.086991 | 98877.4473684211 |
| 0.087028 | 98838.8421052632 |
| 0.017095 | 111495.263157895 |
| 0.017058 | 111965.263157895 |
| 0.071779 | 82918.1315789474 |
| 0.071817 | 82398.5789473684 |
| 0.054287 | 43715.9736842105 |
| 0.054326 | 43708.8421052632 |
| 0.082112 | 95046.4210526316 |
| 0.082072 | 95044 |
| 0.068369 | 83746.3157894737 |
| 0.068407 | 83829.8684210526 |
| 0.027104 | 8103.8631578947 |
| 0.027144 | 8118.85 |
| 0.011047 | 129060 |
| 0.011012 | 128640 |
| 0.084582 | 94701.0263157895 |
| 0.084543 | 94672.7368421053 |
| 0.050495 | 83239.7631578947 |
| 0.050455 | 83211.9210526316 |
| 0.087251 | 99002.3421052632 |
| 0.087288 | 99021.7368421053 |
| 0.060446 | 86629 |
| 0.060406 | 86340 |
| 0.036315 | 83643.4210526316 |
| 0.036275 | 83463.7368421053 |
| 0.030699 | 9294.2289473684 |
| 0.030739 | 9340.0578947369 |
| 0.039682 | 13377.1842105263 |
| 0.039722 | 13303.4736842105 |
| 0.071364 | 83794.9210526316 |
| 0.071401 | 83361.5 |
| 0.080319 | 94917.5 |
| 0.080279 | 94918.3157894737 |
| 0.072505 | 93484.7894736842 |
| 0.072465 | 93468.6315789474 |
| 0.034743 | 84187.2894736842 |
| 0.034704 | 84421.552631579 |
| 0.066159 | 89952.6842105263 |
| 0.066119 | 89935 |
| 0.007565 | 5184.3157894737 |
| 0.0076033 | 5181.4763157895 |
| 0.020563 | 6035.7105263158 |
| 0.020603 | 6043.1578947369 |
| 0.046007 | 15889.9736842105 |
| 0.046046 | 15788.5526315789 |
| 0.017351 | 5446.0736842105 |
| 0.01739 | 5445.7078947369 |
| 0.05481 | 84160.3684210526 |
| 0.05477 | 84138.1578947368 |
| 0.063714 | 84575.8947368421 |
| 0.063752 | 84555.3684210526 |
| 0.038338 | 12504.8421052632 |
| 0.038377 | 12557.4736842105 |
| 0.011037 | 5331.7342105263 |
| 0.011076 | 5325.8736842105 |
| 0.002124 | 158640 |
| 0.0020924 | 158730 |
| 0.014356 | 5376.3447368421 |
| 0.014396 | 5381.3842105263 |
| 0.066359 | 90386 |
| 0.066319 | 90076.0789473684 |
| 0.017715 | 109750 |
| 0.017678 | 109140 |
| 0.013846 | 5385.45 |
| 0.013886 | 5373.5157894737 |
| 0.032356 | 87047.1842105263 |
| 0.032317 | 87202.2368421053 |
| 0.00022592 | 6227.5342105263 |
| 0.00026284 | 6525.5394736842 |
| 0.026944 | 8170.5052631579 |
| 0.026984 | 8161.0552631579 |
| 0.067684 | 84077.5 |
| 0.067722 | 84406.1578947368 |
| 0.00056151 | 154602.631578947 |
| 0.00053058 | 154025.263157895 |
| 0.01369 | 5392.0289473684 |
| 0.013729 | 5388.5763157895 |
| 0.069712 | 92249.6315789474 |
| 0.069672 | 92203 |
| 0.090336 | 101292.631578947 |
| 0.041221 | 13574.5 |
| 0.041261 | 13649.3157894737 |
| 0.088248 | 98377.2631578947 |
| 0.032747 | 86214.6578947368 |
| 0.032707 | 86199.7105263158 |
| 0.018063 | 5426.3657894737 |
| 0.018102 | 5400.5368421053 |
| 0.037221 | 82838 |
| 0.037182 | 82739.0526315789 |
| 0.079316 | 92217.8421052632 |
| -0.00010567 | 22281.2894736842 |
| -0.00010105 | 16333.2105263158 |
| 0.073491 | 86313.1315789474 |
| 0.073528 | 85860.3157894737 |
| 0.069113 | 91703.1052631579 |
| 0.069073 | 91686.2894736842 |
| 0.077489 | 94662.4210526316 |
| 0.077449 | 94656.2631578947 |
| 0.064242 | 89154.7368421053 |
| 0.064202 | 88866.8947368421 |
| 0.032512 | 86583 |
| 0.032473 | 86524 |
| 0.062564 | 88165 |
| 0.062524 | 87488.0789473684 |
| 0.036792 | 11964.3947368421 |
| 0.036831 | 11973.1315789474 |
| 0.027823 | 8258.2552631579 |
| 0.027863 | 8251.5105263158 |
| 0.078008 | 94725.7105263158 |
| 0.077968 | 94721.5263157895 |
| 0.071747 | 93315.2105263158 |
| 0.071707 | 93235.3421052632 |
| 0.04546 | 15240.2631578947 |
| 0.045499 | 15235.0526315789 |
| 0.044482 | 14225 |
| 0.044521 | 14200.3947368421 |
| 0.024748 | 7701.9210526316 |
| 0.024788 | 7717.5815789474 |
| 0.046545 | 82522.7105263158 |
| 0.046505 | 83732.5789473684 |
| 0.033646 | 85096.8947368421 |
| 0.033607 | 85049 |
| 0.042923 | 82511.9473684211 |
| 0.042883 | 82387.2894736842 |
| 0.03076 | 88367.9210526316 |
| 0.030722 | 88564.7105263158 |
| 0.087696 | 99199.4473684211 |
| 0.087733 | 99362.7368421053 |
| 0.026106 | 8012.5921052632 |
| 0.026145 | 8007.1894736842 |
| 0.098174 | 106645.263157895 |
| 0.098134 | 106593.421052632 |
| 0.086654 | 95285.7368421053 |
| 0.086614 | 95258.9210526316 |
| 0.047382 | 82383.3421052632 |
| 0.047342 | 81833.7105263158 |
| 0.085247 | 97339.8157894737 |
| 0.085284 | 97644.3157894737 |
| 0.04409 | 14122.2368421053 |
| 0.044129 | 14113.7368421053 |
| 0.047901 | 81497 |
| 0.047861 | 81418.3157894737 |
| 0.026825 | 92594.1315789474 |
| 0.026787 | 92662.3684210526 |
| 0.088625 | 100252.184210526 |
| 0.088662 | 100399.131578947 |
| 0.029461 | 8646.752631579 |
| 0.029501 | 8660.6105263158 |
| 0.01543 | 115970 |
| 0.015394 | 115980 |
| 0.021836 | 6580.3368421053 |
| 0.021876 | 6564.8184210526 |
| 0.057528 | 85296 |
| 0.057488 | 85006 |
| 0.032903 | 86497 |
| 0.032864 | 86557 |
| 0.042874 | 13655.0263157895 |
| 0.042913 | 13685.4473684211 |
| 0.026175 | 94264.5 |
| 0.026136 | 94268 |
| 0.023931 | 97683.052631579 |
| 0.023893 | 97162.5789473684 |
| 0.086434 | 98644.9736842105 |
| 0.086471 | 98392.6052631579 |
| 0.0097586 | 5293.0263157895 |
| 0.0097972 | 5290.9763157895 |
| 0.0019695 | 5567.652631579 |
| 0.0020067 | 5540.9947368421 |
| 0.062218 | 78864.3947368421 |
| 0.062257 | 79100.7105263158 |
| 0.085658 | 94532.6052631579 |
| 0.085618 | 94520.1842105263 |
| 0.07039 | 92623.2631578947 |
| 0.070351 | 92675.2894736842 |
| 0.096459 | 106720 |
| 0.096419 | 106790.789473684 |
| 0.059167 | 85700 |
| 0.059127 | 85778 |
| 0.034272 | 84811.6842105263 |
| 0.034233 | 84255.4736842105 |
| 0.024987 | 7870.9342105263 |
| 0.025027 | 7889.0263157895 |
| 0.037743 | 11876 |
| 0.037783 | 12060.9473684211 |
| 0.042991 | 13413.9210526316 |
| 0.043031 | 13357.1315789474 |
| 0.097974 | 107210.263157895 |
| 0.097935 | 107250.263157895 |
| 0.095788 | 104838.947368421 |
| 0.095825 | 104789.473684211 |
| 0.098491 | 105937.894736842 |
| 0.098529 | 105891.842105263 |
| 0.062124 | 87670 |
| 0.062084 | 87664 |
| 0.046864 | 82101.0263157895 |
| 0.046824 | 82290.8157894737 |
| 0.016087 | 5395.5868421053 |
| 0.016127 | 5394.0657894737 |
| 0.065015 | 84096.0526315789 |
| 0.090918 | 104228.684210526 |
| 0.090878 | 104018.421052632 |
| 0.065664 | 84754.8947368421 |
| 0.065702 | 84734.4473684211 |
| 0.012619 | 123970 |
| 0.012584 | 124412.631578947 |
| 0.064561 | 89115.0263157895 |
| 0.064521 | 88996.7368421053 |
| 0.048382 | 20419.7368421053 |
| 0.048421 | 20430.5526315789 |
| 0.07741 | 94648.2631578947 |
| 0.010148 | 131200 |
| 0.010114 | 131180 |
| 0.031421 | 88356.7105263158 |
| 0.031382 | 88397.8157894737 |
| 0.0018949 | 5593.8789473684 |
| 0.0019322 | 5588.7631578947 |
| 0.056048 | 50006.8157894737 |
| 0.056086 | 50654.4736842105 |
| 0.024468 | 7700.4473684211 |
| 0.024508 | 7719.8421052632 |
| 0.078566 | 94788.3947368421 |
| 0.078526 | 94786.3421052632 |
| 0.089045 | 100410 |
| 0.089005 | 100279.210526316 |
| 0.026225 | 7953.7 |
| 0.026265 | 7956.9105263158 |
| 0.021636 | 100290 |
| 0.021598 | 100420 |
| 0.013715 | 120830 |
| 0.01368 | 121700 |
| 0.05765 | 55683.2631578947 |
| 0.057689 | 55756.9210526316 |
| 0.0025295 | 5411.2236842105 |
| 0.0025668 | 5389.6973684211 |
| 0.046704 | 82967 |
| 0.046665 | 82828.552631579 |
| 0.050615 | 81974 |
| 0.050575 | 82113 |
| 0.071907 | 93276.4210526316 |
| 0.071867 | 93275.5526315789 |
| 0.077569 | 94671.2631578947 |
| 0.077529 | 94667.1578947368 |
| 0.057079 | 53966 |
| 0.057117 | 54124.1842105263 |
| 0.010649 | 5281.5394736842 |
| 0.010688 | 5270.6394736842 |
| 0.051093 | 29069.0263157895 |
| 0.051132 | 29441.1315789474 |
| 0.019257 | 105790 |
| 0.01922 | 105980 |
| 0.059931 | 62444.0789473684 |
| 0.059968 | 62989.5263157895 |
| 0.087491 | 96730.8684210526 |
| 0.087451 | 96664.7894736842 |
| 0.091379 | 101955.263157895 |
| 0.091416 | 101995.526315789 |
| 0.095143 | 106449.210526316 |
| 0.095103 | 106461.578947368 |
| 0.087889 | 97553.9210526316 |
| 0.08785 | 97504 |
| 0.024428 | 7661.7315789474 |
| 0.014788 | 5407.6947368421 |
| 0.014828 | 5407.6315789474 |
| 0.083541 | 96009.3947368421 |
| 0.037664 | 12100.8157894737 |
| 0.037704 | 11912.6842105263 |
| 0.070311 | 92669.6842105263 |
| 0.031458 | 9692.202631579 |
| 0.031498 | 9714.7368421053 |
| 0.072561 | 84201.3421052632 |
| 0.072598 | 84699.2105263158 |
| 0.067876 | 91029.6578947368 |
| 0.067836 | 91104.0263157895 |
| 0.035998 | 11560.7105263158 |
| 0.036037 | 11357.4473684211 |
| 0.042963 | 82322 |
| 0.022912 | 100400 |
| 0.022874 | 99962.6842105263 |
| 0.0064626 | 143520 |
| 0.0064294 | 143710 |
| 0.085499 | 94484.4210526316 |
| 0.085459 | 94543.5 |
| 0.091528 | 102023.684210526 |
+-------------+------------------+
When I tried to plot it using the line plot, I am getting a really messy plot,
But if I plot it using the symbols then it is making more sense, which is as follows:
As you can clearly notice that there are 2 separate curves (upper and lower) in the plot. I wanted to take the average of the 5-10 adjacent points of upper and lower curve to make the plot smoother (shown by the arrows in the previous figure) and then finally plot it using the line plot. The major issue in taking the average is that the Python/Pandas takes the average of the adjacent data point which is specifically the upper or lower portion of graph (since randomly arranged data). I tried to write a small code to plot it but could not achieve the desired output.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import math
dataframe1 = pd.read_csv('excel_file_trial.csv')
plt.figure(1) #
plt.scatter(dataframe1.iloc[:,0], dataframe1.iloc[0:,1], s=1, marker="o", facecolors='none', edgecolor='black', linewidths=2)
plt.xlabel("x", fontsize=17)
plt.ylabel("y", fontsize=17)
plt.tight_layout()
plt.savefig("trialv0.png", bbox_inches = "tight", format='png', dpi=300)
# figure 3
plt.figure(3) #
plt.plot(dataframe1.iloc[:,0], dataframe1.iloc[0:,1], color="green", linewidth=3.5)
plt.xlabel("x", fontsize=17)
plt.ylabel("y", fontsize=17)
plt.tight_layout()
plt.savefig("trialv2.png", bbox_inches = "tight", format='png', dpi=300)
Could anyone please help/suggest in this regard. Many thanks in advance.

Using pandas as a distance matrix and then getting a sub-dataframe of relevant distances

I have created a pandas df that has the distances between location i and location j. Beginning with a start point P1 and end point P2, I want to find the sub-dataframe (distance matrix) that has one axis of the df having P1, P2 and the other axis having the rest of the indices.
I'm using a Pandas DF because I think its' the most efficient way
dm_dict = # distance matrix in dict form where you can call dm_dict[i][j] and get the distance from i to j
dm_df = pd.DataFrame().from_dict(dm_dict)
P1 = dm_df.max(axis=0).idxmax()
P2 = dm_df[i].idxmax()
route = [i, j]
remaining_locs = dm_df[dm_df[~dm_df.isin(route)].isin(route)]
while not_done:
# go through the remaining_locs until found all the locations are added.
No error messages, but the remaining_locs df is full of nan's rather than a df with the distances.
using dm_df[~dm_df.isin(route)].isin(route) seems to give me a boolean df that is accurate.
sample data, it's technically the haversine distance but the euclidean should be fine for filling up the matrix:
import numpy
def dist(i, j):
a = numpy.array((i[1], i[2]))
b = numpy.array((j[1], j[2]))
return numpy.linalg.norm(a-b)
locations = [
("Ottawa", 45.424722,-75.695),
("Edmonton", 53.533333,-113.5),
("Victoria", 48.428611,-123.365556),
("Winnipeg", 49.899444,-97.139167),
("Fredericton", 49.899444,-97.139167),
("StJohns", 47.561389, -52.7125),
("Halifax", 44.647778, -63.571389),
("Toronto", 43.741667, -79.373333),
("Charlottetown",46.238889, -63.129167),
("QuebecCity",46.816667, -71.216667 ),
("Regina", 50.454722, -104.606667),
("Yellowknife", 62.442222, -114.3975),
("Iqaluit", 63.748611, -68.519722)
]
dm_dict = {i: {j: dist(i, j) for j in locations if j != i} for i in locations}
It looks like you want scipy's distance_matrix:
df = pd.DataFrame(locations)
x = df[[1,2]]
dm = pd.DataFrame(distance_matrix(x,x),
index=df[0],
columns=df[0])
Output:
+----------------+------------+------------+------------+------------+--------------+------------+------------+------------+----------------+-------------+------------+--------------+-----------+
| | Ottawa | Edmonton | Victoria | Winnipeg | Fredericton | StJohns | Halifax | Toronto | Charlottetown | QuebecCity | Regina | Yellowknife | Iqaluit |
+----------------+------------+------------+------------+------------+--------------+------------+------------+------------+----------------+-------------+------------+--------------+-----------+
| 0 | | | | | | | | | | | | | |
+----------------+------------+------------+------------+------------+--------------+------------+------------+------------+----------------+-------------+------------+--------------+-----------+
| Ottawa | 0.000000 | 38.664811 | 47.765105 | 21.906059 | 21.906059 | 23.081609 | 12.148481 | 4.045097 | 12.592181 | 4.689667 | 29.345960 | 42.278586 | 19.678657 |
| Edmonton | 38.664811 | 0.000000 | 11.107987 | 16.759535 | 16.759535 | 61.080146 | 50.713108 | 35.503607 | 50.896264 | 42.813477 | 9.411122 | 8.953983 | 46.125669 |
| Victoria | 47.765105 | 11.107987 | 0.000000 | 26.267600 | 26.267600 | 70.658378 | 59.913580 | 44.241193 | 60.276176 | 52.173796 | 18.867990 | 16.637528 | 56.945306 |
| Winnipeg | 21.906059 | 16.759535 | 26.267600 | 0.000000 | 0.000000 | 44.488147 | 33.976105 | 18.802741 | 34.206429 | 26.105163 | 7.488117 | 21.334745 | 31.794214 |
| Fredericton | 21.906059 | 16.759535 | 26.267600 | 0.000000 | 0.000000 | 44.488147 | 33.976105 | 18.802741 | 34.206429 | 26.105163 | 7.488117 | 21.334745 | 31.794214 |
| StJohns | 23.081609 | 61.080146 | 70.658378 | 44.488147 | 44.488147 | 0.000000 | 11.242980 | 26.933071 | 10.500284 | 18.519147 | 51.974763 | 63.454538 | 22.625084 |
| Halifax | 12.148481 | 50.713108 | 59.913580 | 33.976105 | 33.976105 | 11.242980 | 0.000000 | 15.827902 | 1.651422 | 7.946971 | 41.444115 | 53.851052 | 19.731392 |
| Toronto | 4.045097 | 35.503607 | 44.241193 | 18.802741 | 18.802741 | 26.933071 | 15.827902 | 0.000000 | 16.434995 | 8.717042 | 26.111037 | 39.703942 | 22.761342 |
| Charlottetown | 12.592181 | 50.896264 | 60.276176 | 34.206429 | 34.206429 | 10.500284 | 1.651422 | 16.434995 | 0.000000 | 8.108112 | 41.691201 | 53.767927 | 18.320711 |
| QuebecCity | 4.689667 | 42.813477 | 52.173796 | 26.105163 | 26.105163 | 18.519147 | 7.946971 | 8.717042 | 8.108112 | 0.000000 | 33.587610 | 45.921044 | 17.145385 |
| Regina | 29.345960 | 9.411122 | 18.867990 | 7.488117 | 7.488117 | 51.974763 | 41.444115 | 26.111037 | 41.691201 | 33.587610 | 0.000000 | 15.477744 | 38.457705 |
| Yellowknife | 42.278586 | 8.953983 | 16.637528 | 21.334745 | 21.334745 | 63.454538 | 53.851052 | 39.703942 | 53.767927 | 45.921044 | 15.477744 | 0.000000 | 45.896374 |
| Iqaluit | 19.678657 | 46.125669 | 56.945306 | 31.794214 | 31.794214 | 22.625084 | 19.731392 | 22.761342 | 18.320711 | 17.145385 | 38.457705 | 45.896374 | 0.000000 |
+----------------+------------+------------+------------+------------+--------------+------------+------------+------------+----------------+-------------+------------+--------------+-----------+
I am pretty sure this is what I wanted:
filtered = dm_df.filter(items=route,axis=1).filter(items=set(locations).difference(set(route)), axis=0)
filtered is a df with [2 rows x 10 columns] and then I can find the minimum value from there

Python maze generator explanation

Welcome. Can someone explain me what happens in this code? I would like to know how exactly does this work (it comes from http://rosettacode.org/wiki/Maze_generation#Python).
from random import shuffle, randrange
def make_maze(w = 16, h = 8):
vis = [[0] * w + [1] for _ in range(h)] + [[1] * (w + 1)]
ver = [["| "] * w + ['|'] for _ in range(h)] + [[]]
hor = [["+--"] * w + ['+'] for _ in range(h + 1)]
def walk(x, y):
vis[y][x] = 1
d = [(x - 1, y), (x, y + 1), (x + 1, y), (x, y - 1)]
shuffle(d)
for (xx, yy) in d:
if vis[yy][xx]: continue
if xx == x: hor[max(y, yy)][x] = "+ "
if yy == y: ver[y][max(x, xx)] = " "
walk(xx, yy)
walk(randrange(w), randrange(h))
for (a, b) in zip(hor, ver):
print(''.join(a + ['\n'] + b))
make_maze()
I know nothing about maze generation, but I also got curious about how this piece of code works. Here are some insights:
These two lines print the maze:
for (a, b) in zip(hor, ver):
print(''.join(a + ['\n'] + b))
So what happens if we put these lines right after the three lines that define vis, ver and hor? We get this:
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
You can even put these two lines right before the recursive call walk(xx,yy) and see some steps of maze evolution:
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+ +--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+ +--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+ +--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+ +--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+ +--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+ +--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+ + +--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+ + +--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+ + +--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | |
+--+--+--+ + +--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+ + +--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | |
+--+--+ + + +--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+ + +--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | |
+--+--+ + + +--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+ + + +--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | |
+--+--+ + + +--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+ + + +--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | |
+--+--+ +--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | |
+--+--+ + + +--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+ + + +--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | |
+--+--+ +--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+ +--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | | | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
[...]
Now let's focus on walk(x,y). As its name and the printed output suggest, this function walks around the maze, removing the walls in a random fashion so as to build a path.
This call:
walk(randrange(w), randrange(h))
initializes the walk in a random location within the maze. Every cell in the grid is visited exactly once; the visited cells are marked in vis.
This line initializes an array with all four neighbors of the current cell:
d = [(x - 1, y), (x, y + 1), (x + 1, y), (x, y - 1)]
These are visited in a random order (thanks to shuffle(d))
And these two lines are the ones that remove the walls as the maze path is being built:
# remove horizontal wall, "+--" turns into "+ "
if xx == x: hor[max(y, yy)][x] = "+ "
# remove vertical wall, "|" turns into " "
if yy == y: ver[y][max(x, xx)] = " "
There is more to say about this algorithm (see Jongware's comment), but as far as understanding this particular piece of code goes, these are the kind of things you can do.

Categories