How to create python dataframe from nested json dictionary with increasing key - python

I have a json file with the following structure:
{'0': {'transaction': [{'transaction_key': '406.l.657872.tr.374',
'transaction_id': '374',
'type': 'add/drop',
'status': 'successful',
'timestamp': '1639593953'},
{'players': {'0': {'player': [[{'player_key': '406.p.100006'},
{'player_id': '100006'},
{'name': {'full': 'Dallas',
'first': 'Dallas',
'last': '',
'ascii_first': 'Dallas',
'ascii_last': ''}},
{'editorial_team_abbr': 'Dal'},
{'display_position': 'DEF'},
{'position_type': 'DT'}],
{'transaction_data': [{'type': 'add',
'source_type': 'freeagents',
'destination_type': 'team',
'destination_team_key': '406.l.657872.t.10',
'destination_team_name': 'Team 1'}]}]},
'1': {'player': [[{'player_key': '406.p.24793'},
{'player_id': '24793'},
{'name': {'full': 'Julio Jones',
'first': 'Julio',
'last': 'Jones',
'ascii_first': 'Julio',
'ascii_last': 'Jones'}},
{'editorial_team_abbr': 'Ten'},
{'display_position': 'WR'},
{'position_type': 'O'}],
{'transaction_data': {'type': 'drop',
'source_type': 'team',
'source_team_key': '406.l.657872.t.10',
'source_team_name': 'Team 1',
'destination_type': 'waivers'}}]},
'count': 2}}]},
'1': {'transaction': [{'transaction_key': '406.l.657872.tr.373',
'transaction_id': '373',
'type': 'add/drop',
'status': 'successful',
'timestamp': '1639575496'},
{'players': {'0': {'player': [[{'player_key': '406.p.32722'},
{'player_id': '32722'},
{'name': {'full': 'Cam Akers',
'first': 'Cam',
'last': 'Akers',
'ascii_first': 'Cam',
'ascii_last': 'Akers'}},
{'editorial_team_abbr': 'LAR'},
{'display_position': 'RB'},
{'position_type': 'O'}],
{'transaction_data': [{'type': 'add',
'source_type': 'freeagents',
'destination_type': 'team',
'destination_team_key': '406.l.657872.t.5',
'destination_team_name': 'Team 2'}]}]},
'1': {'player': [[{'player_key': '406.p.100007'},
{'player_id': '100007'},
{'name': {'full': 'Denver',
'first': 'Denver',
'last': '',
'ascii_first': 'Denver',
'ascii_last': ''}},
{'editorial_team_abbr': 'Den'},
{'display_position': 'DEF'},
{'position_type': 'DT'}],
{'transaction_data': {'type': 'drop',
'source_type': 'team',
'source_team_key': '406.l.657872.t.5',
'source_team_name': 'Team 2',
'destination_type': 'waivers'}}]},
'count': 2}}]},
'2': {'transaction': [{'transaction_key': '406.l.657872.tr.372',
'transaction_id': '372',
'type': 'add/drop',
'status': 'successful',
'timestamp': '1639575448'},
{'players': {'0': {'player': [[{'player_key': '406.p.33413'},
{'player_id': '33413'},
{'name': {'full': 'Travis Etienne',
'first': 'Travis',
'last': 'Etienne',
'ascii_first': 'Travis',
'ascii_last': 'Etienne'}},
{'editorial_team_abbr': 'Jax'},
{'display_position': 'RB'},
{'position_type': 'O'}],
{'transaction_data': [{'type': 'add',
'source_type': 'freeagents',
'destination_type': 'team',
'destination_team_key': '406.l.657872.t.5',
'destination_team_name': 'Team 2'}]}]},
'1': {'player': [[{'player_key': '406.p.24815'},
{'player_id': '24815'},
{'name': {'full': 'Mark Ingram II',
'first': 'Mark',
'last': 'Ingram II',
'ascii_first': 'Mark',
'ascii_last': 'Ingram II'}},
{'editorial_team_abbr': 'NO'},
{'display_position': 'RB'},
{'position_type': 'O'}],
{'transaction_data': {'type': 'drop',
'source_type': 'team',
'source_team_key': '406.l.657872.t.5',
'source_team_name': 'Team 2',
'destination_type': 'waivers'}}]},
'count': 2}}]}
These are transactions for a fantasy football league and I'd like to organize each transaction into a dataframe, however I'm running into issues normalizing the data. I figure I'd need to begin a loop, but am slightly stuck in the mud and would appreciate if anyone has any suggestions. Thank You.
Ideally, I'm looking to summarize each transaction with the following dataframe structure:
transaction_id type added pos_1 dropped pos_2 timestamp
374 add/drop Dallas DEF Julio Jones WR 1639593953
373 add/drop Cam Akers RB Denver DEF 1639575496
372 add/drop Travis Etienne RB Mark Ingram II RB 1639575448

Related

How can i append dictionary to my existing key "processed_data" in mongodb using pymongo

I am trying to append a dictionary to my already existing key "processed_data" where data is saved in the list of dictionaries. I tried several methods as shown in already asked questions but they did not work. This is my schema.
{'_id': ObjectId('5fe46a5b7468e3498124fcbe'), 'metadata': {'_id': ObjectId('5fe4500c7b2c03decd86334f'), 'type': 'VIDEO', 'id': 'o6st4ces9Wg"},"qoeUrl":{"baseUrl":"https://s.youtube.com/api/stats/qoe?cl=348521801', 'user_id': 'fc3240b2d7ef9d33bbb04fd7203e35ea9da54ffb', 'name': 'City Ak47', 'thumbnail': 'https://i.ytimg.com/vi/o6st4ces9Wg/hqdefault.jpg', 'title': 'Alex Bhatti ki Video Viral Ho Gie | How To Become Tiktok Star | City AK47 - YouTube', 'publication_date': 'Sep 17, 2020', 'channel_id': 'UCuo6tBl2MfkWvMPyCqph2LA', 'channel_name': 'City Ak47', 'scrape_date': '2020-12-24 08:23:17.390018', 'regions_allowed': 'AD,AE,AF,AG,AI,AL,AM,AO,AQ,AR,AS,AT,AU,AW,AX,AZ,BA,BB,BD,BE,BF,BG,BH,BI,BJ,BL,BM,BN,BO,BQ,BR,BS,BT,BV,BW,BY,BZ,CA,CC,CD,CF,CG,CH,CI,CK,CL,CM,CN,CO,CR,CU,CV,CW,CX,CY,CZ,DE,DJ,DK,DM,DO,DZ,EC,EE,EG,EH,ER,ES,ET,FI,FJ,FK,FM,FO,FR,GA,GB,GD,GE,GF,GG,GH,GI,GL,GM,GN,GP,GQ,GR,GS,GT,GU,GW,GY,HK,HM,HN,HR,HT,HU,ID,IE,IL,IM,IN,IO,IQ,IR,IS,IT,JE,JM,JO,JP,KE,KG,KH,KI,KM,KN,KP,KR,KW,KY,KZ,LA,LB,LC,LI,LK,LR,LS,LT,LU,LV,LY,MA,MC,MD,ME,MF,MG,MH,MK,ML,MM,MN,MO,MP,MQ,MR,MS,MT,MU,MV,MW,MX,MY,MZ,NA,NC,NE,NF,NG,NI,NL,NO,NP,NR,NU,NZ,OM,PA,PE,PF,PG,PH,PK,PL,PM,PN,PR,PS,PT,PW,PY,QA,RE,RO,RS,RU,RW,SA,SB,SC,SD,SE,SG,SH,SI,SJ,SK,SL,SM,SN,SO,SR,SS,ST,SV,SX,SY,SZ,TC,TD,TF,TG,TH,TJ,TK,TL,TM,TN,TO,TR,TT,TV,TW,TZ,UA,UG,UM,US,UY,UZ,VA,VC,VE,VG,VI,VN,VU,WF,WS,YE,YT,ZA,ZM,ZW', 'views': '663962', 'is_family_friendly': 'true', 'category': 'Entertainment', 'tags': ['AmirFilms', 'Alex Bhatti ki Video Viral Ho Gie | How To Become Tiktok Star | City AK47', 'Tiktok star', 'Tiktok', 'Alex tiktokr', 'Alex bhatti tiktok star', 'Alex bhatti', 'Ayesha bukhari', 'Viral video', 'New video', 'Leak vidro', 'Ayesha leak video', 'Alex bhatti leak video', 'News', 'Tiktik funny video'], 'language': 'en-US', 'width': '480', 'height': '360', 'job_id': '539f61c4183c46448a75cfb65dc40926'}, 'results': {'unique_word_freq': [{'text': 'hai', 'value': 6}, {'text': 'famous', 'value': 4}, {'text': 'allah', 'value': 3}, {'text': 'kar', 'value': 3}, {'text': 'gy', 'value': 3}, {'text': 'ye', 'value': 3}, {'text': 'yeh', 'value': 2}, {'text': 'ka', 'value': 2}, {'text': 'video', 'value': 2}, {'text': 'asee', 'value': 2}, {'text': 'nhi', 'value': 2}, {'text': 'ho', 'value': 2}, {'text': 'tum', 'value': 2}, {'text': 'jao', 'value': 2}, {'text': 'kitna', 'value': 1}, {'text': 'budsoor', 'value': 1}, {'text': 'gundgi', 'value': 1}, {'text': 'dher', 'value': 1}, {'text': 'khusra', 'value': 1}, {'text': 'tiktok', 'value': 1}, {'text': 'kunjuro', 'value': 1}, {'text': 'zanano', 'value': 1}, {'text': 'kaam', 'value': 1}, {'text': 'usko', 'value': 1}, {'text': 'hadyat', 'value': 1}, {'text': 'de', 'value': 1}, {'text': 'ameen', 'value': 1}, {'text': '😔', 'value': 1}, {'text': 'kahn', 'value': 1}, {'text': 'puri', 'value': 1}, {'text': 'kotta', 'value': 1}, {'text': 'ٹک', 'value': 1}, {'text': 'ٹاک', 'value': 1}, {'text': 'ایپ', 'value': 1}, {'text': 'پر', 'value': 1}, {'text': 'پاکستان', 'value': 1}, {'text': 'میں', 'value': 1}, {'text': 'مکمل', 'value': 1}, {'text': 'پابندی', 'value': 1}, {'text': 'لگنی', 'value': 1}, {'text': 'چاہیے', 'value': 1}, {'text': 'leaked', 'value': 1}, {'text': 'purpose', 'value': 1}, {'text': 'fame', 'value': 1}, {'text': 'views', 'value': 1}, {'text': 'mean', 'value': 1}, {'text': 'people', 'value': 1}, {'text': 'like', 'value': 1}, {'text': 'kinda', 'value': 1}, {'text': 'cheap', 'value': 1}, {'text': 'acts', 'value': 1}, {'text': 'inki', 'value': 1}, {'text': 'maa', 'value': 1}, {'text': 'bhano', 'value': 1}, {'text': 'sath', 'value': 1}, {'text': 'bhi', 'value': 1}, {'text': 'hoo', 'value': 1}, {'text': 'pak', 'value': 1}, {'text': 'ko', 'value': 1}, {'text': 'bohot', 'value': 1}, {'text': 'bari', 'value': 1}, {'text': 'sazaa', 'value': 1}, {'text': 'dee', 'value': 1}, {'text': 'duniyan', 'value': 1}, {'text': 'hee', 'value': 1}, {'text': 'dikhaee', 'value': 1}, {'text': 'pata', 'value': 1}, {'text': 'khha', 'value': 1}, {'text': 'jay', 'value': 1}, {'text': 'kiyamat', 'value': 1}, {'text': 'din', 'value': 1}, {'text': 'logo', 'value': 1}, {'text': 'hisab', 'value': 1}, {'text': 'lena', 'value': 1}, {'text': 'log', 'value': 1}, {'text': 'sidah', 'value': 1}, {'text': 'janat', 'value': 1}, {'text': 'chaly', 'value': 1}, {'text': 'baaz', 'value': 1}, {'text': 'ap', 'value': 1}, {'text': 'bakwas', 'value': 1}, {'text': 'band', 'value': 1}, {'text': 'kareen', 'value': 1}, {'text': 'larka', 'value': 1}, {'text': 'bharva', 'value': 1}, {'text': 'bs', 'value': 1}, {'text': 'pakar', 'value': 1}, {'text': 'gal', 'value': 1}, {'text': 'ma', 'value': 1}, {'text': 'dala', 'value': 1}, {'text': 'gaya', 'value': 1}, {'text': 'bahut', 'value': 1}, {'text': 'ghatiya', 'value': 1}, {'text': 'insan', 'value': 1}, {'text': 'tu', 'value': 1}, {'text': 'chakka', 'value': 1}, {'text': 'alex', 'value': 1}, {'text': 'bhatti', 'value': 1}], 'polarity_freq': [{'date': '2020-12-03', 'total': 4, 'positive': 3, 'negative': 1}, {'date': '2020-12-10', 'total': 9, 'positive': 8, 'negative': 1}, {'date': '2020-12-17', 'total': 2, 'positive': 2, 'negative': 0}, {'date': '2020-12-21', 'total': 1, 'positive': 1, 'negative': 0}, {'date': '2020-12-22', 'total': 2, 'positive': 1, 'negative': 1}], 'polarity_dist': [{'name': 'positive', 'value': '15'}, {'name': 'negative', 'value': '3'}], 'assoc': []}, 'processed_data': [{'index': 0, '_id': ObjectId('5fe4500c7b2c03decd863350'), 'channel_id': '/channel/UCg7rf8yXy8wqVxlbnErgdyg', 'clean_text': 'kitna budsoor hai yeh gundgi ka dher khusra', 'comment_user_image': 'https://yt3.ggpht.com/ytc/AAUvwniDMBIClPo0sPLX5RDOLPHTJhECMOub-fC0ZTVY6Q=s48-c-k-c0xffffffff-no-rj-mo', 'datetime': '2020-12-22 08:23:17', 'id': 'UgxzkGuC2JpeaZD7El14AaABAg', 'job_id': '539f61c4183c46448a75cfb65dc40926', 'lang': 'ro-ur', 'likes': 0, 'orig_lang': 'unknown', 'published_time_display': '2 days ago', 'replies': None, 'reply_to': None, 'scrape_date': '2020-12-24 08:23:17.955821', 'text': 'Kitna budsoor hai yeh gundgi ka dher khusra.', 'tokens': ['kitna', 'budsoor', 'hai', 'yeh', 'gundgi', 'ka', 'dher', 'khusra'], 'tokens_no_swords': ['kitna', 'budsoor', 'hai', 'yeh', 'gundgi', 'ka', 'dher', 'khusra'], 'tran_text': 'kitna budsoor hai yeh gundgi ka dher khusra .', 'type': 'COMMENT', 'user_id': 'f7961259b974ba9fae934410fca2e939d3493038', 'user_name': 'jimmi khan', 'video_id': 'o6st4ces9Wg', 'is_hate': '1', 'date': '2020-12-22'}, {'index': 1, '_id': ObjectId('5fe4500c7b2c03decd863351'), 'channel_id': '/channel/UCg7rf8yXy8wqVxlbnErgdyg', 'clean_text': 'tiktok kunjuro zanano ka kaam hai', 'comment_user_image': 'https://yt3.ggpht.com/ytc/AAUvwniDMBIClPo0sPLX5RDOLPHTJhECMOub-fC0ZTVY6Q=s48-c-k-c0xffffffff-no-rj-mo', 'datetime': '2020-12-22 08:23:17', 'id': 'UgwntMkhi7J2l2N3MZJ4AaABAg', 'job_id': '539f61c4183c46448a75cfb65dc40926', 'lang': 'ro-ur', 'likes': 0, 'orig_lang': 'unknown', 'published_time_display': '2 days ago (edited)', 'replies': None, 'reply_to': None, 'scrape_date': '2020-12-24 08:23:17.955821', 'text': 'Tiktok kunjuro r zanano ka kaam hai.', 'tokens': ['tiktok', 'kunjuro', 'zanano', 'ka', 'kaam', 'hai'], 'tokens_no_swords': ['tiktok', 'kunjuro', 'zanano', 'ka', 'kaam', 'hai'], 'tran_text': 'tiktok kunjuro r zanano ka kaam hai .', 'type': 'COMMENT', 'user_id': 'f7961259b974ba9fae934410fca2e939d3493038', 'user_name': 'jimmi khan', 'video_id': 'o6st4ces9Wg', 'is_hate': '0', 'date': '2020-12-22'}, {'index': 2, '_id': ObjectId('5fe4500c7b2c03decd863352'), 'channel_id': '/channel/UCMDNByou1B62upgmnv-UQMw', 'clean_text': 'allah usko hadyat de ameen 😔', 'comment_user_image': 'https://yt3.ggpht.com/ytc/AAUvwnik2uW0mzYoagKEYX1_kGY3HDhYd3Ni6UlOxSEHOA=s48-c-k-c0xffffffff-no-rj-mo', 'datetime': '2020-12-21 08:23:17', 'id': 'UgxZrbzomoOLyGEGAjp4AaABAg', 'job_id': '539f61c4183c46448a75cfb65dc40926', 'lang': 'ro-ur', 'likes': 0, 'orig_lang': 'unknown', 'published_time_display': '3 days ago', 'replies': None, 'reply_to': None, 'scrape_date': '2020-12-24 08:23:17.955821', 'text': 'Allah usko hadyat de ameen 😔', 'tokens': ['allah', 'usko', 'hadyat', 'de', 'ameen', '😔'], 'tokens_no_swords': ['allah', 'usko', 'hadyat', 'de', 'ameen', '😔'], 'tran_text': 'allah usko hadyat de ameen 😔', 'type': 'COMMENT', 'user_id': 'da9fe12c7945488a70f56355f8c122d2f35231c5', 'user_name': 'neha Rajput', 'video_id': 'o6st4ces9Wg', 'is_hate': '0', 'date': '2020-12-21'}, {'index': 3, '_id': ObjectId('5fe4500c7b2c03decd863353'), 'channel_id': '/channel/UCkl4U918shu8CroBno8-aJg', 'clean_text': '', 'comment_user_image': 'https://yt3.ggpht.com/ytc/AAUvwniom0S4ta4uSnNx7yD69NfR4TmOqXPpYxv6_Q=s48-c-k-c0xffffffff-no-rj-mo', 'datetime': '2020-12-17 08:23:17', 'id': 'UgzAfvcluRdyX9yi-JJ4AaABAg', 'job_id': '539f61c4183c46448a75cfb65dc40926', 'lang': 'ro-ur', 'likes': 0, 'orig_lang': 'unknown', 'published_time_display': '1 week ago', 'replies': None, 'reply_to': None, 'scrape_date': '2020-12-24 08:23:17.955821', 'text': '420', 'tokens': [], 'tokens_no_swords': [], 'tran_text': '420', 'type': 'COMMENT', 'user_id': '04b4dd4534a4acf47ba876387d752eda8d3087f6', 'user_name': 'Shahid Khankarachi', 'video_id': 'o6st4ces9Wg', 'is_hate': '0', 'date': '2020-12-17'}, {'index': 5, '_id': ObjectId('5fe4500c7b2c03decd863355'), 'channel_id': '/channel/UCoL0h9EyBTNSvKIWIxl6WIg', 'clean_text': 'kahn hai yeh puri video', 'comment_user_image': 'https://yt3.ggpht.com/ytc/AAUvwnjNJAzxyS9mOk-R7TF5ICxa0_EQbtgcL3z2Yg=s48-c-k-c0xffffffff-no-rj-mo', 'datetime': '2020-12-17 08:23:17', 'id': 'Ugxx-JvihbK7P8Y8u5x4AaABAg', 'job_id': '539f61c4183c46448a75cfb65dc40926', 'lang': 'ro-ur', 'likes': 0, 'orig_lang': 'unknown', 'published_time_display': '1 week ago', 'replies': None, 'reply_to': None, 'scrape_date': '2020-12-24 08:23:17.955821', 'text': 'Kahn hai yeh puri video', 'tokens': ['kahn', 'hai', 'yeh', 'puri', 'video'], 'tokens_no_swords': ['kahn', 'hai', 'yeh', 'puri', 'video'], 'tran_text': 'kahn hai yeh puri video', 'type': 'COMMENT', 'user_id': '3bc410f7e5133b61e2f2cc790ce6ae2692397778', 'user_name': 'ALISHA ZOYA', 'video_id': 'o6st4ces9Wg', 'is_hate': '0', 'date': '2020-12-17'}, {'index': 6, '_id': ObjectId('5fe4500c7b2c03decd863356'), 'channel_id': '/channel/UCsd6TX3yWpNYK55hawyi8qw', 'clean_text': 'kotta', 'comment_user_image': 'https://yt3.ggpht.com/ytc/AAUvwnj19uWVIJ75wx27KLjDGDcsVcGtzVtp8SRQ0w=s48-c-k-c0xffffffff-no-rj-mo', 'datetime': '2020-12-10 08:23:17', 'id': 'UgyhHfBJDWHtL73E71N4AaABAg', 'job_id': '539f61c4183c46448a75cfb65dc40926', 'lang': 'ro-ur', 'likes': 0, 'orig_lang': 'unknown', 'published_time_display': '2 weeks ago', 'replies': None, 'reply_to': None, 'scrape_date': '2020-12-24 08:23:17.955821', 'text': 'Kotta', 'tokens': ['kotta'], 'tokens_no_swords': ['kotta'], 'tran_text': 'kotta', 'type': 'COMMENT', 'user_id': 'e83b422e66c1bd722306aee6715c3846c32e506b', 'user_name': 'Shakeel Khan', 'video_id': 'o6st4ces9Wg', 'is_hate': '0', 'date': '2020-12-10'}, {'index': 7, '_id': ObjectId('5fe4500c7b2c03decd863357'), 'channel_id': '/channel/UCX6LjA5LbC7xMO19yyM7m0Q', 'clean_text': 'ٹک ٹاک ایپ پر پاکستان میں مکمل پابندی لگنی چاہیے', 'comment_user_image': 'https://yt3.ggpht.com/ytc/AAUvwni4WpKnrXzHmw2VwT0z5aYnM0T5IhRN0DG3Pmsg=s48-c-k-c0xffffffff-no-rj-mo', 'datetime': '2020-12-10 08:23:17', 'id': 'UgwFILpwDAQKYA9ioMV4AaABAg', 'job_id': '539f61c4183c46448a75cfb65dc40926', 'lang': 'en', 'likes': 0, 'orig_lang': 'unknown', 'published_time_display': '2 weeks ago', 'replies': None, 'reply_to': None, 'scrape_date': '2020-12-24 08:23:17.955821', 'text': 'ٹک ٹاک ایپ پر پاکستان میں مکمل پابندی لگنی چاہیے', 'tokens': ['ٹک', 'ٹاک', 'ایپ', 'پر', 'پاکستان', 'میں', 'مکمل', 'پابندی', 'لگنی', 'چاہیے'], 'tokens_no_swords': ['ٹک', 'ٹاک', 'ایپ', 'پر', 'پاکستان', 'میں', 'مکمل', 'پابندی', 'لگنی', 'چاہیے'], 'tran_text': 'ٹک ٹاک ایپ پر پاکستان میں مکمل پابندی لگنی چاہیے', 'type': 'COMMENT', 'user_id': '9ca6083ff6234bd94fc218ef27d12c8b91c2fa33', 'user_name': 'Wahab Mirza', 'video_id': 'o6st4ces9Wg', 'is_hate': '0', 'date': '2020-12-10'}, {'index': 8, '_id': ObjectId('5fe4500c7b2c03decd863358'), 'channel_id': '/channel/UCKMvpfSppW24ixWCOmJDu_g', 'clean_text': 'he leaked this video on purpose to get fame and views i mean people like them do these kinda cheap acts to get famous', 'comment_user_image': 'https://yt3.ggpht.com/ytc/AAUvwniL8sePcWPsqDg6AOaLsW4nf14XDW3132kC0Q=s48-c-k-c0xffffffff-no-rj-mo', 'datetime': '2020-12-10 08:23:17', 'id': 'Ugz7IdhD6s8zCJ6vHNt4AaABAg', 'job_id': '539f61c4183c46448a75cfb65dc40926', 'lang': 'en', 'likes': 0, 'orig_lang': 'unknown', 'published_time_display': '2 weeks ago', 'replies': None, 'reply_to': None, 'scrape_date': '2020-12-24 08:23:17.955821', 'text': 'He leaked this video on purpose to get fame and views...I mean people like them do these kinda cheap acts to get famous.', 'tokens': ['he', 'leaked', 'this', 'video', 'on', 'purpose', 'to', 'get', 'fame', 'and', 'views', 'i', 'mean', 'people', 'like', 'them', 'do', 'these', 'kinda', 'cheap', 'acts', 'to', 'get', 'famous'], 'tokens_no_swords': ['leaked', 'video', 'purpose', 'fame', 'views', 'mean', 'people', 'like', 'kinda', 'cheap', 'acts', 'famous'], 'tran_text': 'he leaked this video on purpose to get fame and views ... i mean people like them do these kinda cheap acts to get famous .', 'type': 'COMMENT', 'user_id': '3f75585892685df3ae4b3d733d9795a719b2d528', 'user_name': 'Ana T', 'video_id': 'o6st4ces9Wg', 'is_hate': '0', 'date': '2020-12-10'}, {'index': 9, '_id': ObjectId('5fe4500c7b2c03decd863359'), 'channel_id': '/channel/UCSUGRfHKn5qCNN4TKG3MAkw', 'clean_text': '', 'comment_user_image': 'https://yt3.ggpht.com/ytc/AAUvwng6eAHeRd7CcM8mmkCHCA8VI2tqmMNPb1q1MA=s48-c-k-c0xffffffff-no-rj-mo', 'datetime': '2020-12-10 08:23:17', 'id': 'UgwMvbYR29V4ISyic_d4AaABAg', 'job_id': '539f61c4183c46448a75cfb65dc40926', 'lang': 'en', 'likes': 0, 'orig_lang': 'unknown', 'published_time_display': '2 weeks ago', 'replies': None, 'reply_to': None, 'scrape_date': '2020-12-24 08:23:17.955821', 'text': '03065455318', 'tokens': [], 'tokens_no_swords': [], 'tran_text': '03065455318', 'type': 'COMMENT', 'user_id': '23ac482ba9b36182915c502c13d4cd45b7f7bf1f', 'user_name': 'Ali Rizwan', 'video_id': 'o6st4ces9Wg', 'is_hate': '0', 'date': '2020-12-10'}, {'index': 10, '_id': ObjectId('5fe4500c7b2c03decd86335a'), 'channel_id': '/channel/UC-fWQ2vkmngdVliZDkRSiiQ', 'clean_text': 'inki maa bhano sath bhi asee hoo', 'comment_user_image': 'https://yt3.ggpht.com/ytc/AAUvwngl0V2Zy_AUGUyIZpMbrBDxqL6pq5AcdF4hNg=s48-c-k-c0xffffffff-no-rj-mo', 'datetime': '2020-12-10 08:23:17', 'id': 'UgzEMbiRNk2ywQHlgxR4AaABAg', 'job_id': '539f61c4183c46448a75cfb65dc40926', 'lang': 'ro-ur', 'likes': 0, 'orig_lang': 'unknown', 'published_time_display': '2 weeks ago', 'replies': None, 'reply_to': None, 'scrape_date': '2020-12-24 08:23:17.955821', 'text': 'Inki maa bhano k sath bhi Asee hoo', 'tokens': ['inki', 'maa', 'bhano', 'sath', 'bhi', 'asee', 'hoo'], 'tokens_no_swords': ['inki', 'maa', 'bhano', 'sath', 'bhi', 'asee', 'hoo'], 'tran_text': 'inki maa bhano k sath bhi asee hoo', 'type': 'COMMENT', 'user_id': '2c7e490aa5d0ceca9340c92c3577fa75d3e5a8d3', 'user_name': 'M wali Yousuf', 'video_id': 'o6st4ces9Wg', 'is_hate': '0', 'date': '2020-12-10'}, {'index': 11, '_id': ObjectId('5fe4500c7b2c03decd86335b'), 'channel_id': '/channel/UC-fWQ2vkmngdVliZDkRSiiQ', 'clean_text': 'allah pak asee ko bohot bari sazaa dee or duniyan me hee dikhaee', 'comment_user_image': 'https://yt3.ggpht.com/ytc/AAUvwngl0V2Zy_AUGUyIZpMbrBDxqL6pq5AcdF4hNg=s48-c-k-c0xffffffff-no-rj-mo', 'datetime': '2020-12-10 08:23:17', 'id': 'Ugyo9YlNa7zuVsSQlZh4AaABAg', 'job_id': '539f61c4183c46448a75cfb65dc40926', 'lang': 'ro-ur', 'likes': 0, 'orig_lang': 'unknown', 'published_time_display': '2 weeks ago', 'replies': None, 'reply_to': None, 'scrape_date': '2020-12-24 08:23:17.955821', 'text': 'Allah Pak Asee ko Bohot bari Sazaa Dee Or Duniyan me hee dikhaee', 'tokens': ['allah', 'pak', 'asee', 'ko', 'bohot', 'bari', 'sazaa', 'dee', 'or', 'duniyan', 'me', 'hee', 'dikhaee'], 'tokens_no_swords': ['allah', 'pak', 'asee', 'ko', 'bohot', 'bari', 'sazaa', 'dee', 'duniyan', 'hee', 'dikhaee'], 'tran_text': 'allah pak asee ko bohot bari sazaa dee or duniyan me hee dikhaee', 'type': 'COMMENT', 'user_id': '2c7e490aa5d0ceca9340c92c3577fa75d3e5a8d3', 'user_name': 'M wali Yousuf', 'video_id': 'o6st4ces9Wg', 'is_hate': '0', 'date': '2020-12-10'}, {'index': 12, '_id': ObjectId('5fe4500c7b2c03decd86335c'), 'channel_id': '/channel/UCphcNEEoxrp08DARCX7dSNQ', 'clean_text': 'pata nhi famous ho kar khha jay gy kiyamat din allah famous logo hisab lena hai ye nhi tum log famous ho gy or sidah janat chaly jao gy baaz a jao', 'comment_user_image': 'https://yt3.ggpht.com/ytc/AAUvwnibNbjcM0UMLW2aTnOD3jfJXlaq2Iq5_hMg3Q-O=s48-c-k-c0xffffffff-no-rj-mo', 'datetime': '2020-12-10 08:23:17', 'id': 'UgzfXCaZHDMSCKPqlKB4AaABAg', 'job_id': '539f61c4183c46448a75cfb65dc40926', 'lang': 'ro-ur', 'likes': '1', 'orig_lang': 'unknown', 'published_time_display': '2 weeks ago', 'replies': None, 'reply_to': None, 'scrape_date': '2020-12-24 08:23:17.955821', 'text': 'Pata nhi famous ho Kar khha jay gy. Kiyamat k din Allah n famous logo c b hisab lena hai ye nhi k tum log famous ho gy or sidah janat m chaly jao gy. Baaz a jao', 'tokens': ['pata', 'nhi', 'famous', 'ho', 'kar', 'khha', 'jay', 'gy', 'kiyamat', 'din', 'allah', 'famous', 'logo', 'hisab', 'lena', 'hai', 'ye', 'nhi', 'tum', 'log', 'famous', 'ho', 'gy', 'or', 'sidah', 'janat', 'chaly', 'jao', 'gy', 'baaz', 'a', 'jao'], 'tokens_no_swords': ['pata', 'nhi', 'famous', 'ho', 'kar', 'khha', 'jay', 'gy', 'kiyamat', 'din', 'allah', 'famous', 'logo', 'hisab', 'lena', 'hai', 'ye', 'nhi', 'tum', 'log', 'famous', 'ho', 'gy', 'sidah', 'janat', 'chaly', 'jao', 'gy', 'baaz', 'jao'], 'tran_text': 'pata nhi famous ho kar khha jay gy . kiyamat k din allah n famous logo c b hisab lena hai ye nhi k tum log famous ho gy or sidah janat m chaly jao gy . baaz a jao', 'type': 'COMMENT', 'user_id': 'da5c845fbd0a39db29a99a9d620bd8c266956065', 'user_name': 'Rida Khan', 'video_id': 'o6st4ces9Wg', 'is_hate': '1', 'date': '2020-12-10'}, {'index': 13, '_id': ObjectId('5fe4500c7b2c03decd86335d'), 'channel_id': '/channel/UCnVVsV2fd3P0lS9QClU5DCA', 'clean_text': 'ap bakwas band kareen', 'comment_user_image': 'https://yt3.ggpht.com/ytc/AAUvwniJxcGaZeKzmvDSUGeX5vFZo3m_ZXQ_yC7-Kw=s48-c-k-c0xffffffff-no-rj-mo', 'datetime': '2020-12-10 08:23:17', 'id': 'UgwzEHvn_WWTbPOzl7x4AaABAg', 'job_id': '539f61c4183c46448a75cfb65dc40926', 'lang': 'ro-ur', 'likes': 0, 'orig_lang': 'unknown', 'published_time_display': '2 weeks ago', 'replies': None, 'reply_to': None, 'scrape_date': '2020-12-24 08:23:17.955821', 'text': 'Ap bakwas band kareen', 'tokens': ['ap', 'bakwas', 'band', 'kareen'], 'tokens_no_swords': ['ap', 'bakwas', 'band', 'kareen'], 'tran_text': 'ap bakwas band kareen', 'type': 'COMMENT', 'user_id': 'e51ff2c592a9ad2fe8f6f373c6a2dab117f2c2e9', 'user_name': 'ahmad muaaz', 'video_id': 'o6st4ces9Wg', 'is_hate': '0', 'date': '2020-12-10'}, {'index': 14, '_id': ObjectId('5fe4500c7b2c03decd86335e'), 'channel_id': '/channel/UCtbUvUvL0qrREfEzUvJUQKQ', 'clean_text': 'ye larka bharva', 'comment_user_image': 'https://yt3.ggpht.com/ytc/AAUvwnjKBfmUKuMzFCtwM-KuKAfq_5y0RA7iez5w9Q=s48-c-k-c0xffffffff-no-rj-mo', 'datetime': '2020-12-10 08:23:17', 'id': 'UgzvW63udc6CEwnsb_J4AaABAg', 'job_id': '539f61c4183c46448a75cfb65dc40926', 'lang': 'ro-ur', 'likes': 0, 'orig_lang': 'unknown', 'published_time_display': '2 weeks ago', 'replies': None, 'reply_to': None, 'scrape_date': '2020-12-24 08:23:17.955821', 'text': 'ye larka bharva', 'tokens': ['ye', 'larka', 'bharva'], 'tokens_no_swords': ['ye', 'larka', 'bharva'], 'tran_text': 'ye larka bharva', 'type': 'COMMENT', 'user_id': '7d7c7165c38c05a96c335421faf6ca3eb9eb1722', 'user_name': 'Rana Waqas', 'video_id': 'o6st4ces9Wg', 'is_hate': '0', 'date': '2020-12-10'}, {'index': 15, '_id': ObjectId('5fe4500c7b2c03decd86335f'), 'channel_id': '/channel/UCieduNjSrF2DPawdZh_HesQ', 'clean_text': 'bs kar do tum', 'comment_user_image': 'https://yt3.ggpht.com/ytc/AAUvwniOEn-mQTkzQu5ybCc6gjFqSlK8eQF-4RsB6w=s48-c-k-c0xffffffff-no-rj-mo', 'datetime': '2020-12-03 08:23:17', 'id': 'Ugx9gyNwYeVV5DDKsJV4AaABAg', 'job_id': '539f61c4183c46448a75cfb65dc40926', 'lang': 'ro-ur', 'likes': 0, 'orig_lang': 'unknown', 'published_time_display': '3 weeks ago', 'replies': None, 'reply_to': None, 'scrape_date': '2020-12-24 08:23:17.955821', 'text': 'bs kar do tum', 'tokens': ['bs', 'kar', 'do', 'tum'], 'tokens_no_swords': ['bs', 'kar', 'tum'], 'tran_text': 'bs kar do tum', 'type': 'COMMENT', 'user_id': '7339de48d380c9efe854dc9b6660a8fe22c28448', 'user_name': 'sami ali ali', 'video_id': 'o6st4ces9Wg', 'is_hate': '0', 'date': '2020-12-03'}, {'index': 16, '_id': ObjectId('5fe4500c7b2c03decd863360'), 'channel_id': '/channel/UCaPZsZzHcOMiDgZ3rkALMFg', 'clean_text': 'is pakar kar gal ma dala gaya', 'comment_user_image': 'https://yt3.ggpht.com/ytc/AAUvwnhv4vXDX16Pi0veGMZVUtqiYiYq_XOUp2yTvQ=s48-c-k-c0xffffffff-no-rj-mo', 'datetime': '2020-12-03 08:23:17', 'id': 'UgzC4DaDahEFvjUsJiN4AaABAg', 'job_id': '539f61c4183c46448a75cfb65dc40926', 'lang': 'ro-ur', 'likes': 0, 'orig_lang': 'unknown', 'published_time_display': '3 weeks ago', 'replies': None, 'reply_to': None, 'scrape_date': '2020-12-24 08:23:17.955821', 'text': 'Is pakar kar gal ma dala gaya', 'tokens': ['is', 'pakar', 'kar', 'gal', 'ma', 'dala', 'gaya'], 'tokens_no_swords': ['pakar', 'kar', 'gal', 'ma', 'dala', 'gaya'], 'tran_text': 'is pakar kar gal ma dala gaya', 'type': 'COMMENT', 'user_id': 'ae974ed633daab66164b5dcee9340e2ed0b1c455', 'user_name': 'munir gill', 'video_id': 'o6st4ces9Wg', 'is_hate': '1', 'date': '2020-12-03'}, {'index': 17, '_id': ObjectId('5fe4500c7b2c03decd863361'), 'channel_id': '/channel/UCqGRwUGDEBY98v0PcA9BpUQ', 'clean_text': 'bahut ghatiya insan hai', 'comment_user_image': 'https://yt3.ggpht.com/ytc/AAUvwnhMts6KGq4VtnvbDuVVatNlFduO6jmHbIRX6A=s48-c-k-c0xffffffff-no-rj-mo', 'datetime': '2020-12-03 08:23:17', 'id': 'UgyyCAdkym_IDptSbNZ4AaABAg', 'job_id': '539f61c4183c46448a75cfb65dc40926', 'lang': 'ro-ur', 'likes': 0, 'orig_lang': 'unknown', 'published_time_display': '3 weeks ago', 'replies': None, 'reply_to': None, 'scrape_date': '2020-12-24 08:23:17.955821', 'text': 'Bahut ghatiya insan hai', 'tokens': ['bahut', 'ghatiya', 'insan', 'hai'], 'tokens_no_swords': ['bahut', 'ghatiya', 'insan', 'hai'], 'tran_text': 'bahut ghatiya insan hai', 'type': 'COMMENT', 'user_id': '331593ec91edb449953d775229e7a91727415976', 'user_name': 'Asif Bhatti', 'video_id': 'o6st4ces9Wg', 'is_hate': '0', 'date': '2020-12-03'}, {'index': 18, '_id': ObjectId('5fe4500c7b2c03decd863362'), 'channel_id': '/channel/UCBZ0mLPPioFWW1i-kvmZBnA', 'clean_text': 'ye tu chakka hai alex bhatti', 'comment_user_image': 'https://yt3.ggpht.com/ytc/AAUvwnggrksT4HvfysI9VkzPzsKIXkcJsPfmWvvNyg=s48-c-k-c0xffffffff-no-rj-mo', 'datetime': '2020-12-03 08:23:17', 'id': 'Ugwh57O9lzDJgzCvKJV4AaABAg', 'job_id': '539f61c4183c46448a75cfb65dc40926', 'lang': 'ro-ur', 'likes': 0, 'orig_lang': 'unknown', 'published_time_display': '3 weeks ago', 'replies': None, 'reply_to': None, 'scrape_date': '2020-12-24 08:23:17.955821', 'text': 'Ye tu chakka hai alex bhatti', 'tokens': ['ye', 'tu', 'chakka', 'hai', 'alex', 'bhatti'], 'tokens_no_swords': ['ye', 'tu', 'chakka', 'hai', 'alex', 'bhatti'], 'tran_text': 'ye tu chakka hai alex bhatti', 'type': 'COMMENT', 'user_id': 'ee99e20e5128b5fc14c1972d55625585cf4d0237', 'user_name': 'Khizar Rao', 'video_id': 'o6st4ces9Wg', 'is_hate': '0', 'date': '2020-12-03'}]}
I want to append the following dictionary against the "processed_data" key.
{'index': 19, '_id': ObjectId('5fe4500c7b2c03decd863362'), 'channel_id': '/channel/UCBZ0mLPPioFWW1i-kvmZBnA', 'clean_text': 'ye tu chakka hai alex bhatti', 'comment_user_image': 'https://yt3.ggpht.com/ytc/AAUvwnggrksT4HvfysI9VkzPzsKIXkcJsPfmWvvNyg=s48-c-k-c0xffffffff-no-rj-mo', 'datetime': '2020-12-03 08:23:17', 'id': 'Ugwh57O9lzDJgzCvKJV4AaABAg', 'job_id': '539f61c4183c46448a75cfb65dc40926', 'lang': 'ro-ur', 'likes': 0, 'orig_lang': 'unknown', 'published_time_display': '3 weeks ago', 'replies': None, 'reply_to': None, 'scrape_date': '2020-12-24 08:23:17.955821', 'text': 'Ye tu chakka hai alex bhatti', 'tokens': ['ye', 'tu', 'chakka', 'hai', 'alex', 'bhatti'], 'tokens_no_swords': ['ye', 'tu', 'chakka', 'hai', 'alex', 'bhatti'], 'tran_text': 'ye tu chakka hai alex bhatti', 'type': 'COMMENT', 'user_id': 'ee99e20e5128b5fc14c1972d55625585cf4d0237', 'user_name': 'Khizar Rao', 'video_id': 'o6st4ces9Wg', 'is_hate': '0', 'date': '2020-12-03'}
Thanks!
Use update_one() with $push:
from pymongo import MongoClient
from bson import ObjectId
db = MongoClient()['mydatabase']
db.mycollection.insert_one({'_id': ObjectId('5fe46a5b7468e3498124fcbe'),
'processed_data': []})
update = {'index': 19, '_id': ObjectId('5fe4500c7b2c03decd863362'), 'channel_id': 'etc.'}
db.mycollection.update_one({'_id': ObjectId('5fe46a5b7468e3498124fcbe')}, {'$push': {'processed_data': update}})

Use the column of a dataframe that has a list of dictionaries to create other columns for the dataframe

I have a column in my dataframe of type object that has values like:
for i in df3['placeholders'][:10]:
Output:
[{'type': 'experience', 'label': '0-1 Yrs'}, {'type': 'salary', 'label': '1,00,000 - 1,25,000 PA.'}, {'type': 'location', 'label': 'Chennai'}]
[{'type': 'date', 'label': '08 October - 13 October'}, {'type': 'salary', 'label': 'Not disclosed'}, {'type': 'location', 'label': 'Chennai'}]
[{'type': 'education', 'label': 'B.Com'}, {'type': 'salary', 'label': 'Not disclosed'}, {'type': 'location', 'label': 'Mumbai Suburbs, Navi Mumbai, Mumbai'}]
[{'type': 'experience', 'label': '0-2 Yrs'}, {'type': 'salary', 'label': '50,000 - 2,00,000 PA.'}, {'type': 'location', 'label': 'Chennai'}]
[{'type': 'experience', 'label': '0-1 Yrs'}, {'type': 'salary', 'label': '2,00,000 - 2,25,000 PA.'}, {'type': 'location', 'label': 'Bengaluru(JP Nagar)'}]
[{'type': 'experience', 'label': '0-3 Yrs'}, {'type': 'salary', 'label': '80,000 - 2,00,000 PA.'}, {'type': 'location', 'label': 'Hyderabad'}]
[{'type': 'experience', 'label': '0-5 Yrs'}, {'type': 'salary', 'label': 'Not disclosed'}, {'type': 'location', 'label': 'Hyderabad'}]
[{'type': 'experience', 'label': '0-1 Yrs'}, {'type': 'salary', 'label': '1,25,000 - 2,00,000 PA.'}, {'type': 'location', 'label': 'Mumbai'}]
[{'type': 'date', 'label': '08 October - 17 October'}, {'type': 'salary', 'label': 'Not disclosed'}, {'type': 'location', 'label': 'Pune(Bavdhan)'}]
[{'type': 'experience', 'label': '0-2 Yrs'}, {'type': 'salary', 'label': 'Not disclosed'}, {'type': 'location', 'label': 'Jaipur'}]
[{'type': 'experience', 'label': '0-0 Yrs'}, {'type': 'salary', 'label': '1,00,000 - 1,50,000 PA.'}, {'type': 'location', 'label': 'Delhi NCR(Sector-81 Noida)'}]
I want to add more columns to my existing dataframe by extracting features from this column such that
value of "type"= Column name
value of "label"= value under the column
The final expected output:
df.head(3)
Output:
..... experience, salary, location, date, education
..... 0-1 Yrs, 1,00,000 - 1,25,000 PA., Chennai, nan, nan
..... nan, 1,00,000 - 1,25,000 PA., Chennai, 08 October - 13 October, nan
..... nan, Not disclosed, Mumbai Suburbs, Navi Mumbai, Mumbai, nan, B.Com
The first answer worked.
[EDIT 2]
Later, I tried the same code suggested in the first response for a new dataset with same issue. I got the following error:
<ipython-input-23-ad8e644044af> in <listcomp>(.0)
----> 1 new_columns = set([d['Name'] for l in dfr.RatingDistribution.values for d in l ])
2 # Make a dict of dicts
3 col_val_dict = {}
4 for col_name in new_columns:
5 col_val_dict[col_name] = {}
TypeError: 'float' object is not iterable
My Input column:
RatingDistribution
[{'Name': 'Work-Life Balance', 'count': 5}, {'Name': 'Skill Development', 'count': 5}, {'Name': 'Salary & Benefits', 'count': 5}, {'Name': 'Job Security', 'count': 5}, {'Name': 'Company Culture', 'count': 5}, {'Name': 'Career Growth', 'count': 5}, {'Name': 'Work Satisfaction', 'count': 5}]
[{'Name': 'Work-Life Balance', 'count': 4}, {'Name': 'Skill Development', 'count': 5}, {'Name': 'Salary & Benefits', 'count': 4}, {'Name': 'Job Security', 'count': 4}, {'Name': 'Company Culture', 'count': 3}, {'Name': 'Career Growth', 'count': 3}, {'Name': 'Work Satisfaction', 'count': 5}]
[{'Name': 'Work-Life Balance', 'count': 3}, {'Name': 'Skill Development', 'count': 4}, {'Name': 'Salary & Benefits', 'count': 5}, {'Name': 'Job Security', 'count': 4}, {'Name': 'Company Culture', 'count': 5}, {'Name': 'Career Growth', 'count': 4}, {'Name': 'Work Satisfaction', 'count': 4}]
[{'Name': 'Work-Life Balance', 'count': 5}, {'Name': 'Skill Development', 'count': 5}, {'Name': 'Salary & Benefits', 'count': 5}, {'Name': 'Job Security', 'count': 5}, {'Name': 'Company Culture', 'count': 5}, {'Name': 'Career Growth', 'count': 5}, {'Name': 'Work Satisfaction', 'count': 5}]
[{'Name': 'Work-Life Balance', 'count': 3}, {'Name': 'Skill Development', 'count': 5}, {'Name': 'Salary & Benefits', 'count': 3}, {'Name': 'Job Security', 'count': 3}, {'Name': 'Company Culture', 'count': 3}, {'Name': 'Career Growth', 'count': 3}, {'Name': 'Work Satisfaction', 'count': 4}]
[{'Name': 'Work-Life Balance', 'count': 3}, {'Name': 'Skill Development', 'count': 5}, {'Name': 'Salary & Benefits', 'count': 5}, {'Name': 'Job Security', 'count': 1}, {'Name': 'Company Culture', 'count': 3}, {'Name': 'Career Growth', 'count': 1}, {'Name': 'Work Satisfaction', 'count': 1}]
My code:
new_columns = set([d['Name'] for l in dfr.RatingDistribution.values for d in l ])
# Make a dict of dicts
col_val_dict = {}
for col_name in new_columns:
col_val_dict[col_name] = {}
# For each column name look to see if a row has that as a type
# If so, get the label for that dict
# otherwise fill it with NaN
for i,l in enumerate(dfr.placeholders.values):
the_label = [d['count'] for d in l if d['Name'] == col_name]
if the_label:
col_val_dict[col_name][i] = the_label[0]
else:
col_val_dict[col_name][i] = np.NaN
# Merge this new dfa with the old one
merged_dfa = pd.concat([dfr,pd.DataFrame(col_val_dict)],axis='columns')
dfr.shape
I'm getting error in the very first line. I'm not able to figure out why it is throwing me the float error.
PLEASE HELP
# Get the unique types (column names)
new_columns = set([d['type'] for l in df3.placeholders.values for d in l ])
# Make a dict of dicts
col_val_dict = {}
for col_name in new_columns:
col_val_dict[col_name] = {}
# For each column name look to see if a row has that as a type
# If so, get the label for that dict
# otherwise fill it with NaN
for i,l in enumerate(df3.placeholders.values):
the_label = [d['label'] for d in l if d['type'] == col_name]
if the_label:
col_val_dict[col_name][i] = the_label[0]
else:
col_val_dict[col_name][i] = np.NaN
# Merge this new df with the old one
merged_df = pd.concat([df3,pd.DataFrame(col_val_dict)],axis='columns')

Group List of Dicts by Key in Python

Looking for a Pythonic way to iterate over a list of Dicts and group them by a certain key.
E.g. a list like this should be grouped by position
[
{'Name': 'Bradley Greer', 'Position': 'Software Engineer', 'Office': 'London', 'Age': '41', 'Start date': '2012/10/13', 'Salary': '$132,000'},
{'Name': 'Brenden Wagner', 'Position': 'Software Engineer', 'Office': 'San Francisco', 'Age': '28', 'Start date': '2011/06/07', 'Salary': '$206,850'},
{'Name': 'Bruno Nash', 'Position': 'Software Engineer', 'Office': 'London', 'Age': '38', 'Start date': '2011/05/03', 'Salary': '$163,500'},
{'Name': 'Cara Stevens', 'Position': 'Sales Assistant', 'Office': 'New York', 'Age': '46', 'Start date': '2011/12/06', 'Salary': '$145,600'},
{'Name': 'Donna Snider', 'Position': 'Customer Support', 'Office': 'New York', 'Age': '27', 'Start date': '2011/01/25', 'Salary': '$112,000'},
{'Name': 'Doris Wilder', 'Position': 'Sales Assistant', 'Office': 'Sydney', 'Age': '23', 'Start date': '2010/09/20', 'Salary': '$85,600'},
{'Name': 'Gavin Joyce', 'Position': 'Sales Assistant', 'Office': 'Edinburgh', 'Age': '42', 'Start date': '2010/12/22', 'Salary': '$92,575'},
{'Name': 'Herrod Chandler', 'Position': 'Sales Assistant', 'Office': 'San Francisco', 'Age': '59', 'Start date': '2012/08/06', 'Salary': '$137,500'}
]
would result in something like this
[
{
'Position': 'Software Engineer',
'Items': [
{'Name': 'Bradley Greer', 'Position': 'Software Engineer', 'Office': 'London', 'Age': '41', 'Start date': '2012/10/13', 'Salary': '$132,000'},
{'Name': 'Brenden Wagner', 'Position': 'Software Engineer', 'Office': 'San Francisco', 'Age': '28', 'Start date': '2011/06/07', 'Salary': '$206,850'},
{'Name': 'Bruno Nash', 'Position': 'Software Engineer', 'Office': 'London', 'Age': '38', 'Start date': '2011/05/03', 'Salary': '$163,500'},
]
},
{
'Position': 'Sales Assistant',
'Items': [
{'Name': 'Cara Stevens', 'Position': 'Sales Assistant', 'Office': 'New York', 'Age': '46', 'Start date': '2011/12/06', 'Salary': '$145,600'},
{'Name': 'Doris Wilder', 'Position': 'Sales Assistant', 'Office': 'Sydney', 'Age': '23', 'Start date': '2010/09/20', 'Salary': '$85,600'},
{'Name': 'Gavin Joyce', 'Position': 'Sales Assistant', 'Office': 'Edinburgh', 'Age': '42', 'Start date': '2010/12/22', 'Salary': '$92,575'},
{'Name': 'Herrod Chandler', 'Position': 'Sales Assistant', 'Office': 'San Francisco', 'Age': '59', 'Start date': '2012/08/06', 'Salary': '$137,500'}
]
},
{
'Position': 'Customer Support',
'Items': [
{'Name': 'Donna Snider', 'Position': 'Customer Support', 'Office': 'New York', 'Age': '27', 'Start date': '2011/01/25', 'Salary': '$112,000'},
]
}
]
You can use itertools.groupby
items is your input list
import itertools
output = []
for k,v in itertools.groupby(items, key=lambda x:x['Position']):
output += [{
'Position': k,
'Items': list(v)
}]
If you want a one-liner (nearly as fast as the previous solution), then here it is:
people = # Your list of dicts
key = "Position" # The key to group by
output = [
{key: k, "items": [person for person in people if person[key] == k]}
for k in {person[key] for person in people}
]

create dictionary of values based on matching keys in list from nested dictionary

i have nested dictionary with upto 300 items from TYPE1 TO TYPE300 called mainlookup
mainlookup = {'TYPE1': [{'Song': 'Rock', 'Type': 'Hard', 'Price': '10'}],
'TYPE2': [{'Song': 'Jazz', 'Type': 'Slow', 'Price': '5'}],
'TYPE37': [{'Song': 'Country', 'Type': 'Fast', 'Price': '7'}]}
input list to search in lookup based on string TYPE1, TYPE2 and so one
input_list = ['thissong-fav-user:type1-chan-44-John',
'thissong-fav-user:type1-chan-45-kelly-md',
'thissong-fav-user:type2-rock-45-usa',
'thissong-fav-user:type737-chan-45-patrick-md',
'thissong-fav-user:type37-chan-45-kelly-md']
i want to find the string TYPE IN input_list and then create a dictionary as shown below
Output_Desired = {'thissong-fav-user:type1-chan-44-John': [{'Song': 'Rock', 'Type': 'Hard',
'Price':'10'}],
'thissong-fav-user:type1-chan-45-kelly-md': [{'Song': 'Rock', 'Type': 'Hard', 'Price': '10'}],
'thissong-fav-user:type2-rock-45-usa': [{'Song': 'Jazz', 'Type': 'Slow', 'Price': '5'}],
'thissong-fav-user:type37-chan-45-kelly-md': [{'Song': 'Country', 'Type': 'Fast', 'Price': '7'}]}
Note-thissong-fav-user:type737-chan-45-patrick-md in the list has no match so i want to create a
seperate list if value is not found in main lookup
Notfound_list = ['thissong-fav-user:type737-chan-45-patrick-md', and so on..]
Appreciate your help.
You can try this:
mainlookup = {'TYPE1': [{'Song': 'Rock', 'Type': 'Hard', 'Price': '10'}],
'TYPE2': [{'Song': 'Jazz', 'Type': 'Slow', 'Price': '5'}], 'TYPE37': [{'Song': 'Country', 'Type': 'Fast', 'Price': '7'}]}
input_list = ['thissong-fav-user:type1-chan-44-John',
'thissong-fav-user:type1-chan-45-kelly-md', 'thissong-fav-user:type737-chan-45-kelly-md']
dct={i:mainlookup[i.split(':')[1].split('-')[0].upper()] for i in input_list if i.split(':')[1].split('-')[0].upper() in mainlookup.keys()}
Notfoundlist=[i for i in input_list if i not in dct.keys() ]
print(dct)
print(Notfoundlist)
Output:
{'thissong-fav-user:type1-chan-44-John': [{'Song': 'Rock', 'Type': 'Hard', 'Price': '10'}], 'thissong-fav-user:type1-chan-45-kelly-md': [{'Song': 'Rock', 'Type': 'Hard', 'Price': '10'}]}
['thissong-fav-user:type737-chan-45-kelly-md']
An answer using regular expressions:
import re
from pprint import pprint
input_list = ['thissong-fav-user:type1-chan-44-John', 'thissong-fav-user:type1-chan-45-kelly-md', 'thissong-fav-user:type2-rock-45-usa', 'thissong-fav-user:type737-chan-45-patrick-md', 'thissong-fav-user:type37-chan-45-kelly-md']
mainlookup = {'TYPE2': {'Song': 'Reggaeton', 'Type': 'Hard', 'Price': '30'}, 'TYPE1': {'Song': 'Rock', 'Type': 'Hard', 'Price': '10'}, 'TYPE737': {'Song': 'Jazz', 'Type': 'Hard', 'Price': '99'}, 'TYPE37': {'Song': 'Rock', 'Type': 'Soft', 'Price': '1'}}
pattern = re.compile('type[0-9]+')
matches = [re.search(pattern, x).group(0) for x in input_list]
result = {x: [mainlookup[matches[i].upper()]] for i, x in enumerate(input_list)}
pprint(result)
Output:
{'thissong-fav-user:type1-chan-44-John': [{'Price': '10',
'Song': 'Rock',
'Type': 'Hard'}],
'thissong-fav-user:type1-chan-45-kelly-md': [{'Price': '10',
'Song': 'Rock',
'Type': 'Hard'}],
'thissong-fav-user:type2-rock-45-usa': [{'Price': '30',
'Song': 'Reggaeton',
'Type': 'Hard'}],
'thissong-fav-user:type37-chan-45-kelly-md': [{'Price': '1',
'Song': 'Rock',
'Type': 'Soft'}],
'thissong-fav-user:type737-chan-45-patrick-md': [{'Price': '99',
'Song': 'Jazz',
'Type': 'Hard'}]}

Multiple Nested Dictionaries into Dataframe as one single row

I am working on a nested dictionary and not able to put it in a dataframe.
This is what it looks like:
{'props': {'initialProps': {'statusCode': 0, '$isMobile': False, '$isIOS': None, '$isAndroid': False, '$host': 'www.tiktok.com', '$pageUrl': '/#alanwalkermusic/video/6816594007829859589', '$language': 'en', '$originalLanguage': 'en', '$languageList': [{'value': 'id', 'alias': 'id-ID', 'label': 'Bahasa Indonesia', 'children': [{'value': 'default', 'label': ''}]}, {'value': 'de', 'alias': 'de-DE', 'label': 'Deutsch', 'children': [{'value': 'default', 'label': ''}]}, {'value': 'en', 'alias': 'en', 'label': 'English', 'children': [{'value': 'default', 'label': ''}]}, {'value': 'es', 'alias': 'es', 'label': 'Español', 'children': [{'value': 'default', 'label': ''}]}, {'value': 'fr', 'alias': 'fr', 'label': 'Français', 'children': [{'value': 'default', 'label': ''}]}, {'value': 'it', 'alias': 'it-IT', 'label': 'Italiano', 'children': [{'value': 'default', 'label': ''}]}, {'value': 'pl', 'alias': 'pl-PL', 'label': 'Polski', 'children': [{'value': 'default', 'label': ''}]}, {'value': 'pt_BR', 'alias': 'pt-BR', 'label': 'Português', 'children': [{'value': 'default', 'label': ''}]}, {'value': 'vi', 'alias': 'vi-VN', 'label': 'Tiếng Việt', 'children': [{'value': 'default', 'label': ''}]}, {'value': 'tr', 'alias': 'tr-TR', 'label': 'Türkçe', 'children': [{'value': 'default', 'label': ''}]}, {'value': 'ru', 'alias': 'ru-RU', 'label': 'Русский', 'children': [{'value': 'default', 'label': ''}]}, {'value': 'hi', 'alias': 'hi-IN', 'label': 'हिन्दी', 'children': [{'value': 'default', 'label': ''}]}, {'value': 'ko', 'alias': 'ko-KR', 'label': '한국어', 'children': [{'value': 'default', 'label': ''}]}, {'value': 'ja', 'alias': 'ja-JP', 'label': '日本語', 'children': [{'value': 'default', 'label': ''}]}, {'value': 'zh_Hant', 'alias': 'zh-Hant-TW', 'label': '繁體中文', 'children': [{'value': 'default', 'label': ''}]}, {'value': 'ar', 'alias': 'ar', 'label': 'العربية', 'children': [{'value': 'default', 'label': ''}]}], '$region': 'DE', '$appId': 1233, '$os': 'windows', '$baseURL': 'm.tiktok.com', '$downloadLink': {'amazon': {'visible': True, 'normal': 'https://www.amazon.com/dp/B0117U0G3M/'}, 'google': {'visible': True, 'normal': 'https://www.tiktok.com/download-link/af/com.zhiliaoapp.musically'}, 'apple': {'visible': True, 'normal': 'https://www.tiktok.com/download-link/af/id835599320'}}, '$abTestVersion': {'clientParameters': '{}', 'clientVersionName': '', 'versionName': '1618717', 'parameters': '{}', 'startTime': '', 'endTime': ''}, '$appType': 'm', '$gray': {'upload': False, 'tea': False}, '$reflowType': 't', '$legalList': [{'title': 'TikTok.com Cookies Policy', 'key': 'tiktok-website-cookies-policy', 'href': 'https://www.tiktok.com/legal/tiktok-website-cookies-policy?lang=en'}, {'title': 'Open Source', 'key': 'open-source', 'href': 'https://www.tiktok.com/legal/open-source?lang=en'}, {'title': 'Virtual Items', 'key': 'virtual-items', 'href': 'https://www.tiktok.com/legal/virtual-items?lang=en'}, {'title': 'Intellectual Property Policy', 'key': 'copyright-policy', 'href': 'https://www.tiktok.com/legal/copyright-policy?lang=en'}, {'title': 'Law Enforcement', 'key': 'law-enforcement', 'href': 'https://www.tiktok.com/legal/law-enforcement?lang=en'}, {'title': 'Privacy Policy', 'key': 'privacy-policy', 'href': 'https://www.tiktok.com/legal/privacy-policy?lang=en'}, {'title': 'Terms of Service', 'key': 'terms-of-use', 'href': 'https://www.tiktok.com/legal/terms-of-use?lang=en'}], '$botType': 'others', '$config': {'covidBanner': {'open': True, 'url': 'https://www.tiktok.com/safety/resources/covid-19', 'background': 'rgba(125,136,227,1)'}}, '$wid': '6819614602332063233'}, 'pageProps': {'serverCode': 200, 'pageState': {'regionAppId': 1233, 'os': 'windows', 'region': 'DE', 'baseURL': 'm.tiktok.com', 'appType': 't', 'fullUrl': 'https://www.tiktok.com/node/share/video/#alanwalkermusic/6816594007829859589'}, 'videoData': {'itemInfos': {'id': '6816594007829859589', 'video': {'urls': ['https://v19.muscdn.com/afefda4362dccd13b93a4d8e79ab5338/5ea477c7/video/tos/useast2a/tos-useast2a-ve-0068c004/7f922f16b9534b80be6523cbb8ba2ce2/?a=1233&br=2652&bt=1326&cr=0&cs=0&dr=0&ds=3&er=&l=202004251147400101150770371E5160EE&lr=tiktok_m&qs=0&rc=MzY3cXc1OjxwdDMzNjczM0ApNDtnNTY1Ojw2N2k8NDU4ZWdoZXBmLjE0ZGBfLS0vMTZzczZjYDEvL2BgMDAzXy0uMC86Yw%3D%3D&vl=&vr='], 'videoMeta': {'width': 540, 'height': 960, 'ratio': 11, 'duration': 11}}, 'covers': ['https://p16-va-default.akamaized.net/obj/tos-maliva-p-0068/8187e5fe9a0f684529910363f5dbe428'], 'authorId': '84453035275386880', 'coversOrigin': ['https://p16-va-default.akamaized.net/obj/tos-maliva-p-0068/7dd18717d8aa4916abc5699c25c09b3e_1587111976'], 'text': 'I’m that friend who can’t stand still when taking photos', 'commentCount': 300, 'diggCount': 14145, 'playCount': 91447, 'shareCount': 29, 'createTime': '1587111974', 'isActivityItem': False, 'warnInfo': [], 'liked': False, 'commentStatus': 0, 'showNotPass': False}, 'authorInfos': {'verified': True, 'secUid': 'MS4wLjABAAAAhS4j-WG_Zq5WJfdV35QEqorNPzc_kkFVNPQmGM6mQIWEg6FOgjS-Dl9eISlYvPDc', 'uniqueId': 'alanwalkermusic', 'userId': '84453035275386880', 'nickName': 'Alan Walker', 'covers': ['https://p16.muscdn.com/img/musically-maliva-obj/1649466056705029~c5_100x100.jpeg']}, 'musicInfos': {'musicId': '6805465027164768258', 'musicName': 'Heading Home', 'authorName': 'Alan Walker, Ruben', 'covers': ['https://p16.muscdn.com/img/musically-maliva-obj/1649466056705029~c5_100x100.jpeg']}, 'authorStats': {'followerCount': 2147815, 'heartCount': '10903873'}, 'challengeInfoList': [], 'duetInfo': '0', 'textExtra': []}, 'shareUser': {'secUid': '', 'userId': '', 'uniqueId': '', 'nickName': '', 'signature': '', 'covers': [], 'coversMedium': [], 'coversLarger': [], 'isSecret': False}, 'shareMeta': {'title': 'Alan Walker on TikTok', 'desc': 'I’m that friend who can’t stand still when taking photos', 'image': {'url': 'http://p16-va-default.akamaized.net/img/tos-maliva-p-0068/8187e5fe9a0f684529910363f5dbe428~tplv-tiktok-play.image', 'width': 540, 'height': 960}}, 'statusCode': 0, 'langList': [{'value': 'id', 'alias': 'id-ID', 'label': 'Bahasa Indonesia', 'children': [{'value': 'default', 'label': ''}]}, {'value': 'de', 'alias': 'de-DE', 'label': 'Deutsch', 'children': [{'value': 'default', 'label': ''}]}, {'value': 'en', 'alias': 'en', 'label': 'English', 'children': [{'value': 'default', 'label': ''}]}, {'value': 'es', 'alias': 'es', 'label': 'Español', 'children': [{'value': 'default', 'label': ''}]}, {'value': 'fr', 'alias': 'fr', 'label': 'Français', 'children': [{'value': 'default', 'label': ''}]}, {'value': 'it', 'alias': 'it-IT', 'label': 'Italiano', 'children': [{'value': 'default', 'label': ''}]}, {'value': 'pl', 'alias': 'pl-PL', 'label': 'Polski', 'children': [{'value': 'default', 'label': ''}]}, {'value': 'pt_BR', 'alias': 'pt-BR', 'label': 'Português', 'children': [{'value': 'default', 'label': ''}]}, {'value': 'vi', 'alias': 'vi-VN', 'label': 'Tiếng Việt', 'children': [{'value': 'default', 'label': ''}]}, {'value': 'tr', 'alias': 'tr-TR', 'label': 'Türkçe', 'children': [{'value': 'default', 'label': ''}]}, {'value': 'ru', 'alias': 'ru-RU', 'label': 'Русский', 'children': [{'value': 'default', 'label': ''}]}, {'value': 'hi', 'alias': 'hi-IN', 'label': 'हिन्दी', 'children': [{'value': 'default', 'label': ''}]}, {'value': 'ko', 'alias': 'ko-KR', 'label': '한국어', 'children': [{'value': 'default', 'label': ''}]}, {'value': 'ja', 'alias': 'ja-JP', 'label': '日本語', 'children': [{'value': 'default', 'label': ''}]}, {'value': 'zh_Hant', 'alias': 'zh-Hant-TW', 'label': '繁體中文', 'children': [{'value': 'default', 'label': ''}]}, {'value': 'ar', 'alias': 'ar', 'label': 'العربية', 'children': [{'value': 'default', 'label': ''}]}], 'webId': '6819614602168567298', 'requestId': '23892479587815260433', 'videoObjectPageProps': {'videoProps': {'url': 'https://www.tiktok.com/#alanwalkermusic/video/6816594007829859589', 'name': 'Alan Walker(#alanwalkermusic) on TikTok I’m that friend who can’t stand still when taking photos', 'description': 'Alan Walker(#alanwalkermusic) has created a short video on TikTok with music Heading Home. I’m that friend who can’t stand still when taking photos', 'keywords': 'alanwalkermusic, Alan Walker,', 'thumbnailUrl': ['https://p16-va-default.akamaized.net/obj/tos-maliva-p-0068/8187e5fe9a0f684529910363f5dbe428', 'https://p16-va-default.akamaized.net/obj/tos-maliva-p-0068/7dd18717d8aa4916abc5699c25c09b3e_1587111976'], 'uploadDate': '2020-04-17T08:26:14.000Z', 'contentUrl': 'https://v19.muscdn.com/afefda4362dccd13b93a4d8e79ab5338/5ea477c7/video/tos/useast2a/tos-useast2a-ve-0068c004/7f922f16b9534b80be6523cbb8ba2ce2/?a=1233&br=2652&bt=1326&cr=0&cs=0&dr=0&ds=3&er=&l=202004251147400101150770371E5160EE&lr=tiktok_m&qs=0&rc=MzY3cXc1OjxwdDMzNjczM0ApNDtnNTY1Ojw2N2k8NDU4ZWdoZXBmLjE0ZGBfLS0vMTZzczZjYDEvL2BgMDAzXy0uMC86Yw%3D%3D&vl=&vr=', 'embedUrl': 'https://www.tiktok.com/embed/v2/6816594007829859589', 'commentCount': '300', 'duration': 'PT11S', 'audio': {'name': 'Heading Home - Alan Walker, Ruben', 'author': 'Alan Walker, Ruben', 'mainEntityOfPage': {'#type': 'ItemPage', '#id': 'https://www.tiktok.com/music/Heading-Home-6805465027164768258'}}, 'creator': {'#type': 'Person', 'name': 'Alan Walker', 'alternateName': 'alanwalkermusic', 'url': 'https://www.tiktok.com/#alanwalkermusic', 'interactionStatistic': [{'#type': 'InteractionCounter', 'interactionType': {'#type': 'http://schema.org/LikeAction'}, 'userInteractionCount': '10903873'}, {'#type': 'InteractionCounter', 'interactionType': {'#type': 'http://schema.org/FollowAction'}, 'userInteractionCount': 2147815}]}, 'width': 540, 'height': 960, 'interactionStatistic': [{'#type': 'InteractionCounter', 'interactionType': {'#type': 'http://schema.org/WatchAction'}, 'userInteractionCount': 91447}, {'#type': 'InteractionCounter', 'interactionType': {'#type': 'http://schema.org/LikeAction'}, 'userInteractionCount': 14145}, {'#type': 'InteractionCounter', 'interactionType': {'#type': 'http://schema.org/ShareAction'}, 'userInteractionCount': 29}]}, 'pageProps': {'type': 'ItemPage', 'id': 'https://www.tiktok.com/#alanwalkermusic/video/6816594007829859589'}, 'breadcrumbProps': {'urlList': [{'name': 'TikTok', 'url': 'https://www.tiktok.com'}, {'name': 'Alan Walker(#alanwalkermusic) Official | TikTok', 'url': 'https://www.tiktok.com/#alanwalkermusic'}, {'name': 'Alan Walker(#alanwalkermusic) on TikTok I’m that friend who can’t stand still when taking photos', 'url': 'https://www.tiktok.com/#alanwalkermusic/video/6816594007829859589'}]}}, 'metaParams': {'title': 'Alan Walker(#alanwalkermusic) on TikTok I’m that friend who can’t stand still when taking photos', 'keywords': 'alanwalkermusic, Alan Walker,', 'description': 'Alan Walker(#alanwalkermusic) has created a short video on TikTok with music Heading Home. I’m that friend who can’t stand still when taking photos', 'robotsContent': 'index, follow'}, 'isSSR': True, 'pageOptions': {'header': {'showUpload': True, 'type': 'webapp'}, 'headOptions': None}}, 'pathname': '/share/video'}, 'page': '/share/video', 'query': {'uniqueId': 'alanwalkermusic', 'id': '6816594007829859589', '$initialProps': {'$wid': '6819614602332063233'}, 'webtoken': 'tac=\'i+2gv2ay5eehis!i#piis"yZl!%s"l"u&kLs#l l#vr*charCodeAtx0[!cb^i$1em7b*0d#>>>s j\\uffeel s#0y\\xe8g,&qnfme|ms l vk.}l ,(dfijxdaamvk5}l ,(dfijxdaam,$lwcamk\\xadl ,(dfijxdaam,$lwcams!l!v,\\\'nfmosCkmx,*~bgyad>r}~[!c0b<k$t jql!v,\\\'nfmosCkmx,%cokm3[!c0d#===k$t jMl!v,\\\'nfmosCkmx,0xefc.:9&*.4+2-0.[!c0d#===k"t f z[ cb|1d<y\\u01ea,(|fY\\x7f~d`hs g,(lfi~ah`{ms!g,&qnfme|ms"g,)gk}ejo{\\x7fcms#,)|doikgauus$ul"d\\\',typeofl$d#===v!kA}l"vl mx[ c,/T\\x7fsxvwa6#qw~tk#d#!==v!k\\\\}gr&Object,)yxdxbzv`tml mvr$callxl"[!c,/T\\x7fsxvwa6#qw~tk#d#!==v!k4}ul!d\\\',typeofl$d#===v!kG}l!vl mx[ cv,\\\'nfmosCkmx,(Lfi~ah`{[!c0b<v!k4}ul#d\\\',typeofl$d#===v!kD}l#vl mx[ c,2I|v\\x7fstl9Tzjty~TNP~d#!==v!k=}ugr(locationd\\\',typeofl$d#===k"t fv!k4}l!,,hbmz}t|gYzrrm!!!kjg,\\\'oaz~d~tms%ul%d\\\',typeofl$d#===v!kB}l%vl mx[ c,0K~pyqvb7PpiosogBd#!==k"t f z[ cb|1d<y\\xe2g,&qnfme|ms l vk-}l ,\\\'dggyd`hmvk7}l ,\\\'dggyd`hm,\\\'aa{oiyjmk"t ugr.InstallTriggerd\\\',typeof,)|doikgauud#!==kwl vkn}l ,*e~xh|Xyuf{ml ,*cebh|Xyuf{mb-\\x94b>v!kF}l ,+dyyk}Xt{t|aml ,+bbck}Xt{t|amb-\\x94b>k"t f z[ cb|1d<y\\xa0g,&akgkkgms g,&Iebli\\x7fms!ul d\\\',typeof,)|doikgauud#!==vkh}l!,)yxdxbzv`tm,(|fY\\x7f~d`hmvr$callxl ,\\\'wzfin\\x7f~m[!c,0K~pyqvb7hkuxynmBd#=== z[ cb|1d<y\\u011eg,(lfi~ah`{ms g,)gk}ejo{\\x7fcms!g,&qnfme|ms"fv!k4}l ,,hbmz}t|gYzrrm!!!k\\xd7,\\\'wd|mbb~l!d"in!v!kI}l!,\\\'wd|mbb~mg,+[`xif~P`aulmd*instanceof!v!k1},(Wybjbyabl"d"inv!k4},+hmab_xp|g{xl"d"inv!k4},+TScghxe\\x7frfpl"d"inv!kP},%Dscafl"d"in,8[xtm}nLzNEGQMKAdGG^NTY\\x1ckl"d"inb< f z[ cb|1d<y\\u02a1g,&qnfme|ms g,)gk}ejo{\\x7fcms!g,&Iebli\\x7fms"l!,)~oih\\x7fgyucmk"t (\\x80,.jjvx|vDgyg}knbl"d"inkfl"v,.jjvx|vDgyg}knbmxl!,)~oih\\x7fgyucgr&Objectn vuq%valuevfq(writable[#c}) %{s#t ,4KJarz}hrjxl#EWCOQDRB,3LKfs{}wsnqB{iAMWBP#,;DCj{}DSKUAWyTK[C[XrHZ^RFZ[[,7HGn\\x7fyxowiES}PGWOW\\\\vL^BN,5JI`}{~iuk{m\\x7fRAQMURxNG,3LKsnsjpl~nB{iAMWBP#,2MLpg\\x7fa}kEnrjl~PQGG,5JI`}{~iuk{m\\x7fTLTVDVWMM,1NMwf|`rjF\\x7fm}qk~TD,4KJert|tripAjNVPBTUCC,4KJpo|ksmyoAjNVPBTUCC[+s#,)Vyn`h`fe|,,olbcCt~vz|cz,6ID}u\\x7fuuhs#ieg|v#EHZMOY[#s$l$*%s%l%u&k4s&l$l&ms\\\'l l\\\'mk"t j\\x06l#*%s%l%u&k?s&l#l&ms\\\'l ,(lfi~ah`{ml\\\'mk"t j\\ufffbl ,(lfi~ah`{m*%s%l%u&kls&l&vr%matchxgr&RegExp$*\\\\$[a-z]dc_$ n"[!cvk:}l ,(lfi~ah`{ml&m,&efkaoTmk"t j\\uffcef z[ cb|1d<y\\u024fg,&qnfme|ms gr&RegExp$+constructor$!in"vr$testxl ,+CX#BJ|t\\x7fvzam[!cv!k\\xb2}yZl vr(toStringx[ c,AzMAN#ES\\x08zKMM_G}U\\\\]GQ{YCQ_SX]IWP.\\x1cd#=== l ,&ufnhxbm!v!kd}ugr&safarid\\\',typeof,)|doikgauud#!==vk=}gr&safari,0`da{Zzb~~pyzhtqqm&k\\u010e(=l v,,c}kaTpfrvtermxzzzz[$c}) %{s!t (\\x85l ,.}jcb{|zFbxjx}~mvr\\\'setItemx,+xc`kDuhZvfp, ["c}l ,.}jcb{|zFbxjx}~mvr*removeItemx,+xc`kDuhZvfp[!c}) \\x7f{s!l!,$gjbbmgr,DOMException,2CF[AWH]AY^YY[[\\x7fdpqmd#===vkC}l ,.}jcb{|zFbxjx}~m,&jbfn~cm0d#===k"t f fv!k>}g,(lfi~ah`{m,,hbmz}t|gYzrrm!!k`l ,)`doiukkTSm!vkJ}l ,,\\\\bgadt`Vbpxcmv!k4}l ,.C\\\\#~{}`pdRn|tomk"t f z[ cb|1d<y\\u0165g,(lfi~ah`{ms g,)gk}ejo{\\x7fcms!,(|fY\\x7f~d`hs",\\\'nfmosCks#fv!k4}l ,,hbmz}t|gYzrrm!!!k\\u0113l v,-n|jqewVxp{rvmmx,&eff\\x7fkx[!cs$l$,)}eOmyoZB]mvl"mx[ cv,\\\'umyfjohmxgr&RegExp$#\\\\s*$!gn"$ ["cvl#mx,*djxdxjs~vv[!c0b<v!kh}l!l"mvl"mx[ cvr\\\'replacexgr&RegExp$#\\\\s*$!gn"$ ["cvl#mx,*djxdxjs~vv[!c0b<v!kP}l!,\\\'wd|mbb~mvl"mx[ c,4Ozt}}zn;LqkxIOcQVD_zd#!== f z[ cb|1d<1b|\'', '0fea6a13c52b4d4725368f24b045ca84': {}, 'aa59d67c2123f094d0d6798ffe651c4d': {}}, 'buildId': '1.0.1.1363', 'assetPrefix': '//s16.tiktokcdn.com/tiktok/falcon', 'isFallback': False, 'customServer': True}
I can reach any values by writing so:
import json
parsed = json.loads(source_code)
props = parsed['props']['pageProps']['videoData']
follower = props['authorStats']['followerCount']
heartcount = props['authorStats']['heartCount']
AuthorID = props['itemInfos']['authorId']
Commentcount = pandas.DataFrame(props['itemInfos']['commentCount']
CreateTime = props['itemInfos']['createTime']
But I was not able to put all of them into a dataframe with one row and all of those follower,heartcount etc as columns.
Thank you for your help!

Categories