Exception/Error Handling in Python - python

I have a class that I created, at the end of the task I have to create two lists: one for tweets and the other for tweet labels.
After initiation, I want to load the tweets from a file and from another file their labels. After loading I want to check that each tweet is a json object and that it has no error, if it does, then I want to remove it and remove the associated label if provided. Labels are either 'pos' or 'neg'.
class flu_tweets:
def __init__(self):
self.tweets = [] #init create empty tweet list
self.labels = [] #init create empty label list
def load(self, tweets_filename, labels_filename = ''):
open_tweet_file = open(tweets_filename, 'r')
for tweet in open_tweet_file:
if tweet !='\n' and tweet != '\r\n':
self.tweets.extend([tweet])
if labels_filename != '':
open_label_file = open(labels_filename, 'r')
for label in open_label_file:
if label != '\n' and label != '\r\n':
if label[3:] != '\n':
self.labels.extend([label[:3]])
else:
self.labels.extend([label])
open_label_file.close()
index_tweet = 0
for tweet in self.tweets:
try:
json.loads(tweet)
index_tweet += 1
break
except:
print(index_tweet-1)
self.tweets.pop(index_tweet-1)
if self.labels != []:
self.labels.pop(index_tweet-1)
open_tweet_file.close()
Right now the method doesn't do that, and upon checking the list it does contain non-json objects.
Below is a copy of text file used that has tweets in it:
{"created_at":"Fri Oct 20 14:35:19 +0000 2017","id":921384339421745153,"id_str":"921384339421745153","text":"RT #alvindchipmunk: Dont let the DNC slide with no handcuffs. https://t.co/h72q7lGAHF","source":"\u003ca href=\"http://twitter.com/download/iphone\" rel=\"nofollow\"\u003eTwitter for iPhone\u003c/a\u003e","truncated":false,"in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"in_reply_to_user_id_str":null,"in_reply_to_screen_name":null,"user":{"id":3435638633,"id_str":"3435638633","name":"alwaystrump","screen_name":"rodilosso_patty","location":"New Jersey, USA","url":null,"description":"Let's not give the media the Race war they want. POTUS we have your back! MAGA","translator_type":"none","protected":false,"verified":false,"followers_count":2770,"friends_count":2048,"listed_count":183,"favourites_count":85745,"statuses_count":152520,"created_at":"Sat Aug 22 16:27:18 +0000 2015","utc_offset":null,"time_zone":null,"geo_enabled":true,"lang":"en","contributors_enabled":false,"is_translator":false,"profile_background_color":"C0DEED","profile_background_image_url":"http://abs.twimg.com/images/themes/theme1/bg.png","profile_background_image_url_https":"https://abs.twimg.com/images/themes/theme1/bg.png","profile_background_tile":false,"profile_link_color":"1DA1F2","profile_sidebar_border_color":"C0DEED","profile_sidebar_fill_color":"DDEEF6","profile_text_color":"333333","profile_use_background_image":true,"profile_image_url":"http://pbs.twimg.com/profile_images/773719268026257410/AuXU_l-D_normal.jpg","profile_image_url_https":"https://pbs.twimg.com/profile_images/773719268026257410/AuXU_l-D_normal.jpg","profile_banner_url":"https://pbs.twimg.com/profile_banners/3435638633/1461499638","default_profile":true,"default_profile_image":false,"following":null,"follow_request_sent":null,"notifications":null},"geo":null,"coordinates":null,"place":null,"contributors":null,"retweeted_status":{"created_at":"Fri Oct 20 12:27:04 +0000 2017","id":921352064160034816,"id_str":"921352064160034816","text":"Dont let the DNC slide with no handcuffs. https://t.co/h72q7lGAHF","display_text_range":[0,41],"source":"\u003ca href=\"http://twitter.com\" rel=\"nofollow\"\u003eTwitter Web Client\u003c/a\u003e","truncated":false,"in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"in_reply_to_user_id_str":null,"in_reply_to_screen_name":null,"user":{"id":35962023,"id_str":"35962023","name":"alvin maldonado","screen_name":"alvindchipmunk","location":"Bunnell, FL","url":"http://alvindchipmunk-theconservativecomet.blogspot.com/","description":"Artist-musician-Medical Professional-Patriot-Guns-God-Country & 2Unite with others of like mind 2 re-elect Trump, redecorate DC making d USA gr8 as it still is","translator_type":"none","protected":false,"verified":false,"followers_count":1660,"friends_count":2051,"listed_count":43,"favourites_count":2001,"statuses_count":16294,"created_at":"Tue Apr 28 02:43:18 +0000 2009","utc_offset":-14400,"time_zone":"Eastern Time (US & Canada)","geo_enabled":true,"lang":"en","contributors_enabled":false,"is_translator":false,"profile_background_color":"9818A1","profile_background_image_url":"http://pbs.twimg.com/profile_background_images/627772771741732864/MHLgViA4.jpg","profile_background_image_url_https":"https://pbs.twimg.com/profile_background_images/627772771741732864/MHLgViA4.jpg","profile_background_tile":true,"profile_link_color":"981CEB","profile_sidebar_border_color":"DE3C88","profile_sidebar_fill_color":"E887E8","profile_text_color":"333333","profile_use_background_image":true,"profile_image_url":"http://pbs.twimg.com/profile_images/542852260422639616/75bqMWY3_normal.jpeg","profile_image_url_https":"https://pbs.twimg.com/profile_images/542852260422639616/75bqMWY3_normal.jpeg","profile_banner_url":"https://pbs.twimg.com/profile_banners/35962023/1481464103","default_profile":false,"default_profile_image":false,"following":null,"follow_request_sent":null,"notifications":null},"geo":null,"coordinates":null,"place":null,"contributors":null,"quoted_status_id":920847040132792320,"quoted_status_id_str":"920847040132792320","quoted_status":{"created_at":"Thu Oct 19 03:00:17 +0000 2017","id":920847040132792320,"id_str":"920847040132792320","text":"I'm sick of all the evidence against the Democrats and no handcuffs. Retweet-\nif you agree!\n\n#realDonaldTrump \ud83c\uddfa\ud83c\uddf8","source":"\u003ca href=\"http://twitter.com/download/android\" rel=\"nofollow\"\u003eTwitter for Android\u003c/a\u003e","truncated":false,"in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"in_reply_to_user_id_str":null,"in_reply_to_screen_name":null,"user":{"id":889982846428782592,"id_str":"889982846428782592","name":"c\u2113\u03b9\u03b7\u0442\u03c3\u03b7 \u043c\u03b9c\u043d\u03b1\u03b5\u2113","screen_name":"crusher614","location":"*not of this world","url":"https://www.youtube.com/channel/UCthNh_qChVqAh_zamo0IQxQ","description":"FMR U.S. Border Patrol | New Mexico SWAT Operator | Legend Who Lives Rent Free In The Minds of Liberals World Wide. #MAGA \u03bc\u03bf\u03bb\u1f7c\u03bd \u03bb\u03b1\u03b2\u03ad III","translator_type":"none","protected":false,"verified":false,"followers_count":19821,"friends_count":135,"listed_count":73,"favourites_count":16032,"statuses_count":7708,"created_at":"Tue Jul 25 22:57:00 +0000 2017","utc_offset":-25200,"time_zone":"America/Phoenix","geo_enabled":false,"lang":"en","contributors_enabled":false,"is_translator":false,"profile_background_color":"000000","profile_background_image_url":"http://abs.twimg.com/images/themes/theme1/bg.png","profile_background_image_url_https":"https://abs.twimg.com/images/themes/theme1/bg.png","profile_background_tile":false,"profile_link_color":"19CF86","profile_sidebar_border_color":"000000","profile_sidebar_fill_color":"000000","profile_text_color":"000000","profile_use_background_image":false,"profile_image_url":"http://pbs.twimg.com/profile_images/916414292882219008/fvSIJCC6_normal.jpg","profile_image_url_https":"https://pbs.twimg.com/profile_images/916414292882219008/fvSIJCC6_normal.jpg","profile_banner_url":"https://pbs.twimg.com/profile_banners/889982846428782592/1506312325","default_profile":false,"default_profile_image":false,"following":null,"follow_request_sent":null,"notifications":null},"geo":null,"coordinates":null,"place":null,"contributors":null,"is_quote_status":false,"quote_count":551,"reply_count":703,"retweet_count":15439,"favorite_count":13676,"entities":{"hashtags":[],"urls":[],"user_mentions":[{"screen_name":"realDonaldTrump","name":"Donald J. Trump","id":25073877,"id_str":"25073877","indices":[93,109]}],"symbols":[]},"favorited":false,"retweeted":false,"filter_level":"low","lang":"en"},"is_quote_status":true,"quote_count":0,"reply_count":0,"retweet_count":1,"favorite_count":0,"entities":{"hashtags":[],"urls":[{"url":"https://t.co/h72q7lGAHF","expanded_url":"https://twitter.com/crusher614/status/920847040132792320","display_url":"twitter.com/crusher614/sta\u2026","indices":[42,65]}],"user_mentions":[],"symbols":[]},"favorited":false,"retweeted":false,"possibly_sensitive":false,"filter_level":"low","lang":"en"},"quoted_status_id":920847040132792320,"quoted_status_id_str":"920847040132792320","quoted_status":{"created_at":"Thu Oct 19 03:00:17 +0000 2017","id":920847040132792320,"id_str":"920847040132792320","text":"I'm sick of all the evidence against the Democrats and no handcuffs. Retweet-\nif you agree!\n\n#realDonaldTrump \ud83c\uddfa\ud83c\uddf8","source":"\u003ca href=\"http://twitter.com/download/android\" rel=\"nofollow\"\u003eTwitter for Android\u003c/a\u003e","truncated":false,"in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"in_reply_to_user_id_str":null,"in_reply_to_screen_name":null,"user":{"id":889982846428782592,"id_str":"889982846428782592","name":"c\u2113\u03b9\u03b7\u0442\u03c3\u03b7 \u043c\u03b9c\u043d\u03b1\u03b5\u2113","screen_name":"crusher614","location":"*not of this world","url":"https://www.youtube.com/channel/UCthNh_qChVqAh_zamo0IQxQ","description":"FMR U.S. Border Patrol | New Mexico SWAT Operator | Legend Who Lives Rent Free In The Minds of Liberals World Wide. #MAGA \u03bc\u03bf\u03bb\u1f7c\u03bd \u03bb\u03b1\u03b2\u03ad III","translator_type":"none","protected":false,"verified":false,"followers_count":19821,"friends_count":135,"listed_count":73,"favourites_count":16032,"statuses_count":7708,"created_at":"Tue Jul 25 22:57:00 +0000 2017","utc_offset":-25200,"time_zone":"America/Phoenix","geo_enabled":false,"lang":"en","contributors_enabled":false,"is_translator":false,"profile_background_color":"000000","profile_background_image_url":"http://abs.twimg.com/images/themes/theme1/bg.png","profile_background_image_url_https":"https://abs.twimg.com/images/themes/theme1/bg.png","profile_background_tile":false,"profile_link_color":"19CF86","profile_sidebar_border_color":"000000","profile_sidebar_fill_color":"000000","profile_text_color":"000000","profile_use_background_image":false,"profile_image_url":"http://pbs.twimg.com/profile_images/916414292882219008/fvSIJCC6_normal.jpg","profile_image_url_https":"https://pbs.twimg.com/profile_images/916414292882219008/fvSIJCC6_normal.jpg","profile_banner_url":"https://pbs.twimg.com/profile_banners/889982846428782592/1506312325","default_profile":false,"default_profile_image":false,"following":null,"follow_request_sent":null,"notifications":null},"geo":null,"coordinates":null,"place":null,"contributors":null,"is_quote_status":false,"quote_count":551,"reply_count":703,"retweet_count":15439,"favorite_count":13676,"entities":{"hashtags":[],"urls":[],"user_mentions":[{"screen_name":"realDonaldTrump","name":"Donald J. Trump","id":25073877,"id_str":"25073877","indices":[93,109]}],"symbols":[]},"favorited":false,"retweeted":false,"filter_level":"low","lang":"en"},"is_quote_status":true,"quote_count":0,"reply_count":0,"retweet_count":0,"favorite_count":0,"entities":{"hashtags":[],"urls":[{"url":"https://t.co/h72q7lGAHF","expanded_url":"https://twitter.com/crusher614/status/920847040132792320","display_url":"twitter.com/crusher614/sta\u2026","indices":[62,85]}],"user_mentions":[{"screen_name":"alvindchipmunk","name":"alvin maldonado","id":35962023,"id_str":"35962023","indices":[3,18]}],"symbols":[]},"favorited":false,"retweeted":false,"possibly_sensitive":false,"filter_level":"low","lang":"en","timestamp_ms":"1508510119668"}
{"created_at":"Fri Oct 20 14:35:19 +0000 2017","id":921384340113670149,"id_str":"921384340113670149","text":"RT #mitchelmusso: My girl is so fine! but gaw}
Now that last tweet is incomplete,I expect that my function should raise an error and eliminate it.

Related

How to convert this data to a json format and store it in json file

So I have sample data which is in this format :
['{"created_at":"Sat Mar 16 16:10:43 +0000 2019","id":1106950932569288707,"id_str":"1106950932569288707","text":"Hillary and her #nitwit adherents claim she won the 2016 election. \"Participation trophies\" don\'t mean a thing in t\u2026 https:\/\/t.co\/zUUSMWXqja","source":"\u003ca href=\"http:\/\/twitter.com\" rel=\"nofollow\"\u003eTwitter Web Client\u003c\/a\u003e","truncated":true,"in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"in_reply_to_user_id_str":null,"in_reply_to_screen_name":null,"user":{"id":864531280934907905,"id_str":"864531280934907905","name":"\ud83c\uddfa\ud83c\uddf8\ud83c\uddfa\ud83c\uddf8 SSDD \ud83c\uddfa\ud83c\uddf8\ud83c\uddfa\ud83c\uddf8","screen_name":"EdGullion","location":"Objective Reality","url":null,"description":"\ud83c\uddfa\ud83c\uddf8\ud83c\uddfa\ud83c\uddf8#Trump | #BuildThatWall | #AmericaFirst | \ud83d\udeabLibs | \ud83d\udeabDems | Block #Nitwits | http:\/\/Gab.com \ud83c\uddfa\ud83c\uddf8\ud83c\uddfa\ud83c\uddf8","translator_type":"none","protected":false,"verified":false,"followers_count":5173,"friends_count":5171,"listed_count":4,"favourites_count":18478,"statuses_count":41837,"created_at":"Tue May 16 17:21:34 +0000 2017","utc_offset":null,"time_zone":null,"geo_enabled":false,"lang":"en","contributors_enabled":false,"is_translator":false,"profile_background_color":"000000","profile_background_image_url":"http:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_image_url_https":"https:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_tile":false,"profile_link_color":"1B95E0","profile_sidebar_border_color":"000000","profile_sidebar_fill_color":"000000","profile_text_color":"000000","profile_use_background_image":false,"profile_image_url":"http:\/\/pbs.twimg.com\/profile_images\/959783845246615552\/nJBLKbFS_normal.jpg","profile_image_url_https":"https:\/\/pbs.twimg.com\/profile_images\/959783845246615552\/nJBLKbFS_normal.jpg","profile_banner_url":"https:\/\/pbs.twimg.com\/profile_banners\/864531280934907905\/1541463166","default_profile":false,"default_profile_image":false,"following":null,"follow_request_sent":null,"notifications":null},"geo":null,"coordinates":null,"place":null,"contributors":null,"quoted_status_id":1106777688574849024,"quoted_status_id_str":"1106777688574849024","quoted_status":{"created_at":"Sat Mar 16 04:42:18 +0000 2019","id":1106777688574849024,"id_str":"1106777688574849024","text":"Stacey Abrams claimed that she won her election against Republican Gov. Brian Kemp, during an event yesterday.\n\n\u201cI\u2026 https:\/\/t.co\/swEmfwTo8S","source":"\u003ca href=\"http:\/\/twitter.com\" rel=\"nofollow\"\u003eTwitter Web Client\u003c\/a\u003e","truncated":true,"in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"in_reply_to_user_id_str":null,"in_reply_to_screen_name":null,"user":{"id":835676806455832576,"id_str":"835676806455832576","name":"\ud83c\uddfa\ud83c\uddf8 Jack Ralph \ud83c\uddfa\ud83c\uddf8","screen_name":"NevadaJack2","location":"Carson City, NV","url":null,"description":"Vietnam Era #Veteran (USAF 61-65), married 57 years, father of 4, grand of 7, retired IT guy. YUUGE #Trump supporter. #MAGA #TRUMP2020 #WWG1WGA","translator_type":"none","protected":false,"verified":false,"followers_count":59620,"friends_count":57299,"listed_count":55,"favourites_count":213014,"statuses_count":125563,"created_at":"Sun Feb 26 02:24:11 +0000 2017","utc_offset":null,"time_zone":null,"geo_enabled":false,"lang":"en","contributors_enabled":false,"is_translator":false,"profile_background_color":"000000","profile_background_image_url":"http:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_image_url_https":"https:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_tile":false,"profile_link_color":"E81C4F","profile_sidebar_border_color":"000000","profile_sidebar_fill_color":"000000","profile_text_color":"000000","profile_use_background_image":false,"profile_image_url":"http:\/\/pbs.twimg.com\/profile_images\/852694196804780034\/Ylah4YQT_normal.jpg","profile_image_url_https":"https:\/\/pbs.twimg.com\/profile_images\/852694196804780034\/Ylah4YQT_normal.jpg","profile_banner_url":"https:\/\/pbs.twimg.com\/profile_banners\/835676806455832576\/1492132580","default_profile":false,"default_profile_image":false,"following":null,"follow_request_sent":null,"notifications":null},"geo":null,"coordinates":null,"place":null,"contributors":null,"is_quote_status":false,"extended_tweet":{"full_text":"Stacey Abrams claimed that she won her election against Republican Gov. Brian Kemp, during an event yesterday.\n\n\u201cI did win my election,\u201d she said, according to ABC News reporter Adam Kelsey. \u201cI just didn\u2019t get to have the job.\u201d \n\nhttps:\/\/t.co\/HCyafbiNw2","display_text_range":[0,253],"entities":{"hashtags":[],"urls":[{"url":"https:\/\/t.co\/HCyafbiNw2","expanded_url":"https:\/\/dailycaller.com\/2019\/03\/15\/stacey-abrams-georgia-election\/","display_url":"dailycaller.com\/2019\/03\/15\/sta\u2026","indices":[230,253]}],"user_mentions":[],"symbols":[]}},"quote_count":184,"reply_count":708,"retweet_count":411,"favorite_count":473,"entities":{"hashtags":[],"urls":[{"url":"https:\/\/t.co\/swEmfwTo8S","expanded_url":"https:\/\/twitter.com\/i\/web\/status\/1106777688574849024","display_url":"twitter.com\/i\/web\/status\/1\u2026","indices":[116,139]}],"user_mentions":[],"symbols":[]},"favorited":false,"retweeted":false,"possibly_sensitive":false,"filter_level":"low","lang":"en"},"quoted_status_permalink":{"url":"https:\/\/t.co\/K7j2fKQy1g","expanded":"https:\/\/twitter.com\/NevadaJack2\/status\/1106777688574849024","display":"twitter.com\/NevadaJack2\/st\u2026"},"is_quote_status":true,"extended_tweet":{"full_text":"Hillary and her #nitwit adherents claim she won the 2016 election. \"Participation trophies\" don\'t mean a thing in the real world. What a screw up generation the Left has loosed on our nation.","display_text_range":[0,191],"entities":{"hashtags":[{"text":"nitwit","indices":[16,23]}],"urls":[],"user_mentions":[],"symbols":[]}},"quote_count":0,"reply_count":0,"retweet_count":0,"favorite_count":0,"entities":{"hashtags":[{"text":"nitwit","indices":[16,23]}],"urls":[{"url":"https:\/\/t.co\/zUUSMWXqja","expanded_url":"https:\/\/twitter.com\/i\/web\/status\/1106950932569288707","display_url":"twitter.com\/i\/web\/status\/1\u2026","indices":[117,140]}],"user_mentions":[],"symbols":[]},"favorited":false,"retweeted":false,"filter_level":"low","lang":"en","timestamp_ms":"1552752643135"}',
'{"created_at":"Sat Mar 16 16:10:43 +0000 2019","id":1106950932569288707,"id_str":"1106950932569288707","text":"Hillary and her #nitwit adherents claim she won the 2016 election. \"Participation trophies\" don\'t mean a thing in t\u2026 https:\/\/t.co\/zUUSMWXqja","source":"\u003ca href=\"http:\/\/twitter.com\" rel=\"nofollow\"\u003eTwitter Web Client\u003c\/a\u003e","truncated":true,"in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"in_reply_to_user_id_str":null,"in_reply_to_screen_name":null,"user":{"id":864531280934907905,"id_str":"864531280934907905","name":"\ud83c\uddfa\ud83c\uddf8\ud83c\uddfa\ud83c\uddf8 SSDD \ud83c\uddfa\ud83c\uddf8\ud83c\uddfa\ud83c\uddf8","screen_name":"EdGullion","location":"Objective Reality","url":null,"description":"\ud83c\uddfa\ud83c\uddf8\ud83c\uddfa\ud83c\uddf8#Trump | #BuildThatWall | #AmericaFirst | \ud83d\udeabLibs | \ud83d\udeabDems | Block #Nitwits | http:\/\/Gab.com \ud83c\uddfa\ud83c\uddf8\ud83c\uddfa\ud83c\uddf8","translator_type":"none","protected":false,"verified":false,"followers_count":5173,"friends_count":5171,"listed_count":4,"favourites_count":18478,"statuses_count":41837,"created_at":"Tue May 16 17:21:34 +0000 2017","utc_offset":null,"time_zone":null,"geo_enabled":false,"lang":"en","contributors_enabled":false,"is_translator":false,"profile_background_color":"000000","profile_background_image_url":"http:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_image_url_https":"https:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_tile":false,"profile_link_color":"1B95E0","profile_sidebar_border_color":"000000","profile_sidebar_fill_color":"000000","profile_text_color":"000000","profile_use_background_image":false,"profile_image_url":"http:\/\/pbs.twimg.com\/profile_images\/959783845246615552\/nJBLKbFS_normal.jpg","profile_image_url_https":"https:\/\/pbs.twimg.com\/profile_images\/959783845246615552\/nJBLKbFS_normal.jpg","profile_banner_url":"https:\/\/pbs.twimg.com\/profile_banners\/864531280934907905\/1541463166","default_profile":false,"default_profile_image":false,"following":null,"follow_request_sent":null,"notifications":null},"geo":null,"coordinates":null,"place":null,"contributors":null,"quoted_status_id":1106777688574849024,"quoted_status_id_str":"1106777688574849024","quoted_status":{"created_at":"Sat Mar 16 04:42:18 +0000 2019","id":1106777688574849024,"id_str":"1106777688574849024","text":"Stacey Abrams claimed that she won her election against Republican Gov. Brian Kemp, during an event yesterday.\n\n\u201cI\u2026 https:\/\/t.co\/swEmfwTo8S","source":"\u003ca href=\"http:\/\/twitter.com\" rel=\"nofollow\"\u003eTwitter Web Client\u003c\/a\u003e","truncated":true,"in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"in_reply_to_user_id_str":null,"in_reply_to_screen_name":null,"user":{"id":835676806455832576,"id_str":"835676806455832576","name":"\ud83c\uddfa\ud83c\uddf8 Jack Ralph \ud83c\uddfa\ud83c\uddf8","screen_name":"NevadaJack2","location":"Carson City, NV","url":null,"description":"Vietnam Era #Veteran (USAF 61-65), married 57 years, father of 4, grand of 7, retired IT guy. YUUGE #Trump supporter. #MAGA #TRUMP2020 #WWG1WGA","translator_type":"none","protected":false,"verified":false,"followers_count":59620,"friends_count":57299,"listed_count":55,"favourites_count":213014,"statuses_count":125563,"created_at":"Sun Feb 26 02:24:11 +0000 2017","utc_offset":null,"time_zone":null,"geo_enabled":false,"lang":"en","contributors_enabled":false,"is_translator":false,"profile_background_color":"000000","profile_background_image_url":"http:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_image_url_https":"https:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_tile":false,"profile_link_color":"E81C4F","profile_sidebar_border_color":"000000","profile_sidebar_fill_color":"000000","profile_text_color":"000000","profile_use_background_image":false,"profile_image_url":"http:\/\/pbs.twimg.com\/profile_images\/852694196804780034\/Ylah4YQT_normal.jpg","profile_image_url_https":"https:\/\/pbs.twimg.com\/profile_images\/852694196804780034\/Ylah4YQT_normal.jpg","profile_banner_url":"https:\/\/pbs.twimg.com\/profile_banners\/835676806455832576\/1492132580","default_profile":false,"default_profile_image":false,"following":null,"follow_request_sent":null,"notifications":null},"geo":null,"coordinates":null,"place":null,"contributors":null,"is_quote_status":false,"extended_tweet":{"full_text":"Stacey Abrams claimed that she won her election against Republican Gov. Brian Kemp, during an event yesterday.\n\n\u201cI did win my election,\u201d she said, according to ABC News reporter Adam Kelsey. \u201cI just didn\u2019t get to have the job.\u201d \n\nhttps:\/\/t.co\/HCyafbiNw2","display_text_range":[0,253],"entities":{"hashtags":[],"urls":[{"url":"https:\/\/t.co\/HCyafbiNw2","expanded_url":"https:\/\/dailycaller.com\/2019\/03\/15\/stacey-abrams-georgia-election\/","display_url":"dailycaller.com\/2019\/03\/15\/sta\u2026","indices":[230,253]}],"user_mentions":[],"symbols":[]}},"quote_count":184,"reply_count":708,"retweet_count":411,"favorite_count":473,"entities":{"hashtags":[],"urls":[{"url":"https:\/\/t.co\/swEmfwTo8S","expanded_url":"https:\/\/twitter.com\/i\/web\/status\/1106777688574849024","display_url":"twitter.com\/i\/web\/status\/1\u2026","indices":[116,139]}],"user_mentions":[],"symbols":[]},"favorited":false,"retweeted":false,"possibly_sensitive":false,"filter_level":"low","lang":"en"},"quoted_status_permalink":{"url":"https:\/\/t.co\/K7j2fKQy1g","expanded":"https:\/\/twitter.com\/NevadaJack2\/status\/1106777688574849024","display":"twitter.com\/NevadaJack2\/st\u2026"},"is_quote_status":true,"extended_tweet":{"full_text":"Hillary and her #nitwit adherents claim she won the 2016 election. \"Participation trophies\" don\'t mean a thing in the real world. What a screw up generation the Left has loosed on our nation.","display_text_range":[0,191],"entities":{"hashtags":[{"text":"nitwit","indices":[16,23]}],"urls":[],"user_mentions":[],"symbols":[]}},"quote_count":0,"reply_count":0,"retweet_count":0,"favorite_count":0,"entities":{"hashtags":[{"text":"nitwit","indices":[16,23]}],"urls":[{"url":"https:\/\/t.co\/zUUSMWXqja","expanded_url":"https:\/\/twitter.com\/i\/web\/status\/1106950932569288707","display_url":"twitter.com\/i\/web\/status\/1\u2026","indices":[117,140]}],"user_mentions":[],"symbols":[]},"favorited":false,"retweeted":false,"filter_level":"low","lang":"en","timestamp_ms":"1552752643135"}']
How can I convert this to json format and store it in a json file?
You can use "json" module
import json
group_data = {
"A": "A",
"B": "B",
"C": "C"
}
with open("out.json", "w", encoding="utf-8") as make_file:
json.dump(group_data, make_file, ensure_ascii=False, indent="\t")
Use json.loads and json.dump:
import json
# 'data' is in the same format you posted
path = "myjson.json"
json.dump([json.loads(d) for d in data], open(path, "w"))
or simply:
with open(path, "w") as f:
f.write("[{}]".format(",".join(data)))

Get information from a xml imdb response with no tags

I'm working on a movie data base, getting responses from imdb. I'm getting the response in a xml format, but it has no tags, just the information mixed. How can I get each of the data in there?
Here's how the respone shows up:
<?xml version="1.0" encoding="UTF-8"?><root response="True">
<movie title="Batman" year="1989" rated="PG-13" released="23 Jun 1989" runtime="126 min" genre="Action, Adventure" director="Tim Burton" writer="Bob Kane (Batman characters), Sam Hamm (story), Sam Hamm (screenplay), Warren Skaaren (screenplay)" actors="Michael Keaton, Jack Nicholson, Kim Basinger, Robert Wuhl" plot="Gotham City. Crime boss Carl Grissom (Jack Palance) effectively runs the town but there's a new crime fighter in town - Batman (Michael Keaton). Grissom's right-hand man is Jack Napier (Jack Nicholson), a brutal man who is not entirely sane... After falling out between the two Grissom has Napier set up with the Police and Napier falls to his apparent death in a vat of chemicals. However, he soon reappears as The Joker and starts a reign of terror in Gotham City. Meanwhile, reporter Vicki Vale (Kim Basinger) is in the city to do an article on Batman. She soon starts a relationship with Batman's everyday persona, billionaire Bruce Wayne." language="English, French, Spanish" country="USA, UK" awards="Won 1 Oscar. Another 8 wins & 26 nominations." poster="https://m.media-amazon.com/images/M/MV5BMTYwNjAyODIyMF5BMl5BanBnXkFtZTYwNDMwMDk2._V1_SX300.jpg" metascore="69" imdbRating="7.6" imdbVotes="302,842" imdbID="tt0096895" type="movie" />
</root>
Here is my answer to your question
xmlRaw="""< ?xml
version = "1.0"
encoding = "UTF-8"? > < root
response = "True" >
< movie
title = "Batman"
year = "1989"
rated = "PG-13"
released = "23 Jun 1989"
runtime = "126 min"
genre = "Action, Adventure"
director = "Tim Burton"
writer = "Bob Kane (Batman characters), Sam Hamm (story), Sam Hamm (screenplay), Warren Skaaren (screenplay)"
actors = "Michael Keaton, Jack Nicholson, Kim Basinger, Robert Wuhl"
plot = "Gotham City. Crime boss Carl Grissom (Jack Palance) effectively runs the town but there's a new crime fighter in town - Batman (Michael Keaton). Grissom's right-hand man is Jack Napier (Jack Nicholson), a brutal man who is not entirely sane... After falling out between the two Grissom has Napier set up with the Police and Napier falls to his apparent death in a vat of chemicals. However, he soon reappears as The Joker and starts a reign of terror in Gotham City. Meanwhile, reporter Vicki Vale (Kim Basinger) is in the city to do an article on Batman. She soon starts a relationship with Batman's everyday persona, billionaire Bruce Wayne."
language = "English, French, Spanish"
country = "USA, UK"
awards = "Won 1 Oscar. Another 8 wins & 26 nominations."
poster = "https://m.media-amazon.com/images/M/MV5BMTYwNjAyODIyMF5BMl5BanBnXkFtZTYwNDMwMDk2._V1_SX300.jpg"
metascore = "69"
imdbRating = "7.6"
imdbVotes = "302,842"
imdbID = "tt0096895"
type = "movie" / >
< / root >"""
def getValue(xml, value):
textString = xmlRaw.split('\n')
for line in textString:
if value in line:
returnData = line
return returnData
print (getValue(xmlRaw, 'title'))
print (getValue(xmlRaw, 'year'))
print (getValue(xmlRaw, 'rated'))
print (getValue(xmlRaw, 'released'))
#add more as you need the data

for x in y loop capture instance value

If I have a string like the following:
output = '''
Certificate 1:
Valid from: Mon Jun 12 14:58:50 EDT 2017
Valid until: Wed Jun 12 15:28:50 EDT 2019
Certificate 2:
Valid from: Mon Jun 12 15:00:43 EDT 2017
Valid until: Wed Jun 12 15:30:43 EDT 2019
'''
I want to differentiate between the two values when I convert to unixtime. How do I tell it when it's Certificate 1 or Certificate 2?
This is what I have so far, works for getting the two dates, but I don't know how to say if it's the first result then it's Certificate 1.
for line in output.splitlines():
if 'Valid until' in line:
environment = '???'
valid_until_time = (line.split(':', 1)[1]).strip()[4:]
valid_until_time = valid_until_time.replace(' EDT', '')
unixtime = time.mktime(datetime.strptime(valid_until_time, '%b %d %H:%M:%S %Y').timetuple())
send_to_zabbixsender(zabbix_executable,
zabbix_config,
item_key='{0}.expirydate'.format(environment),
item_value=unixtime)
just store current certificate as a state when you encounter it:
import time
from datetime import datetime
output = '''
Certificate 1:
Valid from: Mon Jun 12 14:58:50 EDT 2017
Valid until: Wed Jun 12 15:28:50 EDT 2019
Certificate 2:
Valid from: Mon Jun 12 15:00:43 EDT 2017
Valid until: Wed Jun 12 15:30:43 EDT 2019
'''
current_certificate = 0
for line in output.splitlines():
if line.startswith("Certificate"):
current_certificate = int(line.split()[1].rstrip(":"))
if 'Valid until' in line:
environment = '???'
valid_until_time = (line.split(':', 1)[1]).strip()[4:]
valid_until_time = valid_until_time.replace(' EDT', '')
unixtime = time.mktime(datetime.strptime(valid_until_time, '%b %d %H:%M:%S %Y').timetuple())
print("{}: {}".format(current_certificate,unixtime))
this standalone example prints:
1: 1560346130.0
2: 1560346243.0
there's lots of different ways to solve this problem.
One way is by simply checking if it's cert 1 or 2
for line in output.splitlines():
if 'Certificate 1' in line: cert1Bool = True
if 'Certificate 2' in line: cert1Bool = False
Then move forward with the rest of your code, only check cert1Bool as needed
I would just keep track of current certificate outside of the for loop.
e.g.:
certificate = ''
for line in output.splitlines():
if 'certificate' in line:
certificate = line
else if 'Valid until' in line:
environment = certificate
valid_until_time = (line.split(':', 1)[1]).strip()[4:]
valid_until_time = valid_until_time.replace(' EDT', '')
unixtime = time.mktime(datetime.strptime(valid_until_time, '%b %d %H:%M:%S %Y').timetuple())
send_to_zabbixsender(zabbix_executable,
zabbix_config,
item_key='{0}.expirydate'.format(environment),
item_value=unixtime)

python, django, sending data to a template

EDIT: Updated code, im trying to create a new nested dictionary from the orginal results.
however the dictionary is currently not updating, its only adding/editing the last value
so my current context just has greg on it and no one else
my current code is as below
# Create your views here.
def index(request):
### Get all the Polices ###
context = {}
for objPolicy in objPolicyData['escalation_policies']:
strPolicyName = objPolicy['name']
if strPolicyName.lower().find('test') == -1:
context['strPolicyName'] = strPolicyName
obj = {}
for objOnCall in objPolicy['on_call']:
obj['strLevel'] = objOnCall['level']
obj['strStartDate'] = getDate(objOnCall['start'])
obj['strStartTime'] = getTime(objOnCall['start'])
obj['strEndDate'] = getDate(objOnCall['end'])
obj['strEndTime'] = getTime(objOnCall['end'])
objUser = objOnCall['user']
obj['strUsername'] = objUser['name']
obj['strUserMobile'] = getUserMobile(objUser['id'])
context['objUsers'] = obj
return render(request, 'oncall/rota.html', context)
sample data would be
Network Policy
Level 1: John Smith
Start date: 27 April
Start time: 8am
end Date: 05 May
end time: 8am
Level 2: Bob Smith
Start date: 27 April
Start time: 8am
end Date: 05 May
end time: 8am
Server Policy
Level 1: Jane Doe
Start date: 23 April
Start time: 8am
end Date: 02 May
end time: 8am
Level 2: Greg Brad
Start date: 23 April
Start time: 8am
end Date: 02 May
end time: 8am
and so on...
Update:
#Alix, your current solution gives me the below, i think i need nested lists? as the level 2 engineer gets posted twice instead of level 1 and level 2, also missing the policy names for each one
#!/usr/bin/python
# -*- coding: utf-8 -*-
{'policies': [{
'strStartTime': '09:00AM',
'strEndTime': '09:00AM',
'strLevel': 2,
'strUserMobile': u'01234 5678',
'strEndDate': 'Monday 02 May',
'strUsername': u'John Smith',
'strStartDate': 'Monday 25 April',
}, {
'strStartTime': '09:00AM',
'strEndTime': '09:00AM',
'strLevel': 2,
'strUserMobile': u'01234 5678'',
'strEndDate': 'Monday 02 May',
'strUsername': u'John Smith',
'strStartDate': 'Monday 25 April',
}, {
'strStartTime': '09:00AM',
'strEndTime': '05:00PM',
'strLevel': 1,
'strUserMobile': u'011151588',
'strEndDate': 'Thursday 28 April',
'strUsername': u'Jane Doe',
'strStartDate': 'Thursday 28 April',
}, {
'strStartTime': '05:00PM',
'strEndTime': '03:30PM',
'strLevel': 1,
'strUserMobile': 'User does not have a company phone no',
'strEndDate': 'Thursday 28 April',
'strUsername': u'Fred Perry',
'strStartDate': 'Wednesday 27 April',
}, {
'strStartTime': '09:00AM',
'strEndTime': '07:00AM',
'strLevel': 1,
'strUserMobile': 'User does not have a company phone no',
'strEndDate': 'Tuesday 03 May',
'strUsername': u'Sally Cinomon',
'strStartDate': 'Monday 25 April',
}]}
Just expanding my comment above with how to use the data in template:
You can send your data within render:
return render(request, "oncall/rota.html", {"policies": objPolicyData['escalation_policies'])
Then, in your template file, you can do something like this:
{% for policy in policies %}
{% for objOnCall in policy.on_call %}
<p> Level: {{ objOnCall.level }} </p>
<p> Start Time: {{ objOnCall.start }} </p>
{% endfor %}
{% endfor %}
UPDATE
According to the your last update to the question;
You said,
however the dictionary is currently not updating, its only
adding/editing the last value
This is right, because you don't have an array contains your policy objects. You only set the last value in the loop to the dictionary. This is why you are getting only last object.
This should do the work;
# Create your views here.
def index(request):
### Get all the Polices ###
policies = []
for objPolicy in objPolicyData['escalation_policies']:
strPolicyName = objPolicy['name']
policy = {}
policy['name'] = strPolicyName
if strPolicyName.lower().find('test') == -1:
policy = {}
policy['strPolicyName'] = strPolicyName # add policy name here
policy['objUsers'] = [] # define an empty array for users
for objOnCall in objPolicy['on_call']:
obj['strLevel'] = objOnCall['level']
obj['strStartDate'] = getDate(objOnCall['start'])
obj['strStartTime'] = getTime(objOnCall['start'])
obj['strEndDate'] = getDate(objOnCall['end'])
obj['strEndTime'] = getTime(objOnCall['end'])
objUser = objOnCall['user']
obj['strUsername'] = objUser['name']
obj['strUserMobile'] = getUserMobile(objUser['id'])
policy['objUsers'].append(obj) # add each user to the users array belongs to this policy object
policies.append(policy) # and finally append final and prepared policy object to our main policies array.
context = {"policies": policies}
return render(request, 'oncall/rota.html', context)
Now you can do anything you want with this array inside a for loop in template. (see my above example)
I think, this question is not good.
there are many kind of solution for your goal.
even, django documents.
this is just sample.
context = dict()
for objOnCall in objPolicy['on_call']:
obj = dict()
obj['strLevel'] = objOnCall['level']
obj['strStartDate'] = getDate(objOnCall['start'])
obj['strStartTime'] = getTime(objOnCall['start'])
obj['strEndDate'] = getDate(objOnCall['end'])
obj['strEndTime'] = getTime(objOnCall['end'])
obj['objUser'] = objOnCall['user']
obj['strUsername'] = objUser['name']
obj['strUserMobile'] = getUserMobile(objUser['id'])
context[objUser['name']] = obj
return render(request, 'oncall/rota.html', context)

Organize by Twitter unique identifier using python

I have a CSV file with each line containing information pertaining to a particular tweet (i.e. each line contains Lat, Long, User_ID, tweet and so on). I need to read the file and organize the tweets by the User_ID. I am trying to end up with a given User_ID attached to all of the tweets with that specific ID.
Here is what I want:
user_id: 'lat', 'long', 'tweet'
: 'lat', 'long', 'tweet'
user_id2: 'lat', 'long', 'tweet'
: 'lat', 'long', 'tweet'
: 'lat', 'long', 'tweet'
and so on...
This is a snip of my code that reads in the CSV file and creates a list:
UID = []
myID = []
ID = []
f = None
with open(csv_in,'rU') as f:
myreader = csv.reader(f, delimiter=',')
for row in myreader:
# Assign columns in csv to variables.
latitude = row[0]
longitude = row[1]
user_id = row[2]
user_name = row[3]
date = row[4]
time = row[5]
tweet = row[6]
flag = row[7]
compound = row[8]
Vote = row[9]
# Read variables into separate lists.
UID.append(user_id + ', ' + latitude + ', ' + longitude + ', ' + user_name + ', ' + date + ', ' + time + ', ' + tweet + ', ' + flag + ', ' + compound)
myID = ', '.join(UID)
ID = myID.split(', ')
I'd suggest you use pandas for this. It will allow you not only to list your tweets by user_id, as in your question, but also to do many other manipulations quite easily.
As an example, take a look at this python notebook from NLTK. At the end of it, you see an operation very closed to yours, reading a csv file containing tweets,
In [25]:
import pandas as pd
​
tweets = pd.read_csv('tweets.20150430-223406.tweet.csv', index_col=2, header=0, encoding="utf8")
You can also find a simple operation: looking for the tweets of a certain user,
In [26]:
tweets.loc[tweets['user.id'] == 557422508]['text']
Out[26]:
id
593891099548094465 VIDEO: Sturgeon on post-election deals http://...
593891101766918144 SNP leader faces audience questions http://t.c...
Name: text, dtype: object
For listing the tweets by user_id, you would simply do something like the following (this is not in the original notebook),
In [9]:
tweets.set_index('user.id')[0:4]
Out[9]:
created_at favorite_count in_reply_to_status_id in_reply_to_user_id retweet_count retweeted text truncated
user.id
107794703 Thu Apr 30 21:34:06 +0000 2015 0 NaN NaN 0 False RT #KirkKus: Indirect cost of the UK being in ... False
557422508 Thu Apr 30 21:34:06 +0000 2015 0 NaN NaN 0 False VIDEO: Sturgeon on post-election deals http://... False
3006692193 Thu Apr 30 21:34:06 +0000 2015 0 NaN NaN 0 False RT #LabourEoin: The economy was growing 3 time... False
455154030 Thu Apr 30 21:34:06 +0000 2015 0 NaN NaN 0 False RT #GregLauder: the UKIP east lothian candidat... False
Hope it helps.

Categories