I need to Update a List in Python which is:
data = [{' Customers ','null,blank '},{' CustomersName ','max=50,null,blank '},{' CustomersAddress ','max=150,blank '},{' CustomersActive ','Active '}]
I wanted to Write a Lambda Expression to Store the Customers, CustomersName in the List and Remove the White Spaces.
I am absolutely New to Python and Does not Have Any Knowledge!
As I see it, You have Declared the Dictionary Inside a List but the Dict is Wrong, It should be {"key":"value"}, So I assume you need to Change it to List as such:
data = [[' Customers ','null,blank '],[' CustomersName ','max=50,null,blank '],[' CustomersAddress ','max=150,blank '],[' CustomersActive ','Active ']]
And Then The Following would get you Your Desired!
data_NameExtracted = [x[0].strip() for x in data]
You can not put this inside a lambda expression, but you can use a generator object like this:
# please note that i have used tuples instead of sets,
# because sets are unordered
data = [
(' Customers ','null,blank '),
(' CustomersName ','max=50,null,blank '),
(' CustomersAddress ','max=150,blank '),
(' CustomersActive ','Active ')
]
# Indexing is not allowed for set objects
values = [item[0].strip() for item in data]
see:
https://wiki.python.org/moin/Generators
https://docs.python.org/3/tutorial/datastructures.html#sets
EDIT:
If you wan't to use dictionaries you could use something like this:
data = [
{' Customers ': 'null,blank '},
{' CustomersName ': 'max=50,null,blank '},
{' CustomersAddress ': 'max=150,blank '},
{' CustomersActive ': 'Active '}
]
# expecting a single value in the dicts
values = [item.values()[0].strip() for item in data]
Related
The following is the GitHub link for Python's Pandas package.
https://github.com/pandas-dev/pandas
I would like to find the source code for a specific method (for instance, iterrows). What would be the file path for this?
Python, in general, is easily introspect-able. You can use the inspect module if you want to do this programatically. so for example:
In [8]: import pandas as pd
In [9]: import inspect
In [10]: pd.DataFrame.iterrows
Out[10]: <function pandas.core.frame.DataFrame.iterrows(self)>
In [11]: inspect.getsourcefile(pd.DataFrame.iterrows)
Out[11]: '/Users/juan/anaconda3/envs/py38/lib/python3.8/site-packages/pandas/core/frame.py'
So you can go to pandas/core/frame.py. Note, this won't always work if it is, say, a method written in C as an extension. But it should for Python source code. In fact, you can even get the source code lines using inspect.getsourcelines, which returns a tuple of lines, line_number:
In [12]: inspect.getsourcelines(pd.DataFrame.iterrows)
Out[12]:
([' def iterrows(self):\n',
' """\n',
' Iterate over DataFrame rows as (index, Series) pairs.\n',
'\n',
' Yields\n',
' ------\n',
' index : label or tuple of label\n',
' The index of the row. A tuple for a `MultiIndex`.\n',
' data : Series\n',
' The data of the row as a Series.\n',
'\n',
' it : generator\n',
' A generator that iterates over the rows of the frame.\n',
'\n',
' See Also\n',
' --------\n',
' itertuples : Iterate over DataFrame rows as namedtuples of the values.\n',
' items : Iterate over (column name, Series) pairs.\n',
'\n',
' Notes\n',
' -----\n',
'\n',
' 1. Because ``iterrows`` returns a Series for each row,\n',
' it does **not** preserve dtypes across the rows (dtypes are\n',
' preserved across columns for DataFrames). For example,\n',
'\n',
" >>> df = pd.DataFrame([[1, 1.5]], columns=['int', 'float'])\n",
' >>> row = next(df.iterrows())[1]\n',
' >>> row\n',
' int 1.0\n',
' float 1.5\n',
' Name: 0, dtype: float64\n',
" >>> print(row['int'].dtype)\n",
' float64\n',
" >>> print(df['int'].dtype)\n",
' int64\n',
'\n',
' To preserve dtypes while iterating over the rows, it is better\n',
' to use :meth:`itertuples` which returns namedtuples of the values\n',
' and which is generally faster than ``iterrows``.\n',
'\n',
' 2. You should **never modify** something you are iterating over.\n',
' This is not guaranteed to work in all cases. Depending on the\n',
' data types, the iterator returns a copy and not a view, and writing\n',
' to it will have no effect.\n',
' """\n',
' columns = self.columns\n',
' klass = self._constructor_sliced\n',
' for k, v in zip(self.index, self.values):\n',
' s = klass(v, index=columns, name=k)\n',
' yield k, s\n'],
860)
Generally, also, you can just print the function/method and look at the information in the string representation, and pretty much figure it out:
In [19]: pd.DataFrame.iterrows
Out[19]: <function pandas.core.frame.DataFrame.iterrows(self)>
So just from that you could see it is in pandas.core.frame.
This site and this one have a button with a link (source). I usually just google the method I need and add the word source
I am currently scraping into a database where I get minimum orders of product like: "3 Boxes", "1 Kilogram", "9 Cases".
I would like to eliminate all words accompanying numbers and get only the numbers.
My code to filter those exceptions is:
import pandas as pd
min_order = element.find_element_by_class_name('gallery-offer-minorder').find_element_by_tag_name('span').text.replace(' Pieces', '').replace(' Piece', '').replace(' Units', '').replace(
' Unit', '').replace(' Sets', '').replace(' Set', '').replace(' Pairs', '').replace(' Pair', '').replace('Boxes', '').replace('Box', '').replace('Bags', '').replace('Bag', '').replace('Carton', '').replace('Acre', '').replace('Kilograms', '').replace('Kilogram', '')
My code works for all the cases I tried until I get an exception I haven't noticed. I want to know if it is any way to do this procedure using less lines of code and to eliminate all letters.
you can split the text and get only the 1st part which is the number
min_order = element.find_element_by_class_name('gallery-offer-minorder').find_element_by_tag_name('span').text.split(" ")[0]
[' D:\\2019-09-06_ECOSTRESS_L3_ET_PT-JPL_06627_012_20190906T075008_0601_02_ETinst.tif', ' D:\\2019-09-06_ECOSTRESS_L3_ET_PT-JPL_06627_012_20190906T075008_0601_02_ETinstUncertainty.tif', ' D:\\2019-09-06_ECOSTRESS_L3_ET_PT-JPL_06627_013_20190906T075100_0601_01_ETinst.tif', ' D:\\2019-09-06_ECOSTRESS_L3_ET_PT-JPL_06627_013_20190906T075100_0601_01_ETinstUncertainty.tif', ' D:\\2019-09-14_ECOSTRESS_L3_ET_PT-JPL_06749_010_20190914T043343_0601_02_ETinst.tif', ' D:\\2019-09-14_ECOSTRESS_L3_ET_PT-JPL_06749_010_20190914T043343_0601_02_ETinstUncertainty.tif', ' D:\\2019-10-29_ECOSTRESS_L3_ET_PT-JPL_07451_013_20191029T104129_0601_01_ETinst.tif', ' D:\\2019-10-29_ECOSTRESS_L3_ET_PT-JPL_07451_013_20191029T104129_0601_01_ETinstUncertainty.tif', ' D:\\2019-11-13_ECOSTRESS_L3_ET_PT-JPL_07680_016_20191113T050013_0601_01_ETinst.tif', ' D:\\2019-11-13_ECOSTRESS_L3_ET_PT-JPL_07680_016_20191113T050013_0601_01_ETinstUncertainty.tif', ' D:\\2019-11-13_ECOSTRESS_L3_ET_PT-JPL_07680_017_20191113T050105_0601_01_ETinst.tif', ' D:\\2019-11-13_ECOSTRESS_L3_ET_PT-JPL_07680_017_20191113T050105_0601_01_ETinstUncertainty.tif', ' D:\\2019-12-17_ECOSTRESS_L3_ET_PT-JPL_08207_001_20191217T034051_0601_01_ETinst.tif', ' D:\\2019-12-17_ECOSTRESS_L3_ET_PT-JPL_08207_001_20191217T034051_0601_01_ETinstUncertainty.tif']
I have a list of file paths in a python list. I want to find all files which belong to the same date e.g.4 files have names beginning with '2019-09-06'. How can I divide this list into smaller lists each of which has files with the same date in the beginning? I do not know the common dates so the solution should be able to find them dynamically.
You can use dictionary to store the files with same date and date as the key.
from collections import defaultdict
dates=defaultdict(list)
for path in paths:
key,val=path[4:14],path[14:]
dates[key].append(val)
import re
paths = [' D:\\2019-09-06_ECOSTRESS_L3_ET_PT-JPL_06627_012_20190906T075008_0601_02_ETinst.tif', ' D:\\2019-09-06_ECOSTRESS_L3_ET_PT-JPL_06627_012_20190906T075008_0601_02_ETinstUncertainty.tif', ' D:\\2019-09-06_ECOSTRESS_L3_ET_PT-JPL_06627_013_20190906T075100_0601_01_ETinst.tif', ' D:\\2019-09-06_ECOSTRESS_L3_ET_PT-JPL_06627_013_20190906T075100_0601_01_ETinstUncertainty.tif', ' D:\\2019-09-14_ECOSTRESS_L3_ET_PT-JPL_06749_010_20190914T043343_0601_02_ETinst.tif', ' D:\\2019-09-14_ECOSTRESS_L3_ET_PT-JPL_06749_010_20190914T043343_0601_02_ETinstUncertainty.tif', ' D:\\2019-10-29_ECOSTRESS_L3_ET_PT-JPL_07451_013_20191029T104129_0601_01_ETinst.tif', ' D:\\2019-10-29_ECOSTRESS_L3_ET_PT-JPL_07451_013_20191029T104129_0601_01_ETinstUncertainty.tif']
arrays=[]
dates = input().split(" ")
#input dates
for path in paths :
array = []
for date in dates:
match = re.search(date,path)
if match is not None:
array.append(path)
arrays.append(array)
i used regular expression to sort this.You can input your values into the dates array if you would like to but .This gives you a list of sorted lists.
I've massaged the data into the list structure…
[['keychain: "keychainname.keychain-db"',
'version: 512',
'class: 0x0000000F ',
'attributes:\n long string containing : and \n that needs to be split up into a list (by newline) of dictionaries (by regex and/or split() function) '],
['keychain: "keychainname.keychain-db"',
'version: 512',
'class: 0x0000000F ',
'attributes:\n long string that needs to be split up '],
['keychain: "keychainname.keychain-db"',
'version: 512',
'class: 0x0000000F ',
'attributes:\n long string that needs to be split up']]
I'm trying to use a comprehension to take each item in the list and split it into a dictionary with the format…
['{'keychain': 'keychainname.db',
'version': '512',
'class': '0x0000000F',
'attribute':\n long string containing : and \n that needs to be split up into a dictionary (by newline) of dictionaries (by regex and/or split() function) '}']
The following for loop seems to work…
newdata = []
for item in data:
eachdict = {}
for each in item:
new = each.split(':', 1)
eachdict[new[0]] = new[1]
newdata.append(eachdict)
But my attempt at a comprehension does not…
newdata = [[{key:value for item in data} for line in item] for key, value in (line.split(':', 1))]
This comprehension runs, but it doesn't have the nesting done correctly…
newdata = [{key:value for item in data} for key, value in (item.split(':', 1),)]
I've just started learning comprehensions and I've been able to use them successfully to get the data into the above format of a nested list, but I'm struggling to understand the nesting when I'm going down three levels and switching from list to dictionary.
I'd appreciate some pointers on how to tackle the problem.
For bonus points, I'll need to split the long string inside the attributes key into a dictionary of dictionaries as well. I'd like to be able to reference the 'Alis' key, the 'labl' key and so on. I can probably figure that out on my own if I learn how to use nested comprehensions in the example above first.
attributes:\n
"alis"<blob>="com.company.companyvpn.production.vpn.5D5AF9C525C25350E9CD5CEBED824BFD60E42110"\n
"cenc"<uint32>=0x00000003 \n
"ctyp"<uint32>=0x00000001 \n
"hpky"<blob>=0xB7262C7D5BCC976744F8CA6DE5A80B755622D434 "\\267&,}[\\314\\227gD\\370\\312m\\345\\250\\013uV"\\3244"\n
"issu"<blob>=0x306E3128302606035504030C1F4170706C6520436F72706F726174652056504E20436C69656E7420434120313120301E060355040B0C1743657274696669636174696F6E20417574686F7269747931133011060355040A0C0A4170706C6520496E632E310B3009060355040613025553 "0n1(0&\\006\\003U\\004\\003\\014\\037Company Corporate VPN Client CA 11 0\\036\\006\\003U\\004\\013\\014\\027Certification Authority1\\0230\\021\\006\\003U\\004\\012\\014\\012Company Inc.1\\0130\\011\\006\\003U\\004\\006\\023\\002US"\n
"labl"<blob>="com.company.companyvpn.production.vpn.5D5AF9C525C25350E9CD5CEBED824BFD60E42110"\n
"skid"<blob>=0xB7262C7D5BCC976744F8CA6DE5A80B755622D434 "\\267&,}[\\314\\227gD\\370\\312m\\345\\250\\013uV"\\3244"\n "snbr"<blob>=0x060A02F6F9880D69 "\\006\\012\\002\\366\\371\\210\\015i"\n
"subj"<blob>=0x3061315F305D06035504030C56636F6D2E6170706C652E6973742E64732E6170706C65636F6E6E656374322E70726F64756374696F6E2E76706E2E35443541463943353235433235333530453943443543454245443832344246443630453432313130 "0a1_0]\\006\\003U\\004\\003\\014Vcom.company.companyvpn.production.vpn.5D5AF9C525C25350E9CD5CEBED824BFD60E42110"'
For context…
I'm using the output of "security dump-keychain" on the Mac to make a nice Python data structure to find keys. The check_output of this command is a long string with some inconsistent formatting and embedded newlines that I need to clean up to get the data into a list of dictionaries of dictionaries.
For those interested in Mac admin topics, this is so we can remove items that save the Active Directory password when the AD password is reset so that the account doesn't get locked by, say, Outlook presenting the old password to Exchange over and over.
Here might be an approach:
data = [['keychain: "keychainname.keychain-db"', 'version: 512', 'class: 0x0000000F ', 'attributes:\n long string containing : and \n that needs to be split up into a list (by newline) of dictionaries (by regex and/or split() function) '], ['keychain: "keychainname.keychain-db"', 'version: 512', 'class: 0x0000000F ', 'attributes:\n long string that needs to be split up '], ['keychain: "keychainname.keychain-db"', 'version: 512', 'class: 0x0000000F ', 'attributes:\n long string that needs to be split up']]
result = [dict([item.split(':', 1) for item in items]) for items in data]
>>> [{'keychain': ' "keychainname.keychain-db"', 'version': ' 512', 'class': ' 0x0000000F ', 'attributes': '\n long string containing : and \n that needs to be split up into a list (by newline) of dictionaries (by regex and/or split() function) '}, {'keychain': ' "keychainname.keychain-db"', 'version': ' 512', 'class': ' 0x0000000F ', 'attributes': '\n long string that needs to be split up '}, {'keychain': ' "keychainname.keychain-db"', 'version': ' 512', 'class': ' 0x0000000F ', 'attributes': '\n long string that needs to be split up'}]
The split breaks up each individual string into a key, value pair.
The inside list comprehension loops through each key, value pair in an individual item. The outside list comprehension loops through each item in the main list.
I need to clean a list of strings containing names. I need to remove titles and then things like 's etc. The code works ok but I'd like to transform it to two comprehension lists. My attempts like this one [name.replace(e, '') for name in names_ for e in replace] didn't work, I'm definitely missing something. Will appreciate your help!
names = ['Mrs Marple', 'Maj Gen Smith', "Tony Dobson's"]
replace = ['Mrs ', 'Maj ', 'Gen ']
names_new = []
for name in names:
for e in replace:
name = name.replace(e, '')
names_new.append(name)
names_final = []
for name in names_new:
if name.endswith("'s"):
name = name[:-2]
names_final.append(name)
else:
names_final.append(name)
print(names_final)
You can use re.sub() to do exactly what you want:
import re
names = ['Mrs Marple', 'Maj Gen Smith', "Tony Dobson's"]
replace = ['Mrs ', 'Maj ', 'Gen ']
names = [re.sub(r'(Mrs\s|Maj\s|Gen\s|\'s$)', '', x) for x in names]
print(names)
Output:
['Marple', 'Smith', 'Tony Dobson']
the problem is due to name = name.replace(e, '') statement in the for loop, and as we can't use assignment operator in comprehensions, you used name.replace(e, '') but again replace() method is not inplace as the string in python is not mutable.
Solution I that I have written is based on using reduce, here were replacing all the occurrences of elements in sequence replace.
from functools import reduce
names = ['Mrs Marple', 'Maj Gen Smith', "Tony Dobson's"]
replace = ['Mrs ','Maj ','Gen ']
result = [reduce(lambda str, e: str.replace(e, ''), replace, name) for name in names]
Here is the result
print(result)
['Marple', 'Smith', "Tony Dobson's"]
The solution by #chrisz works but if replace list is generated on the fly or is too long, we won't be able to form a regex for it. This solution works pretty much in any scenario.