I have a some variables and I need to compare each of them and fill three lists according the comparison, if the var == 1 add a 1 to lista_a, if var == 2 add a 1 to lista_b..., like:
inx0=2 inx1=1 inx2=1 inx3=1 inx4=4 inx5=3 inx6=1 inx7=1 inx8=3 inx9=1
inx10=2 inx11=1 inx12=1 inx13=1 inx14=4 inx15=3 inx16=1 inx17=1 inx18=3 inx19=1
inx20=2 inx21=1 inx22=1 inx23=1 inx24=2 inx25=3 inx26=1 inx27=1 inx28=3 inx29=1
lista_a=[]
lista_b=[]
lista_c=[]
#this example is the comparison for the first variable inx0
#and the same for inx1, inx2, etc...
for k in range(1,30):
if inx0==1:
lista_a.append(1)
elif inx0==2:
lista_b.append(1)
elif inx0==3:
lista_c.append(1)
I need get:
#lista_a = [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]
#lista_b = [1,1,1]
#lista_c = [1]
Your inx* variables should almost certinaly be a list to begin with:
inx = [2,1,1,1,4,3,1,1,3,1,2,1,1,1,4,3,1,1,3,1,2,1,1,1,2,3,1,1,3,1]
Then, to find out how many 2's it has:
inx.count(2)
If you must, you can build a new list out of that:
list_a = [1]*inx.count(1)
list_b = [1]*inx.count(2)
list_c = [1]*inx.count(3)
but it seems silly to keep a list of ones. Really the only data you need to keep is a single integer (the count), so why bother carrying around a list?
An alternate approach to get the lists of ones would be to use a defaultdict:
from collections import defaultdict
d = defaultdict(list)
for item in inx:
d[item].append(1)
in this case, what you want as list_a could be accessed by d[1], list_b could be accessed as d[2], etc.
Or, as stated in the comments, you could get the counts using a collections.Counter:
from collections import Counter #python2.7+
counts = Counter(inx)
list_a = [1]*counts[1]
list_b = [1]*counts[2]
...
Related
I have a variable that consists of the list after list after list
my code:
>>> text = File(txt) #creates text object from text name
>>> names = text.name_parser() #invokes parser method to extract names from text object
My name_parser() stores names into a list self.names=[]
example:
>>> variable = my_method(txt)
output:
>>> variable
>>> [jacob, david], [jacob, hailey], [judy, david], ...
I want to make them into single list while retaining the duplicate values
desired output:
>>> [jacob, david, jacob, hailey, judy, david, ...]
(edited)
(edited)
Here's a very simple approach to this.
variable = [['a','b','c'], ['d','e','f'], ['g','h','i']]
fileNames = ['one.txt','two.txt','three.txt']
dict = {}
count = 0
for lset in variable:
for letters in lset:dict[letters] = fileNames[count]
count += 1
print(dict)
I hope this helps
#!/usr/bin/python3
#function to iterate through the list of dict
def fun(a):
for i in a:
for ls in i:
f = open(ls)
for x in f:
print(x)
variable ={ "a": "text.txt", "b": "text1.txt" , "c":"text2.txt" , "d": "text3.txt"}
myls = [variable["a"], variable["b"]], [variable["c"], variable["d"]]
fun(myls)
print("Execution Completed")
You can use itertools module that will allow to transform your list of lists into a flat list:
import itertools
foo = [v for v in itertools.chain.from_iterable(variable)]
After that you can iterate over the new variable however you like.
Well, if your variable is list of lists, then you can try something like this:
file_dict = {}
for idx, files in enumerate(variable):
# you can create some dictionary to bind indices to words
# or use any library for this, I believe there are few
file_name = f'{idx+1}.txt'
for file in files:
file_dict[file] = [file_name]
I have the following list:
lines
['line_North_Mid', 'line_South_Mid',
'line_North_South', 'line_Mid_South',
'line_South_North','line_Mid_North' ]
I would like to couple them in a tuple list as follows, with respect to their names:
tuple_list
[('line_Mid_North', 'line_North_Mid'),
('line_North_South', 'line_South_North'),
('line_Mid_South', 'line_South_Mid')]
I thought maybe I could do a string search in the elements of the lines but it wont be efficient. Is there a better way to order lines elements in a way which would look like tuple_list
Paring Criteria:
If the both elements have the same Area_name: ('North', 'Mid', 'South')
E.g.: 'line_North_Mid' should be coupled with 'line_Mid_North'
Try this:
from itertools import combinations
tuple_list = [i for i in combinations(lines,2) if i[0].split('_')[1] == i[1].split('_')[2] and i[0].split('_')[2] == i[1].split('_')[1]]
or I think this is better:
[i for i in combinations(lines,2) if i[0].split('_')[1:] == i[1].split('_')[1:][::-1]]
An order-agnostic O(n) solution is possible using collections.defaultdict. The idea is to use as our dictionary keys the last 2 components of your strings delimited by '_', appending values from your input list. Then extract values and convert to a list of tuples.
from collections import defaultdict
L = ['line_North_Mid', 'line_South_Mid',
'line_North_South', 'line_Mid_South',
'line_South_North', 'line_Mid_North']
dd = defaultdict(list)
for item in L:
dd[frozenset(item.rsplit('_', maxsplit=2)[1:])].append(item)
res = list(map(tuple, dd.values()))
# [('line_North_Mid', 'line_Mid_North'),
# ('line_South_Mid', 'line_Mid_South'),
# ('line_North_South', 'line_South_North')]
You can use the following list comprehension:
lines = ['line_Mid_North', 'line_North_Mid',
'line_North_South', 'line_South_North',
'line_Mid_South', 'line_South_Mid']
[(j,i) for i in lines for j in lines if j not in i
if set(j.split('_')[1:]) < set(i.split('_'))][::2]
[('line_Mid_North', 'line_North_Mid'),
('line_North_South', 'line_South_North'),
('line_Mid_South', 'line_South_Mid')]
I suggest you have a function that returns the same key for string that are supposed to be together (a grouping-key).
def key(s):
# ignore first part and sort other 2 parts, so they will always be in same order
_, part_1, part_2 = s.split('_')
return tuple(sorted([part_1, part_2]))
The you have to use some grouping method; I used defaultdict for example:
import collections
lines = [
'line_North_Mid', 'line_South_Mid',
'line_North_South', 'line_Mid_South',
'line_South_North','line_Mid_North',
]
dd = collections.defaultdict(list)
for s in lines:
dd[key(s)].append(s) # those with same key get grouped
print(list(tuple(v) for v in dd.values()))
# [
# ('line_North_Mid', 'line_Mid_North'),
# ('line_South_Mid', 'line_Mid_South'),
# ('line_North_South', 'line_South_North'),
# ]
I have a dataframe having categorical variables. I want to convert them to the numerical using the following logic:
I have 2 lists one contains the distinct categorical values in the column and the second list contains the values for each category. Now i need to map these values in place of those categorical values.
For Eg:
List_A = ['A','B','C','D','E']
List_B = [3,2,1,1,2]
I need to replace A with 3, B with 2, C and D with 1 and E with 2.
Is there any way to do this in Python.
I can do this by applying multiple for loops but I am looking for some easier way or some direct function if there is any.
Any help is very much appreciated, Thanks in Advance.
Create a mapping dict
List_A = ['A','B','C','D','E',]
List_B = [3,2,1,1,2]
d=dict(zip(List_A, List_B))
new_list=['A','B','C','D','E','A','B']
new_mapped_list=[d[v] for v in new_list if v in d]
new_mapped_list
Or define a function and use map
List_A = ['A','B','C','D','E',]
List_B = [3,2,1,1,2]
d=dict(zip(List_A, List_B))
def mapper(value):
if value in d:
return d[value]
return None
new_list=['A','B','C','D','E','A','B']
map(mapper,new_list)
Suppose df is your data frame and "Category" is the name of the column holding your categories:
df[df.Category == "A"] = 3,2, 1, 1, 2
df[(df.Category == "B") | (df.Category == "E") ] = 2
df[(df.Category == "C") | (df.Category == "D") ] = 1
If you only need to replace values in one list with the values of other and the structure is like the one you say. Two list, same lenght and same position, then you only need this:
list_a = []
list_a = list_b
A more convoluted solution would be like this, with a function that will create a dictionary that you can use on other lists:
# we make a function
def convert_list(ls_a,ls_b):
dic_new = {}
for letter,number in zip(ls_a,ls_b):
dic_new[letter] = number
return dic_new
This will make a dictionary with the combinations you need. You pass the two list, then you can use that dictionary on other list:
List_A = ['A','B','C','D','E']
List_B = [3,2,1,1,2]
dic_new = convert_list(ls_a, ls_b)
other_list = ['a','b','c','d']
for _ in other_list:
print(dic_new[_.upper()])
# prints
3
2
1
1
cheers
You could use a solution from machine learning scikit-learn module.
OneHotEncoder
LabelEncoder
http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html
http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.LabelEncoder.html
The pandas "hard" way:
https://stackoverflow.com/a/29330853/9799449
If a dictionary contains something to which you can hold a reference, you can default-or-update it with one dictionary lookup:
d.setdefault('k', []).append(2)
However, modifying dictionary entries in the same manner is not possible if they're numbers:
d.setdefault('k', 0) += 1 # doesn't work
Instead, you need to do two dict lookups, one for read and one for write:
d['a'] = d.get('a', 0) + 1
This doesn't seem like a great idea for dictionaries with a huge number of keys. So, is there a way to do a default-or-update operation on dictionaries containing numbers? Or, phrased another way, what's the most performant way to apply a default-or-update operation on such dictionaries?
A quick test suggests that collections.defaultdict is about 2.5 times faster than your double-lookup (tested on Python 2.6):
>>> import timeit
>>> s1 = "d = dict((str(n), 0) for n in range(1000000))"
>>> timeit.repeat("d['a'] = d.get('a', 0) + 1", setup=s1)
[0.17711305618286133, 0.17411494255065918, 0.17812514305114746]
>>> s2 = """
... from collections import defaultdict
... d = defaultdict(int, ((str(n), 0) for n in range(1000000)))
... """
>>> timeit.repeat("d['a'] += 1", setup=s2)
[0.07185506820678711, 0.07294416427612305, 0.12155508995056152]
I want to append several variables to a list. The number of variables varies. All variables start with "volume". I was thinking maybe a wildcard or something would do it. But I couldn't find anything like this. Any ideas how to solve this? Note in this example it is three variables, but it could also be five or six or anything.
volumeA = 100
volumeB = 20
volumeC = 10
vol = []
vol.append(volume*)
You can use extend to append any iterable to a list:
vol.extend((volumeA, volumeB, volumeC))
Depending on the prefix of your variable names has a bad code smell to me, but you can do it. (The order in which values are appended is undefined.)
vol.extend(value for name, value in locals().items() if name.startswith('volume'))
If order is important (IMHO, still smells wrong):
vol.extend(value for name, value in sorted(locals().items(), key=lambda item: item[0]) if name.startswith('volume'))
Although you can do
vol = []
vol += [val for name, val in globals().items() if name.startswith('volume')]
# replace globals() with locals() if this is in a function
a much better approach would be to use a dictionary instead of similarly-named variables:
volume = {
'A': 100,
'B': 20,
'C': 10
}
vol = []
vol += volume.values()
Note that in the latter case the order of items is unspecified, that is you can get [100,10,20] or [10,20,100]. To add items in an order of keys, use:
vol += [volume[key] for key in sorted(volume)]
EDIT removed filter from list comprehension as it was highlighted that it was an appalling idea.
I've changed it so it's not too similar too all the other answers.
volumeA = 100
volumeB = 20
volumeC = 10
lst = map(lambda x : x[1], filter(lambda x : x[0].startswith('volume'), globals().items()))
print lst
Output
[100, 10, 20]
do you want to add the variables' names as well as their values?
output=[]
output.append([(k,v) for k,v in globals().items() if k.startswith('volume')])
or just the values:
output.append([v for k,v in globals().items() if k.startswith('volume')])
if I get the question appropriately, you are trying to append different values in different variables into a list. Let's see the example below.
Assuming :
email = 'example#gmail.com'
pwd='Mypwd'
list = []
list.append(email)
list.append (pwd)
for row in list:
print(row)
# the output is :
#example#gmail.com
#Mypwd
Hope this helps, thank you.