I have a python script that imports a CSV file and based on the file imported, I have a list of the indexes of the file.
I am trying to match the indexes in FILESTRUCT to the CSV file and then replace the data in the column with new generated data. Here is a code snip-it:
This is just a parsed CSV file returned from my fileParser method:
PARSED = fileParser()
This is a list of CSV column positions:
FILESTRUCT = [6,7,8,9,47]
This is the script that is in question:
def deID(PARSED, FILESTRUCT):
for item in PARSED:
for idx, lis in enumerate(item):
if idx == FILESTRUCT[0]:
lis = dataGen.firstName()
elif idx == FILESTRUCT[1]:
lis = dataGen.lastName()
elif idx == FILESTRUCT[2]:
lis = dataGen.email()
elif idx == FILESTRUCT[3]:
lis = dataGen.empid()
elif idx == FILESTRUCT[4]:
lis = dataGen.ssnGen()
else:
continue
return(PARSED)
I have verified that it is correctly matching the indices (idx) with the integers in FILESTRUCT by adding a print statement at the end of each if statement. That works perfectly.
The problem is that when I return(PARSED) it is not returning it with the new generated values, it is instead, returning the original PARSED input values. I assume that I am probably messing something up with how I use the enumerate method in my second loop, but I do not understand the enumerate method well enough to really know what I am messing up here.
You can use
item[idx] = dataGen.firstName()
to modify the underlying item. The reason here is that enumerate() returns (id, value) tuples rather than references to the iterable that you passed.
Given your example above you may not even need enumerate, because you're not parsing the lis at all. So you could also just do
for i in range(len(item)):
# your if .. elif statements go here ...
item[i] = dataGen.firstName()
On a side-note, the elif statements in your code will become unwieldy once you start adding more conditions and columns. Maybe consider making FILESTRUCT a dictionary like:
FILESTRUCT = {
6: dataGen.firstName,
7: dataGen.lastName,
....
}
...
for idx in range(len(item)):
if idx in FILESTRUCT.keys():
item[idx] = FILESTRUCT[idx]()
So PARSED is an iterable, and item is an element of it and is also an iterable, and you want to make changes to PARSED by changing elements of item.
So let's do a test.
a = [1, 2, 3]
print 'Before:'
print a
for i, e in enumerate(a):
e += 10
print 'After:'
print a
for e in a:
e += 10
print 'Again:'
print a
a[0] += 10
print 'Finally:'
print a
The results are:
Before:
[1, 2, 3]
After:
[1, 2, 3]
Again:
[1, 2, 3]
Finally:
[11, 2, 3]
And we see, a is not changed by changing the enumerated elements.
You aren't returning a changed variable. You don't ever change the variable FILESTRUCT. Rather make another variable, make it as you loop through FILESTRUCT and then return your new FILE.
You can't change the values in a loop like that, Kind of like expecting this to return all x's:
demo_data = "A string with some words"
for letter in demo_data:
letter = "x"
return demo_data
It won't, it will return: "A string with some words"
Related
l = [1,2,3,4,5,'1','2','3','4','nag','nag','venkat',5,6,7]
l1 = []
for i in l:
if (str(i) not in l1) and (i not in l1):
l1.append(i)
print l1
I want to clean my list. My list contains numbers and strings. In the above list l i have both 1 and "1". I want to remove either 1 or "1". I want the output as [1, 2, 3, 4, 5, "nag", "venkat", 6, 7]
Confirmed in IDLE that this provides the output you're looking for. Also, I updated the names of some of your variables to be a little easier to understand.
my_list = [1,2,3,4,5,'1','2','3','4','nag','nag','venkat',5,6,7]
output_list = []
for i in my_list:
try:
if (str(i) not in output_list) and (int(i) not in output_list):
output_list.append(i)
except ValueError:
if i not in output_list:
output_list.append(i)
print output_list
In Python it's common practice to use variables assuming that they're a certain type and just catch errors, instead of going through the process of checking the type (int, str, etc) on each one. Here, inside the try statement, I'm assuming the loop variable i is either an int or a str that contains only numbers. Provided that's the case, this section works fine.
However, we know that the list contains some strings of letters, so the try block will throw a ValueError. The except block catches that and, knowing that this error will result from an attempt to cast a string of letters as an int (when we use int(i)), we can now safely assume that the loop variable i refers to a string of letters, which we then check against the output_list and append if needed. I hope that helps.
There's a way with list comprehensions, you create a new list, but this example only works if you know what you want to remove:
l1 = [i for i in l if i != "1" if i != "2" if i != "3" if i != "4"]
#output
[1, 2, 3, 4, 5, 'nag', 'nag', 'venkat', 5, 6, 7]
or for example only removing the string "1" it would be
l1 = [i for i in l if i != "1"]
Maybe it could be implemented in a function and a loop to remove such elements with a single if statement with this way. Not sure, anyway I'd go with coralv's way.
I am having trouble with list comprehension in Python
Basically I have code that looks like this
output = []
for i, num in enumerate(test):
loss_ = do something
test_ = do something else
output.append(sum(loss_*test_)/float(sum(loss_)))
How can I write this using list comprehension such as:
[sum(loss_*test_)/float(sum(loss_))) for i, num in enumerate(test)]
however I don't know how to assign the values of loss_ and test_
You can use a nested list comprehension to define those values:
output = [sum(loss_*test_)/float(sum(loss_))
for loss_, test_ in ((do something, do something else)
for i, num in enumerate(test))]
Of course, whether that's any more readable is another question.
As Yaroslav mentioned in the comments, list comprehensions don't allow you to save a value into a variable directly.
However it allows you to use functions.
I've made a very basic example (because the sample you provided is incomplete to test), but it should show how you can still execute code in a list comprehension.
def loss():
print "loss"
return 1
def test():
print "test"
return 5
output = [loss()*test() for i in range(10) ]
print output
which is this case will result in a list [5, 5, 5, 5, 5, 5, 5, 5, 5, 5]
I hope this somehow shows how you could end up with the behaviour that you were looking for.
ip_list = string.split(" ") # split the string to a list using space seperator
for i in range(len(ip_list)): # len(ip_list) returns the number of items in the list - 4
# range(4) resolved to 0, 1, 2, 3
if (i % 2 == 0): ip_list[i] += "-" # if i is even number - concatenate hyphen to the current IP string
else: ip_list[i] += "," # otherwize concatenate comma
print("".join(ip_list)[:-1]) # "".join(ip_list) - join the list back to a string
# [:-1] trim the last character of the result (the extra comma)
I've seen a lot of variations of this question from things as simple as remove duplicates to finding and listing duplicates. Even trying to take bits and pieces of these examples does not get me my result.
My question is how am I able to check if my list has a duplicate entry? Even better, does my list have a non-zero duplicate?
I've had a few ideas -
#empty list
myList = [None] * 9
#all the elements in this list are None
#fill part of the list with some values
myList[0] = 1
myList[3] = 2
myList[4] = 2
myList[5] = 4
myList[7] = 3
#coming from C, I attempt to use a nested for loop
j = 0
k = 0
for j in range(len(myList)):
for k in range(len(myList)):
if myList[j] == myList[k]:
print "found a duplicate!"
return
If this worked, it would find the duplicate (None) in the list. Is there a way to ignore the None or 0 case? I do not care if two elements are 0.
Another solution I thought of was turn the list into a set and compare the lengths of the set and list to determine if there is a duplicate but when running set(myList) it not only removes duplicates, it orders it as well. I could have separate copies, but it seems redundant.
Try changing the actual comparison line to this:
if myList[j] == myList[k] and not myList[j] in [None, 0]:
I'm not certain if you are trying to ascertain whether or a duplicate exists, or identify the items that are duplicated (if any). Here is a Counter-based solution for the latter:
# Python 2.7
from collections import Counter
#
# Rest of your code
#
counter = Counter(myList)
dupes = [key for (key, value) in counter.iteritems() if value > 1 and key]
print dupes
The Counter object will automatically count occurances for each item in your iterable list. The list comprehension that builds dupes essentially filters out all items appearing only once, and also upon items whose boolean evaluation are False (this would filter out both 0 and None).
If your purpose is only to identify that duplication has taken place (without enumerating which items were duplicated), you could use the same method and test dupes:
if dupes: print "Something in the list is duplicated"
If you simply want to check if it contains duplicates. Once the function finds an element that occurs more than once, it returns as a duplicate.
my_list = [1, 2, 2, 3, 4]
def check_list(arg):
for i in arg:
if arg.count(i) > 1:
return 'Duplicate'
print check_list(my_list) == 'Duplicate' # prints True
To remove dups and keep order ignoring 0 and None, if you have other falsey values that you want to keep you will need to specify is not None and not 0:
print [ele for ind, ele in enumerate(lst[:-1]) if ele not in lst[:ind] or not ele]
If you just want the first dup:
for ind, ele in enumerate(lst[:-1]):
if ele in lst[ind+1:] and ele:
print(ele)
break
Or store seen in a set:
seen = set()
for ele in lst:
if ele in seen:
print(ele)
break
if ele:
seen.add(ele)
You can use collections.defaultdict and specify a condition, such as non-zero / Truthy, and specify a threshold. If the count for a particular value exceeds the threshold, the function will return that value. If no such value exists, the function returns False.
from collections import defaultdict
def check_duplicates(it, condition, thresh):
dd = defaultdict(int)
for value in it:
dd[value] += 1
if condition(value) and dd[value] > thresh:
return value
return False
L = [1, None, None, 2, 2, 4, None, 3, None]
res = check_duplicates(L, condition=bool, thresh=1) # 2
Note in the above example the function bool will not consider 0 or None for threshold breaches. You could also use, for example, lambda x: x != 1 to exclude values equal to 1.
In my opinion, this is the simplest solution I could come up with. this should work with any list. The only downside is that it does not count the number of duplicates, but instead just returns True or False
for k, j in mylist:
return k == j
Here's a bit of code that will show you how to remove None and 0 from the sets.
l1 = [0, 1, 1, 2, 4, 7, None, None]
l2 = set(l1)
l2.remove(None)
l2.remove(0)
I am using Python 3.0 to write a program. In this program I deal a lot with lists which I haven't used very much in Python.
I am trying to write several if statements about these lists, and I would like to know how to look at just a specific value in the list. I also would like to be informed of how one would find the placement of a value in the list and input that in an if statement.
Here is some code to better explain that:
count = list.count(1)
if count > 1
(This is where I would like to have it look at where the 1 is that the count is finding)
Thank You!
Check out the documentation on sequence types and list methods.
To look at a specific element in the list you use its index:
>>> x = [4, 2, 1, 0, 1, 2]
>>> x[3]
0
To find the index of a specific value, use list.index():
>>> x.index(1)
2
Some more information about exactly what you are trying to do would be helpful, but it might be helpful to use a list comprehension to get the indices of all elements you are interested in, for example:
>>> [i for i, v in enumerate(x) if v == 1]
[2, 4]
You could then do something like this:
ones = [i for i, v in enumerate(your_list) if v == 1]
if len(ones) > 1:
# each element in ones is an index in your_list where the value is 1
Also, naming a variable list is a bad idea because it conflicts with the built-in list type.
edit: In your example you use your_list.count(1) > 1, this will only be true if there are two or more occurrences of 1 in the list. If you just want to see if 1 is in the list you should use 1 in your_list instead of using list.count().
You can use list.index() to find elements in the list besides the first one, but you would need to take a slice of the list starting from one element after the previous match, for example:
your_list = [4, 2, 1, 0, 1, 2]
i = -1
while True:
try:
i = your_list[i+1:].index(1) + i + 1
print("Found 1 at index", i)
except ValueError:
break
This should give the following output:
Found 1 at index 2
Found 1 at index 4
First off, I would strongly suggest reading through a beginner’s tutorial on lists and other data structures in Python: I would recommend starting with Chapter 3 of Dive Into Python, which goes through the native data structures in a good amount of detail.
To find the position of an item in a list, you have two main options, both using the index method. First off, checking beforehand:
numbers = [2, 3, 17, 1, 42]
if 1 in numbers:
index = numbers.index(1)
# Do something interesting
Your other option is to catch the ValueError thrown by index:
numbers = [2, 3, 17, 1, 42]
try:
index = numbers.index(1)
except ValueError:
# The number isn't here
pass
else:
# Do something interesting
One word of caution: avoid naming your lists list: quite aside from not being very informative, it’ll shadow Python’s native definition of list as a type, and probably cause you some very painful headaches later on.
You can find out in which index is the element like this:
idx = lst.index(1)
And then access the element like this:
e = lst[idx]
If what you want is the next element:
n = lst[idx+1]
Now, you have to be careful - what happens if the element is not in the list? a way to handle that case would be:
try:
idx = lst.index(1)
n = lst[idx+1]
except ValueError:
# do something if the element is not in the list
pass
list.index(x)
Return the index in the list of the first item whose value is x. It is an error if there is no such item.
--
In the docs you can find some more useful functions on lists: http://docs.python.org/tutorial/datastructures.html#more-on-lists
--
Added suggestion after your comment: Perhaps this is more helpful:
for idx, value in enumerate(your_list):
# `idx` will contain the index of the item and `value` will contain the value at index `idx`
I am iterating over a list and I want to print out the index of the item if it meets a certain condition. How would I do this?
Example:
testlist = [1,2,3,5,3,1,2,1,6]
for item in testlist:
if item == 1:
print position
Hmmm. There was an answer with a list comprehension here, but it's disappeared.
Here:
[i for i,x in enumerate(testlist) if x == 1]
Example:
>>> testlist
[1, 2, 3, 5, 3, 1, 2, 1, 6]
>>> [i for i,x in enumerate(testlist) if x == 1]
[0, 5, 7]
Update:
Okay, you want a generator expression, we'll have a generator expression. Here's the list comprehension again, in a for loop:
>>> for i in [i for i,x in enumerate(testlist) if x == 1]:
... print i
...
0
5
7
Now we'll construct a generator...
>>> (i for i,x in enumerate(testlist) if x == 1)
<generator object at 0x6b508>
>>> for i in (i for i,x in enumerate(testlist) if x == 1):
... print i
...
0
5
7
and niftily enough, we can assign that to a variable, and use it from there...
>>> gen = (i for i,x in enumerate(testlist) if x == 1)
>>> for i in gen: print i
...
0
5
7
And to think I used to write FORTRAN.
What about the following?
print testlist.index(element)
If you are not sure whether the element to look for is actually in the list, you can add a preliminary check, like
if element in testlist:
print testlist.index(element)
or
print(testlist.index(element) if element in testlist else None)
or the "pythonic way", which I don't like so much because code is less clear, but sometimes is more efficient,
try:
print testlist.index(element)
except ValueError:
pass
Use enumerate:
testlist = [1,2,3,5,3,1,2,1,6]
for position, item in enumerate(testlist):
if item == 1:
print position
for i in xrange(len(testlist)):
if testlist[i] == 1:
print i
xrange instead of range as requested (see comments).
Here is another way to do this:
try:
id = testlist.index('1')
print testlist[id]
except ValueError:
print "Not Found"
Try the below:
testlist = [1,2,3,5,3,1,2,1,6]
position=0
for i in testlist:
if i == 1:
print(position)
position=position+1
[x for x in range(len(testlist)) if testlist[x]==1]
If your list got large enough and you only expected to find the value in a sparse number of indices, consider that this code could execute much faster because you don't have to iterate every value in the list.
lookingFor = 1
i = 0
index = 0
try:
while i < len(testlist):
index = testlist.index(lookingFor,i)
i = index + 1
print index
except ValueError: #testlist.index() cannot find lookingFor
pass
If you expect to find the value a lot you should probably just append "index" to a list and print the list at the end to save time per iteration.
I think that it might be useful to use the curselection() method from thte Tkinter library:
from Tkinter import *
listbox.curselection()
This method works on Tkinter listbox widgets, so you'll need to construct one of them instead of a list.
This will return a position like this:
('0',) (although later versions of Tkinter may return a list of ints instead)
Which is for the first position and the number will change according to the item position.
For more information, see this page:
http://effbot.org/tkinterbook/listbox.htm
Greetings.
Why complicate things?
testlist = [1,2,3,5,3,1,2,1,6]
for position, item in enumerate(testlist):
if item == 1:
print position
Just to illustrate complete example along with the input_list which has searies1 (example: input_list[0]) in which you want to do a lookup of series2 (example: input_list[1]) and get indexes of series2 if it exists in series1.
Note: Your certain condition will go in lambda expression if conditions are simple
input_list = [[1,2,3,4,5,6,7],[1,3,7]]
series1 = input_list[0]
series2 = input_list[1]
idx_list = list(map(lambda item: series1.index(item) if item in series1 else None, series2))
print(idx_list)
output:
[0, 2, 6]
l = list(map(int,input().split(",")))
num = int(input())
for i in range(len(l)):
if l[i] == num:
print(i)
Explanation:
Taken a list of integer "l" (separated by commas) in line 1.
Taken a integer "num" in line 2.
Used for loop in line 3 to traverse inside the list and checking if numbers(of the list) meets the given number(num) then it will print the index of the number inside the list.
testlist = [1,2,3,5,3,1,2,1,6]
num = 1
for item in range(len(testlist)):
if testlist[item] == num:
print(item)
testlist = [1,2,3,5,3,1,2,1,6]
for id, value in enumerate(testlist):
if id == 1:
print testlist[id]
I guess that it's exacly what you want. ;-)
'id' will be always the index of the values on the list.