All,
I've recently picked up Python and currently in the process of dealing with lists. I'm using a test file containing several lines of characters indented by a tab and then passing this into my python program.
The aim of my python script is to insert each line into a list using the length as the index which means that the list would be automatically sorted. I am considering the most basic case and am not concerned about any complex cases.
My python code below;
newList = []
for line in sys.stdin:
data = line.strip().split('\t')
size = len(data)
newList.insert(size, data)
for i in range(len(newList)):
print ( newList[i])
My 'test' file below;
2 2 2 2
1
3 2
2 3 3 3 3
3 3 3
My expectation of the output of the python script is to print the contents of the list in the following order sorted by length;
['1']
['3', '2']
['3', '3', '3']
['2', '2', '2', '2']
['2', '3', '3', '3', '3']
However, when I pass in my test file to my python script, I get the following;
cat test | ./listSort.py
['2', '2', '2', '2']
['1']
['3', '2']
['3', '3', '3']
['2', '3', '3', '3', '3']
The first line of the output ['2', '2', '2', '2'] is incorrect. I'm trying to figure out why it isn't being printed at the 4th line (because of length 4 which would mean that it would have been inserted into the 4th index of the list). Could someone please provide some insight into why this is? My understanding is that I am inserting each 'data' into the list using 'size' as the index which means when I print out the contents of the list, they would be printed in sorted order.
Thanks in advance!
Inserting into lists work quite differently than what you think:
>>> newList = []
>>> newList.insert(4, 4)
>>> newList
[4]
>>> newList.insert(1, 1)
>>> newList
[4, 1]
>>> newList.insert(2, 2)
>>> newList
[4, 1, 2]
>>> newList.insert(5, 5)
>>> newList
[4, 1, 2, 5]
>>> newList.insert(3, 3)
>>> newList
[4, 1, 2, 3, 5]
>>> newList.insert(0, 0)
>>> newList
[0, 4, 1, 2, 3, 5]
Hopefully you can see two things from this example:
The list indices are 0-based. That is to say, the first entry has index 0, the second has index 1, etc.
list.insert(idx, val) inserts things into the position which currently has index idx, and bumps everything after that down a position. If idx is larger than the current length of the list, the new item is silently added in the last position.
There are several ways to implement the functionality you want:
If you can predict the number of lines, you can allocate the list beforehand, and simply assign to the elements of the list instead of inserting:
newList = [None] * 5
for line in sys.stdin:
data = line.strip().split('\t')
size = len(data)
newList[size - 1] = data
for i in range(len(newList)):
print ( newList[i])
If you can predict a reasonable upper bound of the number of lines, you can also do this, but you need to have some way to remove the None entries afterwards.
Use a dictionary:
newList = {}
for line in sys.stdin:
data = line.strip().split('\t')
size = len(data)
newList[size - 1] = data
for i in range(len(newList)):
print ( newList[i])
Add elements to the list as necessary, which is probably a little bit more involved:
newList = []
for line in sys.stdin:
data = line.strip().split('\t')
size = len(data)
if len(newList) < size: newList.extend([None] * (size - len(newList)))
newList[size - 1] = data
for i in range(len(newList)):
print ( newList[i])
I believe I've figured out the answer to my question, thanks to mkrieger1. I append to the list and then sort it using the length as the key;
newList = []
for line in sys.stdin:
data = line.strip().split('\t')
newList.append(data)
newList.sort(key=len)
for i in range(len(newList)):
print (newList[i])
I got the output I wanted;
/listSort.py < test
['1']
['3', '2']
['3', '3', '3']
['2', '2', '2', '2']
['2', '3', '3', '3', '3']
Related
I want to generate a nested 2 level list from the input numbers. The end of the line is 'enter'.
a = [[i for i in input().split()] for i in input().split (sep = '\ n')]
In this case, this takes only the second line.
For example:
1 2 3
4 5 6
7 8 9
It will output like this:
[['4', '5', '6']]
I want to get the final result like this:
[['1', '2', '3'], ['4', '5', '6'], ['7', '8', '9']]
Help find a mistake. Thanks.
One way to do it would be:
[x.split() for x in data.splitlines()]
Or if you want the items to be an int:
[[int(x) for x in x.split()] for x in data.splitlines()]
Code:
a = [[j for j in i.split()] for i in input().split(sep = '\n')]
You want the inside list to enumerate over the elements of the outside list.
Besides, remove the extra spaces.
I have this code:
list1 = input()
list2 = input()
unique = list(set(list1).intersection(list2))
print(len(unique))
and I want to find the unique numbers that occur in both lists.
However, when I enter the lists [1,2,3,4,5,6] and [6,5,4,3,2,1] it returns 7, instead of 6.
When I edit my code to:
list1 = [1,2,3,4,5,6]
list2 = [6,5,4,3,2,1]
unique = list(set(list1).intersection(list2))
print(len(unique))
It outputs 6 correctly. What is going on in my user input code?
Because the inputs are cast to string when using input, and by constructing a set you're getting:
list1 = '1,2,3,4,5,6'
print(set(list1))
# {',', '1', '2', '3', '4', '5', '6'}
list2 = '6,5,4,3,2,1'
set(list2)
# {',', '1', '2', '3', '4', '5', '6'}
The commas are included, resulting in:
list(set(list1).intersection(list2))
# [',', '1', '6', '5', '3', '4', '2']
You are not converting your second input to a set.
I'd write it at as the follows:
set1 = set(input())
set2 = set(input())
unique = list(set1.intersection(set2))
print(len(unique))
input returns a string, and you have to cast or better parse it first.
One option is json, e.g.
import json
def parse():
return json.loads(input())
list1 = parse() #"[1,2,3,4,5,6]" parsed to [1,2,3,4,5,6]
list2 = parse() #"[6,5,4,3,2,1]" parsed to [6,5,4,3,2,1]
unique = list(set(list1).intersection(set(list2)))
print(len(unique))
Obviously you could use eval(input(), but that could not be considered safe, as it allows executing arbitrary code. Another possible parsing function that does not require brackets could be
def parse():
ip = input()
return [int(element) for element in ip.split(',')]
list1 = parse() #"1,2,3,4,5,6" parsed to [1,2,3,4,5,6]
I'm having some trouble converting type 'str' to numbers. I use a separate text-file containing the following numbers 1, 2, 3, 4, 5, 6 and then I import these numbers into python and save them as a list. However, by doing this, I get a list of strings as follows: ['1', '2', '3', '4', '5', '6']. I want to convert this list of strings so the list represents numbers, i.e. the output should be [1, 2, 3, 4, 5, 6].
My code is:
def imported_numbers(filename):
with open(filename, 'r') as f:
contents = f.read().splitlines()
print(contents)
imported_numbers('sample.txt')
Is there a specific command to do this?
IMO it's more pythonic to say
str_list = ['1', '2', '3']
new_list = [int(n) for n in str_list]
If you're not sure all of the strings will be valid numbers, you need to add appropriate error handling.
You can use map:
l = ['1', '2', '3', '4', '5', '6']
new_l = list(map(int, l)) # or just map(int, l) in Python 2
will return
[1, 2, 3, 4, 5, 6]
This can throw an error if there are strings that cannot be converted to numbers though:
l = ['1', '2', '3', '4', '5', 'lkj']
list(map(int, l))
ValueError: invalid literal for int() with base 10: 'lkj'
So make sure your input is valid and/or wrap it into a try/except.
this is my code:
positions = []
for i in lines[2]:
if i not in positions:
positions.append(i)
print (positions)
print (lines[1])
print (lines[2])
the output is:
['1', '2', '3', '4', '5']
['is', 'the', 'time', 'this', 'ends']
['1', '2', '3', '4', '1', '5']
I would want my output of the variable "positions" to be; ['2','3','4','1','5']
so instead of removing the second duplicate from the variable "lines[2]" it should remove the first duplicate.
You can reverse your list, create the positions and then reverse it back as mentioned by #tobias_k in the comment:
lst = ['1', '2', '3', '4', '1', '5']
positions = []
for i in reversed(lst):
if i not in positions:
positions.append(i)
list(reversed(positions))
# ['2', '3', '4', '1', '5']
You'll need to first detect what values are duplicated before you can build positions. Use an itertools.Counter() object to test if a value has been seen more than once:
from itertools import Counter
counts = Counter(lines[2])
positions = []
for i in lines[2]:
counts[i] -= 1
if counts[i] == 0:
# only add if this is the 'last' value
positions.append(i)
This'll work for any number of repetitions of values; only the last value to appear is ever used.
You could also reverse the list, and track what you have already seen with a set, which is faster than testing against the list:
positions = []
seen = set()
for i in reversed(lines[2]):
if i not in seen:
# only add if this is the first time we see the value
positions.append(i)
seen.add(i)
positions = positions[::-1] # reverse the output list
Both approaches require two iterations; the first to create the counts mapping, the second to reverse the output list. Which is faster will depend on the size of lines[2] and the number of duplicates in it, and wether or not you are using Python 3 (where Counter performance was significantly improved).
you can use a dictionary to save the last position of the element and then build a new list with that information
>>> data=['1', '2', '3', '4', '1', '5']
>>> temp={ e:i for i,e in enumerate(data) }
>>> sorted(temp, key=lambda x:temp[x])
['2', '3', '4', '1', '5']
>>>
I have been studying about the list comprehension. So I have decided to code something with a for loop which looks like
babe = 122132323
b = [n for n in babe]
print b
When I run the above code it gives me error like TypeError: 'int' object is not iterable
I have researched about these similar errors but I don't know what wrong with my code. It would be really appreciated if anyone tell me how can I overcome this error and make the code work.
int variables are not sequences and neither iterable. you have to make it string.
In [60]: babe = '122132323'
In [61]: b = [n for n in babe] #this pick every item from left and store it in `n` then return n.
In [62]: b
Out[62]: ['1', '2', '2', '1', '3', '2', '3', '2', '3']
or simple use list builtin-function.
[63]: list(babe)
Out[63]: ['1', '2', '2', '1', '3', '2', '3', '2', '3']
The reason the object you have is not iterable is because it is not a string. It is a single number, i.e an integer.
If you on the other hand, had a string, say
babe = '122132323'
b = [n for n in babe]
print b
It will print
['1', '2', '2', '1', '3', '2', '3', '2', '3']
To iterate on an integer value, you need n for n in range (babe). That tells Python to use the numbers from zero to babe-1 as loop counter values. Assuming you want to have an actual number, of course...
What values of n are you looking for?
1, 2, 2, 1, 3, 2, 3, 2, 3
or
1, 2, 3, 4, ..., 122132323?
For the first you need
[n for n in '122132323']
and for the second you need
n for n in range(babe)]
(or just list(range(babe)).