Read specific sequence of lines in Python

Read specific sequence of lines in Python - python

I have a sample file that looks like this:
#XXXXXXXXX
VXVXVXVXVX
+
ZZZZZZZZZZZ
#AAAAAA
YBYBYBYBYBYBYB
ZZZZZZZZZZZZ
...
I wish to only read the lines that fall on the index 4i+2, where i starts at 0. So I should read the VXVXV (4*0+2 = 2)... line and the YBYB...(4*1 +2 = 6)line in the snippet above. I need to count the number of 'V's, 'X's,'Y's and 'B's and store in a pre-existing dict.
fp = open(fileName, "r")
lines = fp.readlines()
for i in xrange(1, len(lines),4):
for c in str(lines(i)):
if c == 'V':
some_dict['V'] +=1
Can someone explain how do I avoid going off index and only read in the lines at the 4*i+2 index of the lines list?

Can't you just slice the list of lines?
lines = fp.readlines()
interesting_lines = lines[2::4]
Edit for others questioning how it works:
The "full" slice syntax is three parts: start:end:step
The start is the starting index, or 0 by default. Thus, for a 4 * i + 2, when i == 0, that is index #2.
The end is the ending index, or len(sequence) by default. Slices go up to but not including the last index.
The step is the increment between chosen items, 1 by default. Normally, a slice like 3:7 would return elements 3,4,5,6 (and not 7). But when you add a step parameter, you can do things like "step by 4".
Doing "step by 4" means start+0, start+4, start+8, start+12, ... which is what the OP wants, so long as the start parameter is chosen correctly.

You can do one of the following:
Start xrange at 0 then add 2 onto i in secondary loop
for i in xrange(0, len(lines), 4):
for c in str(lines(i+2))
if c == 'V':
some_dict['V'] += 1
Start xrange at 2, then access i the way specified in your original program
for i in xrange(2, len(lines), 4):
for c in str(lines(i))
if c == 'V':
some_dict['V'] += 1

I'm not quite clear on what you're trying to do here--- are you actually just trying to only read the lines you want from disk? (In which case you've gone wrong from the start, because readlines() reads the whole file.) Or are you just trying to filter the list of lines to pick out the ones you want?
I'll assume the latter. In which case, the easiest thing to do would be to just use a listcomp to filter the line by indices. e.g. something simple like:
indices = [x[0] * 4 + 2 for x in enumerate(lines)]
filtered_lines = [lines[i] for i in indices if len(lines) > i]
and there you go, you've got just the lines you want, no index errors or anything silly like that. Then you can separate out and simplify the rest of your code to do the counting, just operating on the filtered list.
(just slightly edited the first list comp to be a little more idiomatic)

I already gave a similar answer to another question: How would I do this in a file?
A better solution (avoiding unnecessary for loops) would be
fp = open(fileName, "r")
def addToDict(letter):
someDict[letter] += 1;
[addToDict('V') for 'V' in str(a) for a in fp.readlines()[2::4]];
I tried to make this an anonymous function without success, if someone can do that it would be excellent.

Related

Python: Display all even numbers between two integers inclusive, with a limit the number of times the code accepts input

Link to question: https://www.spoj.com/UTPEST22/problems/UTP_Q2/
From what I understand, the question is divided into 2 parts:
Input
The first input is an integer that limits the number of time the user can provide a set of integers.
From the second line onwards, the user provides the sequence of integers up until the specified limit.
The set of integers are arranged in ascending order, separated only by a space each.
Output
For each sequence of integers, the integers in it are looped over. Only those that are even are printed as outputs horizontally.
from sys import stdin
for x in range(1, 1+ int(input())):
# number of cases, number of times the user is allowed to provide input
for line in stdin:
num_list = line.split()
# remove the space in between the integers
num = [eval(i) for i in num_list]
# change the list of numbers into integers each
for numbers in range(num[0], num[-1] + 1):
# the first integer is the lower bound
# the second the upper bound
if numbers % 2 == 0:
# modulo operation to check whether are the integers fully divisible
print(numbers, end = ' ')
# print only the even numbers, horizantally
Can anyone please provide some insights on how to make changes to my code, specifically the loop. I felt so messed up with it. Screenshot of the result.
Any help will be appreciated.

You can use restructure the code into the following steps:
Read each case. You can use the input function here. A list comprehension can be used to read each line, split it into the lower and upper bound and then convert these to integers.
Process each case. Use the lower and upper bounds to display the even numbers in that range.
Using loops: Here is an example solution that is similar to your attempt:
n = int(input())
cases = []
for case in range(n):
cases.append([int(x) for x in input().split()])
for case in cases:
for val in range(case[0], case[1] + 1):
if val % 2 == 0:
print(val, end=' ')
print()
This will produce the following desired output:
4 6 8 10 12 14 16 18 20
2 4 6 8 10 12 14 16
-4 -2 0 2 4 6
100 102 104 106 108
Simplify using unpacking: You can simplify this further by unpacking range. You can learn more about unpacking here.
n = int(input())
cases = []
for case in range(n):
cases.append([int(x) for x in input().split()])
for case in cases:
lower = case[0] if case[0] % 2 == 0 else case[0]
print(*range(lower, case[1] + 1, 2))
Simplify using bit-wise operators: You can simplify this further using the bit-wise & operator. You can learn more about this operator here.
n = int(input())
cases = []
for case in range(n):
cases.append([int(x) for x in input().split()])
for case in cases:
print(*range(case[0] + (case[0] & 1), case[1] + 1, 2))

So first of obviously ask user to input the range however many times they specified, you can just split the input and then just get the first and second item of that list that split will return, by using tuple unpacking, then append to the ranges list the range user inputted but as a Python range object so that you can later easier iterate over it.
After everything's inputted, iterate over that ranges list and then over each range and only print out the even numbers, then call print again to move to a new line in console and done.
ranges = []
for _ in range(int(input())):
start, end = input().split()
ranges.append(range(int(start), int(end) + 1))
for r in ranges:
for number in r:
if number % 2 == 0:
print(number, end="")
print()

Here's my solution:
n = int(input())
my_list = []
for i in range(n):
new_line = input().split()
new_line = [int(x) for x in new_line]
my_list.append(new_line)
for i in my_list:
new_string = ""
for j in range(i[0], i[1]+1):
if (not(j % 2)): new_string += f"{j} "
print(new_string)

Read the first value from stdin using input(). Convert to an integer and create a for loop based on that value.
Read a line from stdin using input(). Assumption is that there will be two whitespace delimited tokens per line each of which represents an integer.
Which gives us:
N = int(input()) # number of inputs
for _ in range(N):
line = input()
lo, hi = map(int, line.split())
print(*range(lo+(lo&1), hi+1, 2))

There are a few issues here but I'm guessing this is what's happening: you run the code, you enter the first two lines of input (4, followed by 3 20) and the code doesn't print anything so you press Enter again and then you get the results printed, but you also get the error.
Here is what's actually happening:
You enter the input and the program prints everything as you expect but, you can't see it. Because for some reason when you use sys.stdin then print without a new line, stdout does not flush. I found this similar issue with this (Doing print("Enter text: ", end="") sys.stdin.readline() does not work properly in Python-3).
Then when you hit Enter again, you're basically sending your program a new input line which contains nothing (equivalent to string of ""). Then you try to split that string which is fine (it will just give you an empty list) and then try to get the first element of that list by calling num[0] but there are no elements in the list so it raises the error you see.
So you can fix that issue by changing your print statement to print(numbers, end = ' ', flush=True). This will force Python to show you what you have printed in terminal. If you still try to Enter and send more empty lines, however, you will still get the same error. You can fix that by putting an if inside your for loop and check if line == "" then do nothing.
There is still an issue with your program though. You are not printing new lines after each line of output. You print all of the numbers then you should go to the newline and print the answer for the next line of input. That can be fixed by putting a print() outside the for loop for numbers in range(num[0], num[-1] + 1):.
That brings us to the more important part of this answer: how could you have figured this out by yourself? Here's how:
Put logs in your code. This is especially important when solving this type of problems on SPOJ, Codeforces, etc. So when your program is doing that weird thing, the first thing you should do is to print your variables before the line of the error to see what their value is. That would probably give you a clue. Equally important: when your program didn't show you your output and you pressed Enter again out of confusion, you should've gotten curious about why it didn't print it. Because according to your program logic it should have; so there already was a problem there.
And finally some side notes:
I wouldn't use eval like that. What you want there is int not eval and it's just by accident that it's working in the same way.
We usually put comments above the line not below.
Since it appears you're doing competitive programming, I would suggest using input() and print() instead of stdin and stdout to avoid the confusions like this one.
SPOJ and other competitive programming websites read your stdout completely independent of the output. That means, if you forget to put that empty print() statement and don't print new lines, your program would look fine in your terminal because you would press Enter before giving the next input line but in reality, you are sending the new line to the stdin and SPOJ will only read stdout section, which does not have a new line. I would suggest actually submitting this code without the new line to see what I mean, then add the new line.
In my opinion, the variable numbers does not have a good name in your code. numbers imply a list of values but it's only holding one number.
Your first if does not need to start from 1 and go to the number does it? It only needs to iterate that many times so you may as well save yourself some code and write range(int(input())).
In the same for loop, you don't really care about the value of variable x - it's just a counter. A standard practice in these situations is that you put _ as your variable. That will imply that the value of this variable doesn't really matter, it's just a counter. So it would look like this: for _ in range(int(input())).
I know some of these are really extra to what you asked for, but I figured you're learning programming and trying to compete in contests so thought give some more suggestions. I hope it helped.

Steps for simple solution
# taking input (number of rows) inside outer loop
# l = lower limit, u = upper limit, taking from user, then converting into integer using map function
# applying logic for even number
# finally printing the even number in single line
for j in range(int(input())):
l, u = map(int, input('lower limit , upper limit').split())
for i in range(l, u+1):
if i%2 == 0: # logic for even number
print(i, end=' ')

(Python) I have fundamental misunderstandings which I am trying to address. Can somebody help, via the following "left rotation" example?

I am a novice programmer and I feel that I have very fundamental misunderstandings which I am trying to identify. In this question, I am going to be working out my logic to a coding challenge in order to help you identify the misconceptions I am experiencing. The context of this question is the Hackerrank Interview Kit problem "Left Rotation". https://www.hackerrank.com/challenges/ctci-array-left-rotation/problem?h_l=interview&playlist_slugs%5B%5D=interview-preparation-kit&playlist_slugs%5B%5D=arrays&h_r=next-challenge&h_v=zen
It says the following in regards to the input format:
The first line contains two space-separated integers n and d, the size of a and the number of left rotations.
The second line contains space-separated integers, each an a[i].
Then it gives an example input:
5 4
1 2 3 4 5
My approach to this problem:
What I want is to create 2 loops. One loop to repeat the following process d times, (ie. 4 times), and a nested loop that goes through the whole second input string (1 2 3 4 5), and does the following:
sets the value of the first index (ie: 1) of the second input string (ie: 1 2 3 4 5) to a variable called temp. (I was thinking the line of code that does this would look something like this: temp = a[1]. But is this correct syntactically? Can i do this? Because will python recognize 1 2 3 4 5 as an array automatically, or do I have to convert this second input line from a string to an array, first?)
Next, for every index of the second input line EXCEPT for the last, change the value of that index to equal the value of the next index in the sequence. So, a[1] = 2, a[2] = 3, a[3] = 4, and a[4] = 5. My logic is that at the end of this step, the line should now be this: 2 3 4 5 5. (I was thinking the line of code that does this would look something like this:
for i in len(a-1):
i = [i + 1].
(Is this correct syntactically? I feel like there is something wrong with this line, but I am not sure what it is.)
Set the last index of the array to the temp value. (I was thinking the line of code that does this would be the following: a[n] = temp. Again, unsure about my syntax. By my logic, I would imagine that the input line now looks like this: 2 3 4 5 1. (Also, how does python know what n: is? I understand that the first input line is in the format n,d, but how does python know what those variables stand for? How do I tell it?)
This process happens d times. So the final result to return once the loop finishes would be the rotated array, which would be: 5 1 2 3 4. But I am even confused on how to write the line for the return statement.
With this logic in mind, here is my code:
def rotLeft(a, d):
i = 0
newArr = []
for i in range (d):
temp = a[1]
for i in len(a-1):
i = [i + 1]
a[n] = temp
a = newArr
return newArr
if __name__ == '__main__':
fptr = open(os.environ['OUTPUT_PATH'], 'w')
first_multiple_input = input().rstrip().split()
n = int(first_multiple_input[0])
d = int(first_multiple_input[1])
a = list(map(int, input().rstrip().split()))
result = rotLeft(a, d)
fptr.write(' '.join(map(str, result)))
fptr.write('\n')
fptr.close()
I am getting a runtime error, which I know can be caused by a number of things.
Overall, I am wondering if someone can help me identify any places where I am going wrong in my syntax or logic. I think one of my main points of confusion is the relationship between the input strings and the values being passed into the function. Any contributions are severely appreciated. Thank you.

When you find yourself stuck it's never a bad idea to step back and go to the basics.
I suggest starting your problem by writing down a few of the possible cases on paper and trying to spot patterns, before you write ANY code.
Write your code in iterations, starting with something very simple and slowly work towards your end goal.
Loop through the existing array, adding it's elements to the newlist at the original elements index plus the shift amount, then apply modulus of the length of the original list to have it 'wrap' around.
Here's a helpful link if you aren't sure how modulus works
def rotLeft(a, d):
new_list = []
for i in range(len(a)):
new_list.append(a[(i+d) % len(a)]);
return new_list

There's a number of syntax errors, but HackerRank likely wouldn't accept an approach where you'd perform one left rotation at a time. (The input size is too big for your algorithm to handle in a small amount of time. If you're just starting to program, don't worry too much about what this means; Ewan Brown is right that correctness is infinitely more important than efficiency. This is solely meant to serve as an explanation to motivate why I'm giving a different approach as opposed to debugging the one you've laid out.)
Instead, what you should do is create a new array that rotates an element by d (since you know the offset in advance). This is faster: in essence, you'd be performing the left rotation for each element all at once, rather than over many steps.
def rotLeft(a, d):
result = [0] * len(a)
for index, element in enumerate(a):
result[((index - d) + len(a)) % len(a)] = element
return result

Ewan Brown nailed it. For completeness, I'll add an even more concise implementation using itertools:
from itertools import cycle, islice
def rotLeft(a, d):
return islice(cycle(a), d, len(a)+d)
cycle is a functional and abstract way to express the idea of "wrapping around" a collection.

Order of array items changing when being printed

I was writing a Python program which includes printing a array created from user input in the order the user inputed each item of the array. Unfortunately, I have had few problems with that; Once it repeated the first item twice with one of the set, and then in another set it put the last 2 items at the beginning.
I checked the array in the shell and the array contained the right amount of items in the right order, so I don't know what is going on. My script looks something like this:
i = 1
lines = []
for i in range (1, (leng + 1)):
lines.append(input())
input() # The data stripped is not used, the input is a wait for the user to be ready.
i = 0
for i in range (0, (leng + 1)):
print(lines[i - len(lines)])
I searches found me nothing for my purposes (but then again, I could have not used the correct search term like in my last question).
Please answer or find a duplicate if existing. I'd like an answer.

Don't you just want this?
for line in lines:
print(line)
EDIT
As an explanation of what's wrong with your code... you're looping one too many times (leng+1 instead of leng). Then you're using i - len(lines), which should probably be okay but is just the equivalent of i. Another fix for your code could be:
for i in range(len(lines)):
print(lines[i])
SECOND EDIT
Rewriting your full code to what I think is the simplest, most idiomatic version:
# store leng lines
lines = [input() for _ in range(leng)]
# wait for user to be ready
input()
# print all the lines
for line in lines:
print(line)

My Python code is only selecting half of a list's contents?

I'm very new to Python, and I'm going through some example projects I found online but I'm stuck on my palindrome checker at the moment.
Right now, my code takes a word as an input, splits it in half, saves each part into separate variables, makes both of the variables lists, and from there it SHOULD reverse the second list so I can compare it to the first, but from what I've gathered trying to fix it, it's only appending half of the selection to the new list.
For example, if I enter "racecar", it'll split it into "race" and "ecar" just fine, but then when I go to reverse "ecar" it only gives me back "['c', 'e']". (Also, if I switch the variables around to reverse the first half, I get the same error)
I've been trying to figure it out for quite a while now and I'm not making any progress so some help would be very much appreciated!
Ninja Edit: If there's an easier way to do this (which I'm sure there is) I'd love to know, but I still want to figure out what I've done wrong in the code I already have so I can try to learn from it
Here's my code so far:
print "Please enter a word you want to check is a palindrome"
input = raw_input('> ')
#Gets lengths of input
full_length = len(input)
split_length = len(input) / 2
#If word has an even length split like this
if full_length % 2 == 0:
first_half = input[0: split_length]
second_half = input[split_length:full_length]
#If word does not have even length split like this
else:
first_half = input[0:split_length+1]
second_half = input[split_length:full_length]
#Make both halves lists
first_half_list = list(first_half)
print first_half_list
second_half_list = list(second_half)
print second_half_list
# Reverse second half
rev_second_half = []
for x in second_half_list:
current_letter = second_half_list[0]
second_half_list.remove(second_half_list[0])
rev_second_half.insert(0, current_letter)
print rev_second_half
"""
#Check to see if both lists are identical
#If they are identical
print "This word is a palindrome!"
#If they are not identical
print "This word is not a palindrome."
"""
And this is the output I get when I enter 'racecar':
racecar
['r','a','c','e']
['e','c','a','r']
['c', 'e']

There's a lot of unnecessary work going on. No need to convert to lists; the interpreter can manage this all for you. No need to manually reverse a string; use slicing. No need to manually declare the indices of the first and last characters in your string; the interpreter knows where they are. Here's a fixed version of the code; you can view a demo at IDE One:
input = 'racecar'
#Gets lengths of input
full_length = len(input)
split_length = len(input) / 2
#If word has an even length split like this
if full_length % 2 == 0:
first_half = input[:split_length]
second_half = input[split_length:]
#If word does not have even length split like this
else:
first_half = input[:split_length+1]
second_half = input[split_length:]
print first_half
print second_half
rev_second_half = second_half[::-1]
print rev_second_half
race
ecar
race
Notice the way that the second half is getting reversed, by using a slice with a negative iteration step? You can just do that once, to your source string, and compare the result to the original. Now you have a one line method to check if a string is a palindrome: input == input[::-1]
A bit more on slicing syntax (you might like to check out this question). input[::-1] is exactly the same as input[0:len(input):-1]. The colons separate the three arguments, which are start : end : step. The first two create a range which includes start and everything between it and end, but not end itself. Not specifying start or end causes the interpreter to assume you mean "use 0" and "use len", respectively. Not specifying step causes an assumption of 1. Using a negative step means "start at end and go backwards by magnitude of step".
If you want to omit arguments and specify a range with a slice, you need to include the colons, so the interpreter can tell which arguments are omitted. For example, input[-1] will return the last element of input, because no colons means you're specifying an index, and negative means "go backwards from the end", so print input[:-1] would yield "raceca" if your input was "racecar".
As for what was going wrong with your code, the problem is in your reversing loop.
for x in second_half_list:
current_letter = second_half_list[0]
second_half_list.remove(second_half_list[0])
rev_second_half.insert(0, current_letter)
You're removing items from the list you're iterating through. Don't do that, it's a great way to cause problems; it's why you're only getting half the list in this case. There's also needless copying going on, though that won't cause incorrect results. Finally, you're not using your iterated variable at all, which is a sure sign of some sort of problem with your loop code. Here, if you fixed the list mutation but continued using second_half_list[0], you'd get that letter repeated len(second_half_list) times. If you really need to actually reverse a list, you can do it like this instead:
for x in second_half_list:
rev_second_half.insert(0, x)
But you should only actually iterate the list if you need some sort of side effects during the iteration. For a pure reversal in python, you want this, which will perform better:
rev_second_half = [reversed(second_half_list)]

To reverse the string (not in place):
rev_second_half = second_half_list[::-1]
To extend:
I'd suggest keeping the halves as strings, as you can then just compare them with:== and the above reversing technique also works on strings.

The reason you're only getting two values is you're mutating your list while you iterate on it -- you just shouldn't do this, if only because it's a pain to reason about. As an example:
In [34]: nums = range(5) # [0, 1, 2, 3, 4]
In [35]: for num in nums:
....: print "num", num
....: print "nums", nums
....: nums.remove(nums[0])
....:
num 0
nums [0, 1, 2, 3, 4]
num 2
nums [1, 2, 3, 4]
num 4
nums [2, 3, 4]
Notice that this only looped three times. The first time through, everything's dandy, but you remove the first element. However, Python's looping logic thinks it has to go to the second item -- but you removed the first item! Does that mean the second item now, or the second item when things started? For Python's internals, it means the second item now -- which is the third item when things started (i.e. the value 2). From there, stuff just snowballs.
The lesson here is don't mutate a list while you iterate on it. Just use the other means for reversing folks have mentioned here.

Difference of consecutive float numbers in a column

I have a list of floating point numbers in a file in column like this:
123.456
234.567
345.678
How can i generate an output file which is generated by subtracting the value in a line with the value just above it. For the input file above,the output generated should be:
123.456-123.456
234.567-123.456
345.678-234.567
The first value should return zero, but the other values should get subtracted with the value just above it. This is not an homework question. This is a small requirement of my bigger problem and i am stuck at this point. Help much appreciated. Thanks !!

This will work:
diffs = [0] + [j - data[i] for i,j in enumerate(data[1:])]
So, assuming data.txt contains:
123.456
234.567
345.678
then
with open('data.txt') as f:
data = f.readlines()
diffs = [0] + [float(j) - float(data[i]) for i,j in enumerate(data[1:])]
print diffs
will yield
[0, 111.111, 111.11099999999999]
This answer assumes you want to keep the computed values for further processing.
If at some point you want to write these out to a file, line by line:
with open('result.txt', 'w') as outf:
for i in diffs:
outf.write('{0:12.5f}\n'.format(i))
and adjust the field widths to suit your needs (right now 12 spaces reserved, 5 after the decimal point), written out to file result.txt.
UPDATE: Given (from the comments below) that there is possibly too much data to hold in memory, this solution should work. Python 2.6 doesn't allow opening both files in the same with, hence the separate statements.
with open('result2.txt', 'w') as outf:
outf.write('{0:12.5f}\n'.format(0.0))
prev_item = 0;
with open('data.txt') as inf:
for i, item in enumerate(inf):
item = float(item.strip())
val = item - prev_item
if i > 0:
outf.write('{0:12.5f}\n'.format(val))
prev_item = item
Has a bit of a feel of a hack. Doesn't create a huge list in memory though.

Given a list of values:
[values[i] - values[i-1] if i > 0 else 0.0 for i in range(len(values))]

Instead of list comprehensions or generator expressions, why not write your own generator that can have arbitrarily complex logic and easily operate on enormous data sets?
from itertools import imap
def differences(values):
yield 0 # The initial 0 you wanted
iterator = imap(float, values)
last = iterator.next()
for value in iterator:
yield value - last
last = value
with open('data.txt') as f:
data = f.readlines()
with open('outfile.txt', 'w') as f:
for value in differences(data):
f.write('%s\n' % value)
If data holds just a few values, the benefit wouldn't necessarily be so clear (although the explicitness of the code itself might be nice next year when you have to come back and maintain it). But suppose data was a stream of values from a huge (or infinite!) source and you wanted to process the first thousand values from it:
diffs = differences(enormousdataset)
for count in xrange(1000):
print diffs.next()
Finally, this plays well with data sources that aren't indexable. Solutions that track index numbers to look up values don't play well with the output of generators.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.