For range iteration

For range iteration - python

I'm trying to iterate a certain number of lines from a text file. I've tried doing so by using different combinations of i += 1 and for loops, but it just seems to print out all of the lines from the text file.
def showEntries(amount):
print("---------- SUBSCRIPTIONS ----------\n")
with open('lib/names.txt','r') as names:
for line in names:
for x in range(0,2):
print(line)
print("---------- END OF LIST ----------")

You can use itertools.islice() for this.
from itertools import islice
def showEntries(filename, amount):
with open(filename,'r') as names:
for line in islice(names, amount):
print(line)

I think you want something like:
for line_no, line in enumerate(names):
if line_no >= amount: # if we've seen enough already
break # stop looping
print(line) # otherwise show the current line
For example:
>>> names = ["John Cleese", "Graham Chapman", "Terry Gilliam", "Eric Idle", "Terry Jones", "Michael Palin"]
>>> for index, name in enumerate(names):
if index >= 3:
break
print(name)
John Cleese
Graham Chapman
Terry Gilliam
See the docs on enumerate for more information.

I believe this is what you want to do ...
a. Get an email address as input from the user and add it to a file. As an aside ... you may want to do the following before you write it to the file...
- ensure that the email address is a valid address (use regex to enforce this)
- ensure that the email address does not exist already (you don't want a file full of duplicate addresses)
etc...
and,
b. Print out how a specified number of results from your subscriber list. You need to either put your
Here's a quick example of what you can do to read x number of lines from. I've used a lot of syntactic sugar .. which you can choose not to use..
def printLines(numlines):
i = 0
with open(file) as f:
for line in f:
if i < numlines:
print line
i += 1
else:
break

Your code is really close — you just need to let the iteration of the number of lines control the loop, not the number of them in the file:
def showEntries(amount):
print("---------- SUBSCRIPTIONS ----------\n")
with open('lib/names.txt', 'r') as names:
for _ in range(amount):
print(next(file).rstrip())
print("---------- END OF LIST ----------")
To be more robust, inner portion of the loop probably should be wrapped in a try/except to handle the case were the file contains less than the specified number of lines gracefully.

Related

How to find whether a integer is between first two columns of a file without using any for loop

I've a file which have integers in first two columns.
File Name : file.txt
col_a,col_b
1001021,1010045
2001021,2010045
3001021,3010045
4001021,4010045 and so on
Now using python, i get a variable var_a = 2002000.
Now how to find the range within which this var_a lies in "file.txt".
Expected Output : 2001021,2010045
I have tried with below,
With open("file.txt","r") as a:
a_line = a.readlines()
for line in a_line:
line_sp = line.split(',')
if var_a < line_sp[0] and var_a > line_sp[1]:
print ('%r, %r', %(line_sp[0], line_sp[1])
Since the file have more than million of record this make it time consuming. Is there any better way to do the same without a for loop.

Since the file have more than million of record this make it time
consuming. Is there any better way to do the same without a for loop.
Unfortunately you have to iterate over all records in file and the only way you can archive that is some kind of for loop. So complexity of this task will always be at least O(n).

It is better to read your file linewise (not all into memory) and store its content inside ranges to look them up for multiple numbers. Ranges store quite efficiently and you only have to read in your file once to check more then 1 number.
Since python 3.7 dictionarys are insert ordered, if your file is sorted you will only iterate your dictionary until the first time a number is in the range, for numbers not all all in range you iterate the whole dictionary.
Create file:
fn = "n.txt"
with open(fn, "w") as f:
f.write("""1001021,1010045
2001021,2010045
3001021,3010045
garbage
4001021,4010045""")
Process file:
fn = "n.txt"
# read in
data = {}
with open(fn) as f:
for nr,line in enumerate(f):
line = line.strip()
if line:
try:
start,stop = map(int, line.split(","))
data[nr] = range(start,stop+1)
except ValueError as e:
pass # print(f"Bad data ({e}) in line {nr}")
look_for_nums = [800, 1001021, 3001039, 4010043, 9999999]
for look_for in look_for_nums:
items_checked = 0
for nr,rng in data.items():
items_checked += 1
if look_for in rng:
print(f"Found {look_for} it in line {nr} in range: {rng.start},{rng.stop-1}", end=" ")
break
else:
print(f"{look_for} not found")
print(f"after {items_checked } checks")
Output:
800 not found after 4 checks
Found 1001021 it in line 0 in range: 1001021,1010045 after 1 checks
Found 3001039 it in line 2 in range: 3001021,3010045 after 3 checks
Found 4010043 it in line 5 in range: 4001021,4010045 after 4 checks
9999999 not found after 4 checks
There are better ways to store such a ranges-file, f.e. in a tree like datastructure - research into k-d-trees to get even faster results if you need them. They partition the ranges in a smarter way, so you do not need to use a linear search to find the right bucket.
This answer to Data Structure to store Integer Range , Query the ranges and modify the ranges provides more things to research.

Assuming each line in the file has the correct format, you can do something like following.
var_a = 2002000
with open("file.txt") as file:
for l in file:
a,b = map(int, l.split(',', 1)) # each line must have only two comma separated numbers
if a < var_a < b:
print(l) # use the line as you want
break # if you need only the first occurrence, break the loop now
Note that you'll have to do additional verifications/workarounds if the file format is not guaranteed.
Obviously you have to iterate through all the lines (in the worse case). But we don't load all the lines into memory at once. So as soon as the answer is found, the rest of the file is ignored without reading (assuming you are looking only for the first match).

Extract Email: Name: Phone: from adjacent lines from file of similar records

I have a text file with a list similar to this:
These are all on separate lines in my text file
Email: jonsmith#emailaddie.com
Name: Jon Smith
Phone Number: 555-1212
Email: jonsmith#emailaddie.com
Name: Jon Smith
Phone Number: 555-1212
Email: jonsmith#emailaddie.com
Name: Jon Smith
Phone Number: 555-1212
I am attempting to take the group: [email, name, phone] combinations and export as another text file, with each group on a separate line.
Here is what I have tried so far: (If I can get it to print to the terminal correctly, I know how to write to another file.
I am running Ubuntu Linux
import re
stuff = list()
#get line
with open("a2.txt", "r") as ins:
array = []
for line in ins:
if re.match("Email Address: ", line):
array.append(line)
if re.match("Phone Number: ", line):
array.append(line)
if re.match("Name: ", line):
array.append(line)
print(line)

As indicated in comments, you are looking at the same line through the nested if statements. No line in your sample matches all three regular expressions, so that code would never extract anything. There is no need to use regular expressions here, anyway; the simple line.startswith() is entirely sufficient for looking for individual static strings or small sets of static strings.
Instead, you want to
array = []
for line in ins:
if line.startswith('Email Address:'):
array.append(<<Capture the rest of the line>>)
elif line.startswith('Name: '):
array.append(<<Capture the rest of the line>>)
elif line.startswith('Phone Number: '):
array.append(<<Capture the rest of the line>>)
print(array)
array = []
This simple structure is sufficient if the lines are always in exactly the same order. If you have to cope with missing optional lines or mixed order, the program will need to be slightly more complex.
You will notice that this code (with partial pseudo-code) still is rather repetitive. You want to avoid repeating yourself, so a slightly better program might just loop over the expected phrases in sequence.
fields = ('Email Address: ', 'Name: ' , 'Phone Number: ')
index = 0
array = []
for line in ins:
if line.startswith(fields[index]):
array[index] = line[len(fields[index]):-1]
else:
raise ValueError('Expected {0}, got {1}'.format(field[index], line))
index += 1
if index >= len(fields):
print(array)
array = []
index = 0
This is slightly harder to read at first, but you should quickly be able to make sense of it. We have an index which tells us what value out of fields to expect, and print the collected information and wrap the index back to zero when we run out of fields. This also conveniently lets us refer to the length of the expected string, which we need when we extract the substring after it out of the line. (The -1 is to get rid of the newline character which exists at the end of every line we read.)

If you are sure that the parameters(Email, Name, and Phone number) will be coming in the same order given then the code will work fine, if not then handle this in "else" statement. You can save incomplete values or raise an exception for the same.
with open("path to the file") as fh:
# Track the status for a group
counter = 0
# List of all information
all_info = []
# Containing information of current group
current_info = ""
for line in fh:
if line.startswith("Email:") and counter == 0:
counter = 1
current_info = "{}{},".format(current_info, line)
elif line.startswith("Name:") and counter == 1:
counter = 2
current_info = "{}{},".format(current_info, line)
elif line.startswith("Phone Number:") and counter == 2:
counter = 0
all_info.append("{}{},".format(current_info, line).replace("\n",""))
current_info = ""
else:
# You can handle incomplete information here.
counter = 0
current_info = ""

How to check a list for a string then select the rest of that item,

At the moment I have a text file with people who swim and their times, such as this,
jack 12
sarah 20
ben 4
Now i would like to be able to search this for say sarah and for it to return the code.
This is what i currently have.
def Timers(swimmer):
myFile = open("race.txt","r")
lists = []
for eachLine in myFile:
lists += [eachLine.rstrip("\n")]
so I compiled all them into a single list, although i know i can check the list to see if they are there although i cannot work out how i would just select the time.
At this point i know if i get say, sarah 12 I can then use split and then just formate it to get the times.
Thank you for the help.

You want a dict, a python mapping instead, and read the file only once:
def Timers():
with open("race.txt","r") as myFile:
swimmers = {}
for eachLine in myFile:
if line.strip():
swimmer, timer = line.split()
swimmers[swimmer] = timer
return swimmers
The .split() call splits the line on whitespace, giving you a name and a timer string for each line.
Now Timers() returns a mapping containing all swimmer names as the keys, and their times as values. You can simply look up each and every swimmer:
timers = Timers()
print timers['sarah']

Another approach to the problem:
def Timer(swimmer):
myFile = open("race.txt", "r")
lists = myFile.readlines()
found = [l for l in lists if l.startswith(swimmer)][0] # Gets first found swimmer
time = found.split()[-1] # Gets last item (eg. time) in splitted list
myFile.close()
return time
print Timer('jack')
This works even if the swimmer is specified with both first and last name. I used the same way to open the file as you did. But you really should use the with-statement as in the previous answer!

python file list search (I have two matching strings but python doesn't think they are equal)

if your input is john why isn't the if statement kicking in????
studentname.txt
john 34
paul 37
poop 45
above is whats in studentname.txt
b=a
name = input('students name : ')
list1=[]
file=open('studentname.txt','r')
for (a) in file:
list1.append(a)
b=a[:-3]
why isn't this next if statement tripping if name entered is 'john' for instance??
if name == b:
print(a)
file.close

You are picking up newlines. Depending on the os you created the file on, you'll have different new line characters. The safest way to rid yourself of this is:
a = a.rstrip()
That will take care of any trailing whitespace.
You could also do:
for a in map(lambda x: x.rstrip(), file):
Also, don't name your variable 'file'. This is a python built-in function that you've now renamed for your script and any script that imports it.
Finally, you might prefer to handle files like this:
with open("studentname.txt", 'r') as testfile:
for item in (line.rstrip() for line in testfile):
print item
No need to close the file, the with statement controls it's scope and closes it.

Try this:
for a in file.readlines():
name, _, score = a.strip().partition(' ')
if name == b:
print(a)
It is cleaner in that it doesn't rely on a 2-digit value and is more expressive than arbitrary indexes. It also strips carriage returns and newlines.

alternatively, you can use a.strip()[:-3], which will trim all whitespace characters before taking the substring.

Your immediate problem is as others have mentioned that you are not aware of the \n at the end of your data. print and the repr built-in function are your friends; use them:
if name != b:
print repr(name), repr(b)
whereupon the cause of the problem becomes obvious.
Here is some (untested) code that illustrates better practice when handling simple data file formats like yours. It is intended to cope with blank/empty lines, unterminated last line, and real-life possibilities like:
Jack 9
Jill 100
Billy Bob 99
Decimus 1.23
Numberless
without crashing or running amok.
with open('studentname.txt','rU') as f:
for line_number, line in enumerate(f, 1):
line = line.rstrip('\n')
fields = line.split()
nf = len(fields]
if nf == 0:
continue: # blank/empty line
if nf == 1:
print('Only 1 field in line', line_number, repr(line))
continue
dataname = ' '.join(fields[:-1])
try:
datanumber = int(fields[-1])
except ValueError:
print('Invalid number', repr(fields[-1]), 'in line',
line_number, repr(line))
continue
list1.append((dataname, datanumber))
if name == dataname:
print(repr(dataname), number)
Note file.close evaluates to a method/function object, which does nothing. You need to call it: file.close(). However now that you are using the with statement, it will look after closing the file, so just delete that file.close line.

Python- File Parsing

Write a program which reads a text
file called input.txt which contains
an arbitrary number of lines of the
form ", " then records this
information using a dictionary, and
finally outputs to the screen a list
of countries represented in the file
and the number of cities contained.
For example, if input.txt contained the following:
New York, US
Angers, France
Los Angeles, US
Pau, France
Dunkerque, France
Mecca, Saudi Arabia
The program would output the following (in some order):
Saudi Arabia : 1
US : 2
France : 3
My code:
from os import dirname
def parseFile(filename, envin, envout = {}):
exec "from sys import path" in envin
exec "path.append(\"" + dirname(filename) + "\")" in envin
envin.pop("path")
lines = open(filename, 'r').read()
exec lines in envin
returndict = {}
for key in envout:
returndict[key] = envin[key]
return returndict
I get a Syntax error: invalid syntax... when I use my file name
i used file name input.txt

I don't understand what you are trying to do, so I can't really explain how to fix it. In particular, why are you execing the lines of the file? And why write exec "foo" instead of just foo? I think you should go back to a basic Python tutorial...
Anyway, what you need to do is:
open the file using its full path
for line in file: process the line and store it in a dictionary
return the dictionary
That's it, no exec involved.

Yup, that's a whole lot of crap you either don't need or shouldn't do. Here's how I'd do it prior to Python 2.7 (after that, use collections.Counter as shown in the other answers). Mind you, this'll return the dictionary containing the counts, not print it, you'd have to do that externally. I'd also not prefer to give a complete solution for homeworks, but it's already been done, so I suppose there's no real damage in explaining a bit about it.
def parseFile(filename):
with open(filename, 'r') as fh:
lines = fh.readlines()
d={}
for country in [line.split(',')[1].strip() for line in lines]:
d[country] = d.get(country,0) + 1
return d
Lets break that down a bit, shall we?
with open(filename, 'r') as fh:
lines = fh.readlines()
This is how you'd normally open a text file for reading. It will raise an IOError exception if the file doesn't exist or you don't have permissions or the likes, so you'll want to catch that. readlines() reads the entire file and splits it into lines, each line becomes an element in a list.
d={}
This simply initializes an empty dictionary
for country in [line.split(',')[1].strip() for line in lines]:
Here is where the fun starts. The bracket enclosed part to the right is called a list comprehension, and it basically generates a list for you. What it pretty much says, in plain english, is "for each element 'line' in the list 'lines', take that element/line, split it on each comma, take the second element (index 1) of the list you get from the split, strip off any whitespace from it, and use the result as an element in the new list"
Then, the left part of it just iterates over the generated list, giving the name 'country' to the current element in the scope of the loop body.
d[country] = d.get(country,0) + 1
Ok, ponder for a second what would happen if instead of the above line, we'd used the following:
d[country] = d[country] + 1
It'd crash, right (KeyError exception), because d[country] doesn't have a value the first time around.
So we use the get() method, all dictionaries have it. Here's the nifty part - get() takes an optional second argument, which is what we want to get from it if the element we're looking for doesn't exist. So instead of crashing, it returns 0, which (unlike None) we can add 1 to, and update the dictionary with the new count. Then we just return the lot of it.
Hope it helps.

I would use a defaultdict plus a list to mantain the structure of the information.
So additional statistics can be derived.
import collections
def parse_cities(filepath):
countries_cities_map = collections.defaultdict(list)
with open(filepath) as fd:
for line in fd:
values = line.strip().split(',')
if len(values) == 2:
city, country = values
countries_cities_map[country].append(city)
return countries_cities_map
def format_cities_per_country(countries_cities_map):
for country, cities in countries_cities_map.iteritems():
print " {ncities} Cities found in {country} country".format(country=country, ncities = len(cities))
if __name__ == '__main__':
import sys
filepath = sys.argv[1]
format_cities_per_country(parse_cities(filepath))

import collections
def readFile(fname):
with open(fname) as inf:
return [tuple(s.strip() for s in line.split(",")) for line in inf]
def countCountries(city_list):
return collections.Counter(country for city,country in city_list)
def main():
cities = readFile("input.txt")
countries = countCountries(cities)
print("{0} cities found in {1} countries:".format(len(cities), len(countries)))
for country, num in countries.iteritems():
print("{country}: {num}".format(country=country, num=num))
if __name__=="__main__":
main()

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

For range iteration - python

You can use itertools.islice() for this. from itertools import islice def showEntries(filename, amount): with open(filename,'r') as names: for line in islice(names, amount): print(line)

Related

How to find whether a integer is between first two columns of a file without using any for loop

Extract Email: Name: Phone: from adjacent lines from file of similar records

How to check a list for a string then select the rest of that item,

python file list search (I have two matching strings but python doesn't think they are equal)

Python- File Parsing

Categories

Resources