I am new to python programming and I'm getting runtime error with my code. Any help is appreciated.
import statistics
tc = int(input())
while tc > 0:
n = int(input())
arr = input()
l = list(map(int, arr.split(' ')))
print("{} {} {}".format(statistics.mean(l), statistics.median(l), statistics.mode(l)))
tc = tc - 1
Error
StatisticsError: no unique mode; found 2 equally common values
Input Format
First line consists of a single integer T denoting the number of test cases.
First line of each test case consists of a single integer N denoting the size of the array.
Following line consists of N space-separated integers Ai denoting the elements in the array.
Output Format
For each test case, output a single line containing three-separated integers denoting the Mean, Median and Mode of the array
Sample Input
1
5
1 1 2 3 3
Sample Output
2 2 1
You could add a variable mode surrounded by a try...except and if statistics has an error get the mode a different way.
try:
mode=statistics.mode(l)
except:
mode=max(set(l),key=l.count)
print("{} {} {}".format(statistics.mean(l), statistics.median(l), mode))
Hello Kartik Madaan,
Try this code,
Using Python 3.X
import statistics
from statistics import mean,median,mode
tc = int(input())
while tc > 0:
n = int(input())
arr = input()
l = list(map(int,arr.split()))
mod = max(set(l), key=l.count)
print(int(mean(l)),int(median(l)),mod)
tc = tc - 1
I hope my answer is helpful.
If any query so comment please.
Related
So, this is the output that I want from the input:
Input:
5
4
3
7
6
Output:
P[1]5
P[2]4
P[3]3
P[4]7
P[5]6
Lowest Number = 3
Highest Number = 7
But I got this with my code
[P1] 4
[P2] 4
[P3] 4
[P4] 4
[P5] 4
Lowest Number = 3
Highest Number = 7
This is my code:
X = []
for i in range (0,5):
X.append(int(input()))
for j in range (0,5):
print("[P"+str(j+1)+"]", i)
print("Lowest Number = ",min(X))
print("Highest Number = ",max(X))
I got a few questions:
Where do all the '4's come from?
Is there a way to remove the space between [P1] and 4 so it should be [P1]4 (from the wrong output) or [P1]5 (from the right output)?
Thank you.
ad 1. You printed variable i, which has set to 4 as the last number of the first range. I recommand to use enumerate function and iterate trought the list X, which is a better practice, because you can change a length of X.
ad 2. Use a better way to control your output (.format, f-string since Python 3.6).
Here's example about .format:
X = []
for i in range (0,5):
X.append(int(input()))
for no, number in enumerate(X, start=1):
print("[P{0}]{1}".format(no, number))
print("Highest Number = {0}".format(max(X)))
print("Lowest Number = {0}".format(min(X)))
A here's example about f-string:
X = []
for i in range (0,5):
X.append(int(input()))
for no, number in enumerate(X, start=1):
print(f"[P{no}]{number}")
print(f"Highest Number = {max(X)}")
print(f"Lowest Number = {min(X)}")
For example, here's a quick review of formatting: https://realpython.com/python-string-formatting/
Well, according to your code
X = []
for i in range (0,5):
X.append(int(input()))
for j in range (0,5):
print("[P"+str(j+1)+"]", i)#there is the problem
print("Highest Number = ",max(X))
print("Lowest Number = ",min(X))
you are calling i instead of X[j]
I am trying to solve a problem of multiplication. I know that Python supports very large numbers and it can be done but what I want to do is
Enter 2 numbers as strings.
Multiply those two numbers in the same manner as we used to do in school.
Basic idea is to convert the code given in the link below to Python code but I am not very good at C++/Java. What I want to do is to understand the code given in the link below and apply it for Python.
https://www.geeksforgeeks.org/multiply-large-numbers-represented-as-strings/
I am stuck at the addition point.
I want to do it it like in the image given below
So I have made a list which stores the values of ith digit of first number to jth digit of second. Please help me to solve the addition part.
def mul(upper_no,lower_no):
upper_len=len(upper_no)
lower_len=len(lower_no)
list_to_add=[] #saves numbers in queue to add in the end
for lower_digit in range(lower_len-1,-1,-1):
q='' #A queue to store step by step multiplication of numbers
carry=0
for upper_digit in range(upper_len-1,-1,-1):
num2=int(lower_no[lower_digit])
num1=int(upper_no[upper_digit])
print(num2,num1)
x=(num2*num1)+carry
if upper_digit==0:
q=str(x)+q
else:
if x>9:
q=str(x%10)+q
carry=x//10
else:
q=str(x%10)+q
carry=0
num=x%10
print(q)
list_to_add.append(int(''.join(q)))
print(list_to_add)
mul('234','567')
I have [1638,1404,1170] as a result for the function call mul('234','567') I am supposed to add these numbers but stuck because these numbers have to be shifted for each list. for example 1638 is supposed to be added as 16380 + 1404 with 6 aligning with 4, 3 with 0 and 8 with 4 and so on. Like:
1638
1404x
1170xx
--------
132678
--------
I think this might help. I've added a place variable to keep track of what power of 10 each intermediate value should be multiplied by, and used the itertools.accumulate function to produce the intermediate accumulated sums that doing so produces (and you want to show).
Note I have also reformatted your code so it closely follows PEP 8 - Style Guide for Python Code in an effort to make it more readable.
from itertools import accumulate
import operator
def mul(upper_no, lower_no):
upper_len = len(upper_no)
lower_len = len(lower_no)
list_to_add = [] # Saves numbers in queue to add in the end
place = 0
for lower_digit in range(lower_len-1, -1, -1):
q = '' # A queue to store step by step multiplication of numbers
carry = 0
for upper_digit in range(upper_len-1, -1, -1):
num2 = int(lower_no[lower_digit])
num1 = int(upper_no[upper_digit])
print(num2, num1)
x = (num2*num1) + carry
if upper_digit == 0:
q = str(x) + q
else:
if x>9:
q = str(x%10) + q
carry = x//10
else:
q = str(x%10) + q
carry = 0
num = x%10
print(q)
list_to_add.append(int(''.join(q)) * (10**place))
place += 1
print(list_to_add)
print(list(accumulate(list_to_add, operator.add)))
mul('234', '567')
Output:
7 4
7 3
7 2
1638
6 4
6 3
6 2
1404
5 4
5 3
5 2
1170
[1638, 14040, 117000]
[1638, 15678, 132678]
My program is meant to calculate the standard deviation for 5 values given by the users. There is an issue with my code when getting the input in a for loop. Why is that?
givenValues = []
def average(values):
for x in range(0, 6):
total = total + values[x]
if(x==5):
average = total/x
return average
def sqDiff(values):
totalSqDiff = 0
sqDiff = []
av = average(values)
for x in range(0,6):
sqDiff[x] = (values[x] - av)**2
totalSqDiff = totalSqDiff + sqDiff[x]
avSqDiff = totalSqDiff / 5
SqDiffSquared = avSqDiff**2
return SqDiffSquared
for counter in range(0,6):
givenValues[counter] = float(input("Please enter a value: "))
counter = counter + 1
sqDiffSq = sqDiff(givenValues)
print("The standard deviation for the given values is: " + sqDiffSq)
There are several errors in your code.
Which you can easily find out by reading the errormessages your code produces:
in the Function average
insert the line total = 0
you are using it before asigning it.
List appending
Do not use for example
sqDiff[x] = (values[x] - av)**2
You can do this when using dict's but not lists! Since python cannot be sure that the list indices will be continuously assigned use sqDiff.append(...) instead.
Do not concatenate strings with floats. I recommend to read the PEP 0498
(https://www.python.org/dev/peps/pep-0498/) which gives you an idea on how string could/should be formated in python
I have a big text file of 13 GB with 158,609,739 lines and I want to randomly select 155,000,000 lines.
I have tried to scramble the file and then cut the 155000000 first lines, but it's seem that my ram memory (16GB) isn't enough big to do this. The pipelines i have tried are:
shuf file | head -n 155000000
sort -R file | head -n 155000000
Now instead of selecting lines, I think is more memory efficient delete 3,609,739 random lines from the file to get a final file of 155000000 lines.
As you copy each line of the file to the output, assess its probability that it should be deleted. The first line should have a 3,609,739/158,609,739 chance of being deleted. If you generate a random number between 0 and 1 and that number is less than that ratio, don't copy it to the output. Now the odds for the second line are 3,609,738/158,609,738; if that line is not deleted, the odds for the third line are 3,609,738/158,609,737. Repeat until done.
Because the odds change with each line processed, this algorithm guarantees the exact line count. Once you've deleted 3,609,739 the odds go to zero; if at any time you would need to delete every remaining line in the file, the odds go to one.
You could always pre-generate which line numbers (a list of 3,609,739 random numbers selected without replacement) you plan on deleting, then just iterate through the file and copy to another, skipping lines as necessary. As long as you have space for a new file this would work.
You could select the random numbers with random.sample
E.g.,
random.sample(xrange(158609739), 3609739)
Proof of Mark Ransom's Answer
Let's use numbers easier to think about (at least for me!):
10 items
delete 3 of them
First time through the loop we will assume that the first three items get deleted -- here's what the probabilities look like:
first item: 3 / 10 = 30%
second item: 2 / 9 = 22%
third item: 1 / 8 = 12%
fourth item: 0 / 7 = 0 %
fifth item: 0 / 6 = 0 %
sixth item: 0 / 5 = 0 %
seventh item: 0 / 4 = 0 %
eighth item: 0 / 3 = 0 %
ninth item: 0 / 2 = 0 %
tenth item: 0 / 1 = 0 %
As you can see, once it hits zero, it stays at zero. But what if nothing is getting deleted?
first item: 3 / 10 = 30%
second item: 3 / 9 = 33%
third item: 3 / 8 = 38%
fourth item: 3 / 7 = 43%
fifth item: 3 / 6 = 50%
sixth item: 3 / 5 = 60%
seventh item: 3 / 4 = 75%
eighth item: 3 / 3 = 100%
ninth item: 2 / 2 = 100%
tenth item: 1 / 1 = 100%
So even though the probability varies per line, overall you get the results you are looking for. I went a step further and coded a test in Python for one million iterations as a final proof to myself -- remove seven items from a list of 100:
# python 3.2
from __future__ import division
from stats import mean # http://pypi.python.org/pypi/stats
import random
counts = dict()
for i in range(100):
counts[i] = 0
removed_failed = 0
for _ in range(1000000):
to_remove = 7
from_list = list(range(100))
removed = 0
while from_list:
current = from_list.pop()
probability = to_remove / (len(from_list) + 1)
if random.random() < probability:
removed += 1
to_remove -= 1
counts[current] += 1
if removed != 7:
removed_failed += 1
print(counts[0], counts[1], counts[2], '...',
counts[49], counts[50], counts[51], '...',
counts[97], counts[98], counts[99])
print("remove failed: ", removed_failed)
print("min: ", min(counts.values()))
print("max: ", max(counts.values()))
print("mean: ", mean(counts.values()))
and here's the results from one of the several times I ran it (they were all similar):
70125 69667 70081 ... 70038 70085 70121 ... 70047 70040 70170
remove failed: 0
min: 69332
max: 70599
mean: 70000.0
A final note: Python's random.random() is [0.0, 1.0) (doesn't include 1.0 as a possibility).
I believe you're looking for "Algorithm S" from section 3.4.2 of Knuth (D. E. Knuth, The Art of Computer Programming. Volume 2: Seminumerical Algorithms, second edition. Addison-Wesley, 1981).
You can see several implementations at http://rosettacode.org/wiki/Knuth%27s_algorithm_S
The Perlmonks list has some Perl implementations of Algorithm S and Algorithm R that might also prove useful.
These algorithms rely on there being a meaningful interpretation of floating point numbers like 3609739/158609739, 3609738/158609738, etc. which might not have sufficient resolution with a standard Float datatype, unless the Float datatype is implemented using numbers of double precision or larger.
Here's a possible solution using Python:
import random
skipping = random.sample(range(158609739), 3609739)
input = open(input)
output = open(output, 'w')
for i, line in enumerate(input):
if i in skipping:
continue
output.write(line)
input.close()
output.close()
Here's another using Mark's method:
import random
lines_in_file = 158609739
lines_left_in_file = lines_in_file
lines_to_delete = lines_in_file - 155000000
input = open(input)
output = open(output, 'w')
try:
for line in input:
current_probability = lines_to_delete / lines_left_in_file
lines_left_in_file -= 1
if random.random < current_probability:
lines_to_delete -= 1
continue
output.write(line)
except ZeroDivisionError:
print("More than %d lines in the file" % lines_in_file)
finally:
input.close()
output.close()
I wrote this code before seeing that Darren Yin has expressed its principle.
I've modified my code to take the use of name skipping (I didn't dare to choose kangaroo ...) and of keyword continue from Ethan Furman whose code's principle is the same too.
I defined default arguments for the parameters of the function in order that the function can be used several times without having to make re-assignement at each call.
import random
import os.path
def spurt(ff,skipping):
for i,line in enumerate(ff):
if i in skipping:
print 'line %d excluded : %r' % (i,line)
continue
yield line
def randomly_reduce_file(filepath,nk = None,
d = {0:'st',1:'nd',2:'rd',3:'th'},spurt = spurt,
sample = random.sample,splitext = os.path.splitext):
# count of the lines of the original file
with open(filepath) as f: nl = sum(1 for _ in f)
# asking for the number of lines to keep, if not given as argument
if nk is None:
nk = int(raw_input(' The file has %d lines.'
' How many of them do you '
'want to randomly keep ? : ' % nl))
# transfer of the lines to keep,
# from one file to another file with different name
if nk<nl:
with open(filepath,'rb') as f,\
open('COPY'.join(splitext(filepath)),'wb') as g:
g.writelines( spurt(f,sample(xrange(0,nl),nl-nk) ) )
# sample(xrange(0,nl),nl-nk) is the list
# of the counting numbers of the lines to be excluded
else:
print ' %d is %s than the number of lines (%d) in the file\n'\
' no operation has been performed'\
% (nk,'the same' if nk==nl else 'greater',nl)
With the $RANDOM variable you can get a random number between 0 and 32,767.
With this, you could read in each line, and see if $RANDOM is less than 155,000,000 / 158,609,739 * 32,767 (which is 32,021), and if so, let the line through.
Of course, this wouldn't give you exactly 150,000,000 lines, but pretty close to it depending on the normality of the random number generator.
EDIT: Here is some code to get you started:
#!/bin/bash
while read line; do
if (( $RANDOM < 32021 ))
then
echo $line
fi
done
Call it like so:
thatScript.sh <inFile.txt >outFile.txt
I have the following file I'm trying to manipulate.
1 2 -3 5 10 8.2
5 8 5 4 0 6
4 3 2 3 -2 15
-3 4 0 2 4 2.33
2 1 1 1 2.5 0
0 2 6 0 8 5
The file just contains numbers.
I'm trying to write a program to subtract the rows from each other and print the results to a file. My program is below and, dtest.txt is the name of the input file. The name of the program is make_distance.py.
from math import *
posnfile = open("dtest.txt","r")
posn = posnfile.readlines()
posnfile.close()
for i in range (len(posn)-1):
for j in range (0,1):
if (j == 0):
Xp = float(posn[i].split()[0])
Yp = float(posn[i].split()[1])
Zp = float(posn[i].split()[2])
Xc = float(posn[i+1].split()[0])
Yc = float(posn[i+1].split()[1])
Zc = float(posn[i+1].split()[2])
else:
Xp = float(posn[i].split()[3*j+1])
Yp = float(posn[i].split()[3*j+2])
Zp = float(posn[i].split()[3*j+3])
Xc = float(posn[i+1].split()[3*j+1])
Yc = float(posn[i+1].split()[3*j+2])
Zc = float(posn[i+1].split()[3*j+3])
Px = fabs(Xc-Xp)
Py = fabs(Yc-Yp)
Pz = fabs(Zc-Zp)
print Px,Py,Pz
The program is calculating the values correctly but, when I try to call the program to write the output file,
mpipython make_distance.py > distance.dat
The output file (distance.dat) only contains 3 columns when it should contain 6. How do I tell the program to shift what columns to print to for each step j=0,1,....
For j = 0, the program should output to the first 3 columns, for j = 1 the program should output to the second 3 columns (3,4,5) and so on and so forth.
Finally the len function gives the number of rows in the input file but, what function gives the number of columns in the file?
Thanks.
Append a , to the end of your print statement and it will not print a newline, and then when you exit the for loop add an additional print to move to the next row:
for j in range (0,1):
...
print Px,Py,Pz,
print
Assuming all rows have the same number of columns, you can get the number of columns by using len(row.split()).
Also, you can definitely shorten your code quite a bit, I'm not sure what the purpose of j is, but the following should be equivalent to what you're doing now:
for j in range (0,1):
Xp, Yp, Zp = map(float, posn[i].split()[3*j:3*j+3])
Xc, Yc, Zc = map(float, posn[i+1].split()[3*j:3*j+3])
...
You don't need to:
use numpy
read the whole file in at once
know how many columns
use awkward comma at end of print statement
use list subscripting
use math.fabs()
explicitly close your file
Try this (untested):
with open("dtest.txt", "r") as posnfile:
previous = None
for line in posnfile:
current = [float(x) for x in line.split()]
if previous:
delta = [abs(c - p) for c, p in zip(current, previous)]
print ' '.join(str(d) for d in delta)
previous = current
just in case your dtest.txt grows larger and you don't want to redirect your output but rather write to distance.dat, especially, if you want to use numpy. Thank #John for pointing out my mistake in the old code ;-)
import numpy as np
pos = np.genfromtxt("dtest.txt")
dis = np.array([np.abs(pos[j+1] - pos[j]) for j in xrange(len(pos)-1)])
np.savetxt("distance.dat",dis)