Python split on the partition list - python

I am currently on a project that gat all partition on a disk with Ubuntu 20.
def get_partitions():
"""
This function returns a list of partition objects.
"""
partitions = []
for line in open('/proc/partitions'):
if line.startswith('major'):
continue
fields = line.split()
partitions.append(partition(
int(fields[0]),
int(fields[1]),
int(fields[3]),
fields[5]
))
return partitions
But I have this error :
Traceback (most recent call last):
File "/home/mathieu-s/Documents/opt/repo/dosm/disk/disk_scanner.py", line 69, in <module>
print(get_partitions())
File "/home/mathieu-s/Documents/opt/repo/dosm/disk/disk_scanner.py", line 62, in get_partitions
int(fields[0]),
IndexError: list index out of range
Someone can help me ?

Errors explications
Main error
When you get partitions data from /proc/partitions in Ubuntu 20.04, you have approximately this output:
major minor #blocks name
7 0 5956 loop0
7 1 4 loop1
7 2 9240 loop2
7 3 9244 loop3
7 4 151112 loop4
7 5 135924 loop5
7 6 283688 loop6
7 7 63580 loop7
259 0 500107608 nvme0n1
259 1 834560 nvme0n1p1
259 2 8388608 nvme0n1p2
259 3 490883072 nvme0n1p3
7 8 101824 loop8
You can look that the second line is empty, but in your code you didn't check this case.
Second error
The line partition data:
major minor #blocks name
7 0 5956 loop0
When you get the fields for your row, you convert the 3rd field to int, but the 3rd field is the name of the partition. The conversion of name to int will not work.
And when you get the 5th field from the line, it will show an error because this field does not exist on Ubuntu 20.04's file partition.
Possible solution
fix the first error: empty line
To solve the main error, you can simply modify your if condition with this:
if line.startswith('major') or line.startswith('\n'):
fix the second problem: field number
To solve the second problem, you can modify your code in the append with this:
partitions.append(partition(
int(fields[0]),
int(fields[1]),
int(fields[2]),
fields[3]
))
All code of a possible solution:
def get_partitions():
"""
This function returns a list of partition in the disk.
"""
partitions = []
for line in open('/proc/partitions'):
if line.startswith('major') or line.startswith('\n'):
continue
fields = line.split()
partitions.append(partition(
int(fields[0]),
int(fields[1]),
int(fields[2]),
fields[3]
))
return partitions

Related

list index out of range reading files

I am a beginner of python and would need some help. I have run into a problem when trying to manipulating some dat-files.
I have created 159 dat.files (refitted_to_digit0_0.dat, refitted_to_digit0_1.dat, ...refitted_to_digit0_158.dat) containing time series data of two columns (timestep, value) of 2999 rows. In my python program I have created a list of these files with filelist_refit_0=glob.glob('refitted_to_digit0_*')
plist_refit_0=[]
I now try to load the second column of each 159 files into the plist_refit_0 so that each place in the list contains an array of 2999 values (second columns) that I will use for further manipulations. I have created a for-loop for this and use the len(filelist_refit_0) as the range for the loop. The length being 159 (number of files: 0-158).
However, when I run this I get an error message: list index out of range.
I have tried with a lower range for the for-loop and it seems to work up until range 66 but not above that. filelist_refit_0[66] refer to file refitted_to_digit0_158.dat and filelist_refit_0[67] refer to refitted_to_digit0_16.dat. filelist_refit_0[158] refer to refitted_to_digit0_99.dat. Instead of being sorted in ascending order based on the value 0->158 I think the plist_refit_0 have the files in ascending order based on the digits: refitted_to_digit0_0.dat first, then refitted_to_digit0_1.dat, then refitted_to_digit0_10.dat, then refitted_to_digit0_100.dat, then refitted_to_digit0_101.dat resulting in refitted_to_digit0_158.dat being on place 66 in the list. However, I still don't understand why the compiler interprets the index as being out of range above 66 when the length of the filelist_refit_0 being 159 and there really are 159 files, no matter the order. If anyone can explain this and have some advice how to solve this problem, I highly appreciate it! Thanks for your help.
I have tried the following to understand the sorting:
print len(filelist_refit_0) => 159
print filelist_refit_0[66] => refitted_to_digit0_158.dat
print filelist_refit_0[67] => refitted_to_digit0_16.dat
print filelist_refit_0[158] => refitted_to_digit0_99.dat
print filelist_refit_0[0] => refitted_to_digit0_0.dat
I have "manually" loaded the files and it seems to work for most index e.g.
t, p = loadtxt(filelist_refit_0[65], usecols=(0, 1), unpack=True)
plist_refit_0.append(p)
t, p = loadtxt(filelist_refit_0[67], usecols=(0, 1), unpack=True)
plist_refit_0.append(p)
print plist_refit_0[0]
print plist_refit_0[1]
BUT it does not work for index66!:
t, p = loadtxt(filelist_refit_0[66], usecols=(0, 1), unpack=True)
plist_refit_0.append(p)
Then I get error: list index out of range.
As can be seen above it refers to refitted_to_digit0_158.dat which is the last file. I have looked into the file and it looks exactly the same as all the other files, which the same number of columns and raw-elements (2999). Why is this entry different?
Python 2:
filelist_refit_0 = glob.glob('refitted_to_digit0_*')
plist_refit_0 = []
for i in range(len(filelist_refit_0)):
t, p = loadtxt(filelist_refit_0[i], usecols=(0, 1), unpack=True)
plist_refit_0.append(p)
Traceback (most recent call last):
File "test.py", line 107, in <module>
t,p=loadtxt(filelist_refit_0[i],usecols=(0,1),unpack=True)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/lib/npyio.py", line 1092, in loadtxt
for x in read_data(_loadtxt_chunksize):
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/lib/npyio.py", line 1012, in read_data
vals = [vals[j] for j in usecols]
IndexError: list index out of range

TypeError: unorderable types: Tree() < int() (This runs fine in Python 2)

I ran this Huffman Coding code in python 2 and it works smoothly. However, in python 3, it gives me an error as above. I know that the types are not the same (and hence incomparable) but how should I fix this?
Note that the error specifically points out at q.put((kiri[0]+kanan[0],node)), which I believe the issue lies on the comparison made in the priority queue.
Example of input that causes the error:
3
1
2
3
The first line refers to the number of characters. The next lines show the frequency of the first character, second character and so on.
Note that the code somehow runs if the first line is less than 3. For example:
2
1
2
works just fine
Any help is greatly appreciated. Thank you!
n=list(map(int, input().split()))
n=n[0]
li=[None]*n
for i in range(n):
inp=list(map(int, input().split()))
li[i]=inp[0]
char=[None]*n
index=1
for i in range(n):
char[i]=index
index+=1
freq=list(zip(li,char))
import queue
class Tree:
def __init__(self,kanan,kiri):
self.kanan=kanan
self.kiri=kiri
def anak(self):
return int((self.kanan,self.kiri))
q=queue.PriorityQueue()
for nilai in freq:
q.put(nilai)
size=q.qsize()
for i in range(size,1,-1):
kanan=q.get()
kiri=q.get()
node=Tree(kanan,kiri)
q.put((kiri[0]+kanan[0],node))
huffmantree=q.get()
def traverse(huffmantree,st,pref):
if isinstance(huffmantree[1].kanan[1],Tree):
traverse(huffmantree[1].kanan,st,pref+"0")
else: st[huffmantree[1].kanan[1]]=pref+"0"
if isinstance(huffmantree[1].kiri[1],Tree):
traverse(huffmantree[1].kiri,st,pref+"1")
else: st[huffmantree[1].kiri[1]]=pref+"1"
return st
binarystring=traverse(huffmantree,{},"")
for i in freq: print(binarystring[i[1]])
The basic problem is that you're pushing different types into your tree nodes. The second element of your ordered pair (2-tuple) is an integer as long as you have a simple tree. When you get to a child (grandchild) node, however, you try to use a Tree object as the second element.
Sorting into the tree requires a well-defined ordering. A Tree and an int have no such ordering.
Very simply, you must decide what the other element should be, and keep that consistent in your data handling. Here's some basic debugging instrumentation:
for nilai in freq:
print("nilai insert", nilai)
q.put(nilai)
size = q.qsize()
for i in range(size,1,-1):
kanan = q.get()
kiri = q.get()
node = Tree(kanan,kiri)
new_val = kiri[0] + kanan[0]
print("1st", new_val)
print("2nd", node)
q.put((new_val,node))
Execution for your 3-1-2-3 case:
$ python3 so.py
3
1
2
3
nilai insert (1, 1)
nilai insert (2, 2)
nilai insert (3, 3)
1st 3
2nd <__main__.Tree object at 0x7efd60c9a160>
Traceback (most recent call last):
File "so.py", line 40, in <module>
q.put((new_val,node))
File "/usr/lib64/python3.4/queue.py", line 146, in put
self._put(item)
File "/usr/lib64/python3.4/queue.py", line 230, in _put
heappush(self.queue, item)
TypeError: unorderable types: Tree() < int()
The critical difference is between upper and lower additions:
q.put(nilai) # Simple pair, such as (1, 1)
and
q.put((new_val,node))
Where the second element is a Tree.

Python nested loop - table index as variable

I am not Python programmer, but I need to use some method from SciPy library. I just want to repeat inner loop a couple of times, but with changed index of table. Here is my code for now:
from scipy.stats import pearsonr
fileName = open('ILPDataset.txt', 'r')
attributeValue, classValue = [], []
for index in range(0, 10, 1):
for line in fileName.readlines():
data = line.split(',')
attributeValue.append(float(data[index]))
classValue.append(float(data[10]))
print(index)
print(pearsonr(attributeValue, classValue))
And I am getting the following output:
0
(-0.13735062681256097, 0.0008840631556260505)
1
(-0.13735062681256097, 0.0008840631556260505)
2
(-0.13735062681256097, 0.0008840631556260505)
3
(-0.13735062681256097, 0.0008840631556260505)
4
(-0.13735062681256097, 0.0008840631556260505)
5
(-0.13735062681256097, 0.0008840631556260505)
6
(-0.13735062681256097, 0.0008840631556260505)
7
(-0.13735062681256097, 0.0008840631556260505)
8
(-0.13735062681256097, 0.0008840631556260505)
9
(-0.13735062681256097, 0.0008840631556260505)
As you can see index is changing, but the result of that function is always like the index would be 0.
When I am running script couple of times but with changing index value like this:
attributeValue.append(float(data[0]))
attributeValue.append(float(data[1]))
...
attributeValue.append(float(data[9]))
everything is ok, and I am getting correct results, but I can't do it in one loop statement. What am I doing wrong?
EDIT:
Test file:
62,1,6.8,3,542,116,66,6.4,3.1,0.9,1
40,1,1.9,1,231,16,55,4.3,1.6,0.6,1
63,1,0.9,0.2,194,52,45,6,3.9,1.85,2
34,1,4.1,2,289,875,731,5,2.7,1.1,1
34,1,4.1,2,289,875,731,5,2.7,1.1,1
34,1,6.2,3,240,1680,850,7.2,4,1.2,1
20,1,1.1,0.5,128,20,30,3.9,1.9,0.95,2
84,0,0.7,0.2,188,13,21,6,3.2,1.1,2
57,1,4,1.9,190,45,111,5.2,1.5,0.4,1
52,1,0.9,0.2,156,35,44,4.9,2.9,1.4,1
57,1,1,0.3,187,19,23,5.2,2.9,1.2,2
38,0,2.6,1.2,410,59,57,5.6,3,0.8,2
38,0,2.6,1.2,410,59,57,5.6,3,0.8,2
30,1,1.3,0.4,482,102,80,6.9,3.3,0.9,1
17,0,0.7,0.2,145,18,36,7.2,3.9,1.18,2
46,0,14.2,7.8,374,38,77,4.3,2,0.8,1
Expected results of pearsonr for 9 script runs:
data[0] (0.06050513030608389, 0.8238536636813034)
data[1] (-0.49265895172303803, 0.052525691067199995)
data[2] (-0.5073312383613632, 0.0448647312201305)
data[3] (-0.4852842899321005, 0.056723468068371544)
data[4] (-0.2919584357031029, 0.27254138535817224)
data[5] (-0.41640591455640696, 0.10863082761524119)
data[6] (-0.46954072465442487, 0.0665061785375443)
data[7] (0.08874739193909209, 0.7437895010751641)
data[8] (0.3104260624799073, 0.24193152445774302)
data[9] (0.2943030868699842, 0.26853066217221616)
Turn each line of the file into a list of floats
data = []
with open'ILPDataset.txt') as fileName:
for line in fileName:
line = line.strip()
line = line.split(',')
line = [float(item) for item in line[:11]]
data.append(line)
Transpose the data so that each list in data has the column values from the original file. data --> [[column 0 items], [column 1 items],[column 2 items],...]
data = zip(*data) # for Python 2.7x
#data = list(zip(*data)) # for python 3.x
Correlate:
for n in [0,1,2,3,4,5,6,7,8,9]:
corr = pearsonr(data[n], data[10])
print('data[{}], {}'.format(n, corr))
#wwii 's answer is very good
Only one suggestion. list(zip(*data)) seems a bit overkill to me. zip is really for lists with variable types and potentially variable lengths to be composed into tuples. Only then be transformed back into lists in this case with list()).
So why not just use the simple transpose operation which is what this is?
import numpy;
//...
data = numpy.transpose(data);
which does the same job, probably faster (not measure) and more deterministically.

why this python code is producing Runtime Error in ideone?

import sys
def func():
T = int(next(sys.stdin))
for i in range(0,T):
N = int(next(sys.stdin))
print (N)
func()
Here I am taking input T for for loop and iterating over T it gives Runtime error time: 0.1 memory: 10088 signal:-1 again-again . I have tried using sys.stdin.readline() it also giving same error .
I looked at your code at http://ideone.com/8U5zTQ . at the code itself looks fine, but your input can't be processed.
Because it is:
5 24 2
which will be the string:
"5 24 2"
this is not nearly an int, even if you try to cast it. So you could transform it to the a list with:
inputlist = next(sys.stdin[:-2]).split(" ")
to get the integers in a list that you are putting in one line. The loop over that.
After that the code would still be in loop because it want 2 integers more but at least you get some output.
Since I am not totally shure what you try to achieve, you could now iterate over that list and print your inputs:
inputlist = next(sys.stdin[:-2]).split(" ")
for i in inputlist
print(i)
Another solution would be, you just put one number per line in, that would work also
so instead of
5 24 2
you put in
5
24
2
Further Advice
on Ideone you also have an Error Traceback at the bottom auf the page:
Traceback (most recent call last):
File "./prog.py", line 8, in <module>
File "./prog.py", line 3, in func
ValueError: invalid literal for int() with base 10: '1 5 24 2\n'
which showed you that it can't handle your input

reading an array with missing data and spaces in the first column

I have a .txt file I want to read using pyhon. The file is an array. It contains data on comets. I copied 3 rows out of the 3000 rows.
P/2011 U1 PANSTARRS 1.54 0.5 14.21 145.294 352.628 6098.07
P/2011 VJ5 Lemmon 4.12 0.5 2.45 139.978 315.127 5904.20 *
149P/Mueller 4 3.67 0.1 5.32 85.280 27.963 6064.72
I am reading the array using the the following code:
import numpy as np
list_comet = np.genfromtxt('jfc_master.txt', dtype=None)
I am facing 2 different problems:
First, in row 1 the name of the comet is: P/2011 U1 PANSTARRS. If I type:
list_comet[0][1] the result will be P/2011. How should I tell python how to read the name of each comet? Note that the longest name is 31 characters. So what is the command to tell python that column 1 is 31 characters long?
Second, in row 2 that value of the last column is *. When I read the file I am receiving an error which says that:
Line #2941 (got 41 columns instead of 40)
(note that the above data is not the complete data, the total number of columns I have in my original data is 38). I guess I am receiving this error due to the * found in certain rows. How can I fix this problem?
You didn't mention what data structure you're looking for, i.e. what operations you intend to perform on the parsed data. In the simplest case, you could massage the file into a list of 8-tuples - the last element being either '*' or an empty string. That is as simple as
import string
def tokenize(s):
if s[-1] == '*':
return string.rsplit(s, None, 7)
else:
return string.rsplit(s, None, 6) + ['']
tokens = (tokenize(line.rstrip()) for line in open('so21712204.txt'))
To be fair, this doesn't make tokens a list of 8-tuples but rather a generator (which is more space efficient) of lists, each of which having 8 elements.

Categories