Storing contents of a text file in 2D array [duplicate] - python

This question already has answers here:
Python read file into 2d list
(9 answers)
Closed 1 year ago.
I have a text file as follows:
id1, 0, 0, 0
id2, 0, 0, 0
id3, 0, 0, 0
I want to store this data in an array but I also need each data to be an induvidual element in the corresponding line's array.
So the output should look like this:
[["id1", 0, 0, 0], ["id2", 0, 0, 0], ["id3", 0, 0, 0]...]
data= "data.txt"
dataArray= []
file = open(data, 'r')
contents=file.read()
file.close()
lines= contents.split("\n")
for line in lines:
dataArray.append(line)
I couldn't find a way to split the string and store the data after this part.

Assuming there is a single space between every element of the line you can just change your for loop as -
for line in lines:
dataArray.append(line.split(", "))
Here the line is further split on basis of , followed by a space. The split will return a list itself.
If that is not the case (I mean if there are non-uniform spaces) then you can just combine split(), strip() and list comprehension like this -
for line in lines:
dataArray.append([e.strip() for e in line.split(',')])
Here strip is required because after split() the individual element of the output of split may contain space before or after it.
If you need the numbers to be converted as int, then further modify it to -
for line in lines:
dataArray.append([int(e) if e.strip().isnumeric() else e.strip() for e in line.split(',')])

Your data is in csv format which is supported by python:
(sos_config) ~/wk/cliosoft/projects/sos_config $ ipython
Python 3.9.4 (default, Apr 9 2021, 01:03:21)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.23.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: import csv
In [2]: !cat /tmp/data.csv
id1, 0, 0, 0
id2, 0, 0, 0
id3, 0, 0, 0
In [3]: with open('/tmp/data.csv') as f:
...: reader = csv.reader(f)
...: data_array = [row for row in reader]
...:
In [4]: data_array
Out[4]:
[['id1', ' 0', ' 0', ' 0'],
['id2', ' 0', ' 0', ' 0'],
['id3', ' 0', ' 0', ' 0']]

You can try this it'll take care of integer/strings separately and also new-line char -
with open(data, 'r') as f:
content = f.read().splitlines()
result = [
[
int(k) if k.strip().isnumeric() else k
for k in item.strip().split(',')
]
for item in content
]

Related

Removing line break and writing lists without square brackets and comas to a text file in python

I'm facing a few issues with regard to writing some arguments to a text file. Below are the outputs I need to see in my text file.
I want to write an output like this to the text file.
Input:
Hello
World
Output:
HelloWorld
2. I want to write an output like this into a text file.
Input:
[1, 2, 3, 4, 5]
Output:
1,2,3,4,5
I tried several ways to do this but couldn't find a proper way.
Hope to seek some help.
The Code:
progressList = [120, 0, 0] #A variable which wont change. (ie this variable only has this value))
resultList = ['Progress', 'Trailer'] #Each '' represents one user input
#loop for progress
with open("data.txt", "a") as f: # Used append as per my requirement
i = 0 #iterator
while i < len(resultList):
# f.write(resultList)
if resultList[i] == "Progress":
j = 0
f.write("Progress - ")
for j in range(3):
while j < 2:
f.write(', ', join(progressList[j]))
break
if j == 2:
f.write(progressList[j], end='')
break
Output (textfile):
Progress - 120, 0, 0
Thanks.
1st case would be something like this
>>> s = '''hello
... world'''
>>> ''.join(s.split())
'helloworld'
>>>
2nd one is funny
>>> s = "[1, 2, 3, 4, 5]"
>>> exec ('a = ' + s)
>>> ','.join([str(i) for i in a])
'1,2,3,4,5'
hope it helps

Read List in List [duplicate]

This question already has answers here:
How to convert string representation of list to a list
(19 answers)
Closed 5 months ago.
I have a text file and there is 3 lines on data in it.
[1, 2, 1, 1, 3, 1, 1, 2, 1, 3, 1, 1, 1, 3, 3]
[1, 1, 3, 3, 3, 1, 1, 1, 1, 2, 1, 1, 1, 3, 3]
[1, 2, 3, 1, 3, 1, 1, 3, 1, 3, 1, 1, 1, 3, 3]
I try to open and get data in it.
with open("rafine.txt") as f:
l = [line.strip() for line in f.readlines()]
f.close()
now i have list in list.
if i say print(l[0]) it shows me [1, 2, 1, 1, 3, 1, 1, 2, 1, 3, 1, 1, 1, 3, 3]
But i want to get numbers in it.
So when i write print(l[0][0])
i want to see 1 but it show me [
how can i fix this ?
You can use literal_eval to parse the lines from the file & build the matrix:
from ast import literal_eval
with open("test.txt") as f:
matrix = []
for line in f:
row = literal_eval(line)
matrix.append(row)
print(matrix[0][0])
print(matrix[1][4])
print(matrix[2][8])
result:
1
3
1
import json
with open("rafine.txt") as f:
for line in f.readlines():
line = json.loads(line)
print(line)
The best approach depends on what assumption you make about the data in your text file:
ast.literal_eval
If the data in your file is formatted the same way, it would be inside python source-code, the best approach is to use literal_eval:
from ast import literal_eval
data = [] # will contain list of lists
with open("filename") as f:
for line in f:
row = literal_eval(line)
data.append(row)
or, the short version:
with open(filename) as f:
data = [literal_eval(line) for line in f]
re.findall
If you can make few assumptions about the data, using regular expressions to find all digits might be a way forward. The below builds lists by simply extracting any digits in the text file, regardless of separators or other characters in the file:
import re
data = [] # will contain list of lists
with open("filename") as f:
for line in f:
row = [int(i) for i in re.findall(r'\d+', line)]
data.append(row)
or, in short:
with open(filename) as f:
data= [ [int(i) for i in re.findall(r'\d+', line)] for line in f ]
handwritten parsing
If both options are not suitable, there is always an option to parse by hand, to tailor for the exact format:
data = [] # will contain list of lists
with open(filename) as f:
for line in f:
row = [int(i) for i in line[1:-1].split(, )]
data.append(row)
The [1,-1] will remove the first and last character (the brackets), then split(", ") will split it into a list. for i in ... will iterate over the items in this list (assigning i to each item) and int(i) will convert i to an integer.

Python - convert edge list to adjacency matrix

I have data in the following format:
user,item,rating
1,1,3
1,2,2
2,1,2
2,4,1
and so on
I want to convert this in matrix form
So, the out put is like this
Item--> 1,2,3,4....
user
1 3,2,0,0....
2 2,0,0,1
....and so on..
How do I do this in python?
THanks
data = [
(1,1,3),
(1,2,2),
(2,1,2),
(2,4,1),
]
#import csv
#with open('data.csv') as f:
# next(f) # Skip header
# data = [map(int, row) for row in csv.reader(f)]
# # Python 3.x: map(int, row) -> tuple(map(int, row))
n = max(max(user, item) for user, item, rating in data) # Get size of matrix
matrix = np.zeros((n, n))
for user, item, rating in data:
matrix[user-1][item-1] = rating # Convert to 0-based index.
for row in matrix:
print(row)
prints
[3, 2, 0, 0]
[2, 0, 0, 1]
[0, 0, 0, 0]
[0, 0, 0, 0]
a different approach from #falsetru,
do you read from file in write to file?
may be work with dictionary
from collections import defaultdict
valdict=defaultdict(int)
nuser=0
nitem=0
for line in infile:
eachline=line.strip().split(",")
valdict[tuple(eachline[0:2])]=eachline[2]
nuser=max(nuser,eachline[0])
nitem=max(nitem,eachline[1])
towrite=",".join(range(1,nuser+1))+"\n"
for i in range(1:nuser+1):
towrite+=str(i)
for j in range(1:nitem+1):
towrite+=","+str(valdict[i,j])
towrite+="\n"
outfile.write(towrite)

how to configure variable for read_in txt file like '0x12c'

In my c1.txt file, i have something like
var0 = '0x000000000000004080'
And in my python mail file:
def getVarFromFile(filename):
import imp
f = open(filename)
global data
data = imp.load_source('data', '', f)
f.close()
getVarFromFile('c1.txt')
The var0 is a 72-bit variable, and in my python file, I want to assign that to 12 variable with 6-bit for each, how can I do that?
Since var0 is a hex variable, seems I can't do
x = int(data.var1) & 0x3F
Thanks
I think all you need to do in your last line is provide a base for the integer conversion:
values_from_hex = int(data.var1, 16)
Then you can divide the value up into your 6-bit values (least significant bits first):
six_bit_values = [values_from_hex >> i*6 & 0x3f for i in range(12)]
For a value of '0x000000000000004080', this gets [0, 2, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0].

Python - Parsing Columns and Rows

I am running into some trouble with parsing the contents of a text file into a 2D array/list. I cannot use built-in libraries, so have taken a different approach. This is what my text file looks like, followed by my code
1,0,4,3,6,7,4,8,3,2,1,0
2,3,6,3,2,1,7,4,3,1,1,0
5,2,1,3,4,6,4,8,9,5,2,1
def twoDArray():
network = [[]]
filename = open('twoDArray.txt', 'r')
for line in filename.readlines():
col = line.split(line, ',')
row = line.split(',')
network.append(col,row)
print "Network = "
print network
if __name__ == "__main__":
twoDArray()
I ran this code but got this error:
Traceback (most recent call last):
File "2dArray.py", line 22, in <module>
twoDArray()
File "2dArray.py", line 8, in twoDArray
col = line.split(line, ',')
TypeError: an integer is required
I am using the comma to separate both row and column as I am not sure how I would differentiate between the two - I am confused about why it is telling me that an integer is required when the file is made up of integers
Well, I can explain the error. You're using str.split() and its usage pattern is:
str.split(separator, maxsplit)
You're using str.split(string, separator) and that isn't a valid call to split. Here is a direct link to the Python docs for this:
http://docs.python.org/library/stdtypes.html#str.split
To directly answer your question, there is a problem with the following line:
col = line.split(line, ',')
If you check the documentation for str.split, you'll find the description to be as follows:
str.split([sep[, maxsplit]])
Return a list of the words in the string, using sep as the delimiter string. If maxsplit is given, at most
maxsplit splits are done (thus, the list will have at most maxsplit+1 elements). If maxsplit is not specified, then there is no limit on the number of splits (all possible splits are made).
This is not what you want. You are not trying to specify the number of splits you want to make.
Consider replacing your for loop and network.append with this:
for line in filename.readlines():
# line is a string representing the values for this row
row = line.split(',')
# row is the list of numbers strings for this row, such as ['1', '0', '4', ...]
cols = [int(x) for x in row]
# cols is the list of numbers for this row, such as [1, 0, 4, ...]
network.append(row)
# Put this row into network, such that network is [[1, 0, 4, ...], [...], ...]
"""I cannot use built-in libraries""" -- do you really mean "cannot" as in you have tried to use the csv module and failed? If so, say so. Do you mean that "may not" as in you are forbidden to use a built-in module by the terms of your homework assignment? If so, say so.
Here is an answer that works. It doesn't leave a newline attached to the end of the last item in each row. It converts the numbers to int so that you can use them for whatever purpose you have. It fixes other errors that nobody else has mentioned.
def twoDArray():
network = []
# filename = open('twoDArray.txt', 'r')
# "filename" is a very weird name for a file HANDLE
f = open('twoDArray.txt', 'r')
# for line in filename.readlines():
# readlines reads the whole file into memory at once.
# That is quite unnecessary.
for line in f: # just iterate over the file handle
line = line.rstrip('\n') # remove the newline, if any
# col = line.split(line, ',')
# wrong args, as others have said.
# In any case, only 1 split call is necessary
row = line.split(',')
# now convert string to integer
irow = [int(item) for item in row]
# network.append(col,row)
# list.append expects only ONE arg
# indentation was wrong; you need to do this once per line
network.append(irow)
print "Network = "
print network
if __name__ == "__main__":
twoDArray()
Omg...
network = []
filename = open('twoDArray.txt', 'r')
for line in filename.readlines():
network.append(line.split(','))
you take
[
[1,0,4,3,6,7,4,8,3,2,1,0],
[2,3,6,3,2,1,7,4,3,1,1,0],
[5,2,1,3,4,6,4,8,9,5,2,1]
]
or you neeed some other structure as output? Please add what do you need as output?
class TwoDArray(object):
#classmethod
def fromFile(cls, fname, *args, **kwargs):
splitOn = kwargs.pop('splitOn', None)
mode = kwargs.pop('mode', 'r')
with open(fname, mode) as inf:
return cls([line.strip('\r\n').split(splitOn) for line in inf], *args, **kwargs)
def __init__(self, data=[[]], *args, **kwargs):
dataType = kwargs.pop('dataType', lambda x:x)
super(TwoDArray,self).__init__()
self.data = [[dataType(i) for i in line] for line in data]
def __str__(self, fmt=str, endrow='\n', endcol='\t'):
return endrow.join(
endcol.join(fmt(i) for i in row) for row in self.data
)
def main():
network = TwoDArray.fromFile('twodarray.txt', splitOn=',', dataType=int)
print("Network =")
print(network)
if __name__ == "__main__":
main()
The input format is simple, so the solution should be simple too:
network = [map(int, line.split(',')) for line in open(filename)]
print network
csv module doesn't provide an advantage in this case:
import csv
print [map(int, row) for row in csv.reader(open(filename, 'rb'))]
If you need float instead of int:
print list(csv.reader(open(filename, 'rb'), quoting=csv.QUOTE_NONNUMERIC))
If you are working with numpy arrays:
import numpy
print numpy.loadtxt(filename, dtype='i', delimiter=',')
See Why NumPy instead of Python lists?
All examples produce arrays equal to:
[[1 0 4 3 6 7 4 8 3 2 1 0]
[2 3 6 3 2 1 7 4 3 1 1 0]
[5 2 1 3 4 6 4 8 9 5 2 1]]
Read the data from the file. Here's one way:
f = open('twoDArray.txt', 'r')
buffer = f.read()
f.close()
Parse the data into a table
table = [map(int, row.split(',')) for row in buffer.strip().split("\n")]
>>> print table
[[1, 0, 4, 3, 6, 7, 4, 8, 3, 2, 1, 0], [2, 3, 6, 3, 2, 1, 7, 4, 3, 1, 1, 0], [5, 2, 1, 3, 4, 6, 4, 8, 9, 5, 2, 1]]
Perhaps you want the transpose instead:
transpose = zip(*table)
>>> print transpose
[(1, 2, 5), (0, 3, 2), (4, 6, 1), (3, 3, 3), (6, 2, 4), (7, 1, 6), (4, 7, 4), (8, 4, 8), (3, 3, 9), (2, 1, 5), (1, 1, 2), (0, 0, 1)]

Categories