converting a list of strings into a tuple - python

Hello i am trying to do a dictionary by using Python,
What needs to be done is python reads a text file which has values inside such as:
good buono
What I have done was, open the file with file function and replace tabs and add reversed comma to create a list so it looks like
["('good', 'buono')", "('afternoon', 'pomeriggo')",... and so on
but the problem is type of each word translation is not a tuple, it is string when I am trying to see 1st element(number 0) it shows me the value as
"('good', 'buono')"
which is a string. I need to use dict formula so that i can convert the type into dictionary but I cannot because it is list of strings(has to be list of tuples)
So how can I convert that list of strings into list of tuples?

Assuming that every pair of words is on a separate line in your file, you could do this:
mydict = {line.split()[0]: line.split()[1] for line in myfile}
This transforms a file like
good buono
afternoon pomeriggo
thanks grazie
into
{"good": "buono", "afternoon": "pomeriggo", "thanks": "grazie"}
No need for tuples along the way.
.split() splits the input string on whitespace, removing any leading/trailing whitespace:
>>> " good\tbuono \n".split()
['good', 'buono']

with open('input.txt') as f:
result = map(str.split, f)
# -> [['good', 'buono'], ['afternoon', 'pomeriggo']]
d = dict(result)
# -> {'good': 'buono', 'afternoon': 'pomeriggo'}

ast will work as #onemach suggested, but you may want to read the strings in as tuples instead.
Snippet:
li = []
with open("file.txt", 'r') as f:
li.append(tuple(f.readline().split('\t')))
This will split the string about the tab, and place it into a tuple object, which is then appended into a list.

Related

Converting comma separated .txt file into dictionary

I have a comma delimited .txt file and I need to convert the list into key:value pairs in python 3:
Below is the .txt file:
1988,0.4891
1989,0.2830
1990,0.4312
1991,0.1251
1992,0.0181
1993,0.6182
1994,0.1587
1995,0.1409
1996,0.1505
1997,0.0994
1998,0.1631
1999,0.0330
2000,0.0523
2001,-0.0798
2002,0.1107
2003,0.2308
2004,0.0484
2005,0.0114
2006,0.1088
2007,0.0228
2008,0.1538
2009,0.0038
2010,0.1085
2011,-0.0631
2012,-0.1581
2013,0.2538
2014,0.1377
2015,0.0199
2016,-0.0392
2017,0.0433
2018,-0.0154
Here is my python code:
import csv
answer = {}
with open("annual_chesapeakeCapital_diversifiedProgramLV.txt") as infile:
keys = infile.readline().split(",")
values = infile.readline().split("\n")
answer = dict(zip(keys, values))
print(answer)
What am I doing wrong?
You can use csv.reader to read the rows into a sequence of two-item lists, so that you can pass it to the dict constructor to build the dict you're looking for:
with open("annual_chesapeakeCapital_diversifiedProgramLV.txt") as infile:
answer = dict(csv.reader(infile))
To specifically answer your question, what you are doing wrong is that readline() only reads one line - you need to iterate reading all the lines and splitting each of them, and using the result of each split to create entries in the dictionary.
BTW dictionaries can by definition only have one entry for each key, so have you planned what will happen if two lines have the same value before the comma?
With pandas it's easy
import pandas as pd
df=pd.read_csv("annual_chesapeakeCapital_diversifiedProgramLV.txt",header=None)
answer =dict(zip(df[0].values,df[1].values))

Getting specific element(s) from after read from file

After I read from file:
with open(fileName) as f:
for line in f:
print(line.split(",")) #split the file into multiple lists
How do I get some specific element(s) from those lists?
For example, only elements with index[0 to 3], but discard/ignore any elements after that.
If you want to save the first three items in each line, you could use a list comprehension
with open(fileName) as f:
firstitems = [line.rstrip().split(",")[0:3] for line in f]
Note that the rstrip() is needed to remove the final newline character, if there are fewer than four items in a line. Note that the "items" are all strings, even if they look like other types. If you want integers, for example, you will need to convert them to integers.
Then you can print them:
for line in firstitems:
print(line)
Try the below code:
with open('f.txt') as f:
print('\n'.join([i for i in f.read().split(',')[0:3]]))

Removing newline characters in a txt files

I'm doing Euler Problems and am at problem #8 and wanted to just copy this huge 1000-digit number to a numberToProblem8.txt file and then just read it into my script but I can't find a good way to remove newlines from it. With that code:
hugeNumberAsStr = ''
with open('numberToProblem8.txt') as f:
for line in f:
aSingleLine = line.strip()
hugeNumberAsStr.join(aSingleLine)
print(hugeNumberAsStr)
Im using print() to only check if it works and well, it doesnt. It doesnt print out anything. What's wrong with my code? I remove all the trash with strip() and then use join() to add that cleaned line into hugeNumberAsStr (need a string to join those lines, gonna use int() later on) and its repeated for all the lines.
Here is the .txt file with a number in it.
What about something like:
hugeNumberAsStr = open('numberToProblem8.txt').read()
hugeNumberAsStr = hugeNumberAsStr.strip().replace('\n', '')
Or even:
hugeNumberAsStr = ''.join([d for d in hugeNumberAsStr if d.isdigit()])
I was able to simplify it to the following to get the number from that file:
>>> int(open('numberToProblem8.txt').read().replace('\n',''))
731671765313306249192251196744265747423553491949349698352031277450632623957831801698480186947885184385861560789112949495459501737958331952853208805511125406987471585238630507156932909632952274430435576689664895044524452316173185640309871112172238311362229893423380308135336276614282806444486645238749303589072962904915604407723907138105158593079608667017242712188399879790879227492190169972088809377665727333001053367881220235421809751254540594752243525849077116705560136048395864467063244157221553975369781797784617406495514929086256932197846862248283972241375657056057490261407972968652414535100474821663704844031998900088952434506585412275886668811642717147992444292823086346567481391912316282458617866458359124566529476545682848912883142607690042242190226710556263211111093705442175069416589604080719840385096245544
You need to do hugeNumberAsStr += aSingleLine instead of hugeNumberAsStr.join(..)
str.join() joins the passed iterator and return the string value joined by str. It doesn't update the value of hugeNumberAsStr as you think. You want to create a new string with removed \n. You need to store these values in new string. For that you need append the content to the string
The join method for strings simply takes an iterable object and concatenates each part together. It then returns the resulting concatenated string. As stated in help(str.join):
join(...)
S.join(iterable) -> str
Return a string which is the concatenation of the strings in the
iterable. The separator between elements is S.
Thus the join method really does not do what you want.
The concatenation line should be more like:
hugeNumberAsString += aSingleLine
Or even:
hugeNumberAsString += line.strip()
Which gets rid of the extra line of code doing the strip.

Converting file data to nested list

I have a string of numbers that I would like to convert to a nested list. So far, I have
with open('lifedata.txt') as f:
table_data = [ line.split() for line in f]
print(table_data)
If the text document consists of numbers ordered like this,
0000000
0010000
0001000
0111000
0000000
0000000
The code I have so far only creates a nested list that looks like, [['0000000'], ['0010000'], ['0001000'], ['0111000'], ['0000000'], ['0000000']]
But instead, I wanted it took be [[0,0,0,0,0,0,0],[],[]] and so on. I also do not know how to convert the string into an integer. I'm just very confused on how I should manipulate the original text document to do what I want it to.
This is what is happening:
>>> "0000000".split()
['0000000']
Instead, call int() on every character in each string:
[[int(c) for c in line.strip()] for line in f]
Or, via map():
[list(map(int, line.strip())) for line in f]
Use this ,it will return a list of containing data from each line of txt.
open('lifedata.txt').readlines()

Writing from one file to another

I've been stuck on this Python homework problem for awhile now: "Write a complete python program that reads 20 real numbers from a file inner.txt and outputs them in sorted order to a file outter.txt."
Alright, so what I do is:
f=open('inner.txt','r')
n=f.readlines()
n.replace('\n',' ')
n.sort()
x=open('outter.txt','w')
x.write(print(n))
So my thought process is: Open the text file, n is the list of read lines in it, I replace all the newline prompts in it so it can be properly sorted, then I open the text file I want to write to and print the list to it. First problem is it won't let me replace the new line functions, and the second problem is I can't write a list to a file.
I just tried this:
>>> x= "34\n"
>>> print(int(x))
34
So, you shouldn't have to filter out the "\n" like that, but can just put it into int() to convert it into an integer. This is assuming you have one number per line and they're all integers.
You then need to store each value into a list. A list has a .sort() method you can use to then sort the list.
EDIT:
forgot to mention, as other have already said, you need to iterate over the values in n as it's a list, not a single item.
Here's a step by step solution that fixes the issues you have :)
Opening the file, nothing wrong here.
f=open('inner.txt','r')
Don't forget to close the file:
f.close()
n is now a list of each line:
n=f.readlines()
There are no list.replace methods, so I suggest changing the above line to n = f.read(). Then, this will work (don't forget to reassign n, as strings are immutable):
n = n.replace('\n','')
You still only have a string full of numbers. However, instead of replacing the newline character, I suggest splitting the string using the newline as a delimiter:
n = n.split('\n')
Then, convert these strings to integers:
`n = [int(x) for x in n]`
Now, these two will work:
n.sort()
x=open('outter.txt','w')
You want to write the numbers themselves, so use this:
x.write('\n'.join(str(i) for i in n))
Finally, close the file:
x.close()
Using a context manager (the with statement) is good practice as well, when handling files:
with open('inner.txt', 'r') as f:
# do stuff with f
# automatically closed at the end
I guess real means float. So you have to convert your results to float to sort properly.
raw_lines = f.readlines()
floats = map(float, raw_lines)
Then you have to sort it. To write result back, you have to convert to string and join with line endings:
sortеd_as_string = map(str, sorted_floats)
result = '\n'.join(sortеd_as_string)
Finally you have have to write result to destination.
Ok let's look it step by step what you want to do.
First: Read some integers out of a textfile.
Pythonic Version:
fileNumbers = [int(line) for line in open(r'inner.txt', 'r').readlines()]
Easy to get version:
fileNumbers = list()
with open(r'inner.txt', 'r') as fh:
for singleLine in fh.readlines():
fileNumbers.append(int(singleLine))
What it does:
Open the file
Read each line, convert it to int (because readlines return string values) and append it to the list fileNumbers
Second: Sort the list
fileNumbers.sort()
What it does:
The sort function sorts the list by it's value e.g. [5,3,2,4,1] -> [1,2,3,4,5]
Third: Write it to a new textfile
with open(r'outter.txt', 'a') as fh:
[fh.write('{0}\n'.format(str(entry))) for entry in fileNumbers]

Categories