I've got a .csv file and I need to get some information from it. If I open the file, I can see two lines in it, that says "data" and "notes", and I need to get the information that these two variables have.
When I open the .csv file, it shows these lines:
data =
[0,1,2,3,4,5,3,2,3,4,5,]
notes = [{"text": "Hello", "position":(2,3)}, {"text": "Bye", "position":(4,5)}]
To open the file I use:
import csv
class A()
def __init__(self):
#Some stuff in here
def get_data(self):
file = open(self.file_name, "r")
data = csv.reader(file, delimiter = "\t)
rows = [row for row in data]
Now, to read the information in data, I just write:
for line in row[1][0]:
try:
value_list = int(line)
print value_list
except ValueError:
pass
And, with this I can create another list with these values and print it. Now, I need to read the data from "notes", as you can see, it is a list with dictionaries as elements. What I need to do, is to read the "position" element inside each dictionary and print it.
This is the code that I have:
for notes in row[3][0]:
if notes["position"]:
print notes["position"]
But this, gives me this error:
TypeError: string indices must be integers, not str
How can I access these elements of each dictionary and then print it? Hope you can help me.
This is the .csv file from where I am trying to get the information.
You can change the last part of your code to:
for note in eval(rows[3][0].strip("notes = ")):
if note["position"]:
print note["position"]
If you need the position to be an actual tuple instead of a string, you can change the last line to:
print tuple(note["position"])
Related
I have a script that appends to a list from a text file. I then use ''.join(mylist) to convert to type str so I can query a DynamoDB table for the said str. This seems to work until I query the table. I notice I am getting empty responses. After printing out each str, I notice they are being returned vertically. How can I format the string properly so my calls to DynamoDB are successful?
import boto3
from boto3.dynamodb.conditions import Key, Attr
dynamo = boto3.resource('dynamodb')
table = dynamo.Table('mytable')
s3.Bucket('instances').download_file('MissingInstances.txt')
with open('MissingInstances.txt', 'r') as f:
for line in f:
missing_instances = []
missing_instances.append(line)
unscanned = ''.join(missing_instances)
for i in unscanned:
print(i)
response = table.query(KeyConditionExpression=Key('EC2').eq(i))
items = response['Items']
print(items)
Contents of MissingInstances.txt:
i-xxxxxx
i-yyyyyy
i-zzzzzz
etc etc
Output of print(i):
i
-
x
x
x
x
x
i
-
y
y
y
y
y
etc etc
Output of print(items):
[]
[]
[]
etc etc
Desired output:
i-xxxxxx
i-yyyyyy
etc etc
Your problem isn't actually with the print function, but with how you are iterating your for loops. I've annotated your code below, added a tip to save you some time, and included some code to get you over this hurdle. Here is a resource for for loops, and here is another resource for using lists.
Here is your code, with annotations of what's happening:
#import libraries, prepare the data
import boto3
from boto3.dynamodb.conditions import Key, Attr
dynamo = boto3.resource('dynamodb')
table = dynamo.Table('mytable')
s3.Bucket('instances').download_file('MissingInstances.txt')
#Opens the text file that has the name of an instance and a newline character per line
with open('MissingInstances.txt', 'r') as f:
#For each line in the text file
for line in f:
#(For each line) Create an empty list called missing_instances
missing_instances = []
#Append this line to the empty list
missing_instances.append(line)
#Put all the current values of the list into a space-delimited string
#(There is only one value because you have been overwriting the list every loop)
unscanned = ''.join(missing_instances)
At this point in the code, you have looped through and written over missing_instances every iteration of your loop, so you are left with only the last instance.
#This should print the whole list of missing_instances
>>>print(*missing_instances)
i-cccccc
#This should print the whole unscanned string
>>>print(unscanned)
i-cccccc
Next, you loop through unscanned:
#For each letter in the string unscanned
for i in unscanned:
#Print the letter
print(i)
#Query using the letter (The rest of this won't work for obvious reasons)
response = table.query(KeyConditionExpression=Key('EC2').eq(i))
items = response['Items']
print(items)
You don't need to join the list to convert to string
I have a script that appends to a list from a text file. I then use
''.join(mylist) to convert to type str so I can query a DynamoDB table
for the said str
For example:
If you have this list:
missing_instances = ['i-xxxxxx','i-yyyyyy','i-zzzzzz']
You can see it's datatype is list:
>>>print(type(missing_instances))
<class 'list'>
But if you are looking at an element of that list (eg. the first element), the element's data type is str:
>>>print(type(missing_instances[0]))
<class 'str'>
This code loops through the text file and queries each line to the database:
#import libraries, prepare the data
import boto3
from boto3.dynamodb.conditions import Key, Attr
dynamo = boto3.resource('dynamodb')
table = dynamo.Table('mytable')
s3.Bucket('instances').download_file('MissingInstances.txt')
#Open the text file
with open('MissingInstances.txt', 'r') as f:
#Create a new list
missing_instances = []
#Loop through lines in the text file
for line in f:
#Append each line to the missing_instances list, removing the newlines
missing_instances.append(line.rstrip())
#CHECKS
#Print the whole list of missing_instances, each element on a new line
print(*missing_instances, sep='\n')
#Print the data type of missing_instances
print(type(missing_instances))
#Print the data type of the first element of missing_instances
print(type(missing_instances[0]))
#Loop through the list missing_instances
#For each string element of missing_instances
for i in missing_instances:
#Print the element
print(i)
#Query the element
response = table.query(KeyConditionExpression=Key('EC2').eq(i))
#Save the response
items = response['Items']
#Print the response
print(items)
#For good measure, close the text file
f.close()
Try stripping of newline characters before appending them to the list.
For example:
missing_instances.append(line.rstrip())
Print automatically introduces a new line on each call. It does not work like Java's System.out#print(String). For example, when I run this, I get this:
for c in 'adf':
print(c)
a
d
f
This is because in python (for some reason or another), strings are iterable.
I'm not sure what your code is in fact trying to do. I'm not familiar with this Boto3 library. But let's say the part i-xxxxx is decomposed to i and xxxxx, which I term id and other_stuff. Then,
for the_id in ids:
print(f'{the_id}-{ids}')
I am trying to copy values of data seprated with: from text file.
Text file having data like in this form:
I have 50+ text file contains data in this form:
Type: Assume
Number: 123456
Name: Assume
Phone Number: 000-000
Email Address: any#gmail.com
Mailing Address: Assume
i am trying to get data values in this format in csv from multiple text files:
Type Number Name Phone email Mailing Address
Assume 123456 Assume 000-000 any#gmail.com Assume
Here is the code:
import re
import csv
file_h = open("out.csv","a")
csv_writer = csv.writer(file_h)
def writeHeading(file_content):
list_of_headings = []
for row in file_content:
key = str(row.split(":")[0]).strip()
list_of_headings.append(key)
csv_writer.writerow(tuple(list_of_headings))
def writeContents(file_content):
list_of_data = ['Number']
for row in file_content:
value = str(row.split(":")[1]).strip()
list_of_data.append(value)
csv_writer.writerow(tuple(list_of_data))
def convert_txt_csv(filename):
file_content = open(filename,"r").readlines()
return file_content
list_of_files = ["10002.txt","10003.txt","10004.txt"]
# for writing heading once
file_content = convert_txt_csv(list_of_files[0])
writeHeading(file_content)
# for writing contents
for file in list_of_files:
file_content = convert_txt_csv(file)
writeContents(file_content)
file_h.close()
Here is the following error:
Traceback (most recent call last):
File "Magnet.py", line 37, in <module>
writeContents(file_content)
File "Magnet.py", line 20, in writeContents
value = str(row.split(":")[1]).strip()
IndexError: list index out of range
Your code probably encounters a blank line at the end of the first file, or any line that doesn't have a : in it, so when you try to split it into key/values it complains as it didn't get a list of expected length. You can fix that easily by checking if there is a colon on the current line, i.e.:
for row in file_content:
if ":" not in row: # or you can do the split and check len() of the result
continue
key = row.split(":")[0].strip()
list_of_headings.append(key)
But... While the task you're attempting looks extremely simple, keep in mind that your approach assumes that all the files are equal, with equal number key: value combinations and in the same order.
You'd be much better off by storing your parsed data in a dict and then using csv.DictWriter() to do your bidding.
I know I am missing the obvious here but I have the following PYTHON code in which I am trying to-
Take a specified JSON file containing multiple strings as an input.
Start at the line 1 and look for the key value of "content_text"
Add the key value to a new dictionary and write said dictionary to a new file
Repeat 1-3 on additional JSON files
import json
def OpenJsonFileAndPullData (JsonFileName, JsonOutputFileName):
output_file=open(JsonOutputFileName, 'w')
result = []
with open(JsonFileName, 'r') as InputFile:
for line in InputFile:
Item=json.loads(line)
my_dict={}
print item
my_dict['Post Content']=item.get('content_text')
my_dict['Type of Post']=item.get('content_type')
print my_dict
result.append(my_dict)
json.dumps(result, output_file)
OpenJsonFileAndPullData ('MyInput.json', 'MyOutput.txt')
However, when run I receive this error:
AttributeError: 'str' object has no attribute 'get'
Python is case-sensitive.
Item = json.loads(line) # variable "Item"
my_dict['Post Content'] = item.get('content_text') # another variable "item"
By the way, why don't you load whole file as json at once?
Here is the code for the program that I have done so far. I am trying to calculate the efficiency of NBA players for a class project. When I run the program on a comma-delimited file that contains all the stats, instead of splitting on each comma it is creating a list entry of the entire line of the stat file. I get an index out of range error or it treats each character as a index point instead of the separate fields. I am new to this but it seems it should be creating a list for each line in the file that is separated by elements of that list, so I get a list of lists. I hope I have made myself understood.
Here is the code:
def get_data_list (file_name):
data_file = open(file_name, "r")
data_list = []
for line_str in data_file:
# strip end-of-line, split on commas, and append items to list
line_str.strip()
line_str.split(',')
print(line_str)
data_list.append(line_str)
print(data_list)
file_name1 = input("File name: ")
result_list = get_data_list (file_name1)
print(result_list)
I do not see how to post the data file for you to look at and try it with, but any file of numbers that are comma-delimited should work.
If there is a way to post the data file or email to you for you to help me with it I would be happy to do so.
Boliver
Strings are immutable objects, this means you can't change them in place. That means, any operation on a string returns a new one. Now look at your code:
line_str.strip() # returns a string
line_str.split(',') # returns a list of strings
data_list.append(line_str) # appends original 'line_str' (i.e. the entire line)
You could solve this by:
stripped = line_str.strip()
data = stripped.split(',')
data_list.append(data)
Or concatenating the string operations:
data = line_str.strip().split(',')
data_list.append(data)
I have a code where in I first convert a .csv file into multiple lists and then I have to create a subset of the original file containing only those with a particular word in column 5 of my file.
I am trying to use the following code to do so, but it gives me a syntax error for the if statement. Can anyone tell me how to fix this?
import csv
with open('/Users/jadhav/Documents/Hubble files/m4_hubble_1.csv') as f:
bl = [[],[],[],[],[]]
reader = csv.reader(f)
for r in reader:
for c in range(5):
bl[c].append(r[c])
print "The files have now been sorted into lists"
name = 'HST_10775_64_ACS_WFC_F814W_F606W'
for c in xrange(0,1):
if bl[4][c]!='HST_10775_64_ACS_WFC_F814W_F606W'
print bl[0][c]
You need a colon after your if test, and you need to indent the if taken clause:
if bl[4][c]!='HST_10775_64_ACS_WFC_F814W_F606W':
print bl[0][c]