Im trying to create a script that will read a JSON file and use the variables to select particular folders and files and save them somewhere else.
My JSON is as follows:
{
"source_type": "folder",
"tar_type": "gzip",
"tar_max_age": "10",
"source_include": {"/opt/myapp/config", "/opt/myapp/db, /opt/myapp/randomdata"}
"target_type": "tar.gzip",
"target_path": "/home/user/targetA"
}
So far, I have this Python Code:
import time
import os
import tarfile
import json
source_config = '/opt/myapp/config.JSON'
target_dir = '/home/user/targetA'
def main():
with open('source_config', "r").decode('utf-8') as f:
data = json.loads('source_config')
for f in data["source_include", str]:
full_dir = os.path.join(source, f)
tar = tarfile.open(os.path.join(backup_dir, f+ '.tar.gzip'), 'w:gz')
tar.add(full_dir)
tar.close()
for oldfile in os.listdir(backup_dir):
if str(oldfile.time) < 20:
print("int(oldfile.time)")
My traceback is:
Traceback (most recent call last):
File "/Users/user/Documents/myapp/test/point/test/Test2.py", line 16, in <module>
with open('/opt/myapp/config.json', "r").decode('utf-8') as f:
AttributeError: 'file' object has no attribute 'decode'
How do I fix this?
You are trying to call .decode() directly on the file object. You'd normally do that on the read lines instead. For JSON, however, you don't need to do this. The json library handles this for you.
Use json.load() (no s) to load directly from the file object:
with open(source_config) as f:
data = json.load(f)
Next, you need to address the source_include key with:
for entry in data["source_include"]:
base_filename = os.path.basename(entry)
tar = tarfile.open(os.path.join(backup_dir, base_filename + '.tar.gzip'), 'w:gz')
tar.add(full_dir)
tar.close()
Your JSON also needs to be fixed, so that your source_include is an array, rather than a dictionary:
{
"source_type": "folder",
"tar_type": "gzip",
"tar_max_age": "10",
"source_include": ["/opt/myapp/config", "/opt/myapp/db", "/opt/myapp/randomdata"],
"target_type": "tar.gzip",
"target_path": "/home/user/targetA"
}
Next, you loop over filenames with os.listdir(), which are strings (relative filenames with no path). Strings do not have a .time attribute, if you wanted to read out file timestamps you'll have to use os.stat() calls instead:
for filename in os.listdir(backup_dir):
path = os.path.join(backup_dir, filename)
stats = os.stat(path)
if stats.st_mtime < time.time() - 20:
# file was modified less than 20 seconds ago
Related
i am tryng to generate a json file with my python script.
The goal is to parse a csv file , get some data, do some operations/elaborations and then generate a json file.
When i run the script the json generations seems to run smoothly but as soon the first row is parsed the scripts stops with the following error :
Traceback (most recent call last): File
"c:\xampp\htdocs\mix_test.py", line 37, in
data.append({"name": file_grab ,"status": "progress"})
^^^^^^^^^^^ AttributeError: 'str' object has no attribute 'append'
Below the code:
import json
import requests
import shutil
from os.path import exists
from pathlib import Path
timestr = time.strftime("%Y%m%d")
dest_folder = (r'C:\Users\Documents\mix_test_python')
filename = []
# read filename and path with pandas extension
df = pd.read_csv(
r'C:\Users\Documents\python_test_mix.csv', delimiter=';')
data = []
for ind in df.index:
mode = (df['Place'][ind])
source_folder = (df['File Path'][ind])
file_grab = (df['File Name'][ind])
code = (df['Event ID'][ind])
local_file_grab = os.path.join(dest_folder, file_grab)
remote_file_grab = os.path.join(source_folder, file_grab)
### generate json ########
##s = json.dumps(test)##
data.append({"name": file_grab ,"status": "progress"})
with open(r'C:\Users\Documents\test.json','w') as f:
json.dump(data, f, indent=4)
f.close
#### detect if it is ftp ######
print(mode, source_folder, remote_file_grab)
Could you help me to understand what i am wrong in ?
I have a 1000 json files, I need to change the value of a specific line with numeric sequence in all files.
An example
the specific line is - "name": "carl 00",
I need it to be like following
File 1
"name": "carl 1",
File 1
"name": "carl 2",
File 3
"name": "carl 3",
What is the right script to achieve the above using python
This should do the trick. But you're not very clear about how the data is stored in the actual json file. I listed two different approaches. The first is to parse the json file into a python dict then manipulate the data and then turn it back into a character string and then save it. The second is what I think you mean by "line". You can split the file's character string into a list then change the line you want, and remake the full string again, then save it.
This also assumes your json files are in the same folder as the python script.
import os
import json
my_files = [name1, name2, name3, ...] # ['file_name.json', ...]
folder_path = os.path.dirname(__file__)
for i, name in enumerate(my_files):
path = f'{folder_path}/{name}'
with open(path, 'r') as f:
json_text = f.read()
# if you know the key(s) in the json file...
json_dict = json.loads(json_text)
json_dict['name'] = json_dict['name'].replace('00', str(i))
new_json_str = json.dumps(json_dict)
# if you know the line number in the file...
line_list = json_text.split('\n')
line_list[line_number - 1] = line_list[line_number - 1].replace('00', str(i))
new_json_str = '\n'.join(line_list)
with open(path, 'w') as f:
f.write(new_json_str)
Based on your edit, this is what you want:
import os
import json
my_files = [f'{i}.json' for i in range(1, 1001)]
folder_path = os.path.dirname(__file__) # put this .py file in same folder as json files
for i, name in enumerate(my_files):
path = f'{folder_path}/{name}'
with open(path, 'r') as f:
json_text = f.read()
json_dict = json.loads(json_text)
json_dict['name'] = f'carl {i}'
# include these lines if you want "symbol" and "subtitle" changed
json_dict['symbol'] = f'carl {i}'
json_dict['subtitle'] = f'carl {i}'
new_json_str = json.dumps(json_dict)
with open(path, 'w') as f:
f.write(new_json_str)
Without knowing more, the below loop will accomplish the posts requirements.
name = 'carl'
for i in range(0,1001):
print(f'name: {name} {i}')
I am trying to generate dag files using python code below.
The code below takes two parameters -
bunch of looped json file input
Template which provides the line which the variables has to be applied
I can successfully create output files but the variables which is replicated from the template file did not change. When the file gets created I want the json variables to be passed to the new file created dynamically.
json file:
{
"DagId": "dag_file_xyz",
"Schedule": "'#daily'",
"Processed_file_name":"xyz1",
"Source_object_name":"'xyz2,}
Template:
processed_file = xyzOperator(
task_id=processed_file_name,
source_bucket=bucket_path,
destination_bucket=destination_bucket,
source_object=source_object_name,
destination_object=destination_object_name,
delimiter='.csv',
move_object=False
Generate file code
import json
import os
import shutil
import fileinput
import ctypes
config_filepath = ('C:\\xyz\\')
dag_template_filename = 'C:\\dagfile\\xyztest.py'
for filename in os.listdir(config_filepath):
print(filename)
f = open(config_filepath + filename)
print(f)
config = json.load(f)
new_filename = 'dags/' + config['DagId'] + '.py'
print(new_filename)
shutil.copyfile(dag_template_filename, new_filename)
for line in fileinput.input(new_filename, inplace=True):
print(line)
line.replace("dag_id", "'" + config['DagId'] + "'"))
line.replace("scheduletoreplace", config['Schedule'])
line.replace("processed_file_name", config['Processed_file_name'])
line.replace("source_object_name", config['Source_object_name'])
line.replace("destination_object_name", config['Destination_object_name'])
print(line, end="")
How to define folder name when saving JSON file?
I tried to add myfoldername inside open(), but did not work.
Also tried to myfoldername/myfilename in filename definition
Error:
TypeError: an integer is required (got type str)
Code:
import json
# Testing file save
dictionary_data = {"a": 1, "b": 2}
filename = "myfilename" + time.strftime("%Y%m%d-%H%M%S") + ".json"
a_file = open("myfoldername",filename, "w")
json.dump(dictionary_data, a_file)
a_file.close()
This should do the trick.
Use pathlib to manage paths
Create the parent dir if not exist with mkdir
Open the file thanks to the with statement
import json
import time
from pathlib import Path
# Testing file save
dictionary_data = {"a": 1, "b": 2}
filename = Path("myfilename") / Path(f"{time.strftime('%Y%m%d-%H%M%S')}.json")
# create the parent dir if not exist
filename.parent.mkdir(parents=True, exist_ok=True)
with open(filename, "w") as a_file:
json.dump(dictionary_data, a_file)
I have a file folder of 1000+ json metadata files. I have created a list of the file paths and I'm trying to:
for each file path, read json file
pull in only the key value pairs I'm interested in
store it in a variable or save it in a way that I can insert into
mongodb using pymongo
I have been successful listing the file paths to a variable and loading ONE json doc (from one file path). The problem is I need to do over a thousand and I get an error when trying to incorporate list of file paths and loop.
Here's what I've tried so far:
import pymongo
import json
filename = r"C:\Users\Documents\FileFolder\randomFile.docx.json"
with open(filename, "r", encoding = "utf8") as f:
json_doc = json.load(f)
new_jsonDoc = dict()
for key in {'Application-Name', 'Author', 'resourceName', 'Line-Count', 'Page-Count', 'Paragraph-Count', 'Word-Count'}:
new_jsonDoc[key] = json_doc[0][key]
Sample output:
{'Application-Name': 'Microsoft Office Word',
'Author': 'Sample, John Q.',
'Character Count': '166964',
'Line-Count': '1391',
'Page-Count': '103',
'Paragraph-Count': '391',
'Word-Count': '29291',
'resourceName': 'randomFile.docx'}
Now when I add the loop:
for file in list_JsonFiles: # this is list of file paths created by os.walk
# if I do a print(type(file)) here, file type is a string
with open(file, "r") as f:
# print(type(file)) = string, print(type(f)) = TextIOWrapper
json_doc = json.loads(f)
### TypeError: the JSON object must be str, bytes or bytearray, not TextIOWrapper ###
How can I get my loop working? Is my approach wrong?
Figured the TypeError out:
for file in list_JsonFiles:
with open(file, "r", encoding = "utf8") as f:
json_doc = json.load(f)
new_jsonDoc = dict()
for key in {'Application-Name', 'Author', 'resourceName', 'Line-Count', 'Page-Count', 'Paragraph-Count', 'Word-Count'}:
if key in json_doc[0]:
new_jsonDoc[key] = json_doc[0][key]