Inserting specific columns of csv file into mongodb collection using python script - python

I have a python script to insert a csv file into mongodb collection
import pymongo
import pandas as pd
import json
client = pymongo.MongoClient("mongodb://localhost:27017")
df = pd.read_csv("iris.csv")
data = df.to_dict(oreint = "records")
db = client["Database name"]
db.CollectionName.insert_many(data)
Here all the columns of csv files are getting inserted into mongo collection. How can I achieve a usecase where I want to insert only specific columns of csv file in the mongo collection .
What changes I can make to existing code.
Lets say I also have database already created in my Mongo. Will this command work even if the database is present (db = client["Database name"])

Have you checked out pymongoarrow? the latest release has write support where you can import a csv file into mongodb. Here are the release notes and documentation. You can also use mongoimport to import a csv file, documentation is here, but I can't see any way to exclude fields like the way you can with pymongoarrow.

Related

Load the data from oracle database to csv files using python

I have written python script to fetch the data from oracle database and load that data to csv file.
import datetime
import pandas as pd
import cx_Oracle
con = cx_Oracle.connect('SYSTEM/oracle123#localhost:1521/xe')
c = con.cursor()
sql = "select * from covid_data"
res = c.execute(sql)
t = pd.read_sql(sql,con)
t.to_csv(r'C:\Users\abc\covid.csv')
I want the script to be run everyday and load to csv file. But the challenge which I am facing here is I want to fetch the data on daily basis but the data in csv file should be available for only that particular day. The contents of previous day should not be seen the next day
I found solution for this.
In the below line, we have to add write mode, so that it will not append the data to csv file instead it overwrites.
t.to_csv(r'C:\Users\abc\covid.csv', mode='w')

Python Importing a CSV file into a sqlite3 remove replicates

I have a CSV file and I want to import this file into my sqlite3 database using Python. The column names of the CSV is the same with the column names of the database table, the following is the code i am using now.
df = pandas.read_csv(Data.csv)
df.to_sql(table_name, conn, index=False)
However it seems the command will import all data into the database, I am trying to only input the data that does not already exist in the database. Is there a way to do that without iterating every row of the csv or database?
Use the if_exists parameter.
df = pandas.read_csv(Data.csv)
df.to_sql(table_name,conn,if_exists='append',index=False)

Update a SQLite3 database using CSVs and script automation

I have a sqlite database that is populated with values from csv files. I would like to create a script that when run:
deletes the old tables
creates new tables with the same schema (with newly updated values)
I noticed that sqlite script files don't accept ".mode csv" or .import "csv". Is there a way to automate this is with a script of some sort?
If you want a Python approach, you can use to_sql method from the pandas package to write to SQLite. Pandas can replace existing tables and automatically generate the schema based on the CSV file read.
import sqlite3
import pandas as pd
conn = sqlite3.connect('my.db')
# read the csv file
df = pd.read_csv("my.csv")
# write to SQLite
df.to_sql("my_tbl", conn, if_exists="replace")
conn.close()

Converting CSV to DB for SQL

I am trying to convert a .csv file I've download into a .db so that I can analyze it in DBreaver with SQLite3.
I'm using Anaconda Prompt and python within it.
Can anyone point out where I'm mistaken?
import pandas as pd
import sqlite 3
df = pd.read_csv('0117002-eng.csv')
df.to_sql('health', conn)
And I just haven't been able to figure out how to set up conn appropriately. All the guides I've read have you do something like:
conn = sqlite3.connect("file.db")
But, as I mentioned I have only the csv file. And when I did try to do that, it also doesn't work.

Dynamically creating table from csv file using psycopg2

I would like to get some understanding on the question that I was pretty sure was clear for me. Is there any way to create table using psycopg2 or any other python Postgres database adapter with the name corresponding to the .csv file and (probably the most important) with columns that are specified in the .csv file.
I'll leave you to look at the psycopg2 library properly - this is off the top of my head (not had to use it for a while, but IIRC the documentation is ample).
The steps are:
Read column names from CSV file
Create "CREATE TABLE whatever" ( ... )
Maybe INSERT data
import os.path
my_csv_file = '/home/somewhere/file.csv'
table_name = os.path.splitext(os.path.split(my_csv_file)[1])[0]
cols = next(csv.reader(open(my_csv_file)))
You can go from there...
Create a SQL query (possibly using a templating engine for the fields and then issue the insert if needs be)

Categories