I have basic csv report that is produced by other team on a daily basis, each report has 50k rows, those reports are saved on sharedrive everyday. And I have Oracle DB.
I need to create autoscheduled process (or at least less manual) to import those csv reports to Oracle DB. What solution would you recommend for it?
I did not find such solution in SQL Developer, since it is upload from file and not a query. I was thinking about python cron script, that will autoran on a daily basis and transform csv report to txt with needed SQL syntax (insert into...) and then python will connect to Oracle DB and will ran txt file as SQL command and insert data.
But this looks complicated.
Maybe you know other solution that you would recommend yo use?
Create an external table to allow you to access the content of the CSV as if it were a regular table. This assumes the file name does not change day-to-day.
Create a scheduled job to import the data in that external table and do whatever you want with it.
One common blocking issue that prevents using 'external tables' is that external tables require the data to be on the computer hosting the database. Not everyone has access to those servers. Or sometimes the external transfer of data to that machine + the data load to the DB is slower than doing a direct path load from the remote machine.
SQL*Loader with direct path load may be an option: https://docs.oracle.com/en/database/oracle/oracle-database/19/sutil/oracle-sql-loader.html#GUID-8D037494-07FA-4226-B507-E1B2ED10C144 This will be faster than Python.
If you do want to use Python, then read the cx_Oracle manual Batch Statement Execution and Bulk Loading. There is an example of reading from a CSV file.
Related
I'm building a python app that that can help authorised users to retrieve sample data from production mysql databases to cloud data stores for auditing and tracking purposes. I'm able provide data as CSV files.
However, the auditors now need to import this data into databases in SQL format. For this, I'm currently exporting the query results manually into SQL format using Dbeaver so that it can be ingested by the users by executing the .sql files. It will be great if I can use any libraries that enable this feature in my app.
I tried searching the Dbeaver code base to check for any libraries but could not identify. Please suggest if these is a smarter pythonic way to achieve this?
I have a requirement like I have an ods file with some data and I want insert that data into a table. This scenario need to be done via procedure call because we have to validate some fields in the ods file. Steps for the requirement. For this, we have two tables like Staging and main table. The staging table contains validation failed records and the main table contains success records.
Note: How to do this using python scripting. This will be automate on a daily basis
Step 1:Place the file in a specified location.
Step 2:Pick up file from specified location and call the procedure to insert the records.
Step 3: While calling the procedure needs to handle validation for some fields. Only validation success records needs to be stored in Mani_table. Records which are failed in validation those records need to be stored in the Staging table.
Step 4: Automation script need to be done on daily basis.
You can move the files across the folders with python's shutil module.
shutil.move("path/to/current/file.foo", "path/to/new/destination/for/file.foo")
Check the more details of it here
You can periodically run the python scripts in multiple ways! I can't comment on the efficiency of these methods, but you can use python's apscheduler. More details of it here
You can use python's pyexcel-ods to read ods files.
Since you haven't added your work, I can't help you more than this!
I was wondering if there is a way to allow a user to export a SQLite database as a .csv file, make some changes to it in a program like Excel, then upload that .csv file back to the table it came from using a record UPDATE method.
Currently I have a client that needed an inventory and pricing management system for their e-commerce store. I designed a database system and logic in Python 3 and SQLite. The system from a programming standpoint works flawlessly.
The problem I have is that there are some less then technical office staff that need to edit things like product markup within the database. Currently, I have them setup with SQLite DB Browser, from there they can edit products one at a time and write the changes to the database. They can also export tables to a .csv file for data manipulation in Excel.
The main issue is getting that .csv file back into the table it was exported from using an UPDATE method. When importing a .csv file to a table in SQLite DB Browser there is no way to perform an update import. It can only insert new rows by default and do to my table constraints that is a problem.
I like SQLite DB Browser because it is clean and simple and does exactly what I need. However, as soon as you have to edit more then one thing at a time and filter information in more complicated ways it starts to lack the functionality needed.
Is there a solution out there for SQLite DB Browser to tackle this problem? Is there a better software option all together to interact with a SQLite database that would give me that last bit of functionality?
Have you tried SQLiteForExcel? however, some coding is required.
So after researching some off the shelf options I found that the Devart Excel Add Ins did exactly what I needed. They are paid add ins, however, they seem to support almost all modern databases including SQlite. Once the add in is installed you can connect to a database and manipulate the data returned just like normal in Excel including bulk edits and advanced filtering, all changes are highlighted and can easily be written to the database with one click.
Overall I thought it was a pretty solid solution and everyone seems to be very happy with it as it made interacting with a database intuitive and non threatening to the more technically challenged.
New to Pandas & SQL. Haven't found an answer specific to this config, and not sure if standard SQL wisdom applies when introducing pandas to the mix.
Doing a school project that involves ~300 gb of data in ~6gb .csv chunks.
School advised syncing data via dropbox, but this seemed impractical for a 4-person team.
So, current solution is AWS EC2 & RDS instance (MySQL, I think it'll be, 1 table).
What I wanted to confirm before we start setting it up:
If multiple users are working with (and occasionally modifying) the data, can this arrangement manage conflicts? e.g., if user A uses pandas to construct a dataframe from a query, are the records in that query frozen if user B tries to work with them?
My assumption is that the data in the frame are in memory, and the records in the SQL database are free to be modified by others until the dataframe is written back to the db, but I'm hoping that either I'm wrong or there's a simple solution here (like a random sample query for each user or something).
A pandas DataFrame object does not interact directly with the db. Once you read it in it sits in memory locally. You would have to use a method like DataFrame.to_sql to write your changes back to the MySQL DB. For more information on reading and writing to SQL tables, see the pandas documentation here.
I am trying to determine how to best insert users into active directory from a SQL server table.
I figured I could use the LDAP sever to do a insert, but the research iv done would suggest otherwise and that I could only pull data from active directory to SQL server.
Then I thought I could use a python program to query the table and spit out a CSV file to then do a bulk insert but I am not sure if this would modify existing users if data changes.
Any insight would be appreciated
Here's a general idea of the algorithm:
Load user data from SQL Server
Convert it into an LDIF (LDAP Data Interchange Format) file
Import the LDIF file into Active Directory using the LDIFDE command-line tool
Python, or any other programming language, can help you with step 2. Notice that the details of the conversion are very specific to how your data is represented. You'll have to carefully map each data base field into an LDAP attribute, and determine the classes to be used in the LDAP objects.
Will the above modify existing users? yes, of course. You could write the LDIF in such a way that it updates the existing data, or if that's a problem you could verify first if an user exists in the Active Directory and don't add those changes to the LDIF file.
Alternatively
You could use CSVDE for importing data in CSV format, but anyway you'll have to design a mapping strategy for each one of the fields that you want to import into Active Directory.