phpMyAdmin CSV Upload Replace Data Not Working - sql

I have created a database and table in phpMyAdmin.
I am importing the data from a csv file.
This works fine and adds the data correctly.
However each time I upload I want to replace the existing data.
I have ticked the box "Replace table data with file", it uploads fine but doesn't replace the existing rows it simply adds the new data as new rows below the old data.
Any ideas why this is happening?

This appears to be misleading text; it adds the "ON DUPLICATE KEY UPDATE" directive rather than truncating the table prior to insert, see the bug report at: https://sourceforge.net/p/phpmyadmin/bugs/4891/

Related

Insert row from an Excel file into a PostgreSQL database with Python and Psycopg2

I'm doing for the first time an import of an excel file via Phython to insert the data and later maintain them with update or delete in the PostgreSQL database.
However, I am not able to insert several lines, I already checked the executemany() command, but it did not work.
How can I make this script?
I'm doing it this way above to perform the inclusion of data in the database of this worksheet below:
However, when executing the script it is inserting data that does not compose the content, I managed to do it in another way as well and the same inserted only the last line.
But I expected that the content inserted in the bank would be all the lines contained in the excel.
And I would also like to know how I can make a data change later.

Problem appending CSV upload to existing BigQuery table

I've been used to quickly uploading a CSV file to append data to an existing table in BigQuery.
I've made the new table name the same as the existing table, and I've then had options to overwrite or append data to the existing table.
This seems to have changed in the past few days and there is a new BigQuery console UI.
When I try and create a new table from a CSV file upload, under the table name field it currently says:
Unicode letters, marks, numbers, connectors, dashes or spaces allowed.
The job will create the specified destination table if needed, or the
table must be empty if it already exists.
However, when I try and create a table with the same name as an existing table (even though the existing table is empty), I get a red warning saying:
Table already exists
Does anyone know if this feature has now been removed or how to easily append data?
The long way round is to upload a CSV to a new table, then query the new table and set the destination to append or overwrite an existing table. Not ideal, particulalry having to define a new table schema.
In order to append a CSV file to an existing BigQuery table when using the Console, please follow the instructions below:
In the Explorer panel, expand your project and select a dataset.
Expand the Actions option and click Open.
In the details panel, click Create table.
On the Create table page, in the Source section:
For Create table from, select Upload.
Browse file from system
On the Create table page, in the Destination section:
For Dataset name, choose the appropriate dataset.
In the Schema section, for Auto detect, check Schema and input parameters to enable schema auto detection. Alternatively, you can manually enter the schema definition
Click Advanced options.
For Write preference, choose Append to table
Please review this document that expands on the same topic.

Using SSIS Package, How to validate the source records for duplicate before inserting?

SQL Server 2012: using a SSIS package, how to validate the source records for duplicate before inserting?
Our source file is a .csv. We are facing duplicate records loaded in the staging table.
At present , we are following manual process of loading data.
How to validate the source file data against the destination table before loading and load only the valid records? Possibility of loading duplicate records not only because of the source file having duplicate records in it but also reloading the same file to the staging table.
We are not Truncate the staging table. We are keeping records as is.
Second question : How to pick the name of the source file and pass it in the loading ? Possibly having a derived column as "FileName" which will get loaded along with raw data to the staging table.
The typical load pattern I use in this case is:
Prepare a staging table that matches the source file
In SSIS run a SQL Task with TRUNCATE StagingTable; (which clears it out)
Then, run a data flow task that loads the entire data file into the staging table
Lastly, merge the staging table into the final table.
I prefer to do this last step in a SQL Task also:
INSERT INTO FinalTable
(PrimaryKey,Column1,Column2,Column3)
SELECT
PrimaryKey,Column1,Column2,Column3
FROM StagingTable SRC
WHERE NOT EXISTS (
SELECT * FROM FinalTable TGT WHERE TGT.PrimaryKey=SRC.PrimaryKey
);
If you prefer a graphical UI, and you don't mind the extra network traffic, and slower processing time, you can do the same type of merge operation using lookups. You can even use the SCD component but I strongly discourage it's use.
Whether you do it in T-SQL or the UI, you need a key that can be used to uniquely identify the records (referred to as PrimaryKey in my example). If you don't have this key, there is no way to 'deduplicate'
Note in this example you have a 'real' staging table whose only purpose is to get the data file into the database. Then you have a final table that contains the final consistent result
Also note that this pattern only adds new rows - it will not update existing rows if they change in the data file.
Given your exact scenario (of loading the same file again), I would first check if the data is even loaded to the staging table. If you do that, you don't have to worry about checking the duplicates at record level.
How are you setting the connection to the file? Most of the data loads I have dealt with, I designed for-each-loop-container where the file name/path would be populated in a user variable. As you said, you could just use a derived column transform to add a new column which gets the value from a variable. If you don't have the file name in a user variable, you could use expression task in the control flow to populate it.
To cover your exact requirement, I would use the above step to populate the file name in the table. You could even normalize to a different table instead of storing long file name for every data record. Once you have all the file names in the database, you could just have an "Execute SQL" at the beginning to see if that file name is already in the database.
Two years back I have faced the same problem with importing TSV files.
I tried many other solutions but best I could design is C# code script for such validation at its best.
What I did as a solution
Create one C# DataTable object in memory with Primary Key constraints,
like:-
DataColumn[] keyColumn = new DataColumn[30];
keyColumn[intJ] = dtFilterdPK.Columns["Column name"];
Then try to add one by one row from your CSV to this DataTables.
Whenever your data will get Duplication based on Primary Key will have an error
Handle this error code in (TRY)..CATCH block and make this duplication error as per your logging requirement.
Avoid those error records importing in DataTable object.
Atlast import your CSV file into your table as BulkImport
Like:
using (SqlBulkCopy bulkCopy = new SqlBulkCopy(myConnection))
{
bulkCopy.DestinationTableName = "Your DB Table Name"; //Assign table name
bulkCopy.WriteToServer(dtToBeImport); //Write into Actual table.
}
Hope this will help you.

Crate db cannot query data in a shard

I have a instance of Crate 1.0.2 and I dropped a table from it. Then re-created table with same name and slightly modified schema. Then I imported data using copy from command. File argument to copy from command consists of 10,000 records and copy from command runs ok. When I check table tab in crate web console, it shows many partitions added and each partition having few records. If I add number of records column on this tab, it comes close to 10k but when I fire a command "select count(*) from mytable", it returns around 8000 records only. On further investigation found that there are certain partitions on which data cannot be queried at all. Has any one seen this problem? Does it have anything to do with table drop and creation with same name ? I also observed that when a table is dropped, not all files related to that table are deleted from path.data. Are these directories a reason for those partitions become non-query able? While importing, I saw "Document already exists" exception. I know my data does not have any duplicate value for primary column.
Some questions to clarify the issue:
Have you run refresh table mytable after your copy command has finished?
Are you sure that with the new schema of the table, there are no duplicate records?
Since 1.x versions are not supported anymore, could you try with CrateDB 2.1.6 which is the current stable version to see if the problem persists?

Create SQL trigger if data exists in table

I am new to SQL.
What is the best way to create a TXT file, if a table has records > 0?
The code already exists to remove or add records to this table.
I am looking for ways to create a trigger file (with no content in the file) at a specific network folder.
Preferably, I would want this TXT file to be removed at the end of the day, so the process could repeat itself every morning
On an after delete Trigger do a select count(*) from table or query one of the system catalog views. If its zero, then call a stored proc that poops a file onto your share drive.
To move the file you could create a small package or call a powershell or bcp (after enabling xp_cmdshell though), or you could create a CLR function (after enabling CLR). I guess since the latter two you need to change a server setting, you could just create a package.
Annnd since there is no data you dont actually need to export, you just create a blank file!