When extracting data from a table (schema and data) I can do this by right clicking on the database and by going to tasks->Generate Scripts and it gives me all the data from the table including the create script, which is good.
This though gives me all the data from the table - can this be changed to give me only some of the data from the table? e.g only data on the table after a certain dtmTimeStamp?
Thanks,
I would recommend extracting your data into a separate table using a query and then using generate scripts on this table. Alternatively you can extract the data separately into a flatfile using the export data wizard (include your column headers and use comma seperators with double quote field delimiters).
To make a copy of your table:
SELECT Col1 ,Col2
INTO CloneTable
FROM MyTable
WHERE Col3 = #Condition
(Thanks to #MarkD for adding that)
Related
Context
I am receiving CSV files in S3, which do not always follow the same schema and/or order. For example, sometimes files look like:
foo, bar, bla
hi , 007, 42
bye, 008, 44
But other times, they can look like (bar can be missing):
foo, bla
hi , 42
bye, 44
Now let's say I'm only interested in getting the foo column regardless of what else is there. But I can't really count on the order of the columns in the CSV. so on some days foo could be the first column, but on other days foo could be the third column. By the way, I am using Snowflake as a database.
What I have tried to do
I created a destination table like:
CREATE TABLE woof.meow (foo TEXT);
Then I tried to use Snowflake's COPY INTO command to copy data from the CSV into the table I created. The catch here, is that I tried to do the same way I normally do for Parquet files (matching by column names!) like:
COPY INTO woof.meow
FROM '#STAGES.MY_S3_BUCKET_STAGE/'
file_format = (
TYPE=CSV,
COMPRESSION=GZIP,
)
MATCH_BY_COLUMN_NAME = CASE_INSENSITIVE;
But sadly I always got: error: Insert value list does not match column list expecting 1 but got 0
Some research lead me to this section of the docs (about MATCH_BY_COLUMN_NAME) to discover CSV is not supported:
This copy option is supported for the following data formats:
- JSON
- Avro
- ORC
- Parquet
Desired objective
How can I copy data from the STAGE (containing csv file on s3)to a pre-created table based on column names?
I am happy to provide any further information if needed.
You are trying to insert CSV which is comma separated values file data into one text column ,to my knowledge your column order in your source data files should be same as column orders that you have created for target table in Snowflake which means if you have foo , bar and bla as columns in source csv file then your target table columns should be also be created as separate columns , in same order as source csv files;
If you have unsure of what columns could come in your source file ; i would recommend you transform this file to JSON (that is my choice you can choose other option too like avro) and load that content into VARIANT column in Snowflake;
By this way you would not worry much about order of columns in source files , you would store data as JSON/AVRO into target table and would use JSON handling mechanism to convert JSON values into Columns.(Flatten the JSON to convert it onto relational table)`
I'm new to PostgreSQL and and looking for some guidance and best practice.
I have created a table by importing data from a csv file. I then altered the table by creating multiple generated columns like this:
ALTER TABLE master
ADD office VARCHAR(50)
GENERATED ALWAYS AS (CASE WHEN LEFT(location,4)='Chic' THEN 'CHI'
ELSE LEFT(location,strpos(location,'_')-1) END) STORED;
But when I try to import new data into the table I get the following error:
ERROR: column "office" is a generated column
DETAIL: Generated columns cannot be used in COPY.
My goal is to be able to import new data each day to the table and have the generated columns automatically populate in order to transform the data as I would like. How can I do so?
CREATE TEMP TABLE master (location VARCHAR);
ALTER TABLE master
ADD office VARCHAR
GENERATED ALWAYS AS (
CASE
WHEN LEFT(location, 4) = 'Chic' THEN 'CHI'
ELSE LEFT(location, strpos(location, '_') - 1)
END
) STORED;
--INSERT INTO master (location) VALUES ('Chicago');
--INSERT INTO master (location) VALUES ('New_York');
COPY master (location) FROM $$d:\cities.csv$$ CSV;
SELECT * FROM master;
Is this the structure and the behaviour you are expecting? If not, please provide more details regarding your table structure, your importable data and your importing commands.
Also, maybe when you try to import the csv file, the columns are not linked properly, or maybe the delimiter is not properly set. Try to specify each column in the exact order that appear in your csv file.
https://www.postgresql.org/docs/12/sql-copy.html
Note: d:\cities.csv contains:
Chicago
New_York
EDIT:
If columns positions are mixed up between table and csv, the following operation may come in handy:
1. create temporary table tmp (csv_column1 <data_type>, csv_column_2 <data_type>, ...); (including ALL csv columns)
2. copy tmp from '/path/to/file.csv';
3. insert into master (location, other_info, ...) select csv_column_3 as location, csv_column_7 as other_info, ... from tmp;
Importing data using an intermediate table may slow things down a little, but gives you a lot of flexibility.
I was getting the same error when importing to PG from a csv - I found that even though my column was generated, I still had to have it in the imported data, just left it empty. Worked fine when the column name was in there and mapped to my DB col name.
I have a dataset that I receive on a weekly basis, this dataset is a single column of unique identifiers. Currently this dataset is gathered manually by our support staff. I am trying to query this dataset (CSV file) in my WHERE clause of a SQL Query.
In order to add this dataset to my query I do some data transformation to tweak the formatting, the reformatted data is then pasted directly into the WHERE IN part of my query. Ideally I would have the ability to import this list to the SQL query directly potentially bypassing the manual effort involved in the data formatting and swapping between programs.
I am just wondering if this is possible, have tried my best to scour the internet and have had no luck finding any reference to this functionality.
Using where in makes this more complex than it needs to be. Store the IDs you want to filter on in a table called MyTableFilters with a column of the ID values you want to use as filter(s) and join from MyTable on ID to MyTableFilters on ID. The join will cause MyTable to only return rows if the ID in MyTable is also on MyTableFilters
select * from MyTable A join MyTableFilters F on A.ID = F.ID
Since you don't really need to any transformations or data manipulation of what you want to ETL you could also easily truncate and use bulk insert to keep MyFiltersTable up to date
truncate table dbo.MyFiltersTable
BULK INSERT dbo.MyFiltersTable
FROM 'X:\MyFilterTableIDSourceFile.csv'
WITH
(
FIRSTROW = 1,
DATAFILETYPE='widechar', -- UTF-16
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n',
TABLOCK,
KEEPNULLS -- Treat empty fields as NULLs.
)
I'm guessing that you currently have something like the following:
SELECT *
FROM MyTable t
WHERE t.UniqueID in ('ID12','ID345','ID84')
My recommendation would be to create table in which to store the IDs referenced in the WHERE clause. So for the above, your table would look like this:
UniqueID
========
ID12
ID345
ID84
Supposing the table is called UniqueIDs the original query then becomes:
SELECT *
FROM MyTable t
WHERE t.UniqueID in (SELECT u.UniqueID FROM UniqueIDs u)
The question you're asking is then how to populate the UniqueIDs table. You need some means to expose that table to your users. There are several ways you could go about that. A lazy but relatively effective solution would be a simple MS Access database with that table as a "linked" table. You may need to be careful about permissions.
Alternatively, assuming your wedded to the CSV, set up an SSIS job which clears down the table and then imports from that CSV into the UniqueIDs table.
I have a a table with the structure:
I also have a csv containing all the coupon codes only.
I want to use the same values for the other fields (except id of course).
What would be the be the sql required to insert these rows?
Using phpMyAdmin You can insert data from CSV only if it contains all of the required (NOT NULL column) values - the other (when missing) will be filled with defaults, NULL or auto_incremeneted. But Using this tool it is not possible to insert only codes from CSV and use some same values for the other fields. So here You have two options: either set the default values of that columns not being updated from CSV to the desired ones or create a PHP import script, that will do that for You (if You do not want to change the DB table schema).
I have 12 columns with +/- 2000 rows in a sqlite DB.
Now I want to add a 13th column with the same amount of rows.
If I import the text from a cvs file it will add this after the existing rows (now I have a 4000 row table)
How can I avoid adding it underneath these rows?
Do I need to create a script to run trough each row of the table and add the text from the cvs file for each row?
If you have the code that imported the original data, and if the data has not changed in the meantime, you could just drop the table and reimport it.
Otherwise, you indeed have to create a script that looks up the corresponding record in the table and updates it.
You could also import the new data into a temporary table, and then copy the values over with a command like this:
UPDATE MyTable
SET NewColumn = (SELECT NewColumn
FROM TempTable
WHERE ID = MyTable.ID)
I ended up using Razor SQL great program.
http://www.razorsql.com/