INSERT SELECT with differed table/col stucture - sql

I am trying to create a INSERT SELECT statement which inserts and converts data from Imported_table to Destination_table.
Imported_table
+------------------+-----------------------+
| Id (varchar(10)) | genre (varchar(4000)) |
+------------------+-----------------------+
| 6 | Comedy |
+------------------+-----------------------+
| 5 | Comedy |
+------------------+-----------------------+
| 1 | Action |
+------------------+-----------------------+
Destination_table (How it should be looking)
+-----------------------------+----------------------------+
| genre_name (PK,varchar(50)) | description (varchar(255)) |
+-----------------------------+----------------------------+
| Comedy | Description of Comedy |
+-----------------------------+----------------------------+
| Action | Description of Action |
+-----------------------------+----------------------------+
Imported_table.Id isn't used at all but is still in this (old) table
Destination_table.genre_name is a primairy key and should be unique (distinct)
Destination_table.description is compiled with CONCAT('Description of ',genre)
My best try
INSERT INTO testdb.dbo.Destination_table (genre_name, description)
SELECT DISTINCT Genre,
LEFT(Genre,50) AS genre_name,
CAST(CONCAT('Description of ',Genre) AS varchar(255)) AS description
FROM MYIMDB.dbo.Imported_table
Gives the error: The select list for the INSERT statement contains more items than the insert list. The number of SELECT values must match the number of INSERT columns.
Thanks in advance.

The largest error in your query is that you are trying to insert 3 columns into a destination table having only two columns. That being said, I would just use LEFT for both inserted values and take as much space as the new table can hold:
INSERT INTO testdb.dbo.Destination_table (genre_name, description)
SELECT DISTINCT
LEFT(Genre, 50),
'Description of ' + LEFT(Genre, 240) -- 240 + 15 = 255
FROM MYIMDB.dbo.Imported_table;
As a side note, the original genre field is 4000 characters wide, and your new table structure runs the risk of throwing away a lot of information. It is not clear whether you are concerned with this, but it is worth pointing out.

This means your SELECT (genre, genre_name,description) and INSERT (genre_name, description) lists don't match. You need to SELECT the same number of fields as you are specifying in your INSERT.
Try this:
INSERT INTO testdb.dbo.Destination_table (genre_name, description)
SELECT DISTINCT Genre,
CAST(CONCAT('Description of ',Genre) AS varchar(255)) AS description
FROM MYIMDB.dbo.Imported_table

You have 3 columns in your SELECT, try:
INSERT INTO testdb.dbo.Destination_table (genre_name, description)
SELECT DISTINCT LEFT(Genre,50) AS genre_name,
CAST(CONCAT('Description of ',Genre) AS varchar(255)) AS description
FROM MYIMDB.dbo.Imported_table

Related

An SQL query that uses values from two columns in a Between Operator and adds these two columns as a class for the result

In one table, I have a column that contains a letter and another that contains a letter of a later alphabetical order. Like 'A' for the former and 'R' for the latter for example. I want to use these two columns in a Between operator to search for words in another table that starts a letter from the first column and ends with a letter from the second. So in my example, 'Air' would fit this requirement. The problem is I also need to add these two columns to results, so that for my example, the query would return 'Air' with 'A' and 'R' from the other table as two columns in my results. Sorry I can't be more explicit as the data is sensitive.
Based on what you have described here is one way to get the output.
create table t(id int, start_letter varchar(1), end_letter varchar(1));
create table search_data(words varchar(50))
insert into t values(1,'A','R')
begin
insert into search_data values('Air');
insert into search_data values('Amour');
insert into search_data values('Arogant');
end;
select *
from search_data a
join t b
on lower(substring(a.words,1,1))=lower(b.start_letter)
and lower(substring(reverse(a.words),1,1))=lower(b.end_letter)
+-------+----+--------------+------------+
| words | id | start_letter | end_letter |
+-------+----+--------------+------------+
| Air | 1 | A | R |
| Amour | 1 | A | R |
+-------+----+--------------+------------+
db fiddle link
https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=82cf80f4b76cb740ae56db8f236bfd46

SQL Insert only the rows that have data in a specific column

I am basically a noob at this and have gotten this far from Google searches alone. Access VBA and SQL inventory database.
I have a table that I populate by a barcode scanner that looks like the following;
PartNo | SerialNo | Qty | Vehicle
-------+----------+-----+---------
test | | 1 | H2
test2 | | 1 | H2
test3 | test3s/n | 1 | H2
test3 | test4s/n | 1 | H2
test | | 1 | H2
I am trying to update 2 tables from this, or insert if the PartNo doesn't exist.
tblPerm2 has PartNo as primary key
tblPerm1 has PartNo, SerialNo, Qty and Vehicle
PartNo must exist in tblPerm2 to be added to tblPerm1
I can get the PartNo inserted into tblPerm2 no problem, but I'm running into problems with tblPerm1.
I'm following user Parfait's example here, Update Existing Access Records from CSV Import , native to MS Access or in VB.NET
I've tried an Insert and and insert with a join. The code below adds everything to tblPerm1, including rows with no SerialNo. How can I insert only the rows from tblTemp that have a serial number?
INSERT INTO tblPerm1 (PartNo, SerialNo, Qty, Vehicle)
SELECT tblTemp.PartNo, tblTemp.SerialNo, tblTemp.Qty, tblTemp.Vehicle
FROM tblTemp
WHERE tblTemp.SerialNo IS NOT NULL;
I expect this to only insert the 2 'test3' rows, but all rows are inserted.
SELECT DISTINCT is the same, but only one entry for 'test'
Once this is done, I'll delete from tblTemp and continue on updating and inserting. Maybe there is a better way?
Thanks in advance
Are the SerialNo columns actually empty strings instead of NULL?
If this works, then yes they are:
INSERT INTO tblPerm1 (PartNo, SerialNo, Qty, Vehicle)
SELECT tblTemp.PartNo, tblTemp.SerialNo, tblTemp.Qty, tblTemp.Vehicle
FROM tblTemp
WHERE tblTemp.SerialNo <> '';
See How to check for Is not Null And Is not Empty string in SQL server? for more on checking for empty strings, with or without counting whitespace (though details may vary depending on what SQL server you are running).

Create a table without knowing its columns in SQL

How can I create a table without knowing in advance how many and what columns it exactly holds?
The idea is that I have a table DATA that has 3 columns : ID, NAME, and VALUE
What I need is a way to get multiple values depending on the value of NAME - I can't do it with simple WHERE or JOIN (because I'll need other values - with other NAME values - later on in my query).
Because of the way this table is constructed I want to PIVOT it in order to transform every distinct value of NAME into a column so it will be easier to get to it in my later search.
What I want now is to somehow save this to a temp table / variable so I can use it later on to join with the result of another query...
So example:
Columns:
CREATE TABLE MainTab
(
id int,
nameMain varchar(max),
notes varchar(max)
);
CREATE TABLE SecondTab
(
id int,
id_mainTab, int,
nameSecond varchar(max),
notes varchar(max)
);
CREATE TABLE DATA
(
id int,
id_second int,
name varchar(max),
value varchar(max)
);
Now some example data from the table DATA:
| id | id_second_int | name | value |
|-------------------------------------------------------|
| 1 | 5550 | number | 111115550 |
| 2 | 6154 | address | 1, First Avenue |
| 3 | 1784 | supervisor | John Smith |
| 4 | 3467 | function | Marketing |
| 5 | 9999 | start_date | 01/01/2000 |
::::
Now imagine that 'name' has A LOT of different values, and in one query I'll need to get a lot of different values depending on the value of 'name'...
That's why I pivot it so that number, address, supervisor, function, start_date, ... become colums.
This I do dynamically because of the amount of possible columns - it would take me a while to write all of them in an 'IN' statement - and I don't want to have to remember to add it manually every time a new 'name' value gets added...
herefore I followed http://sqlhints.com/2014/03/18/dynamic-pivot-in-sql-server/
the thing is know that I want the result of my execute(#query) to get stored in a tempTab / variable. I want to use it later on to join it with mainTab...
It would be nice if I could use #cols (which holds the values of DATA.name) but I can't seem to figure out a way to do this.
ADDITIONALLY:
If I use the not dynamic way (write down all the values manually after 'IN') I still need to create a column called status. Now in this column (so far it's NULL everywhere because that value doesn't exist in my unpivoted table) i want to have 'open' or 'closed', depending on the date (let's say i have start_date and end_date,
CASE end_date
WHEN end_date < GETDATE() THEN pivotTab.status = 'closed'
ELSE pivotTab.status = 'open'
Where can I put this statement? Let's say my main query looks like this:
SELECT * FROM(
(SELECT id_second, name, value, id FROM TABLE_DATA) src
PIVOT (max(value) FOR name IN id, number, address, supervisor, function, start_date, end_date, status) AS pivotTab
JOIN SecondTab ON SecondTab.id = pivotTab.id_second
JOIN MainTab ON MainTab.id = SecondTab.id_mainTab
WHERE pivotTab.status = 'closed';
Well, as far as I can understand - you have some select statement and just need to "dump" its result to some temporary table. In this case you can use select into syntax like:
select .....
into #temp_table
from ....
This will create temporary table according to columns in select statement and populate it with data returned by select datatement.
See MDSN for reference.

Which search strategy in SQL (postgres) to use in order to find similar strings

I have a table with records in one column which are just different by how they are written.
So how can I find those and save the corresponding id's in a new table?
e.g. I have the following records in a column for cities.
Id name
1 berlin
2 ber lin
3 ber-lin
4 Berlin
5 Hamburg
6 New York
7 NewYork
So my first assumption would be to remove any special characters including white spaces, then lowercase. And see who matches and then write the id to a new table?
What would be the best and most reliable way to find machtes?
If removing some characters (' ' and '-' in the example) and lower-casing is enough to identify duplicates:
CREATE TABLE tbl_folded AS
SELECT lower(translate(name, ' -', '')) AS base_name
, array_agg(id) AS ids
FROM tbl
GROUP BY 1;
SQL Fiddle.
translate() is particularly useful to replace (or remove) a list of single characters.
Use CREATE TABLE AS to create a new table from the results of a query.
An overview of Postgres' pattern matching capabilities in this related answer on dba.SE:
Pattern matching with LIKE, SIMILAR TO or regular expressions in PostgreSQL
This could certainly be optimized but it works :
CREATE TABLE test (id INT(9) NOT NULL AUTO_INCREMENT PRIMARY KEY, name VARCHAR(50) NOT NULL);
INSERT INTO test (id, name) VALUES ('', 'berlin');
INSERT INTO test (id, name) VALUES ('', 'ber lin');
INSERT INTO test (id, name) VALUES ('', 'ber-lin');
INSERT INTO test (id, name) VALUES ('', 'Berlin');
INSERT INTO test (id, name) VALUES ('', 'Hamburg');
INSERT INTO test (id, name) VALUES ('', 'New York');
INSERT INTO test (id, name) VALUES ('', 'NewYork');
CREATE TABLE tmp_clean_text (id INT(9) NOT NULL, name VARCHAR(50) NOT NULL);
INSERT INTO tmp_clean_text (id, name) SELECT id, REPLACE(REPLACE(LOWER(name), ' ', ''), '-', '') FROM test;
CREATE TABLE results (name VARCHAR(50) NOT NULL);
INSERT INTO results (name) SELECT DISTINCT name FROM tmp_clean_text;
UPDATE results SET results.name = CONCAT(results.name, ' ', (
SELECT GROUP_CONCAT(tmp_clean_text.id)
FROM tmp_clean_text
WHERE tmp_clean_text.name = results.name
));
DROP TABLE tmp_clean_text;
It looks to me like you're trying for low edit distance. When I had a similar problem with low-quality, manually entered data, I used a list of "correct" place names (perhaps "New York" in your sample data), and then used a cross join of all the rows of bad data and all of the correct names, computed the edit distance for each pairing, and took the minimum for each pairing as the "match."
PostgreSQL includes the Levenshtein edit distance function in its fuzzystrmatch library, as others have mentioned.
Edit: here's some code, assuming cities contains the data in the post and normalized_cities contains (HAMBURG, BERLIN, NEWYORK) per the later comment:
select distinct id, name, first_value(normalized_name)
over (partition by id order by edit_distance)
from (
select id, name, normalized_name,
levenshtein(upper(name), normalized_name) edit_distance
from cities cross join normalized_cities
) all_pairs
id | name | first_value
----+----------+-------------
1 | berlin | BERLIN
2 | ber lin | BERLIN
3 | ber-lin | BERLIN
4 | Berlin | BERLIN
5 | Hamburg | HAMBURG
6 | New York | NEWYORK
7 | NewYork | NEWYORK

SQL Insert trigger access old values cleanly?

I'm new to SQL beyond basic queries / inserts (as you'll see quickly as you read further)
Here's a (very) simplified example.
I have table 'person' like:
UID | NAME | AGE | LOCATION
45 | bob | 23 | Canada
31 | bill | 20 | Romania
and a second table 'person_history' like:
UID | PID | NAME | AGE | LOCATION
- | - | - | - | -
when I insert into this table like
update person set age=10 where UID==45
I want my trigger to fire, to access the existing values in person, and to push them into the second table, and then continue with the original insert.
The way I can think to do this is:
Select uid, name, age, location,
into v_uid, v_name, v_age, v_location
from person
where uid = :new.uid
then do the insert like
Insert into person_history(UID, PID, NAME, AGE, LOCATION)
VALUES (sequence.nextval, v_uid, v_name, v_age, v_location);
but this seems like a very round-about way of doing it - especially if the table has 50 columns.
Is this the correct method, and is there a more elegant way of approaching this problem.
Again, keep in mind how new I am to all of this, so examples would be really helpful.
why do you do this part?:
Select uid, name, age, location,
into v_uid, v_name, v_age, v_location
from person
where uid = :new.uid
Just use this approach:
INSERT INTO person_history(UID, PID, NAME, AGE, LOCATION)
VALUES (sequence.nextval, ::new.uid, ::new.name, ::new.age, ::new.location);
Regarding the quantity of columns, I'm not sure it is possible to do it in a "smarter" way.