Create an SQL query from two tables in postgresql - sql

I have two tables as shown in the image. I want to create a SQL query in postgresql to get the pkey and minimum count for each unique 'pkey' in table 1 where 'name1' is not present in the array of column 'name' in table 2.
'name' is a array

You can use ANY to check if one element exists in your name's array.
create table t1 (pkey int, cnt int);
create table t2 (pkey int, name text[]);
insert into t1 values (1, 11),(1, 9),(2, 14),(2, 15),(3, 21),(3,16);
insert into t2 values
(1, array['name1','name2']),
(1, array['name3','name2']),
(2, array['name4','name1']),
(2, array['name5','name2']),
(3, array['name2','name3']),
(3, array['name4','name5']);
select pkey
from t2
where 'name1' = any(name);
| pkey |
| ---: |
| 1 |
| 2 |
select t1.pkey, min(cnt) count
from t1
where not exists (select 1
from t2
where t2.pkey = t1.pkey
and 'name1' = any(name))
group by t1.pkey;
pkey | count
---: | ----:
3 | 16
dbfiddle here

Related

Nested Insert query

I'm trying to write a nested insert into query and I want to insert the values, which are inserted in "Table1", in a new table "Table3".
"Table3" has exactly the same columns as "Table1", but without the second insert into value into "Table2".
"Table1" already contains old data, but only the new data inserted in "Table1" should be inserted in "Table3".
-- insert new rows in History
INSERT INTO "table1"
([icao24], [callsign])
SELECT
CurInserts.icao24
,CurInserts.[callsign]
FROM
(
-- INSERT new rows FROM temptable IN currenttable
INSERT INTO "table2"
([icao24], [callsign])
OUTPUT
inserted.[icao24]
,inserted.[callsign]
SELECT
T.[icao24]
,T.[callsign]
FROM #TempTable T
LEFT JOIN "table2" Cur
ON T.[icao24] = Cur.[icao24]
WHERE Cur.[icao24] IS NULL
) CurInserts;
You could first insert the inserted into another table variable or a temporary table.
Then insert into the 2 other tables from the table variable.
Demonstration
declare #tmp table (icao24 int, callsign int);
declare #ins table (icao24 int, callsign int);
insert into #tmp (icao24, callsign) values (1,10),(2,20),(3,30);
insert into #ins (icao24, callsign)
select icao24, callsign
from
(
insert into table1 (icao24, callsign)
output inserted.icao24, inserted.callsign
select tmp.icao24, tmp.callsign
from #tmp tmp
left join table1 t on t.icao24 = tmp.icao24
where t.icao24 is null
) q;
insert into table2 (icao24, callsign)
select icao24, callsign
from #ins;
insert into table3 (icao24, callsign)
select icao24, callsign
from #ins;
select * from table1;
select * from table2;
select * from table3;
icao24 | callsign
-----: | -------:
1 | 10
2 | 20
3 | 30
icao24 | callsign
-----: | -------:
1 | 10
2 | 20
3 | 30
icao24 | callsign
-----: | -------:
2 | 20
3 | 30
Demo on db<>fiddle here

how to avoid duplicate entries in hive?

Create a table with primary key in Hive.
Insert the identical data record several times.
How can you avoid that the data record (primary key) is not inserted more than once without using a second temporary table?
drop table t1;
CREATE TABLE IF NOT EXISTS `t1` (
`ID` BIGINT DEFAULT SURROGATE_KEY(),
`Name` STRING NOT NULL DISABLE NOVALIDATE,
CONSTRAINT `PK_t1` PRIMARY KEY (`ID`) DISABLE NOVALIDATE);
select * from t1;
+--------+----------+
| t1.id | t1.name |
+--------+----------+
+--------+----------+
insert into t1 values (1, "Hi");
insert into t1 values (1, "Hi");
insert into t1 values (1, "Hi");
select * from t1;
+--------+----------+
| t1.id | t1.name |
+--------+----------+
| 1 | Hi |
| 1 | Hi |
| 1 | Hi |
+--------+----------+
I tried unsuccessfully with a merge:
MERGE INTO t1
USING (select * from t1) sub
ON sub.id != t1.id
WHEN not matched then insert values (2, "World");

SQLite query - filter name where each associated id is contained within a set of ids

I'm trying to work out a query that will find me all of the distinct Names whose LocationIDs are in a given set of ids. The catch is if any of the LocationIDs associated with a distinct Name are not in the set, then the Name should not be in the results.
Say I have the following table:
ID | LocationID | ... | Name
-----------------------------
1 | 1 | ... | A
2 | 1 | ... | B
3 | 2 | ... | B
I'm needing a query similar to
SELECT DISTINCT Name FROM table WHERE LocationID IN (1, 2);
The problem with the above is it's just checking if the LocationID is 1 OR 2, this would return the following:
A
B
But what I need it to return is
B
Since B is the only Name where both of its LocationIDs are in the set (1, 2)
You can try to write two subquery.
get count by each Name
get count by your condition.
then join them by count amount, which means your need to all match your condition count number.
Schema (SQLite v3.17)
CREATE TABLE T(
ID int,
LocationID int,
Name varchar(5)
);
INSERT INTO T VALUES (1, 1,'A');
INSERT INTO T VALUES (2, 1,'B');
INSERT INTO T VALUES (3, 2,'B');
Query #1
SELECT t2.Name
FROM
(
SELECT COUNT(DISTINCT LocationID) cnt
FROM T
WHERE LocationID IN (1, 2)
) t1
JOIN
(
SELECT COUNT(DISTINCT LocationID) cnt,Name
FROM T
WHERE LocationID IN (1, 2)
GROUP BY Name
) t2 on t1.cnt = t2.cnt;
| Name |
| ---- |
| B |
View on DB Fiddle
You can just use aggregation. Assuming no duplicates in your table:
SELECT Name
FROM table
WHERE LocationID IN (1, 2)
GROUP BY Name
HAVING COUNT(*) = 2;
If Name/LocationID pairs can be duplicated, use HAVING COUNT(DISTINCT LocationID) = 2.

SQL Server 2014: SELECT only those rows that match all rows in another table

I am a beginner in SQL Server. I am trying to solve this problem:
Select all (distinct) "item_id" from "ItemTag" table whose corresponding "tag_id" values match at least "all" values in the "UserTagList" table.
I tried a join below but instead of the result I got the query should return the item_id 5 since it has both tag_id's 3 & 4.
Any help is deeply appreciated.
Below is the SQL schema
SQL Fiddle
SQL Server 2014 schema setup:
CREATE TABLE UserTagList (id INT);
INSERT INTO UserTagList (id)
VALUES (3), (4);
CREATE TABLE ItemTag (id INT, item_id INT, tag_id INT);
INSERT INTO ItemTag (id, item_id, tag_id)
VALUES (1, 5, 3), (2, 5, 4), (3, 5, 6), (4, 6, 3), (5, 7, 4);
Query 1:
SELECT i.item_id, i.tag_id
FROM ItemTag AS i
JOIN UserTagList AS u ON i.tag_id = u.id
Results:
| item_id | tag_id |
|---------|--------|
| 5 | 3 |
| 5 | 4 |
| 6 | 3 |
| 7 | 4 |
This is a query where group by and having are useful:
select i.item_id
from ItemTag i join
UserTagList u
on i.tag_id = u.id
group by i.item_id
having count(*) = (select count(*) from UserTagList);
I would use a left join here with aggregation:
SELECT t1.item_id
FROM ItemTag t1
LEFT JOIN UserTagList t2
ON t1.tag_id = t2.id
GROUP BY t1.item_id
HAVING SUM(CASE WHEN t2.id IS NULL THEN 1 ELSE 0 END) = 0
The logic here is that if, for a given item_id group, one or more of its tags did not match to anything in the UserTagList table, then the sum in the HAVING clause would detect and count a null and non-matching record.

Merge two rows in SQL

Assuming I have a table containing the following information:
FK | Field1 | Field2
=====================
3 | ABC | *NULL*
3 | *NULL* | DEF
is there a way I can perform a select on the table to get the following
FK | Field1 | Field2
=====================
3 | ABC | DEF
Thanks
Edit: Fix field2 name for clarity
Aggregate functions may help you out here. Aggregate functions ignore NULLs (at least that's true on SQL Server, Oracle, and Jet/Access), so you could use a query like this (tested on SQL Server Express 2008 R2):
SELECT
FK,
MAX(Field1) AS Field1,
MAX(Field2) AS Field2
FROM
table1
GROUP BY
FK;
I used MAX, but any aggregate which picks one value from among the GROUP BY rows should work.
Test data:
CREATE TABLE table1 (FK int, Field1 varchar(10), Field2 varchar(10));
INSERT INTO table1 VALUES (3, 'ABC', NULL);
INSERT INTO table1 VALUES (3, NULL, 'DEF');
INSERT INTO table1 VALUES (4, 'GHI', NULL);
INSERT INTO table1 VALUES (4, 'JKL', 'MNO');
INSERT INTO table1 VALUES (4, NULL, 'PQR');
Results:
FK Field1 Field2
-- ------ ------
3 ABC DEF
4 JKL PQR
There are a few ways depending on some data rules that you have not included, but here is one way using what you gave.
SELECT
t1.Field1,
t2.Field2
FROM Table1 t1
LEFT JOIN Table1 t2 ON t1.FK = t2.FK AND t2.Field1 IS NULL
Another way:
SELECT
t1.Field1,
(SELECT Field2 FROM Table2 t2 WHERE t2.FK = t1.FK AND Field1 IS NULL) AS Field2
FROM Table1 t1
There might be neater methods, but the following could be one approach:
SELECT t.fk,
(
SELECT t1.Field1
FROM `table` t1
WHERE t1.fk = t.fk AND t1.Field1 IS NOT NULL
LIMIT 1
) Field1,
(
SELECT t2.Field2
FROM `table` t2
WHERE t2.fk = t.fk AND t2.Field2 IS NOT NULL
LIMIT 1
) Field2
FROM `table` t
WHERE t.fk = 3
GROUP BY t.fk;
Test Case:
CREATE TABLE `table` (fk int, Field1 varchar(10), Field2 varchar(10));
INSERT INTO `table` VALUES (3, 'ABC', NULL);
INSERT INTO `table` VALUES (3, NULL, 'DEF');
INSERT INTO `table` VALUES (4, 'GHI', NULL);
INSERT INTO `table` VALUES (4, NULL, 'JKL');
INSERT INTO `table` VALUES (5, NULL, 'MNO');
Result:
+------+--------+--------+
| fk | Field1 | Field2 |
+------+--------+--------+
| 3 | ABC | DEF |
+------+--------+--------+
1 row in set (0.01 sec)
Running the same query without the WHERE t.fk = 3 clause, it would return the following result-set:
+------+--------+--------+
| fk | Field1 | Field2 |
+------+--------+--------+
| 3 | ABC | DEF |
| 4 | GHI | JKL |
| 5 | NULL | MNO |
+------+--------+--------+
3 rows in set (0.01 sec)
I had a similar problem. The difference was that I needed far more control over what I was returning so I ended up with an simple clear but rather long query. Here is a simplified version of it based on your example.
select main.id, Field1_Q.Field1, Field2_Q.Field2
from
(
select distinct id
from Table1
)as main
left outer join (
select id, max(Field1)
from Table1
where Field1 is not null
group by id
) as Field1_Q on main.id = Field1_Q.id
left outer join (
select id, max(Field2)
from Table1
where Field2 is not null
group by id
) as Field2_Q on main.id = Field2_Q.id
;
The trick here is that the first select 'main' selects the rows to display. Then you have one select per field. What is being joined on should be all of the same values returned by the 'main' query.
Be warned, those other queries need to return only one row per id or you will be ignoring data
if one row has value in field1 column and other rows have null value then this Query might work.
SELECT
FK,
MAX(Field1) as Field1,
MAX(Field2) as Field2
FROM
(
select FK,ISNULL(Field1,'') as Field1,ISNULL(Field2,'') as Field2 from table1
)
tbl
GROUP BY FK
My case is I have a table like this
---------------------------------------------
|company_name|company_ID|CA | WA |
---------------------------------------------
|Costco | 1 |NULL | 2 |
---------------------------------------------
|Costco | 1 |3 |Null |
---------------------------------------------
And I want it to be like below:
---------------------------------------------
|company_name|company_ID|CA | WA |
---------------------------------------------
|Costco | 1 |3 | 2 |
---------------------------------------------
Most code is almost the same:
SELECT
FK,
MAX(CA) AS CA,
MAX(WA) AS WA
FROM
table1
GROUP BY company_name,company_ID
The only difference is the group by, if you put two column names into it, you can group them in pairs.
SELECT Q.FK
,ISNULL(T1.Field1, T2.Field2) AS Field
FROM (SELECT FK FROM Table1
UNION
SELECT FK FROM Table2) AS Q
LEFT JOIN Table1 AS T1 ON T1.FK = Q.FK
LEFT JOIN Table2 AS T2 ON T2.FK = Q.FK
If there is one table, write Table1 instead of Table2