SQL Server select column names from multiple tables - sql

I have three tables in SQL Server with following structure:
col1 col2 a1 a2 ... an,
col1 col2 b1 b2 ... bn,
col1 col2 c1 c2 ... cn
The two first records are the same, col1 and col2, however the tables have different lengths.
I need to select the column names of the tables and the result I'm trying to achieve is the followig:
col1, col2, a1, b1, c1, a2, b2, c2 ...
Is there a way to do it?

It's possible but result's is combined into single column of three table tables.
For example
SELECT A.col1 +'/' +B.col1 +'/' + C.col1 As Col1 ,
A.col2 +'/' +B.col2 +'/' + C.col2 As col2 ,a1, b1, c1, a2, b2, c2 ,
* FROM A
INNER JOIN B
ON A.ID =B.ID
INNER JOIN C
ON C.ID = B.ID

SQL-Server is not the right tool to create a generic resultset. The engine needs to know what's coming out in advance. Well, you might try to find a solution with dynamic SQL...
I want to suggest two different approaches.
Both would work with any number of tables, as long as all of them have the columns col1 and col2 with appropriate types.
Let's create a simple mokcup scenario before:
DECLARE #mockup1 TABLE(col1 INT,col2 INT,SomeMore1 VARCHAR(100),SomeMore2 VARCHAR(100));
INSERT INTO #mockup1 VALUES(1,1,'blah 1.1','blub 1.1')
,(1,2,'blah 1.2','blub 1.2')
,(1,100,'not in t2','not in t2');
DECLARE #mockup2 TABLE(col1 INT,col2 INT,OtherType1 INT,OtherType2 DATETIME);
INSERT INTO #mockup2 VALUES(1,1,101,GETDATE())
,(1,2,102,GETDATE()+1)
,(1,200,200,GETDATE()+200);
--You can add as many tables as you need
A very pragmatic approach:
Try this simple FULL OUTER JOIN:
SELECT *
FROM #mockup1 m1
FULL OUTER JOIN #mockup2 m2 ON m1.col1=m2.col1 AND m1.col2=m2.col2
--add more tables here
The result
+------+------+-----------+-----------+------+------+------------+-------------------------+
| col1 | col2 | SomeMore1 | SomeMore2 | col1 | col2 | OtherType1 | OtherType2 |
+------+------+-----------+-----------+------+------+------------+-------------------------+
| 1 | 1 | blah 1.1 | blub 1.1 | 1 | 1 | 101 | 2019-03-08 10:53:20.257 |
+------+------+-----------+-----------+------+------+------------+-------------------------+
| 1 | 2 | blah 1.2 | blub 1.2 | 1 | 2 | 102 | 2019-03-09 10:53:20.257 |
+------+------+-----------+-----------+------+------+------------+-------------------------+
| 1 | 100 | not in t2 | not in t2 | NULL | NULL | NULL | NULL |
+------+------+-----------+-----------+------+------+------------+-------------------------+
| NULL | NULL | NULL | NULL | 1 | 200 | 200 | 2019-09-24 10:53:20.257 |
+------+------+-----------+-----------+------+------+------------+-------------------------+
But you will have to deal with non-unique column names... (This is the moment, where a dynamically created statement can help).
A generic approach using container type XML
Whenever you do not know the result in advance, you can pack the result in a container. This allows a clear structure on the side of your RDBMS and shifts the troubles how to deal with this set to the consumer.
The cte will read all existing pairs of col1 and col2
Each table's row(s) for the pair of values is inserted as XML
Pairs not existing in any of the tables show up as NULL
Try this out
WITH AllDistinctCol1Col2Values AS
(
SELECT col1,col2 FROM #mockup1
UNION ALL
SELECT col1,col2 FROM #mockup2
--add all your tables here
)
SELECT col1,col2
,(SELECT * FROM #mockup1 x WHERE c1c2.col1=x.col1 AND c1c2.col2=x.col2 FOR XML PATH('row'),TYPE) AS Content1
,(SELECT * FROM #mockup2 x WHERE c1c2.col1=x.col1 AND c1c2.col2=x.col2 FOR XML PATH('row'),TYPE) AS Content2
FROM AllDistinctCol1Col2Values c1c2
GROUP BY col1,col2;
The result
+------+------+-----------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------+
| col1 | col2 | Content1 | Content2 |
+------+------+-----------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------+
| 1 | 1 | <row><col1>1</col1><col2>1</col2><SomeMore1>blah 1.1</SomeMore1><SomeMore2>blub 1.1</SomeMore2></row> | <row><col1>1</col1><col2>1</col2><OtherType1>101</OtherType1><OtherType2>2019-03-08T11:03:49.877</OtherType2></row> |
+------+------+-----------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------+
| 1 | 2 | <row><col1>1</col1><col2>2</col2><SomeMore1>blah 1.2</SomeMore1><SomeMore2>blub 1.2</SomeMore2></row> | <row><col1>1</col1><col2>2</col2><OtherType1>102</OtherType1><OtherType2>2019-03-09T11:03:49.877</OtherType2></row> |
+------+------+-----------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------+
| 1 | 100 | <row><col1>1</col1><col2>100</col2><SomeMore1>not in t2</SomeMore1><SomeMore2>not in t2</SomeMore2></row> | NULL |
+------+------+-----------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------+
| 1 | 200 | NULL | <row><col1>1</col1><col2>200</col2><OtherType1>200</OtherType1><OtherType2>2019-09-24T11:03:49.877</OtherType2></row> |
+------+------+-----------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------+

Related

How to update Hive table rows

I have a table that looks like this:
id | Col2 | Col3 | Text
--------------------------
1 | ... | ... | "abc"
2 | ... | ... | "def
3 | ... | ... | "ghi"
4 | ... | ... | "jkl"
And another table that looks like this:
id | Text
-------------
1 | "qwe"
2 | "rty"
And I want to end up with a table that looks like this:
id | Col2 | Col3 | Text
--------------------------
1 | ... | ... | "qwe"
2 | ... | ... | "rty"
3 | ... | ... | "ghi"
4 | ... | ... | "jkl"
where the original values for col2 and col3 are maintained. Essentially, I want to use the values from table 2 to update the values of table 1 where ids are the same.
I tried:
SELECT
A.id,
col1,
col2,
A.text
FROM table1 AS A
LEFT JOIN (
SELECT
id,
text
FROM table2
) AS B
ON A.product_id = B.product_id
But this just returned the original table. Is there a way to achieve what I want in Presto/Hive?
You are loading Text from table A, it should be from table B or NVL(B.text, A.Text) if you want to update value if exists in table B and leave as is if not exists (see comment in the code)
INSERT OVERWRITE table1
SELECT
A.id,
col1,
col2,
NVL(B.text, A.Text) as Text -- Take Text from table B, if not exists, leave as is (A.Text)
FROM table1 AS A
LEFT JOIN B ON A.product_id = B.product_id
You can use coalesce(B.text, A.Text) instead of NVL, as #PiotrFindeisen mentioned, it will work fine on Presto and Hive as well.

Oracle SQL statement without duplicates

I have a requirement to write a SQL statement to return 2 columns, however there cannot be duplicates in either of these columns. For example:
|---------------------|------------------|
| 10 | A |
|---------------------|------------------|
| 11 | B |
|---------------------|------------------|
| 12 | C |
|---------------------|------------------|
| 13 | A | <--- Don't return
|---------------------|------------------|
Using distinct doesn't work, since the row highlighted above is distinct. It also doesn't matter which of the duplicates is returned.
Does anyone know of a way to do this? It feels as though I'm missing something obvious.
Thanks.
You can try to make row number by col2 and get rn = 1 data row.
CREATE TABLE T(
col1 int,
col2 varchar(5)
);
insert into t values (10,'A');
insert into t values (11,'B');
insert into t values (12,'C');
insert into t values (13,'A');
Query 1:
SELECT t1.col1,t1.col2
FROM (
SELECT t1.*,ROW_NUMBER() OVER(PARTITION BY col2 ORDER BY col1) rn
FROM T t1
)t1
WHERE t1.rn = 1
Results:
| COL1 | COL2 |
|------|------|
| 10 | A |
| 11 | B |
| 12 | C |
If you just want the lowest value from the first column, do:
SELECT MIN(column1), column2
FROM YourTable
GROUP BY column2
This is not posible in one query, because each column have different number of unique values

Issue in joining 2 datasets

I have two datasets like below:
1:
+---------------------------+
| Id | Col1 | Col2 | Col3 |
+---------------------------+
| 1 | abc | 0 | 01/01/2010 |
| 2 | def | 10 | 10/10/2011 |
+---------------------------+
2:
+-------------------------------------------+
| Id | Col4 | Col5 | Col6 |
+-------------------------------------------+
| 1 | abc | 0 | 01/01/2010 |
| 5 | xyz | 12 | 5/6/2013 |
+-------------------------------------------+
Now I want to combine both these into a single dataset which shows something like this:
+----------------------------------------------------------------------+
| ID | Col1 | Col2 | Col3 | Col4 | Col5 | Col6 |
+----------------------------------------------------------------------+
| 1 | abc | 0 | 01/01/2010 | abc | 0 | 01/01/2010 |
| 2 | def | 10 | 10/10/2011 | null | null | null |
| 5 | null | null | null | xyz | 12 | 5/6/2013 |
+----------------------------------------------------------------------+
The issue is not all ids in dataset 1 are in dataset 2 and vice versa. What i need as all data from datasets1 and 2 and only the common from 1 and 2 with 2 transposed with 1 as shown above. I have used pipe as a separator.
An inputs are highly appreciated. i tried everything like full outer join, inner join , CTE etc - nothing is working.
CREATE TABLE #TEMP1 (ID INT, Col1 VARCHAR(100), Col2 INT, Col3 DATETIME)
CREATE TABLE #TEMP2 (ID INT, Col4 VARCHAR(100), Col5 INT, Col6 DATETIME)
INSERT INTO #TEMP1 VALUES (1,'abc',0,'1/1/2010')
INSERT INTO #TEMP1 VALUES (1,'def',0,'1/1/2010')
INSERT INTO #TEMP2 VALUES (1,'abc',0,'1/1/2010')
INSERT INTO #TEMP2 VALUES (1,'def',0,'1/1/2010')
SELECT DISTINCT A.ID,A.Col1,A.Col2,A.Col3,B.Col4,B.Col5,B.Col6
FROM #TEMP1 A
FULL OUTER JOIN #TEMP2 B ON A.ID = B.ID
Thanks.
Try using below SQL :
select t1.Id , Col1 , Col2 , Col3 , Col4 , Col5 , Col6
from temp1 t1 left join temp2 t2
on t1.Id=t2.Id
union
select t2.Id , Col1 , Col2 , Col3 , Col4 , Col5 , Col6
from temp1 t1 right join temp2 t2
on t1.Id=t2.Id
Also, i tried on fiddle for you :
http://sqlfiddle.com/#!2/d60a1e/5

display records based on ranks and also delete duplicated data

i have a table like this
+------+------+------+------+
| col1 | col2 | col3 | rank |
+------+------+------+------+
| 1 | A | X | 4 |
| 2 | C | Y | 3 |
| 2 | C | Y | 3 |
| | A | X | 3 |
| 1 | B | Z | 2 |
+------+------+------+------+
(5 rows)
I need o/p like this
+------+------+------+------+
| col1 | col2 | col3 | rank |
+------+------+------+------+
| 1 | A | X | 4 |
| 2 | C | Y | 3 |
| 1 | B | Z | 2 |
+------+------+------+------+
so that I written query like below
select col1,col2,col3,rank,dense_rank() over(order by rank desc) from table1;
but its not giving proper o/p
try this !!
select a.col1,a.col2,a.col3,max(a.rank) as rank
from [dbo].[5] a join [dbo].[5] b
on a.col1=b.col1 group by a.col1,a.col2,a.col3
looks like you need aggregation with max():
select
col1,col2,col3,
max(rnk)
from table1
group by col1,col2,col3
If you could have different values of col1 for one combination of col2, col3, then distinct on is what you need:
select distinct on (col2, col3)
col1,col2,col3,
rnk
from table1
order by col2, col3, rnk desc
sql fiddle demo
The following should match what you are looking for:
select col1,col2,col3,rank,dense_rank() over(order by rank desc) from table1
WHERE col1 IS NOT NULL
GROUP BY 1, 2, 3, 4;
You can also use numeric aliases in your order by clause if you want one.

TSQL select the from two rows that has higher priority and is not null

I try to consolidate two rows of the same table whereas each row has a priority.
The value of interest is the value having priority 1 if it is not NULL; otherwise the value with priority 0.
An example data source could be:
| Id | GroupId | Priority | Col1 | Col2 | Col3 | ... | Coln |
-----------------------------------------------------------------
| 1 | 1 | 0 | NULL | 4711 | 3.41 | ... | f00 |
| 2 | 1 | 1 | NULL | NULL | 2.83 | ... | bar |
| 3 | 2 | 0 | NULL | 4711 | 3.41 | ... | f00 |
| 4 | 2 | 1 | 23 | NULL | 2.83 | ... | NULL |
and I want to have:
| GroupId | Col1 | Col2 | Col3 | ... | Coln |
-------------------------------------------------
| 1 | NULL | 4711 | 2.83 | ... | bar |
| 2 | 23 | 4711 | 2.83 | ... | f00 |
Is there a generic way in TSQL without the need to check each column explicitly?
SELECT
t1.GroupId,
ISNULL(t2.Col1, t1.Col1) as Col1,
ISNULL(t2.Col2, t1.Col2) as Col2,
ISNULL(t2.Col3, t1.Col3) as Col3,
...
ISNULL(t2.Coln, t1.Coln) as Coln
FROM mytable t1
JOIN mytable t2 ON t1.GroupId = t2.GroupId
WHERE
t1.Priority = 0 AND
t2.Priority = 1
Regards
I'll elaborate the ROW_NUMBER() solution that #KM suggested since IMO it's the best solution for this. (In CTE form for easier readability)
WITH cte AS (
SELECT
t1.GroupId,
t1.Col1,
t1.Col2,
ROW_NUMBER() OVER(PARTITION BY t1.GroupId ORDER BY ISNULL(GroupId ,-1) ) AS [row_id]
FROM
mytable t1
)
SELECT
*
FROM
cte
WHERE
row_id = 1
That will give you the row with the highest priority (according to your rules) for each GroupId in mytable.
ROW_NUMBER and RANK are two of my favorite TSQL tricks. http://msdn.microsoft.com/en-us/library/ms186734.aspx
edit: Another favorite of mine is PIVOT/UNPIVOT which you can use to transpose rows/columns which is another way of going about this type of problem. http://msdn.microsoft.com/en-us/library/ms177410.aspx
I think this would do what you are asking for without using isnull for every column
select
*
from
mytable t1
where
priority=(select max(priority) from mytable where groupid=t1.groupid group by groupid)