Where clause between union all in sql? - sql

I have a query that vertically expands data by using Union condition. Below are the 2 sample tables:
create table #temp1(_row_ord int,CID int,_data varchar(10))
insert #temp1
values
(1,1001,'text1'),
(2,1001,'text2'),
(4,1002,'text1'),
(5,1002,'text2')
create table #temp2(_row_ord int,CID int,_data varchar(10))
insert #temp2
values
(1,1001,'sample1'),
(2,1001,'sample2'),
(4,1002,'sample1'),
(5,1002,'sample2')
--My query
select * from #temp1
union
select * from #temp2 where CID in (select CID from #temp1)
order by _row_ord,CID
drop table #temp1,#temp2
So my current output is:
I want to group the details of every client together for which I am unable to use 'where' clause across Union condition.
My desired output:
Any help?! Order by is also not helping me.

I can imagine you want all of the rows for a CID sorted by _row_ord from the first table before the ones from the second table. And the CID should be the outermost sort criteria.
If that's right, you can select literals from your tables. Let the literal for the first table be less than that of the second table. Then first sort by CID, then that literal and finally by _row_ord.
SELECT cid,
_data
FROM (SELECT 1 s,
_row_ord,
cid,
_data
FROM #temp1
UNION ALL
SELECT 2 s,
_row_ord,
cid,
_data
FROM #temp2) x
ORDER BY cid,
s,
_row_ord;
db<>fiddle

If I correctly understand your need, you need the output to be sorted the way that #temp1 rows appear before #temp2 rows for each cid value.
What you could do is generate additional column ordnum assigning values for each table, just for sorting purposes, and then get rid of it in the outer select statement.
select cid, _data
from (
select 1 as ordnum, *
from #temp1
union all
select 2 as ordnum, *
from #temp2 t2
where exists (
select 1
from #temp1 t1
where t1.cid = t2.cid
)
) q
order by cid, ordnum
I have also rewritten your where condition for an equivalent which should work faster using exists operator.
Live DEMO - click me!
Output
cid _data
1001 text1
1001 text2
1001 sample1
1001 sample2
1002 text1
1002 text2
1002 sample1
1002 sample2

Use With. here is my first try with your sql
create table #temp1(_row_ord int,CID int,_data varchar(10))
insert #temp1
values
(1,1001,'text1'),
(2,1001,'text2'),
(4,1002,'text1'),
(5,1002,'text2')
create table #temp2(_row_ord int,CID int,_data varchar(10))
insert #temp2
values
(1,1001,'sample1'),
(2,1001,'sample2'),
(4,1002,'sample1'),
(5,1002,'sample2');
WITH result( _row_ord, CID,_data) AS
(
--My query
select * from #temp1
union
select * from #temp2 where CID in (select CID from #temp1)
)
select * from tmp order by CID ,_data
drop table #temp1,#temp2
result
_row_ord CID _data
1 1001 sample1
2 1001 sample2
1 1001 text1
2 1001 text2
4 1002 sample1
5 1002 sample2
4 1002 text1
5 1002 text2

Union is placed between two result set blocks and forms a single result set block. If you want a where clause on a particular block you can put it:
select a from a where a = 1
union
select z from z
select a from a
union
select z from z where z = 1
select a from a where a = 1
union
select z from z where z = 1
The first query in a union defines column names in the output. You can wrap an output in brackets, alias it and do a where on the whole lot:
select * from
(
select a as newname from a where a = 1
union
select z from z where z = 2
) o
where o.newname = 3
It is important to note that a.a and z.z will combine into a new column, o.newname. As a result, saying where o.newname will filter on all rows from both a and z (the rows from z are also stacked into the newname column). The outer query knows only about o.newname, it knows nothing of a or z
Side note, the query above produces nothing because we know that only rows where a.a is 1 and z.z is 2 are output by the union as o.newname. This o.newname is then filtered to only output rows that are 3, but no rows are 3
select * from
(
select a as newname from a
union
select z from z
) o
where o.newname = 3
This query will pick up any rows in a or z where a.a is 3 or z.z is 3, thanks to the filtering of the resulting union

Related

Insert into table gives different results

I inserted all the rows of a view to a delta table, but after running a query for a particular value I get below results.
Can someone explain to me how this is possible?
A is View and B is Delta(databricks) table.
1.
Select count(*) from A
result: 102.321
2.
Select * from A where id = '11'
id
date
unit
11
2022-09-02
4
3.
Insert Into B Select * From A
OR
CREATE OR REPLACE TABLE B AS ( SELECT * FROM A)
Select count(*) from B '
result: 101.372
5.
Select * from B where id = '11'
id
date
unit
11
2022-09-02
2

SQL grouping by distinct values in a multi-value string column

(I want to perform a group-by based on the distinct values in a string column that has multiple values
The said column has a list of strings in a standard format separated by commas. The potential values are only a,b,c,d.
For example the column collection (type: String) contains:
Row 1: ["a","b"]
Row 2: ["b","c"]
Row 3: ["b","c","a"]
Row 4: ["d"]`
The expected output is a count of unique values:
collection | count
a | 2
b | 3
c | 2
d | 1
For all the below i used this table:
create table tmp (
id INT auto_increment,
test VARCHAR(255),
PRIMARY KEY (id)
);
insert into tmp (test) values
("a,b"),
("b,c"),
("b,c,a"),
("d")
;
If the possible values are only a,b,c,d you can try one of this:
Tke note that this will only works if you have not so similar values like test and test_new, because then the test would be joined also with all test_new rows and the count would not match
select collection, COUNT(*) as count from tmp JOIN (
select CONCAT("%", tb.collection, "%") as like_collection, collection from (
select "a" COLLATE utf8_general_ci as collection
union select "b" COLLATE utf8_general_ci as collection
union select "c" COLLATE utf8_general_ci as collection
union select "d" COLLATE utf8_general_ci as collection
) tb
) tb1
ON tmp.test LIKE tb1.like_collection
GROUP BY tb1.collection;
Which will give you the result you want
collection | count
a | 2
b | 3
c | 2
d | 1
or you can try this one
SELECT
(SELECT COUNT(*) FROM tmp WHERE test LIKE '%a%') as a_count,
(SELECT COUNT(*) FROM tmp WHERE test LIKE '%b%') as b_count,
(SELECT COUNT(*) FROM tmp WHERE test LIKE '%c%') as c_count,
(SELECT COUNT(*) FROM tmp WHERE test LIKE '%d%') as d_count
;
The result would be like this
a_count | b_count | c_count | d_count
2 | 3 | 2 | 1
What you need to do is to first explode the collection column into separate rows (like a flatMap operation). In redshift the only way to generate new rows is to JOIN - so let's CROSS JOIN your input table with a static table having consecutive numbers, and take only ones having id less or equal to number of elements in the collection. Then we'll use split_part function to read the item at correct index. Once we have the exploaded table, we'll do a simple GROUP BY.
If your items are stored as JSON array strings ('["a", "b", "c"]') then you can use JSON_ARRAY_LENGTH and JSON_EXTRACT_ARRAY_ELEMENT_TEXT instead of REGEXP_COUNT and SPLIT_PART respectively.
with
index as (
select 1 as i
union all select 2
union all select 3
union all select 4 -- could be substituted with 'select row_number() over () as i from arbitrary_table limit 4'
),
agg as (
select 'a,b' as collection
union all select 'b,c'
union all select 'b,c,a'
union all select 'd'
)
select
split_part(collection, ',', i) as item,
count(*)
from index,agg
where regexp_count(agg.collection, ',') + 1 >= index.i -- only get rows where number of items matches
group by 1

Combine three columns from different tables into one row

I am new to sql and are trying to combine a column value from three different tables and combine to one row in DB2 Warehouse on Cloud. Each table consists of only one row and unique column name. So what I want to is just join these three to one row their original column names.
Each table is built from a statement that looks like this:
SELECT SUM(FUEL_TEMP.FUEL_MLAD_VALUE) AS FUEL
FROM
(SELECT ML_ANOMALY_DETECTION.MLAD_METRIC AS MLAD_METRIC, ML_ANOMALY_DETECTION.MLAD_VALUE AS FUEL_MLAD_VALUE, ML_ANOMALY_DETECTION.TAG_NAME AS TAG_NAME, ML_ANOMALY_DETECTION.DATETIME AS DATETIME, DATA_CONFIG.SYSTEM_NAME AS SYSTEM_NAME
FROM ML_ANOMALY_DETECTION
INNER JOIN DATA_CONFIG ON
(ML_ANOMALY_DETECTION.TAG_NAME =DATA_CONFIG.TAG_NAME AND
DATA_CONFIG.SYSTEM_NAME = 'FUEL')
WHERE ML_ANOMALY_DETECTION.MLAD_METRIC = 'IFOREST_SCORE'
AND ML_ANOMALY_DETECTION.DATETIME >= (CURRENT DATE - 9 DAYS)
ORDER BY DATETIME DESC)
AS FUEL_TEMP
I have tried JOIN, INNER JOIN, UNION/UNION ALL, but can't get it to work as it should. How can I do this?
Use a cross-join like this:
create table table1 (field1 char(10));
create table table2 (field2 char(10));
create table table3 (field3 char(10));
insert into table1 values('value1');
insert into table2 values('value2');
insert into table3 values('value3');
select *
from table1
cross join table2
cross join table3;
Result:
field1 field2 field3
---------- ---------- ----------
value1 value2 value3
A cross join joins all the rows on the left with all the rows on the right. You will end up with a product of rows (table1 rows x table2 rows x table3 rows). Since each table only has one row, you will get (1 x 1 x 1) = 1 row.
Using UNION should solve your problem. Something like this:
SELECT
WarehouseDB1.WarehouseID AS TheID,
'A' AS TheSystem,
WarehouseDB1.TheValue AS TheValue
FROM WarehouseDB1
UNION
SELECT
WarehouseDB2.WarehouseID AS TheID,
'B' AS TheSystem,
WarehouseDB2.TheValue AS TheValue
FROM WarehouseDB2
UNION
WarehouseDB3.WarehouseID AS TheID,
'C' AS TheSystem,
WarehouseDB3.TheValue AS TheValue
FROM WarehouseDB3
Ill adapt the code with your table names and rows if you tell me what they are. This kind of query would return something like the following:
TheID TheSystem TheValue
1 A 10
2 A 20
3 B 30
4 C 40
5 C 50
As long as your column names match in each query, you should get the desired results.

How to insert multiple rows from one column?

I want to insert multiple rows from one column by splitting column value. But I have to do that without cursors because of performance issues.
Every value is splitted to 6 chars length values. Then these values also splitted to 3, 1 and 2 chars length values to insert different columns in table B.
I think giving a sample will clarify my question:
Table A
ID Value
1 ABCDEFGHJKLM
2 NOPRST
3 NULL VALUE
I want to insert these values into table B like this format
Table B
ID Value1 Value2 Value3
1 ABC D EF
1 GHJ K LM
2 NOP R ST
Supposing 600(100 rows) as maximum length of value:
insert into tableB
select id, substr(value,n*6+1,3), substr(value,n*6+4,1), substr(value,n*6+5,2)
from tableA
join (select level-1 as n from dual connect by level <= 100)
on length(value) > n*6;
see Sqlfiddle.
select ID,
SUBSTR(value,number*6+1,3),
SUBSTR(value,number*6+4,1),
SUBSTR(value,number*6+5,2)
from yourtable,
(select 0 as number union select 1 union select 2 union select 3 union select 4
union select 5 union select 6) as numbers
/* etc up to the max length of your string /6 */
where LEN(value)>number*6
try this:
Please convert it to ORACLE SQL..
Even though, its using a while loop, its doing bulk inserts..and the loop is executed as per the length of maximun length of value in the table
declare #max_len int=0;
declare #counter int=0;
declare #col_index int=1;
select #max_len=MAX(len(Value)) from TableA
while (#max_len/6 > #counter)
begin
set #counter=#counter+1
Insert into TableB
select ID,substring(Value,#col_index,3),
substring(Value,#col_index+3,1),
substring(Value,#col_index+4,2)
from TableA where substring(Value,#col_index,3) is not null
set #col_index=#col_index+6
end

How do you find a missing number in a table field starting from a parameter and incrementing sequentially?

Let's say I have an sql server table:
NumberTaken CompanyName
2 Fred 3 Fred 4 Fred 6 Fred 7 Fred 8 Fred 11 Fred
I need an efficient way to pass in a parameter [StartingNumber] and to count from [StartingNumber] sequentially until I find a number that is missing.
For example notice that 1, 5, 9 and 10 are missing from the table.
If I supplied the parameter [StartingNumber] = 1, it would check to see if 1 exists, if it does it would check to see if 2 exists and so on and so forth so 1 would be returned here.
If [StartNumber] = 6 the function would return 9.
In c# pseudo code it would basically be:
int ctr = [StartingNumber]
while([SELECT NumberTaken FROM tblNumbers Where NumberTaken = ctr] != null)
ctr++;
return ctr;
The problem with that code is that is seems really inefficient if there are thousands of numbers in the table. Also, I can write it in c# code or in a stored procedure whichever is more efficient.
Thanks for the help
Fine, if this question isn't going to be closed, I may as well Copy and paste my answer from the other one:
I called my table Blank, and used the following:
declare #StartOffset int = 2
; With Missing as (
select #StartOffset as N where not exists(select * from Blank where ID = #StartOffset)
), Sequence as (
select #StartOffset as N from Blank where ID = #StartOffset
union all
select b.ID from Blank b inner join Sequence s on b.ID = s.N + 1
)
select COALESCE((select N from Missing),(select MAX(N)+1 from Sequence))
You basically have two cases - either your starting value is missing (so the Missing CTE will contain one row), or it's present, so you count forwards using a recursive CTE (Sequence), and take the max from that and add 1
Tables:
create table Blank (
ID int not null,
Name varchar(20) not null
)
insert into Blank(ID,Name)
select 2 ,'Fred' union all
select 3 ,'Fred' union all
select 4 ,'Fred' union all
select 6 ,'Fred' union all
select 7 ,'Fred' union all
select 8 ,'Fred' union all
select 11 ,'Fred'
go
I would create a temp table containing all numbers from StartingNumber to EndNumber and LEFT JOIN to it to receive the list of rows not contained in the temp table.
If NumberTaken is indexed you could do it with a join on the same table:
select T.NumberTaken -1 as MISSING_NUMBER
from myTable T
left outer join myTable T1
on T.NumberTaken= T1.NumberTaken+1
where T1.NumberTaken is null and t.NumberTaken >= STARTING_NUMBER
order by T.NumberTaken
EDIT
Edited to get 1 too
1> select 1+ID as ID from #b as b
where not exists (select 1 from #b where ID = 1+b.ID)
2> go
ID
-----------
5
9
12
Take max(1+ID) and/or add your starting value to the where clause, depending on what you actually want.