SQL Query get common column with diff values in other columns - sql

I am not very fluent with SQL.. Im just facing a little issue in making the best and efficient sql query. I have a table with a composite key of column A and B as shown below
A
B
C
1
1
4
1
2
5
1
3
3
2
2
4
2
1
5
3
1
4
So what I need is to find rows where column C has both values of 4 and 5 (4 and 5 are just examples) for a particular value of column A. So 4 and 5 are present for two A values (1 and 2). For A value 3, 4 is present but 5 is not, hence we cannot take it.
My explanation is so confusing. I hope you get it.
After this, I need to find only those where B value for 4 (First Number) is less than B value for 5 (Second Number). In this case, for A=1, Row 1 (A-1, B-1,C-4) has B value lesser than Row 2 (A-1, B-2, C-5) So we take this row. For A = 2, Row 1(A-2,B-2,C-4) has B value greater than Row 2 (A-2,B-1,C-5) hence we cannot take it.
I Hope someone gets it and helps. Thanks.

Rows containing both c=4 and c=5 for a given a and ordered by b and by c the same way.
select a, b, c
from (
select tbl.*,
count(*) over(partition by a) cnt,
row_number() over (partition by a order by b) brn,
row_number() over (partition by a order by c) crn
from tbl
where c in (4, 5)
) t
where cnt = 2 and brn = crn;
EDIT
If an order if parameters matters, the position of the parameter must be set explicitly. Comparing b ordering to explicit parameter position
with params(val, pos) as (
select 4,2 union all
select 5,1
)
select a, b, c
from (
select tbl.*,
count(*) over(partition by a) cnt,
row_number() over (partition by a order by b) brn,
p.pos
from tbl
join params p on tbl.c = p.val
) t
where cnt = (select count(*) from params) and brn = pos;

I assume you want the values of a where this is true. If so, you can use aggregation:
select a
from t
where c in (4, 5)
group by a
having count(distinct c) = 2;

Related

Snowflake: Repeating rows based on column value

How to repeat rows based on column value in snowflake using sql.
I tried a few methods but not working such as dual and connect by.
I have two columns: Id and Quantity.
For each ID, there are different values of Quantity.
So if you have a count, you can use a generator:
with ten_rows as (
select row_number() over (order by null) as rn
from table(generator(ROWCOUNT=>10))
), data(id, count) as (
select * from values
(1,2),
(2,4)
)
SELECT
d.*
,r.rn
from data as d
join ten_rows as r
on d.count >= r.rn
order by 1,3;
ID
COUNT
RN
1
2
1
1
2
2
2
4
1
2
4
2
2
4
3
2
4
4
Ok let's start by generating some data. We will create 10 rows, with a QTY. The QTY will be randomly chosen as 1 or 2.
Next we want to duplicate the rows with a QTY of 2 and leave the QTY =1 as they are.
Obviously you can change all parameters above to suit your needs - this solution works super fast and in my opinion way better than table generation.
Simply stack SPLIT_TO_TABLE(), REPEAT() with a LATERAL() join and voila.
WITH TEN_ROWS AS (SELECT ROW_NUMBER()OVER(ORDER BY NULL)SOME_ID,UNIFORM(1,2,RANDOM())QTY FROM TABLE(GENERATOR(ROWCOUNT=>10)))
SELECT
TEN_ROWS.*
FROM
TEN_ROWS,LATERAL SPLIT_TO_TABLE(REPEAT('hire me $10/hour',QTY-1),'hire me $10/hour')ALTERNATIVE_APPROACH;

Counting the amount of object with the same value in another column

I want to count the amount of an occurrance the reattacht that the row back and couldn't find any good way to do it.
So 1 table would look like
id | value
1. a
2. a
3. b
4. a
5. b
6. b
7. c
8. c
9. a
which I would like to result in:
id | value | count
1. a, 4
2. a, 4
3. b, 3
4. a, 4
5. b, 3
6. b, 3
7. c, 2
8. c, 2
9. a, 4
I can only find answers with group by so any help is appreciated. This should also be matched to another table so if the result is joinable that would be helpful as well.
If your RDBMS support window functions, no need to join: you can just do a window count:
select t.*, count(*) over(partition by value) cnt from mytable t
select t.id, t.value, tmp.cnt
from your_table t
join
(
select value, count(*) as cnt
from your_table
group by value
) tmp on tmp.value = t.value

How to check the possibility of groups union in a sequence order

I have a table with the following columns:
ID_group, ID_elements
For example with the following records:
1, 1
1, 2
2, 2
2, 4
2, 5
2, 6
3, 7
And I have sets of the elements, for example: 1,2,5; 1,5,2; 1,2,4; 2,7;
I need to check (true or false) that exist a common group for the pairs of adjacent elements.
For example elements:
1,2,5 -> true [i.e. elements 1,2 has common group 1 and elements 2,5 has common group 2]
1,5,2 -> false [i.e. 1,5 do not have a common group unlike 5,2 (but the result is false due to 1,5 - false)]
1,2,4 -> true
2,7 -> false
First, we need a list of pairs. We can get this by taking your set as an array, turning each element into a row with unnest and then making pairs by matching each row with its previous row using lag.
with nums as (
select *
from unnest(array[1,2,5]) i
)
select lag(i) over() a, i b
from nums
offset 1;
a | b
---+---
1 | 2
2 | 5
(2 rows)
Then we join each pair with each matching row. To avoid counting duplicate data rows twice, we count only the distinct rows.
with nums as (
select *
from unnest(array[1,2,5]) i
), pairs as (
select lag(i) over() a, i b
from nums
offset 1
)
select
count(distinct(id_group,id_elements)) = (select count(*) from pairs)
from pairs
join foo on foo.id_group = a and foo.id_elements = b;
This works on any size array.
dbfiddle
Your query to check if elements in a set evaluate to true or not can be done via procedures/function. Set representation can be taken as a string and then splitting it to substring then returning the required result can use a record for multiple entries. For sql query, below is a sample that can be used as a workaround, you can try changing the below query based on your requirement.
select case when ( Select count(*)
from ( SELECT
id_group, count(distinct id_elements)
from table where
id_group
in (1,2,5)
group by ID_group having
id_elements
in (1,2,5)) =3 ) then "true" else "false"
end) from table;
#Schwern, thank you, it helped. But I have changed the condition join foo on foo.id_group = a, because as I understand, a is element's ID, not group's. I have changed the following section:
join foo fA on fA.id_elements = a
join foo fB on fB.id_elements = b and fA.group_id = fB.group_id;

Populate a sql table with duplicate data except for one column

I have a sql table :
Levels
LevelId Min Product
1 x 1
2 y 1
3 z 1
4 a 1
I need to duplicate the same data into the database by changing only the product Id from 1 2,3.... 40
example
LevelId Min Product
1 x 2
2 y 2
3 z 2
4 a 2
I could do something like
INSERT INTO dbo.Levels SELECT top 4 * fROM dbo.Levels
but that would just copy paste the data.
Is there a way I can copy the data and paste it changing only the Product value?
You're most of the way there - you just need to take one more logical step:
INSERT INTO dbo.Levels (LevelID, Min, Product)
SELECT LevelID, Min, 2 FROM dbo.Levels WHERE Product = 1
...will duplicate your rows with a different product ID.
Also consider that WHERE Product = 1 is going to be more reliable than TOP 4. Once you have more than four rows in the table, you will not be able to guarantee that TOP 4 will return the same four rows unless you also add an ORDER BY to the select, however WHERE Product = ... will always return the same rows, and will continue to work even if you add an extra row with a product ID of 1 (where as you'd have to consider changing TOP 4 to TOP 5, and so on if extra rows are added).
You can generate the product id's and then load them in:
with cte as (
select 2 as n
union all
select n + 1
from cte
where n < 40
)
INSERT INTO dbo.Levels(`min`, product)
SELECT `min`, cte.n as product
fROM dbo.Levels l cross join
cte
where l.productId = 1;
This assumes that the LevelId is an identity column, that auto-increments on insert. If not:
with cte as (
select 2 as n
union all
select n + 1
from cte
where n < 40
)
INSERT INTO dbo.Levels(levelid, `min`, product)
SELECT l.levelid+(cte.n-1)*4, `min`, cte.n as product
fROM dbo.Levels l cross join
cte
where l.productId = 1;
INSERT INTO dbo.Levels (LevelId, Min, Product)
SELECT TOP 4
LevelId,
Min,
2
FROM dbo.Levels
You can include expressions in the SELECT statement, either hard-coded values or something like Product + 1 or anything else.
I expect you probably wouldn't want to insert the LevelId though, but left that there to match your sample. If you don't want that just remove it from the INSERT and SELECT sections.
You could use a CROSS JOIN against a numbers table, for example.
WITH
L0 AS(SELECT 1 AS C UNION ALL SELECT 1 AS O), -- 2 rows
L1 AS(SELECT 1 AS C FROM L0 AS A CROSS JOIN L0 AS B), -- 4 rows
Nums AS(SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS N FROM L1)
SELECT
lvl.[LevelID],
lvl.[Min],
num.[N]
FROM dbo.[Levels] lvl
CROSS JOIN Nums num
This would duplicate 4 times.

SQL random number that doesn't repeat within a group

Suppose I have a table:
HH SLOT RN
--------------
1 1 null
1 2 null
1 3 null
--------------
2 1 null
2 2 null
2 3 null
I want to set RN to be a random number between 1 and 10. It's ok for the number to repeat across the entire table, but it's bad to repeat the number within any given HH. E.g.,:
HH SLOT RN_GOOD RN_BAD
--------------------------
1 1 9 3
1 2 4 8
1 3 7 3 <--!!!
--------------------------
2 1 2 1
2 2 4 6
2 3 9 4
This is on Netezza if it makes any difference. This one's being a real headscratcher for me. Thanks in advance!
To get a random number between 1 and the number of rows in the hh, you can use:
select hh, slot, row_number() over (partition by hh order by random()) as rn
from t;
The larger range of values is a bit more challenging. The following calculates a table (called randoms) with numbers and a random position in the same range. It then uses slot to index into the position and pull the random number from the randoms table:
with nums as (
select 1 as n union all select 2 union all select 3 union all select 4 union all select 5 union all
select 6 union all select 7 union all select 8 union all select 9
),
randoms as (
select n, row_number() over (order by random()) as pos
from nums
)
select t.hh, t.slot, hnum.n
from (select hh, randoms.n, randoms.pos
from (select distinct hh
from t
) t cross join
randoms
) hnum join
t
on t.hh = hnum.hh and
t.slot = hnum.pos;
Here is a SQLFiddle that demonstrates this in Postgres, which I assume is close enough to Netezza to have matching syntax.
I am not an expert on SQL, but probably do something like this:
Initialize a counter CNT=1
Create a table such that you sample 1 row randomly from each group and a count of null RN, say C_NULL_RN.
With probability C_NULL_RN/(10-CNT+1) for each row, assign CNT as RN
Increment CNT and go to step 2
Well, I couldn't get a slick solution, so I did a hack:
Created a new integer field called rand_inst.
Assign a random number to each empty slot.
Update rand_inst to be the instance number of that random number within this household. E.g., if I get two 3's, then the second 3 will have rand_inst set to 2.
Update the table to assign a different random number anywhere that rand_inst>1.
Repeat assignment and update until we converge on a solution.
Here's what it looks like. Too lazy to anonymise it, so the names are a little different from my original post:
/* Iterative hack to fill 6 slots with a random number between 1 and 13.
A random number *must not* repeat within a household_id.
*/
update c3_lalfinal a
set a.rand_inst = b.rnum
from (
select household_id
,slot_nbr
,row_number() over (partition by household_id,rnd order by null) as rnum
from c3_lalfinal
) b
where a.household_id = b.household_id
and a.slot_nbr = b.slot_nbr
;
update c3_lalfinal
set rnd = CAST(0.5 + random() * (13-1+1) as INT)
where rand_inst>1
;
/* Repeat until this query returns 0: */
select count(*) from (
select household_id from c3_lalfinal group by 1 having count(distinct(rnd)) <> 6
) x
;