I have the following problem:
I need to sort some products where one needs to be a specific row and others to be random.
So if I have products: A B C D, I need for example B to be the third product while others can be random like:
C 1
A 2
B 3
D 4
Best shot I have tried is (3 is a dynamic value):
SELECT
product_name,
CASE
WHEN product = 'B' THEN 3
ELSE ( CASE WHEN rownum < 3 THEN rownum ELSE rownum + 1 END )
END sorting
FROM
products
ORDER BY
sorting ASC;
but I'm not always getting the desired outcome.
Any help or lead is appreciated.
This is rather tricky, but you can use row_number() and a bunch of arithmetic:
select p.*
from (select p.*,
row_number() over (order by case when product = 'B' then 2 else 1 end),
dbms_random.value
) as seqnum
from products p
) p
order by (case when seqnum < 3 then seqnum end),
(case when product = 'B' then 1 else 2 end),
seqnum;
The logic is:
Enumerate the values randomly, with the special value going last.
Put in the rows with lower values.
Put in the row with the special value.
Put in the rest of the rows.
The above uses a subquery because the randomness is enforced. You can do this without a subquery as:
order by (case when row_number() over (order by (case when product = 'B' then 2 else 1 end) < 3
then dbms_random.value
else 2 -- bigger than value
end),
(case when product = 'B' then 1 else 2 end),
dbms_random.value;
Related
Those who have helped me before, i tend to use SAS9.4 a lot for my day to day work, however there are times when i need to use SQL Server
There is a output table i have with 2 variables (attached output.csv)
output table
ID, GROUP, DATE
The table has 830 rows:
330 have a "C" group
150 have a "A" group
50 have a "B" group
the remaining 300 have group as "TEMP"
within SQL i do not now how to programatically work out the total volume of A+B+C. The aim is to update "TEMP" column to ensure there is an Equal amount of "A" and "B" totalling 250 of each (the remainder of the total count)
so the table totals
330 have a "C" group
250 have a "A" group
250 have a "B" group
You want to proportion the "temp" to get equal amounts of "A" and "B".
So, the idea is to count up everything in A, B, and Temp and divide by 2. That is the final group size. Then you can use arithmetic to allocate the rows in Temp to the two groups:
select t.*,
(case when seqnum + a_cnt <= final_group_size then 'A' else 'B' end) as allocated_group
from (select t.*, row_number() over (order by newid()) as seqnum
from t
where group = 'Temp'
) t cross join
(select (cnt_a + cnt_b + cnt_temp) / 2 as final_group_size,
g.*
from (select sum(case when group = 'A' then 1 else 0 end) as cnt_a,
sum(case when group = 'B' then 1 else 0 end) as cnt_b,
sum(case when group = 'Temp' then 1 else 0 end) as cnt_temp
from t
) g
) g
SQL Server makes it easy to put this into an update:
with toupdate as (
select t.*,
(case when seqnum + a_cnt <= final_group_size then 'A' else 'B' end) as allocated_group
from (select t.*, row_number() over (order by newid()) as seqnum
from t
where group = 'Temp'
) t cross join
(select (cnt_a + cnt_b + cnt_temp) / 2 as final_group_size,
g.*
from (select sum(case when group = 'A' then 1 else 0 end) as cnt_a,
sum(case when group = 'B' then 1 else 0 end) as cnt_b,
sum(case when group = 'Temp' then 1 else 0 end) as cnt_temp
from t
) g
) g
)
update toupdate
set group = allocated_group;
I'd go with a top 250 update style approach
update top (250) [TableName] set Group = 'A' where exists (Select * from [TableName] t2 where t2.id = [TableName].id order by newid()) and Group = 'Temp'
update top (250) [TableName] set Group = 'B' where exists (Select * from [TableName] t2 where t2.id = [TableName].id order by newid()) and Group = 'Temp'
I am trying to select the 3 rows into 3 columns, but i get NULL values.
Here is my code so far:
SELECT * FROM
(
SELECT t_k
FROM m_t_k
WHERE p_id = 5 and t_k_id in (1,2,7)
) src
PIVOT(
MAX()
for t_k in ([1],[2],[3])
) piv
this is the result of the query without the PIVOT
and i want those rows to be on 3 columns
You could use ROW_NUMBER and a Cross Tab to achieve this. This is a bit of a guess, based on the query and image we have though, so it is untested:
SELECT MAX(CASE WHEN RN = 1 THEN sq.term_key END) AS term_key1,
MAX(CASE WHEN RN = 2 THEN sq.term_key END) AS term_key2,
MAX(CASE WHEN RN = 3 THEN sq.term_key END) AS term_key3
FROM (SELECT term_key,
ROW_NUMBER() OVER (ORDER BY term_key) AS RN
FROM mpos_term_key
WHERE profile_id = 5
AND term_keys_type_id IN (1, 2, 7)) sq;
I have a table with two columns in postgresql: original id and duplicate id.
Sample data:
original_id duplicate_id
1 1
2 2
3 3
4 4
5 5
6 6
I would like to randomly split this table in 50/50, so I can put a specific tag in each
Sample data:
original_id duplicate_id tag
1 1 control
2 2 treatment
3 3 treatment
4 4 control
5 5 treatment
6 6 control
What is important:
1. The selection has to be random
2. The split has to be 50/50 (or the closest to this if the number of rows is odd)
You can select half of the rows in a random order with this query:
select *
from my_table
order by random()
limit (select count(*)/ 2 from my_table)
Use it to tag the rows:
with control as (
select *
from my_table
order by random()
limit (select count(*)/ 2 from my_table)
)
select
*,
case when t in (select t from control t) then 'control' else 'treatment' end
from my_table t;
Working example in rextester.
You can use rownumber() OVER (ORDER BY random()) to assign a random number to each record. Then use it in a CASE to assign either the tag 'control' or 'treatment' depending on the number being less than (or equal) than the half of the count of rows in the table or not.
For a SELECT that looks like this:
SELECT original_id,
duplicate_id,
CASE
WHEN rn <= (SELECT count(*) / 2
FROM elbat) THEN
'control'
ELSE
'treatment'
END tag
FROM (SELECT original_id,
duplicate_id,
row_number() OVER (ORDER BY random()) rn
FROM elbat) x;
If you want an UPDATE (I'm not sure on this), assuming, that the pair of original_id and duplicate_id is unique, this could look like:
UPDATE elbat t
SET tag = CASE
WHEN rn <= (SELECT count(*) / 2
FROM elbat) THEN
'control'
ELSE
'treatment'
END
FROM (SELECT original_id,
duplicate_id,
row_number() OVER (ORDER BY random()) rn
FROM elbat) x
WHERE x.original_id = t.original_id
AND x.duplicate_id = t.duplicate_id;
db<>fiddle
(BTW, that SELECT result on the Fiddle gives a nice example, that the order of the rows returned can be totally different from the physical one, if the optimizer likes it better that way.)
I would use window functions:
select t.*,
(case when seqnum <= cnt / 2
then 'treatment' else 'control
end) as tag
from (select t.*,
count(*) over () as cnt,
row_number() over (order by random() as seqnum
from t
) t;
Actually, random is random. So, you don't need the count. You can use modulo arithmetic instead:
select t.*,
(case when row_number() over (order by random()) % 2 = 1
then 'treatment' else 'control'
end) as tag
from t;
You can make the random() generate the values 1 or 2 using the formula: (random() + 1)::int
select t.*,
case (random() + 1)::int
when 1 then 'treatment'
else 'control'
end as tag
from t;
In general, (random() * (upper_limit - 1) + lower_limit)::int will generate numbers between upper_limit and lower_limit (inclusive). If upper limit is 2 then the multiplication can be removed (because it would be * 1 which doesn't change anything), but if you want to e.g. generate four random values you can use that as well:
select t.*,
case (random() * 3 + 1)::int
when 1 then 'treatment'
when 2 then 'control'
when 3 then 'something'
else 'some other thing'
end as tag
from t;
Looking for help with an SQL query to turn a history table into flat file format with up to 5 instances of results on table B. I have only shown 2 instances in the results. For a bonus point can these be sorted by EFF_DATE ascending?!
My query so far is
SELECT a.REFNO, a.M_NAME, b.EFF_DATE, b.VAL
FROM TABLEA a INNER JOIN TABLEB b ON (a.REFNO=b.REFNO)
WHERE a.REFNO = '1'
This is fine for returning results once per row, but how do I modify so up to 5 EFF_DATE and VAL instances are repeated on one row. The dates can be any date and ideally would like them sorted ascending left to right. Only those rows on TABLEB where Val > 0 should be included.
If you know the number of columns you want in the history, then you can use conditional aggregation or pivot. The challenge is not having a column for the pivot.
You can easily generate one, though, using ROW_NUMBER():
SELECT a.REFNO, a.M_NAME,
MAX(CASE WHEN seqnum = 1 THEN b.EFF_DATE END) as EFF_DATE_1,
MAX(CASE WHEN seqnum = 1 THEN b.VAL END) as VAL_1,
MAX(CASE WHEN seqnum = 2 THEN b.EFF_DATE END) as EFF_DATE_2,
MAX(CASE WHEN seqnum = 2 THEN b.VAL END) as VAL_2,
MAX(CASE WHEN seqnum = 3 THEN b.EFF_DATE END) as EFF_DATE_3,
MAX(CASE WHEN seqnum = 3 THEN b.VAL END) as VAL_3
FROM TABLEA a INNER JOIN
(SELECT b.*,
ROW_NUMBER() OVER (PARTITION BY REFNO ORDER BY EFF_DATE) as seqnum
FROM TABLEB b
) b
ON a.REFNO = b.REFNO
WHERE a.REFNO = '1'
GROUP BY a.REFNO, a.M_NAME;
If you don't know the number of columns in the output, then you will need dynamic SQL or to do the formatting at the application layer.
I have a table Test with two columns.
Id Value
1 A
1 B
1 C
I want to get the result like below,
Id Value1 Value2 value3
1 A B C
How can I done this in SQL Server.
This is a pivot, but you don't have a column for the pivoting. row_number() can provide that. I usually use conditional aggregations for this.
select id,
max(case when seqnum = 1 then value end) as value1,
max(case when seqnum = 2 then value end) as value2,
max(case when seqnum = 3 then value end) as value3
from (select t.*,
row_number() over (partition by id order by (select null)) as seqnum
from t
) t
group by id;
Note that SQL tables represent unordered sets. So, there is no information about ordering and the values could be in any order. If a column does specify the ordering, then include that in the order by rather than select null.