Order grouped table by id user sql - sql

I want to order a grouped statement using as reference the number choosen by an specific user.
SELECT *
FROM likes
WHERE /**/
GROUP BY type
TABLE
id_user type
1420 1
1421 3
1422 3
1424 7
1425 4
1426 2
1427 1
expected result (at the end what user 1425 choosed)
1
2
3
7
4 //choosen by id_user 1425
I want to put the last row with the number choosed by the user. i just cant figure that out

You can aggregate and use a conditional max for ordering, like so:
select type
from likes
group by likes
order by max(case when id_user = 1425 then 1 else 0 end), type
If any row for the given type has an id_user that matches the chosen value, the conditional max returns 1, wich puts it last in the resultset. The second ordering criteria break the ties for groups that do not fulfill the condition.
If you are running MySQL, you can simplify the order by clause a little:
order by max(id_user = 1425), type

Related

Why different sequence is observed for same SQL query on different DBMS?

How does ORDER BY work when there are multiple rows with the same name?
I executed a query:
SELECT * FROM XYZ ORDER BY NAME;
On Oracle I get the following result:
Name
Uid
OwnerId
Test123
QuuNWWzUJKmZPC
iotNmQNGJKmZPC
Test123
NULL
NULL
On SQL Server I get the following result
Name
Uid
OwnerId
Test123
NULL
NULL
Test123
QuuNWWzUJKmZPC
iotNmQNGJKmZPC
Why is a different sequence shown? In Oracle the row with NULL appears 2nd whereas in SQL Server it appeared 1st. Is there any default behavior for each DBMS?
Is there any way to make SQL Server's result look like Oracle's?
An unordered result set is delivered with a non-deterministic ordering and it is possible that the order can change with every execution of the query.
If you apply an ORDER BY clause that defines a total ordering of the result set then that result set will be delivered in a deterministic ordering and the output will be identical with every execution of the query (assuming the underlying data set is unchanged).
If you apply an ORDER BY clause that defines a partial ordering of the result set then the results set will be partly ordered and partly non-deterministic.
For example, if you have the data set:
Name
Value
Alice
1
Beryl
2
Beryl
3
Alice
4
Alice
5
Beryl
6
and you use:
SELECT *
FROM table_name
ORDER BY name, value
Then that defines a total ordering of the result set such that it will be initially ordered by name, which defines a partial ordering, and then, within each set of rows with the same name, will be ordered by the value and the values are unique. This outputs:
Name
Value
Alice
1
Alice
4
Alice
5
Beryl
2
Beryl
3
Beryl
6
If you do not define a total ordering:
SELECT *
FROM table_name
ORDER BY name
Then the rows will be ordered by name but within the set of rows with the same names the individual rows can be in ANY order and this is a partial order and could output:
Name
Value
Alice
5
Alice
4
Alice
1
Beryl
3
Beryl
6
Beryl
2
but equally could output:
Name
Value
Alice
5
Alice
1
Alice
4
Beryl
6
Beryl
2
Beryl
3
Is there any way to make SQL Server's result look like Oracle's?
Do not apply a partial ordering; add columns to your ORDER BY clause to make it a total ordering and then the RDBMSes should output the result set in the same order.
SELECT *
FROM XYZ
ORDER BY
NAME ASC,
UID ASC NULLS LAST;

Athena array aggregate and filter multiple columns on condition

I have data as shown below.
uuid
movie
data
1
movie1
{title=rental, label=GA, price=50, feetype=rental, hidden=false}
1
movie1
{title=tax, label=GA, price=25, feetype=service-fees, hidden=true}
1
movie1
{title=rental, label=GA, price=50, feetype=rental, hidden=false}
1
movie1
{title=tax, label=GA, price=25, feetype=service-fees, hidden=true}
2
movie3
{title=rental, label=VIP, price=100, feetype=rental, hidden=false}
2
movie3
{title=tax, label=VIP, price=25, feetype=service-fees, hidden=true}
2
movie3
{title=promo, label=VIP, price=10, feetype=discount, hidden=false}
and, this is how i want the result to be like.
uuid
total_fee
total_discount
discount_type
1
150
0
NA
2
125
10
promo
I tried using
SELECT uuid
, sum("fee"."price") "total_fee"
, array_agg(distinct("fee"."feetype")) "fee_type"
, array_agg(distinct("fee"."title")) "fee_name"
This gives the result as shown below,
uuid
total_fee
fee_type
fee_name
1
100
[rental]
[rental]
1
50
[service-fees]
[tax]
2
100
[rental]
[rental]
2
25
[service-fees]
[tax]
2
10
[discount]
[promo]
Now how do I aggregate on total_fee and filter fee_name based on fee_type?
I tried using
, CASE WHEN regexp_like(array_join(fee_type, ','), 'discount') THEN sum("fee") ELSE 0 END "discount"
but that resulted in
SYNTAX_ERROR: line 207:6: '(CASE WHEN "regexp_like"("array_join"(fee_type, ','), 'discount') THEN "sum"("fee") ELSE 0 END)' must be an aggregate expression or appear in GROUP BY clause
You should be able to do something like this:
SELECT
uuid,
SUM(fee.price) AS total_fee,
SUM(fee.price) FILTER (WHERE fee.feetype = 'discount') AS total_discount,
ARBITRARY(fee.title) FILTER (WHERE fee.feetype = 'discount') AS discount_type
FROM …
GROUP BY uuid
(I'm assuming the data column in your example is the same as the fee column in your query).
Aggregate functions support a FILTER clause that selects the rows to include into the aggregation. This can also be achieved by e.g. SUM(IF(fee.feetype = 'discount', fee.price, 0)), which is more compact but not as elegant.
The ARBITRARY aggregate function picks an arbitrary value from the group. I don't know if that's appropriate in your case, but I assume that there will only be one discount row per group. If there are more than one you might want to use ARRAY_AGG with the DISTINCT clause (e.g. ARRAY_AGG(DISTINCT fee.title) to get the all).

SQL SUM only when each value within a group is greater than 0

Here is a sample data set:
ID Value
1 421
1 532
1 642
2 3413
2 0
2 5323
I want a query that, in this case, only sums ID=1 because all of its values are greater than 0. I cannot use a WHERE statement that says WHERE Value > 0 because then ID=2 would still return a value. I feel like this may be an instance where I could possibly use a OVER(PARTITION BY...) statement, but I am not familiar enough to use it creatively.
As an aside, I don't simply add a WHERE ID = 1 statement because this needs to cover a much larger data set.
Just use having:
select id, sum(value)
from t
group by id
having min(value) > 0;

Sort by specific order, including NULL, postgresql

best explained with an example:
So I have users table:
id name product
1 second NULL
2 first 27
3 first 27
4 last 6
5 second NULL
And I would like to order them in this product order: [27,NULL, 6]
So I will get:
id name product
2 first 27
3 first 27
1 second NULL
5 second NULL
4 last 6
(notice user id 3 can be before user id 2 since they both have the same product value)
Now without NULL I could do it like that:
SELECT id FROM users ORDER BY users.product=27, users.product=6;
How can I do it with NULL ?
p.s.
I would like to do that for many records so it should be efficient.
You can use case to produce custom sort order:
select id
from users
order by case
when product = 27
then 1
when product is null
then 2
when product = 6
then 3
end
As a note, you can follow your original approach. You just need a NULL-safe comparison:
SELECT id
FROM users
ORDER BY (NOT users.product IS DISTINCT FROM 27)::int DESC,
(user.product IS NULL)::int DESC,
(NOT users.product IS DISTINCT FROM 6)::int DESC;
The reason your version has unexpected results is because the first comparison can return NULL, which is ordered separately from the "true" and "false".

how to select one tuple in rows based on variable field value

I'm quite new into SQL and I'd like to make a SELECT statement to retrieve only the first row of a set base on a column value. I'll try to make it clearer with a table example.
Here is my table data :
chip_id | sample_id
-------------------
1 | 45
1 | 55
1 | 5986
2 | 453
2 | 12
3 | 4567
3 | 9
I'd like to have a SELECT statement that fetch the first line with chip_id=1,2,3
Like this :
chip_id | sample_id
-------------------
1 | 45 or 55 or whatever
2 | 12 or 453 ...
3 | 9 or ...
How can I do this?
Thanks
i'd probably:
set a variable =0
order your table by chip_id
read the table in row by row
if table[row]>variable, store the table[row] in a result array,increment variable
loop till done
return your result array
though depending on your DB,query and versions you'll probably get unpredictable/unreliable returns.
You can get one value using row_number():
select chip_id, sample_id
from (select chip_id, sample_id,
row_number() over (partition by chip_id order by rand()) as seqnum
) t
where seqnum = 1
This returns a random value. In SQL, tables are inherently unordered, so there is no concept of "first". You need an auto incrementing id or creation date or some way of defining "first" to get the "first".
If you have such a column, then replace rand() with the column.
Provided I understood your output, if you are using PostGreSQL 9, you can use this:
SELECT chip_id ,
string_agg(sample_id, ' or ')
FROM your_table
GROUP BY chip_id
You need to group your data with a GROUP BY query.
When you group, generally you want the max, the min, or some other values to represent your group. You can do sums, count, all kind of group operations.
For your example, you don't seem to want a specific group operation, so the query could be as simple as this one :
SELECT chip_id, MAX(sample_id)
FROM table
GROUP BY chip_id
This way you are retrieving the maximum sample_id for each of the chip_id.