Sort by data from multiple columns - sql

For customer reviews on my products, I have them stored in SQL something like the below:
durability | cost | appearance
----------------------------------
5 | 3 | 4
2 | 4 | 2
1 | 5 | 5
Each value is an out of five score in the three categories.
When I want to print this information on page, I'd like to order them in descending order by the average score of an individual review.
SELECT *
FROM reviews
ORDER BY (durability+cost+appearance)/3 DESC
Obviously this doesn't work, but is there a way to get my result? I don't want to include an average column in SQL because outside of this one small application, it serves zero purpose.

Use ORDER BY instead of SORT BY:
SELECT *
FROM reviews
ORDER BY (durability+cost+appearance)/3 DESC
EDIT:
To see the order by value, try adding one more column in the select clause:
SELECT *,(durability+cost+appearance)/3 as OrderValue
FROM reviews
ORDER BY (durability+cost+appearance)/3 DESC
Sample output:
DURABILITY COST APPEARANCE ORDERVALUE
5 3 4 4
1 5 5 3
2 4 2 2

Related

Split a quantity into multiple rows with limit on quantity per row

I have a table of ids and quantities that looks like this:
dbo.Quantity
id | qty
-------
1 | 3
2 | 6
I would like to split the quantity column into multiple lines and number them, but with a set limit (which can be arbitrary) on the maximum quantity allowed for each row.
So for the value of 2, expected output should be:
dbo.DesiredResult
id | qty | bucket
---------------
1 | 2 | 1
1 | 1 | 2
2 | 1 | 2
2 | 2 | 3
2 | 2 | 4
2 | 1 | 5
In other words,
Running SELECT id, SUM(qty) as qty FROM dbo.DesiredResult should return the original table (dbo.Quantity).
Running
SELECT id, SUM(qty) as qty FROM dbo.DesiredResult GROUP BY bucket
should give you this table.
id | qty | bucket
------------------
1 | 2 | 1
1 | 2 | 2
2 | 2 | 3
2 | 2 | 4
2 | 1 | 5
I feel I can do this with cursors imperitavely, looping over each row, keeping a counter that increments and resets as the "max" for each is filled. But this is very "anti-SQL" I feel there is a better way around this.
One approach is recursive CTE which emulates cursor sequentially going through rows.
Another approach that comes to mind is to represent your data as intervals and intersections of intervals.
Represent this:
id | qty
-------
1 | 3
2 | 6
as intervals [0;3), [3;9) with ids being their labels
0123456789
|--|-----|
1 2 - id
It is easy to generate this set of intervals using running total SUM() OVER().
Represent your buckets also as intervals [0;2), [2;4), [4;6), etc. with their own labels
0123456789
|-|-|-|-|-|
1 2 3 4 5 - bucket
It is easy to generate this set of intervals using a table of numbers.
Intersect these two sets of intervals preserving information about their labels.
Working with sets should be possible in a set-based SQL query, rather than a sequential cursor or recursion.
It is bit too much for me to write down the actual query right now. But, it is quite possible that ideas similar to those discussed in Packing Intervals by Itzik Ben-Gan may be useful here.
Actually, once you have your quantities represented as intervals you can generate required number of rows/buckets on the fly from the table of numbers using CROSS APPLY.
Imagine we transformed your Quantity table into Intervals:
Start | End | ID
0 | 3 | 1
3 | 9 | 2
And we also have a table of numbers - a table Numbers with column Number with values from 0 to, say, 100K.
For each Start and End of the interval we can calculate the corresponding bucket number by dividing the value by the bucket size and rounding down or up.
Something along these lines:
SELECT
Intervals.ID
,A.qty
,A.Bucket
FROM
Intervals
CROSS APPLY
(
SELECT
Numbers.Number + 1 AS Bucket
,#BucketSize AS qty
-- it is equal to #BucketSize if the bucket is completely within the Start and End boundaries
-- it should be adjusted for the first and last buckets of the interval
FROM Numbers
WHERE
Numbers.Number >= Start / #BucketSize
AND Numbers.Number < End / #BucketSize + 1
) AS A
;
You'll need to check and adjust formulas for errors +-1.
And write some CASE WHEN logic for calculating the correct qty for the buckets that happen to be on the lower and upper boundary of the interval.
Use a recursive CTE:
with cte as (
select id, 1 as n, qty
from t
union all
select id, n + 1, qty
from cte
where n + 1 < qty
)
select id, n
from cte;
Here is a db<>fiddle.

Count results in SQL statement additional row

I am trying to get 3% of total membership which the code below does, but the results are bringing me back two rows one has the % and the other is "0" not sure why or how to get rid of it ...
select
sum(Diabetes_FLAG) * 100 / (select round(count(medicaid_no) * 0.03) as percent
from membership) AS PERCENT_OF_Dia
from
prefinal
group by
Diabetes_Flag
Not sure why it brought back a second row I only need the % not the second row .
Not sure what I am doing wrong
Output:
PERCENT_OF_DIA
1 11.1111111111111
2 0
SELECT sum(Diabetes_FLAG)*100 / (SELECT round(count(medicaid_no)*0.03) as percentt
FROM membership) AS PERCENT_OF_Dia
FROM prefinal
WHERE Diabetes_FLAG = 1
# GROUP BY Diabetes_Flag # as you're limiting by the flag in the where clause, this isn't needed.
Remove the group by if you want one row:
select sum(Diabetes_FLAG)*100/( SELECT round(count(medicaid_no)*0.03) as percentt
from membership) AS PERCENT_OF_Dia
from prefinal;
When you include group by Diabetes_FLAG, it creates a separate row for each value of Diabetes_FLAG. Based on your results, I'm guessing that it takes on the values 0 and 1.
Not sure why it brought back a second row
This is how GROUP BY query works. The group by clause group data by a given column, that is - it collects all values of this column, makes a distinct set of these values and displays one row for each individual value.
Please consider this simple demo: http://sqlfiddle.com/#!9/3a38df/1
SELECT * FROM prefinal;
| Diabetes_Flag |
|---------------|
| 1 |
| 1 |
| 5 |
Usually GROUP BY column is listed in in SELECT clause too, in this way:
SELECT Diabetes_Flag, sum(Diabetes_Flag)
FROM prefinal
GROUP BY Diabetes_Flag;
| Diabetes_Flag | sum(Diabetes_Flag) |
|---------------|--------------------|
| 1 | 2 |
| 5 | 5 |
As you see, GROUP BY display two rows - one row for each unique value of Diabetes_Flag column.
If you remove Diabetes_Flag colum from SELECT clause, you will get the same result as above, but without this column:
SELECT sum(Diabetes_Flag)
FROM prefinal
GROUP BY Diabetes_Flag;
| sum(Diabetes_Flag) |
|--------------------|
| 2 |
| 5 |
So the reason that you get 2 rows is that Diabetes_Flag has 2 distict values in the table.

How does DISTINCT interact with ORDER BY?

Consider the two tables below:
user:
ID | name
---+--------
1 | Alice
2 | Bob
3 | Charlie
event:
order | user
------+------------
1 | 1 (Alice)
2 | 2 (Bob)
3 | 3 (Charlie)
4 | 3 (Charlie)
5 | 2 (Bob)
6 | 1 (Alice)
If I run the following query:
SELECT DISTINCT user FROM event ORDER BY "order" DESC;
will it be guaranteed that I get the results in the following order?
1 (Alice)
2 (Bob)
3 (Charlie)
If the three last rows of event are selected, I know this is the order I get, because it would be ordering 4, 5, 6 in descending order. But if the first three rows are selected, and then DISTINCT prevents the last tree to be loaded for consideration, I would get it in reversed order.
Is this behavior well defined in SQL? Which of the two will happen? What about in SQLite?
No, it will not be guaranteed.
Find Itzik Ben-Gan's Logical Query Processing Phases poster for MS SQL. It migrates over many sites, currently found at https://accessexperts.com/wp-content/uploads/2015/07/Logical-Query-Processing-Poster.pdf .
DISTINCT preceeds ORDER BY .. TOP and Sql Server is free to return any of 1 | 1 (Alice) or 6 | 1 (Alice) rows for Alice. So any of (1,2,3), (1,4,5) an so on are valid results of DISTINCT.
Here's a query solution that I believe solves your problem.
SELECT
MAX([order]) AS MaxOrd
, [user]
FROM Event
GROUP BY [User]
ORDER BY MaxOrd DESC

TSQL change in query to and query

I have one to many relationship table
ReviewId EffectId
1 | 2
1 | 5
1 | 8
2 | 2
2 | 5
2 | 9
2 | 3
3 | 3
3 | 2
3 | 9
In the site the users select each effect he chooses, and I get all the relevant review.
I make an in query
For example if the user select effects 2 and 5
My query: “
select reviewed from table_name where effected in(2,5)
Now I need get all the review that contain both effect
All reviews that has effect 2 and effect 5
What is the best query to make this?
Important for me that the query will run as quick as possible.
And for this I can also change the table schema (if needed ) like add a cached field that contain all the effect with comma like
Reviewed cachedEffects
1 | ,2,5,8
2 | ,2,5,9,3,
3 | ,3,2,9
You can do it this way:
select reviewid
from
tbl
where effectid in (2,5)
group by reviewid
having count(distinct effectid) > 1
Demo
count (distinct effectid) is used to ensure that the results contain only those reviewIDs which have multiple records with different values of effectID. The where clause is used to filter out based on your filter condition of having both 2 and 5.
The key thing to note here is that we are grouping by reviewID, and also using the count of distinct effectID values to ensure that only those records which have both 2 and 5 are returned. If we did not do so, the query would return all rows which have effectID equal to either 2 or 5.
For improving performance, you could create an index on reviewID.

PostgreSQL - repeating rows from LIMIT OFFSET

I noticed some repeating rows in a paginated recordset.
When I run this query:
SELECT "students".*
FROM "students"
ORDER BY "students"."status" asc
LIMIT 3 OFFSET 0
I get:
| id | name | status |
| 1 | foo | active |
| 12 | alice | active |
| 4 | bob | active |
Next query:
SELECT "students".*
FROM "students"
ORDER BY "students"."status" asc
LIMIT 3 OFFSET 3
I get:
| id | name | status |
| 1 | foo | active |
| 6 | cindy | active |
| 2 | dylan | active |
Why does "foo" appear in both queries?
Why does "foo" appear in both queries?
Because all rows that are returned have the same value for the status column. In that case the database is free to return the rows in any order it wants.
If you want a reproducable ordering you need to add a second column to your order by statement to make it consistent. E.g. the ID column:
SELECT students.*
FROM students
ORDER BY students.status asc,
students.id asc
If two rows have the same value for the status column, they will be sorted by the id.
For more details from PostgreSQL documentation (http://www.postgresql.org/docs/8.3/static/queries-limit.html) :
When using LIMIT, it is important to use an ORDER BY clause that constrains the result rows into a unique order. Otherwise you will get an unpredictable subset of the query's rows. You might be asking for the tenth through twentieth rows, but tenth through twentieth in what ordering? The ordering is unknown, unless you specified ORDER BY.
The query optimizer takes LIMIT into account when generating a query plan, so you are very likely to get different plans (yielding different row orders) depending on what you give for LIMIT and OFFSET. Thus, using different LIMIT/OFFSET values to select different subsets of a query result will give inconsistent results unless you enforce a predictable result ordering with ORDER BY. This is not a bug; it is an inherent consequence of the fact that SQL does not promise to deliver the results of a query in any particular order unless ORDER BY is used to constrain the order.
select * from(
Select "students".*
from "students"
order by "students"."status" asc
limit 6
) as temp limit 3 offset 0;
select * from(
Select "students".*
from "students"
order by "students"."status" asc
limit 6
) as temp limit 3 offset 3;
where 6 is the total number of records that is under examination.