Get Items with smallest vote count including 0 in postgresql - sql

I am working on a project where I need to get the 2 items with the least amount of votes where I have 2 tables an item table and a votes table with a forgienkey of ItemId.
I have this query:
SELECT id FROM (
SELECT "ItemId" AS id,
count("ItemId") AS total
FROM "Votes"
WHERE "ItemId" IN (
SELECT id FROM "Items"
WHERE date("Items"."createdAt") = date('2015-05-26 18:30:00.565+00')
AND "Items"."region" = 'west'
)
GROUP BY "ItemId" ORDER BY total LIMIT 2
) x;
Which in some respects is fine but it doesn't include the Items with the count being null or 0. Is there a better way to do this?
Thanks. Please let me know if you need more info.
Postgresql: 9.4

something like this should work:
SELECT id,
coalesce((SELECT count(*) FROM "Votes" WHERE "ItemId" = "Items".id), 0) as total
FROM "Items"
WHERE date("Items"."createdAt") = date('2015-05-26 18:30:00.565+00')
AND "Items"."region" = 'west'
ORDER BY total LIMIT 2

If an item has not been voted for, then the "Votes" table will not return anything for it and therefore the main query does not display the item at all.
You need to select from "Items" and then LEFT JOIN to "Votes" grouped by "ItemId" and the count of votes for it. Like this, all the items will be considered, also those for which no votes have been cast. Use the coalesce() function to convert NULLs to 0:
SELECT "Items".id, coalesce(x.total, 0) AS cnt
FROM "Items"
LEFT JOIN (
SELECT "ItemId" AS id, count("ItemId") AS total
FROM "Votes"
GROUP BY "ItemId") x USING (id)
WHERE date("Items"."createdAt") = '2015-05-26'::date
AND "Items"."region" = 'west'
ORDER BY cnt
LIMIT 2;

Related

How to deselect duplicate entries in a query?

I've got a query like this:
SELECT *
FROM RecipeTable, RecipeIngredientTable, SyncRecipeIngredientTable
WHERE RecipeTable.recipe_id = SyncRecipeIngredientTable.recipe_id
AND RecipeIngredientTable.recipe_ingredient_id =
SyncRecipeIngredientTable.recipe_ingredient_id
AND RecipeIngredientTable.recipe_item_name in ("ayva", "pirinç", "su")
GROUP by RecipeTable.recipe_id
HAVING COUNT(*) >= 3;
and this query returns the result like this:
As you can see in the image there is 3 duplicate, unnecessary entries (no, i can't delete them because of the multiple foreign keys). How can I deselect these duplicate entries from the result query? In the end I want to return 6 entries not 9.
What you want to eliminate in the result set is not duplication of recipe_id values but recipe_name values.
You just need to group(partition) by recipe_name through use of ROW_NUMBER() analytic function :
SELECT recipe_id, author_name ...
FROM
(
SELECT ROW_NUMBER() OVER (PARTITION BY recipe_name) AS rn,
sr.recipe_id, author_name ...
FROM SyncRecipeIngredientTable sr
JOIN RecipeIngredientTable ri
ON ri.recipe_ingredient_id = sr.recipe_ingredient_id
JOIN RecipeTable rt
ON rt.recipe_id = sr.recipe_id
WHERE ri.recipe_item_name in ("ayva", "pirinç", "su")
)
WHERE rn = 1
This way, you can pick only one of the records with rn=1 (ORDER BY Clause might be added to that analytic function after PARTITION BY clause if spesific record is needed to be picked)

Merging two query results in a materialized view

Im trying to merge two SELECT results into one view.
The first query returns the id's of all registered users.
The second query goes through an entire table and counts how many victories a player has and returns the id of the player and number of wins.
What I'm trying to do now is to merge these two results, so that if the user has wins it states how many but if he doesn't then it says 0.
I tried doing it like this:
SELECT profile.user_id
FROM profile
FULL JOIN ( SELECT player_game_data.user_id,
count(player_game_data.user_id) AS wins
FROM player_game_data
WHERE player_game_data.is_winner = 1
GROUP BY player_game_data.user_id) t2 ON profile.user_id::text = t2.user_id::text;
But in the end it only returns id's of the players and there isn't a count column:
What am I doing wrong?
Is this what you want?
select p.*,
(select count(*)
from player_game_data pg
where pg.user_id = p.user_id and pg.is_winner = 1
) as num_wins
from profile p;
Or, if all users have played at least one game, you can use conditional aggregation:
select pg.user_id,
count(*) filter (where pg.is_winner = 1)
from player_game_data pg
group by pg.user_id;
Or, if is_winner only takes on the values of 0 and 1:
select pg.user_id, sum(ps.is_winner)
from player_game_data pg
group by pg.user_id;
Thanks for the help Gordon. I've got it to work now.
The final query looks like this :
SELECT p.user_id,
( SELECT count(*) AS count
FROM player_game_data pg
WHERE pg.user_id::text = p.user_id::text AND pg.is_winner = 1) AS wins,
( SELECT count(*) AS count
FROM player_game_data pg
WHERE pg.user_id::text = p.user_id::text AND pg.is_winner = 0) AS losses,
( SELECT count(*) AS count
FROM player_game_data pg
WHERE pg.user_id::text = p.user_id::text) AS games_played
FROM profile p;
And when I run it I get the result that i wanted:

UPDATE FROM subquery using the same table in subquery's WHERE

I have 2 integer fields in a table "user": leg_count and leg_length. The first one stores the amount of legs of a user and the second one - their total length.
Each leg that belongs to user is stored in separate table, as far as typical internet user can have zero to infinity legs:
CREATE TABLE legs (
user_id int not null,
length int not null
);
I want to recalculate the statistics for all users in one query, so I try:
UPDATE users SET
leg_count = subquery.count, leg_length = subquery.length
FROM (
SELECT COUNT(*) as count, SUM(length) as length FROM legs WHERE legs.user_id = users.id
) AS subquery;
and get "subquery in FROM cannot refer to other relations of same query level" error.
So I have to do
UPDATE users SET
leg_count = (SELECT COUNT(*) FROM legs WHERE legs.user_id = users.id),
leg_length = (SELECT SUM(length) FROM legs WHERE legs.user_id = users.id)
what makes database to perform 2 SELECT's for each row, although, required data could be calculated in one SELECT:
SELECT COUNT(*), SUM(length) FROM legs;
Is it possible to optimize my UPDATE query to use only one SELECT subquery?
I use PostgreSQL, but I beleive, the solution exists for any SQL dialect.
TIA.
I would do:
WITH stats AS
( SELECT COUNT(*) AS cnt
, SUM(length) AS totlength
, user_id
FROM legs
GROUP BY user_id
)
UPDATE users
SET leg_count = cnt, leg_length = totlength
FROM stats
WHERE stats.user_id = users.id
You could use PostgreSQL's extended update syntax:
update users as u
set leg_count = aggr.cnt
, leg_length = aggr.length
from (
select legs.user_id
, count(*) as cnt
, sum(length) as length
from legs
group by
legs.user_id
) as aggr
where u.user_id = aggr.user_id

Getting row number for query

I have a query which will return one row. Is there any way I can find the row index of the row I'm querying when the table is sorted?
I've tried rowid but got #582 when I was expecting row #7.
Eg:
CategoryID Name
I9GDS720K4 CatA
LPQTOR25XR CatB
EOQ215FT5_ CatC
K2OCS31WTM CatD
JV5FIYY4XC CatE
--> C_L7761O2U CatF <-- I want this row (#5)
OU3XC6T19K CatG
L9YKCYAYMG CatH
XKWMQ7HREG CatI
I've tried rowid with unexpected results:
SELECT rowid FROM Categories WHERE CategoryID = 'C_L7761O2U ORDER BY Name
EDIT: I've also tried J Cooper's suggestion (below), but the row numbers just aren't right.
using (var cmd = conn.CreateCommand()) {
cmd.CommandText = string.Format(#"SELECT (SELECT COUNT(*) FROM Recipes AS t2 WHERE t2.RecipeID <= t1.RecipeID) AS row_Num
FROM Recipes AS t1
WHERE RecipeID = 'FB3XSAXRWD'
ORDER BY Name";
cmd.Parameters.AddWithValue("#recipeId", id);
idx = Convert.ToInt32(cmd.ExecuteScalar());
Here is a way to get the row number in Sqlite:
SELECT CategoryID,
Name,
(SELECT COUNT(*)
FROM mytable AS t2
WHERE t2.Name <= t1.Name) AS row_Num
FROM mytable AS t1
ORDER BY Name, CategoryID;
Here's a funny trick you can use in Spatialite to get the order of values. If you use the count() function with a WHERE clause limiting to only values >= the current value, then the count will actually give the order. So if I have a point layer called "mypoints" with columns "value" and "val_order" then:
SELECT value, (
SELECT count(*) FROM mypoints AS my
WHERE my.value>=mypoints.value) AS val_order
FROM mypoints
ORDER BY value DESC;
Gives the descending order of the values.
I can update the "val_order" column this way:
UPDATE mypoints SET val_order = (
SELECT count(*) FROM mypoints AS my
WHERE my.value>=mypoints.value
);
What you are asking can be explained in two different ways, but I'm assuming you want to sort the resulting table and then number those rows according to the sort.
declare #resultrow int
select
#resultrow = row_number() OVER (ORDER BY Name Asc) as 'Row Number'
from Categories WHERE CategoryID = 'C_L776102U'
select #resultrow

select least row per group in SQL

I am trying to select the min price of each condition category. I did some search and wrote the code below. However, it shows null for the selected fields. Any solution?
SELECT Sales.Sale_ID, Sales.Sale_Price, Sales.Condition
FROM Items
LEFT JOIN Sales ON ( Items.Item_ID = Sales.Item_ID
AND Sales.Expires_DateTime > NOW( )
AND Sales.Sale_Price = (
SELECT MIN( s2.Sale_Price )
FROM Sales s2
WHERE Sales.`Condition` = s2.`Condition` ) )
WHERE Items.ISBN =9780077225957
A little more complicated solution, but one that includes your Sale_ID is below.
SELECT TOP 1 Sale_Price, Sale_ID, Condition
FROM Sales
WHERE Sale_Price IN (SELECT MIN(Sale_Price)
FROM Sales
WHERE
Expires_DateTime > NOW()
AND
Item_ID IN
(SELECT Item_ID FROM Items WHERE ISBN = 9780077225957)
GROUP BY Condition )
The 'TOP 1' is there in case more than 1 sale had the same minimum price and you only wanted one returned.
(internal query taken directly from #Michael Ames answer)
If you don't need Sales.Sale_ID, this solution is simpler:
SELECT MIN(Sale_Price), Condition
FROM Sales
WHERE Expires_DateTime > NOW()
AND Item_ID IN
(SELECT Item_ID FROM Items WHERE ISBN = 9780077225957)
GROUP BY Condition
Good luck!