UPDATE FROM subquery using the same table in subquery's WHERE - sql

I have 2 integer fields in a table "user": leg_count and leg_length. The first one stores the amount of legs of a user and the second one - their total length.
Each leg that belongs to user is stored in separate table, as far as typical internet user can have zero to infinity legs:
CREATE TABLE legs (
user_id int not null,
length int not null
);
I want to recalculate the statistics for all users in one query, so I try:
UPDATE users SET
leg_count = subquery.count, leg_length = subquery.length
FROM (
SELECT COUNT(*) as count, SUM(length) as length FROM legs WHERE legs.user_id = users.id
) AS subquery;
and get "subquery in FROM cannot refer to other relations of same query level" error.
So I have to do
UPDATE users SET
leg_count = (SELECT COUNT(*) FROM legs WHERE legs.user_id = users.id),
leg_length = (SELECT SUM(length) FROM legs WHERE legs.user_id = users.id)
what makes database to perform 2 SELECT's for each row, although, required data could be calculated in one SELECT:
SELECT COUNT(*), SUM(length) FROM legs;
Is it possible to optimize my UPDATE query to use only one SELECT subquery?
I use PostgreSQL, but I beleive, the solution exists for any SQL dialect.
TIA.

I would do:
WITH stats AS
( SELECT COUNT(*) AS cnt
, SUM(length) AS totlength
, user_id
FROM legs
GROUP BY user_id
)
UPDATE users
SET leg_count = cnt, leg_length = totlength
FROM stats
WHERE stats.user_id = users.id

You could use PostgreSQL's extended update syntax:
update users as u
set leg_count = aggr.cnt
, leg_length = aggr.length
from (
select legs.user_id
, count(*) as cnt
, sum(length) as length
from legs
group by
legs.user_id
) as aggr
where u.user_id = aggr.user_id

Related

Including count combinations with null value in SQL

I have one dataset, and am trying to list all of the combinations of said dataset. However, I am unable to figure out how to include the combinations that are null. For example, Longitudinal? can be no and cohort can be 11-20, however for Region 1, there were no patients of that age in that region. How can I show a 0 for the count?
Here is the code:
SELECT "s_safe_005prod"."ig_eligi_group1"."site_name" AS "Site Name",
"s_safe_005prod"."ig_eligi_group1"."il_eligi_ellong" AS "Longitudinal?",
"s_safe_005prod"."ig_eligi_group1"."il_eligi_elcohort" AS "Cohort",
count(*) AS "count"
FROM "s_safe_005prod"."ig_eligi_group1"
GROUP BY "s_safe_005prod"."ig_eligi_group1"."site_name",
"s_safe_005prod"."ig_eligi_group1"."il_eligi_ellong",
"s_safe_005prod"."ig_eligi_group1"."il_eligi_elcohort"
ORDER BY "s_safe_005prod"."ig_eligi_group1"."site_name",
"s_safe_005prod"."ig_eligi_group1"."il_eligi_ellong" ASC,
"s_safe_005prod"."ig_eligi_group1"."il_eligi_elcohort" ASC
Create a cross join across the unique values from each of the three grouping fields to create a set of all possible combinations. Then left join that to the counts you have originally and coalesce null values to zero.
WITH groups AS
(
SELECT a.site_name, b.longitudinal, c.cohort
FROM (SELECT DISTINCT site_name FROM s_safe_005prod.ig_eligi_group1) a,
(SELECT DISTINCT il_eligi_ellong AS longitudinal FROM s_safe_005prod.ig_eligi_group1) b,
(SELECT DISTINCT il_eligi_elcohort AS cohort FROM s_safe_005prod.ig_eligi_group1) c
),
dat AS
(
SELECT site_name,
il_eligi_ellong AS longitudinal,
il_eligi_elcohort AS cohort,
count(*) AS "count"
FROM s_safe_005prod.ig_eligi_group1
GROUP BY site_name,
il_eligi_ellong,
il_eligi_elcohort
)
SELECT groups.site_name,
groups.longitudinal,
groups.cohort,
COALESCE(dat.[count],0) AS "count"
FROM groups
LEFT JOIN dat ON groups.site_name = dat.site_name
AND groups.longitudinal = dat.longitudinal
AND groups.cohort = dat.cohort;

Merging two query results in a materialized view

Im trying to merge two SELECT results into one view.
The first query returns the id's of all registered users.
The second query goes through an entire table and counts how many victories a player has and returns the id of the player and number of wins.
What I'm trying to do now is to merge these two results, so that if the user has wins it states how many but if he doesn't then it says 0.
I tried doing it like this:
SELECT profile.user_id
FROM profile
FULL JOIN ( SELECT player_game_data.user_id,
count(player_game_data.user_id) AS wins
FROM player_game_data
WHERE player_game_data.is_winner = 1
GROUP BY player_game_data.user_id) t2 ON profile.user_id::text = t2.user_id::text;
But in the end it only returns id's of the players and there isn't a count column:
What am I doing wrong?
Is this what you want?
select p.*,
(select count(*)
from player_game_data pg
where pg.user_id = p.user_id and pg.is_winner = 1
) as num_wins
from profile p;
Or, if all users have played at least one game, you can use conditional aggregation:
select pg.user_id,
count(*) filter (where pg.is_winner = 1)
from player_game_data pg
group by pg.user_id;
Or, if is_winner only takes on the values of 0 and 1:
select pg.user_id, sum(ps.is_winner)
from player_game_data pg
group by pg.user_id;
Thanks for the help Gordon. I've got it to work now.
The final query looks like this :
SELECT p.user_id,
( SELECT count(*) AS count
FROM player_game_data pg
WHERE pg.user_id::text = p.user_id::text AND pg.is_winner = 1) AS wins,
( SELECT count(*) AS count
FROM player_game_data pg
WHERE pg.user_id::text = p.user_id::text AND pg.is_winner = 0) AS losses,
( SELECT count(*) AS count
FROM player_game_data pg
WHERE pg.user_id::text = p.user_id::text) AS games_played
FROM profile p;
And when I run it I get the result that i wanted:

My query is returning duplicates

I have written an SQL query to filter for a number of conditions, and have used distinct to find only unique records.
Specifically, I need only for the AccountID field to be unique, there are multiple AddressClientIDs for each AccountID.
The query works but is however producing some duplicates.
Further caveats are:
There are multiple trans for each AccountID
There can be trans record both Y and N for an AccountID
I only want to return AccountIDs which have transaction for statuses other than what's specified, hence why I used not in, as I do not want the 2 statuses.
I would like to find only unique values for the AccountID column.
If anyone could help refine the query below, it would be much appreciated.
SELECT AFS_Account.AddressClientID
,afs_transunit.AccountID
,SUM(afs_transunit.Units)
FROM AFS_TransUnit
,AFS_Account
WHERE afs_transunit.AccountID IN (
-- Gets accounts which only have non post statuses
SELECT DISTINCT accountid
FROM afs_trans
WHERE accountid NOT IN (
SELECT accountid
FROM afs_trans
WHERE STATUS IN (
'POSTPEND'
,'POSTWAIT'
)
)
-- This gets the unique accountIDs which only have transactions with Y status,
-- and removes any which have both Y and N.
AND AccountID IN (
SELECT DISTINCT accountid
FROM afs_trans
WHERE IsAllocated = 'Y'
AND accountid NOT IN (
SELECT DISTINCT AccountID
FROM afs_trans
WHERE IsAllocated = 'N'
)
)
)
AND AFS_TransUnit.AccountID = AFS_Account.AccountID
GROUP BY afs_transunit.AccountID
,AFS_Account.AddressClientID
HAVING SUM(afs_transunit.Units) > 100
Thanks.
Since you confirmed that you have one-to-many relationship across two tables on AccountID column, you could use Max value of your AccountID to get distinct values:
SELECT afa.AddressClientID
,MAX(aft.AccountID)
,SUM(aft.Units)
FROM AFS_TransUnit aft
INNER JOIN AFS_Account afa ON aft.AccountID = afa.AccountID
GROUP BY afa.AddressClientID
HAVING SUM(aft.Units) > 100
AND MAX(aft.AccountID) IN (
-- Gets accounts which only have non post statuses
-- This gets the unique accountIDs which only have transactions with Y status,
-- and removes any which have both Y and N.
SELECT DISTINCT accountid
FROM afs_trans a
WHERE [STATUS] NOT IN ('POSTPEND','POSTWAIT')
AND a.accountid IN (
SELECT t.accountid
FROM (
SELECT accountid
,max(isallocated) AS maxvalue
,min(isallocated) AS minvalue
FROM afs_trans
GROUP BY accountid
) t
WHERE t.maxvalue = 'Y'
AND t.minvalue = 'Y'
)
)
SELECT AFS_Account.AddressClientID
,afs_transunit.AccountID
,SUM(afs_transunit.Units)
FROM AFS_TransUnit
INNER JOIN AFS_Account ON AFS_TransUnit.AccountID = AFS_Account.AccountID
INNER JOIN afs_trans ON afs_trans.acccountid = afs_transunit.accountid
WHERE afs_trans.STATUS NOT IN ('POSTPEND','POSTWAIT')
-- AND afs_trans.isallocated = 'Y'
GROUP BY afs_transunit.AccountID
,AFS_Account.AddressClientID
HAVING SUM(afs_transunit.Units) > 100
and max(afs_trans.isallocated) = 'Y'
and min(afs_trans.isallocated) = 'Y'
Modified your query with ANSI SQL join syntax. As you are joining the tables, you just need to specify the conditions without using the sub-queries you have.

Get Items with smallest vote count including 0 in postgresql

I am working on a project where I need to get the 2 items with the least amount of votes where I have 2 tables an item table and a votes table with a forgienkey of ItemId.
I have this query:
SELECT id FROM (
SELECT "ItemId" AS id,
count("ItemId") AS total
FROM "Votes"
WHERE "ItemId" IN (
SELECT id FROM "Items"
WHERE date("Items"."createdAt") = date('2015-05-26 18:30:00.565+00')
AND "Items"."region" = 'west'
)
GROUP BY "ItemId" ORDER BY total LIMIT 2
) x;
Which in some respects is fine but it doesn't include the Items with the count being null or 0. Is there a better way to do this?
Thanks. Please let me know if you need more info.
Postgresql: 9.4
something like this should work:
SELECT id,
coalesce((SELECT count(*) FROM "Votes" WHERE "ItemId" = "Items".id), 0) as total
FROM "Items"
WHERE date("Items"."createdAt") = date('2015-05-26 18:30:00.565+00')
AND "Items"."region" = 'west'
ORDER BY total LIMIT 2
If an item has not been voted for, then the "Votes" table will not return anything for it and therefore the main query does not display the item at all.
You need to select from "Items" and then LEFT JOIN to "Votes" grouped by "ItemId" and the count of votes for it. Like this, all the items will be considered, also those for which no votes have been cast. Use the coalesce() function to convert NULLs to 0:
SELECT "Items".id, coalesce(x.total, 0) AS cnt
FROM "Items"
LEFT JOIN (
SELECT "ItemId" AS id, count("ItemId") AS total
FROM "Votes"
GROUP BY "ItemId") x USING (id)
WHERE date("Items"."createdAt") = '2015-05-26'::date
AND "Items"."region" = 'west'
ORDER BY cnt
LIMIT 2;

Why does this mysql query give me garbage results?

Im trying to get the total amount of points a user has, as well as current month's points. When a user gets a point, it gets logged into the points table with a timestamp. Totals ignore the timestamp, while the current month's points looks for the points with the correct timestamp (from the first day of the month).
SELECT user_id, user_name, sum(tpoints.point_points) as total_points, sum(mpoints.point_points) as month_points
FROM users
LEFT JOIN points tpoints
ON users.user_id = tpoints.point_userid
LEFT JOIN points mpoints
ON (users.user_id = mpoints.point_userid AND mpoints.point_date > '$this_month')
WHERE user_id = 1
GROUP BY user_id
points table structure
CREATE TABLE IF NOT EXISTS `points` (
`point_userid` int(11) NOT NULL,
`point_points` int(11) NOT NULL,
`point_date` int(11) NOT NULL,
KEY `point_userid` (`point_userid`),
KEY `point_date` (`point_date`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
This results in a very large number, thats equal to the sum of all points, multiplied by the number of rows that match the query.
I need to achieve this without the use of subqueries or multiple queries.
try
SELECT user_id, user_name, sum(point_points) as total_points, sum( case when point_date > '$this_month' then point_points else 0 end ) as month_points
FROM users
LEFT JOIN points
ON users.user_id = points.point_userid
WHERE user_id = 1
GROUP BY user_id, user_name
SELECT user_id, user_name,
(
SELECT SUM(points.point_points)
FROM points
WHERE points.point_userid = users.user_id
) AS total_points,
(
SELECT SUM(points.point_points)
FROM points
WHERE points.point_userid = users.user_id
AND points.point_date > '$this_month'
) AS month_points
FROM users
WHERE user_id = 1