Get "latest" row after GROUP BY over multiple tables - sql

I'd preferably like to first query listed below and just group by stories.id, but I get the following error:
ERROR: column "u.first_name" must appear in the GROUP BY clause or be used in an aggregate function LINE 1: SELECT "s".*, "u"."first_name", "u"."last_name", ("i"."filen...
The second query works but does not group by stories.id and generates the wrong results. Is it possible to select from multiple tables and not group by all of them?
The table panels also has a column updated_at. I would like to get the newest file per story according to panels.updated_at.
SELECT
"s".*,
"u"."first_name",
"u"."last_name",
("i"."filename" || '.' || "i"."extension") AS "file"
FROM
"stories" "s"
LEFT JOIN "panels" "p" ON("p"."story_id" = "s"."id")
LEFT JOIN "users" "u" ON("s"."user_id" = "u"."uid")
LEFT JOIN "images" "i" ON ("p"."image_id" = "i"."id")
WHERE
"s"."complete" = false AND
"s"."created_by" = 205700489
GROUP BY
"s"."id",
ORDER BY
"s"."created_at" DESC
SELECT
"s".*,
"u"."first_name",
"u"."last_name",
("i"."filename" || '.' || "i"."extension") AS "file"
FROM
"stories" "s"
LEFT JOIN "panels" "p" ON("p"."story_id" = "s"."id")
LEFT JOIN "users" "u" ON("s"."user_id" = "u"."uid")
LEFT JOIN "images" "i" ON ("p"."image_id" = "i"."id")
WHERE
"s"."complete" = false AND
"s"."created_by" = 205700489
GROUP BY
"s"."id",
"u"."first_name",
"u"."last_name", "i"."filename",
"i"."extension"
ORDER BY
"s"."created_at" DESC

Updated after clarification of the question:
SELECT DISTINCT ON (s.created_at, s.id)
s.*
,u.first_name
,u.last_name
,concat_ws('.', i.filename, i.extension) AS file
FROM stories s
LEFT JOIN users u ON u.uid = s.user_id
LEFT JOIN panels p ON p.story_id = s.id
LEFT JOIN images i ON i.id = p.image_id
WHERE s.complete = false
AND s.created_by = 205700489
ORDER BY s.created_at DESC, s.id, p.updated_at DESC;
Grouping by primary key requires PostgreSQL 9.1.
I use concat_ws(), because I don't know which columns might be NULL. If both i.filename and i.extension are defined NOT NULL, you can simplify.
Effect of the additional ORDER BY item p.updated_at DESC is that the "newest" file will be picked per story. The query technique is explained in full under this related question:
Select first row in each GROUP BY group?

You can write something like:
SELECT
"s".*,
(SELECT "u"."first_name"
FROM "users" "u"
WHERE "s"."user_id" = "u"."uid"
LIMIT 1) ,
(SELECT "u"."last_name"
FROM "users" "u"
WHERE "s"."user_id" = "u"."uid"
LIMIT 1),
(SELECT "i"."filename" || '.' || "i"."extension"
FROM "panels" "p"
JOIN "images" "i" ON ("p"."image_id" = "i"."id")
WHERE "p"."story_id" = "s"."id"
LIMIT 1) AS "file"
FROM
"stories" "s"
WHERE
"s"."complete" = false AND
"s"."created_by" = 205700489
ORDER BY
"s"."created_at" DESC
It will get only 1 record from "users" and "panels" JOIN "images" per record in "stories" .
Add ORDER BY, extra WHERE or some aggregates to get what you need from "users" and "panels" JOIN "images"
UPD Also, you can use something like this:
SELECT *
FROM (
SELECT DISTINCT ON ("s"."id")
"s".*,
"u"."first_name",
"u"."last_name",
("i"."filename" || '.' || "i"."extension") AS "file"
FROM
"stories" "s"
LEFT JOIN "panels" "p" ON("p"."story_id" = "s"."id")
LEFT JOIN "users" "u" ON("s"."user_id" = "u"."uid")
LEFT JOIN "images" "i" ON ("p"."image_id" = "i"."id")
WHERE
"s"."complete" = false AND
"s"."created_by" = 205700489
ORDER BY
"s"."id"
) t ORDER BY "t"."created_at" DESC
It will leave only one row for every distinct "s"."id"

Related

Unsure why ORA-00918 column ambiguously defined is appearing

I'm unsure as to why I'm getting the ORA-00918 error message appearing when I type in the following code.
I can't see which column is ambiguously defined.
What I want to do is create a table that pulls in the b.site_code value based on the work_header_no, work_version_no and site_numbers matching in queries A & B matching.
Code is below
SELECT
a.statement.statement_date,
a.sw_header.organise_code,
a.organisation.organise_name,
a.PermitRef,
a.actual_inspection.logged_time,
a.insp_category.insp_category_name,
a.actual_inspection.insp_number,
a.actual_inspection.site_number,
a.inspection_outcome.insp_outcome_name,
a.insp_category.insp_charge,
a.actual_inspection.insp_notes,
a.actual_inspection.work_header_no,
a.actual_inspection.insp_time,
b.site_code
FROM
(select
statement.statement_date,
sw_header.organise_code,
organisation.organise_name,
CAST(
organisation.external_ref_2 ||''||
sw_header.works_ref||'.'||
sw_notice_header.app_seq_no||'.'||
sw_notice_header.ext_version_no
as VARCHAR (40)) as PermitRef,
actual_inspection.logged_time,
insp_category.insp_category_name,
actual_inspection.insp_number,
actual_inspection.site_number,
inspection_outcome.insp_outcome_name,
insp_category.insp_charge,
actual_inspection.insp_notes,
actual_inspection.work_header_no,
actual_inspection.insp_time,
sw_notice_header.work_header_no,
sw_notice_header.work_version_no,
actual_inspection.site_number
from
actual_inspection
inner join sw_header on
actual_inspection.work_header_no = sw_header.work_header_no
inner join sw_notice_header on
sw_header.work_header_no = sw_notice_header.work_header_no
and sw_header.work_version_no = sw_notice_header.work_version_no
inner join insp_category on
actual_inspection.insp_category_code = insp_category.insp_category_code
inner join inspection_outcome on
actual_inspection.insp_outcome_code = inspection_outcome.insp_outcome_code
inner join organisation on
sw_header.organise_code = organisation.organise_code
inner join statement on
organisation.organise_code = statement.organise_code
and organisation.statement_number = statement.statement_no
where
actual_inspection.notice_type_code = '2600' and
actual_inspection.insp_outcome_code != 'O40'
order by
actual_inspection.logged_time)
a
JOIN
(
select
sns.work_header_no,
sns.work_version_no,
sns.site_number,
sns.site_code
from
sw_notice_site sns
)
b
ON a.work_header_no = b.work_header_no and
a.work_version_no = b.work_version_no and
a.site_number = b.site_number
You have duplicate actual_inspection.site_number and work_header_no remove the duplicate rows
actual_inspection.site_number,
inspection_outcome.insp_outcome_name,
insp_category.insp_charge,
actual_inspection.insp_notes,
actual_inspection.work_header_no,
actual_inspection.insp_time,
sw_notice_header.work_header_no,
sw_notice_header.work_version_no,
actual_inspection.site_number
No need to use a.statement.statement_date. You can use a.statement_date.
Likewise change all other columns for a..
Your whole query should look like this:
SELECT -- removed table names from all the columns
A.STATEMENT_DATE,
A.ORGANISE_CODE,
A.ORGANISE_NAME,
A.PERMITREF,
A.LOGGED_TIME,
A.INSP_CATEGORY_NAME,
A.INSP_NUMBER,
A.SITE_NUMBER,
A.INSP_OUTCOME_NAME,
A.INSP_CHARGE,
A.INSP_NOTES,
A.WORK_HEADER_NO,
A.INSP_TIME,
B.SITE_CODE
FROM
(
SELECT
STATEMENT.STATEMENT_DATE,
SW_HEADER.ORGANISE_CODE,
ORGANISATION.ORGANISE_NAME,
CAST(ORGANISATION.EXTERNAL_REF_2
|| ''
|| SW_HEADER.WORKS_REF
|| '.'
|| SW_NOTICE_HEADER.APP_SEQ_NO
|| '.'
|| SW_NOTICE_HEADER.EXT_VERSION_NO AS VARCHAR(40)) AS PERMITREF,
ACTUAL_INSPECTION.LOGGED_TIME,
INSP_CATEGORY.INSP_CATEGORY_NAME,
ACTUAL_INSPECTION.INSP_NUMBER,
--ACTUAL_INSPECTION.SITE_NUMBER, -- commented this as it is there in statement twice
INSPECTION_OUTCOME.INSP_OUTCOME_NAME,
INSP_CATEGORY.INSP_CHARGE,
ACTUAL_INSPECTION.INSP_NOTES,
ACTUAL_INSPECTION.WORK_HEADER_NO,
ACTUAL_INSPECTION.INSP_TIME,
--SW_NOTICE_HEADER.WORK_HEADER_NO, -- commented this as it is there in statement twice
SW_NOTICE_HEADER.WORK_VERSION_NO,
ACTUAL_INSPECTION.SITE_NUMBER
FROM
ACTUAL_INSPECTION
INNER JOIN SW_HEADER ON ACTUAL_INSPECTION.WORK_HEADER_NO = SW_HEADER.WORK_HEADER_NO
INNER JOIN SW_NOTICE_HEADER ON SW_HEADER.WORK_HEADER_NO = SW_NOTICE_HEADER.WORK_HEADER_NO
AND SW_HEADER.WORK_VERSION_NO = SW_NOTICE_HEADER.WORK_VERSION_NO
INNER JOIN INSP_CATEGORY ON ACTUAL_INSPECTION.INSP_CATEGORY_CODE = INSP_CATEGORY.INSP_CATEGORY_CODE
INNER JOIN INSPECTION_OUTCOME ON ACTUAL_INSPECTION.INSP_OUTCOME_CODE = INSPECTION_OUTCOME.INSP_OUTCOME_CODE
INNER JOIN ORGANISATION ON SW_HEADER.ORGANISE_CODE = ORGANISATION.ORGANISE_CODE
INNER JOIN STATEMENT ON ORGANISATION.ORGANISE_CODE = STATEMENT.ORGANISE_CODE
AND ORGANISATION.STATEMENT_NUMBER = STATEMENT.STATEMENT_NO
WHERE
ACTUAL_INSPECTION.NOTICE_TYPE_CODE = '2600'
AND ACTUAL_INSPECTION.INSP_OUTCOME_CODE != 'O40'
ORDER BY
ACTUAL_INSPECTION.LOGGED_TIME
) A
JOIN (
SELECT
SNS.WORK_HEADER_NO,
SNS.WORK_VERSION_NO,
SNS.SITE_NUMBER,
SNS.SITE_CODE
FROM
SW_NOTICE_SITE SNS
) B ON A.WORK_HEADER_NO = B.WORK_HEADER_NO
AND A.WORK_VERSION_NO = B.WORK_VERSION_NO
AND A.SITE_NUMBER = B.SITE_NUMBER;
Cheers!!

Count left join and use it in where

I currently have the following sql (Generated by typeorm).
SELECT
"media"."permissionOwner" AS "media_permissionOwner",
"media"."permissionGroup" AS "media_permissionGroup",
"media"."permissionOther" AS "media_permissionOther",
"media"."id" AS "media_id",
"media"."filename" AS "media_filename",
"media"."mime" AS "media_mime",
"media"."description" AS "media_description",
"media"."length" AS "media_length",
(
SELECT
1
FROM
"user" "user"
WHERE
"user"."id" = $1 LIMIT 1
)
AS "user"
FROM
"media_object" "media"
LEFT JOIN
"media_object_groups_group" "media_groups"
ON "media_groups"."mediaObjectId" = "media"."id"
LEFT JOIN
"group" "groups"
ON "groups"."id" = "media_groups"."groupId"
LEFT JOIN
"media_object_owner_user" "media_owner"
ON "media_owner"."mediaObjectId" = "media"."id"
LEFT JOIN
"user" "owner"
ON "owner"."id" = "media_owner"."userId"
WHERE
(
"media"."permissionGroup" >= $2
AND
COUNT("groups") = 0 -- How can I accomplish this
)
OR
(
"media"."permissionGroup" >= $3
AND groups # > user_groups
)
OR
(
owner # > ARRAY[user]
AND "media"."permissionOwner" >= $4
)
OR "media"."permissionOther" >= $5 -- PARAMETERS: [23,4,4,4,4]
The problem is, I can't use HAVING, because it doesn't seem like it support this type of condition making like WHERE does.
I tried it with a subquery at the COUNT part, but than I can't access the "groups" alias inside the sub select count query.
I am not even sure if the other part of the query works, but first I want to solve the "count" issue.

How to find records that have any duplicate data using Active Record

How to find records with duplicate values in any column using Activerecord or SQL?
SELECT leads.id, leads.name, leads.email, leads.created_at, array_agg(tn2.id) as ids
FROM "leads" join leads tn2
on leads.name = tn2.name
or leads.cpf_cnpj = tn2.cpf_cnpj
or leads.email = tn2.email
or leads.phone -> 'cellphone' = tn2.phone -> 'cellphone'
or leads.phone -> 'residence' = tn2.phone -> 'residence'
or leads.phone -> 'commercial' = tn2.phone -> 'commercial'
GROUP BY leads.id ORDER BY leads.created_at DESC
Using array_agg I want only ids from repeated objects, but it gives me from all records.
Currently, I'm using PostgreSQL.
How to find records with duplicate values in any column?
SELECT l.id, l.name, l.email, l.created_at, array_agg(l2.id) AS ids
FROM leads l
WHERE EXISTS (
SELECT 1
FROM leads
WHERE id <> l.id
AND (
name = l.name
OR cpf_cnpj = l.cpf_cnpj
OR email = l.email
OR phone->'cellphone' = l.phone->'cellphone'
OR phone->'residence' = l.phone->'residence'
OR phone->'commercial' = l.phone->'commercial'
)
);
But it seems like you want something different:
How to get an array of IDs for each row from rows that have the same value in at least one of several given columns, youngest entry first?
SELECT l.id, l.name, l.email, l.created_at
, array_agg(l2.id ORDER BY l2.created_at DESC NULL LAST) AS dupe_ids
FROM leads l
JOIN leads l2 ON l2.id <> l.id
AND (
l2.name = l.name
OR l2.cpf_cnpj = l.cpf_cnpj
OR l2.email = l.email
OR l2.phone->'cellphone' = l.phone->'cellphone'
OR l2.phone->'residence' = l.phone->'residence'
OR l2.phone->'commercial' = l.phone->'commercial'
)
GROUP BY l.id
ORDER BY l.created_at DESC NULL LAST;
Assuming id is the primary key.

Column is invalid in the ORDER BY clause because it is not contained in either an aggregate function or the GROUP BY clause

Ok here's my View (vw_LiftEquip)
SELECT dbo.tbl_equip_swl_unit.unit_id,
dbo.tbl_equip_swl_unit.unit_name,
dbo.tbl_equip_swl_unit.archived,
dbo.tbl_categories.category_id,
dbo.tbl_categories.categoryName,
dbo.tbl_categories.parentCategory,
dbo.tbl_categories.sub_category,
dbo.tbl_categories.desc_category,
dbo.tbl_categories.description,
dbo.tbl_categories.miscellaneous,
dbo.tbl_categories.category_archived,
dbo.tbl_equip_swl_unit.unit_name AS Expr1,
dbo.tbl_categories.categoryName AS Expr2,
dbo.tbl_categories.description AS Expr3,
dbo.tbl_equip_depts.dept_name,
dbo.tbl_equip_man.man_name,
dbo.tbl_Lifting_Gear.e_defects AS Expr7,
dbo.tbl_Lifting_Gear.e_defects_desc AS Expr8,
dbo.tbl_Lifting_Gear.e_defects_date AS Expr9,
dbo.tbl_equipment.equipment_id,
dbo.tbl_equipment.e_contract_no,
dbo.tbl_equipment.slID,
dbo.tbl_equipment.e_entered_by,
dbo.tbl_equipment.e_serial,
dbo.tbl_equipment.e_model,
dbo.tbl_equipment.e_description,
dbo.tbl_equipment.e_location_id,
dbo.tbl_equipment.e_owner_id,
dbo.tbl_equipment.e_department_id,
dbo.tbl_equipment.e_manafacture_id,
dbo.tbl_equipment.e_manDate1,
dbo.tbl_equipment.e_manDate2,
dbo.tbl_equipment.e_manDate3,
dbo.tbl_equipment.e_dimensions,
dbo.tbl_equipment.e_test_no,
dbo.tbl_equipment.e_firstDate1,
dbo.tbl_equipment.e_firstDate2,
dbo.tbl_equipment.e_firstDate3,
dbo.tbl_equipment.e_prevDate1,
dbo.tbl_equipment.e_prevDate2,
dbo.tbl_equipment.e_prevDate3,
dbo.tbl_equipment.e_insp_frequency,
dbo.tbl_equipment.e_swl,
dbo.tbl_equipment.e_swl_unit_id,
dbo.tbl_equipment.e_swl_notes,
dbo.tbl_equipment.e_cat_id,
dbo.tbl_equipment.e_sub_id,
dbo.tbl_equipment.e_parent_id,
dbo.tbl_equipment.e_last_inspector,
dbo.tbl_equipment.e_last_company,
dbo.tbl_equipment.e_deleted AS Expr11,
dbo.tbl_equipment.e_deleted_desc AS Expr12,
dbo.tbl_equipment.e_deleted_date AS Expr13,
dbo.tbl_equipment.e_deleted_insp AS Expr14,
dbo.tbl_Lifting_Gear.e_defects_action AS Expr15,
dbo.tbl_equipment.e_rig_location,
dbo.tbl_Lifting_Gear.e_add_type AS Expr17,
dbo.tbl_Lifting_Gear.con_id,
dbo.tbl_Lifting_Gear.lifting_date,
dbo.tbl_Lifting_Gear.lifting_ref_no,
dbo.tbl_Lifting_Gear.e_id,
dbo.tbl_Lifting_Gear.inspector_id,
dbo.tbl_Lifting_Gear.lift_testCert,
dbo.tbl_Lifting_Gear.lift_rig_location,
dbo.tbl_Lifting_Gear.inspected,
dbo.tbl_Lifting_Gear.lifting_through,
dbo.tbl_Lifting_Gear.liftingNDT,
dbo.tbl_Lifting_Gear.liftingTest,
dbo.tbl_Lifting_Gear.e_defects,
dbo.tbl_Lifting_Gear.e_defects_desc,
dbo.tbl_Lifting_Gear.e_defects_date,
dbo.tbl_Lifting_Gear.e_defects_action,
dbo.tbl_Lifting_Gear.lift_department_id,
dbo.tbl_Lifting_Gear.lifting_loc
FROM dbo.tbl_equipment
INNER JOIN dbo.tbl_equip_swl_unit
ON dbo.tbl_equipment.e_swl_unit_id = dbo.tbl_equip_swl_unit.unit_id
INNER JOIN dbo.tbl_categories
ON dbo.tbl_equipment.e_cat_id = dbo.tbl_categories.category_id
INNER JOIN dbo.tbl_equip_depts
ON dbo.tbl_equipment.e_department_id = dbo.tbl_equip_depts.dept_id
INNER JOIN dbo.tbl_equip_man
ON dbo.tbl_equipment.e_manafacture_id = dbo.tbl_equip_man.man_id
INNER JOIN dbo.vwSubCategory
ON dbo.tbl_equipment.e_sub_id = dbo.vwSubCategory.category_id
INNER JOIN dbo.vwDescCategory
ON dbo.tbl_equipment.e_cat_id = dbo.vwDescCategory.category_id
INNER JOIN dbo.tbl_Lifting_Gear
ON dbo.tbl_equipment.equipment_id = dbo.tbl_Lifting_Gear.e_id
And here's the select statement with subquery that I am using:
SELECT *
FROM vw_LiftEquip
WHERE lifting_loc = ? AND
con_id = ? AND
EXPR11 =
'N'(
SELECT MAX(lifting_date) AS maxLift
FROM vw_LiftEquip
WHERE e_id = equipment_id
)
ORDER BY lifting_ref_no,
category_id,
e_swl,
e_serial
I get the error :
Column "vw_LiftEquip.category_id" is invalid in the ORDER BY clause because it is not contained in either an aggregate function or the GROUP BY clause.
Can't see why its returning that error, this is admittedly the first time I've ran a subquery on such a complex view, and I am a bit lost, thanks in advance for any help. I have looked through the similar posts and can find no answers to this one, sorry if I am just being dumb.
You are missing AND between EXPR11 = 'N' and (SELECT MAX(...
Otherwise, it looks OK. MAX without GROUP BY is allowed if you have no other columns in the SELECT
Update: #hvd also noted that you have nothing to compare to MAX(lifting_date). See comment
Update 2,
SELECT *
FROM vw_LiftEquip v1
CROSS JOIN
(
SELECT MAX(lifting_date) AS maxLift
FROM vw_LiftEquip
WHERE e_id = equipment_id
) v2
WHERE v1.lifting_loc = ? AND
v1.con_id = ? AND
v1.EXPR11 = 'N'
ORDER BY v1.lifting_ref_no,
v1.category_id,
v1.e_swl,
v1.e_serial

Get num of ROWS in other table

I have two tables EXERCISE and EXERCISEUSER. I need to list all exercise entries and put an additional field in the query, which will return if that exercise exists in the table EXERCISEUSER. In other words, I need know if the user did that exercise. If so, it will have a row in EXERCISEUSER.
My current query is:
SELECT
"E".*,
"T"."NAME" AS "LEVEL"
FROM
"EXERCISE" AS "E"
INNER JOIN
"EXERCISETYPE" AS "T"
ON
E.STO_FK_EXERCISETYPEEXERCISE = T.PK_EXERCISETYPE
INNER JOIN
"LEVEL" AS "L"
ON
L.PK_LEVEL = E.STO_FK_LEVELEXERCISE
WHERE
(
E.STATUS = 1)
AND (
L.STATUS = 1)
AND (
L.PK_LEVEL = 5)
ORDER BY
"T"."ORDER" ASC
I will provide PK_USER too.
Thanks!
Well, i use a subquery, and reach the result i want.
SELECT
"E".*,
"T"."NAME" AS "LEVEL",
( SELECT COUNT(*) FROM STOUSER.EXERCISEUSER AS EU WHERE EU.STO_FK_EXERCISEEXERCISEUSER = E.PK_EXERCISE AND EU.STO_FK_USEREXERCISEUSER = 5978 ) AS MAKE_EXER_NUM
FROM
"STOUSER"."EXERCISE" AS "E"
INNER JOIN
"STOUSER"."EXERCISETYPE" AS "T"
ON
E.STO_FK_EXERCISETYPEEXERCISE = T.PK_EXERCISETYPE
INNER JOIN
"STOUSER"."LEVEL" AS "L"
ON
L.PK_LEVEL = E.STO_FK_LEVELEXERCISE
WHERE
(
E.STATUS = 1)
AND (
L.STATUS = 1)
AND (
L.PK_LEVEL = 5)
ORDER BY
"T"."ORDER" ASC
Thanks!
I think this should be done with a LEFT OUTER JOIN.