Converting Nested SQL to ORM in Django - sql

I have a Query Like this
SELECT
*,
(
SELECT
COALESCE(json_agg(product_attribute), '[]')
FROM
(
SELECT
*
FROM
optimus_productattribute as product_attribute
WHERE
product.id = product_attribute.product_id
)
AS product_attribute
)
AS product_atttribute
FROM
optimus_product as product inner join optimus_store as store on product.store_id = store.id
and I want to convert it to ORM
Have Tried JSONBAgg but it says it can only be used to have Single Column
Product.objects.filter(store_id=787).annotate(attributes=Coalesce(JSONBAgg(ProductAttribute.objects.filter(product=OuterRef('pk')).values_list('id', 'uuid')),[]))

Related

Convert aliases and distinct subquery from SQLite to SQLAlchemy

I am trying to convert a SQLite statement into python SQLAlchemy to be used with FASTApi. I am not sure how to convert a query this complex with aliases of s and p for the single prices table.
Here is the SQLite query:
SELECT s.security_id, p.price, MAX(p.price_datetime) price_datetime
FROM (SELECT DISTINCT security_id FROM prices) s
LEFT JOIN prices p ON p.security_id = s.security_id AND p.price_datetime <= '2022-08-10 19:000:00.000000'
GROUP BY s.security_id;
Here is my attempt so far:
# starting attempt so far
select(models.Price.security_id, models.Price.price, func.max(models.Price.price_datetime), models.Price.price_datetime)
First wonder is why do you have such a complicated query ? Selecting distinct security_id to join again, to group by security_id makes no sense to me.
I have come up with this much simpler version, which in my tests works the same.
SELECT security_id, price, MAX(price_datetime) price_datetime
FROM prices
WHERE price_datetime <= '2022-02-01'
GROUP BY security_id;
Which then is fairly easy to translate to SQLAlchemy.
stmt = (
select(
Price.security_id,
Price.price,
func.max(Price.price_datetime).alias("price_datetime"),
)
.filter(Price.price_datetime <= '2022-02-01')
.group_by(Price.security_id)
)
After OP's comment:
SELECT s.id, p.price, MAX(p.price_datetime) AS price_datetime
FROM security AS s
LEFT JOIN prices as p
ON s.id = p.security_id AND p.price_datetime <= '2021-02-01'
GROUP BY s.id;
which should translate to
stmt = (
select(
Security.id,
Price.price,
func.max(Price.price_datetime).label("price_datetime"),
)
.join(
Price,
and_(
Security.id == Price.security_id,
Price.price_datetime <= "2022-01-01",
),
isouter=True,
)
.group_by(Security.id)
)

BQ SQL join with a table with a name that is derived from a query

I have some fields that are a date. That date then is then used to look up a table with a name corresponding to the date of that field. I'm doing a join to get other fields, but the question is how to treat the field with the date as a variable that can be used then to perform the join.
Here is the example query:
with tab1 as (
select
product_id,
start_date,
from `project.user.table`
)
select * from tab1 inner join `project2.table2.{start_date}` as B on tab1.product_id = B.p_id
After suggestions I have tried the following query to tighten things up, but it is sadly not working.
with tab1 as (
select
cast(product_id as INT64) as product_id_64,
cast(FORMAT_DATE('%Y%m%d', CAST(start_date AS DATE)) as STRING) as start_date_string
from `project.user.table`
)
select * from `user2.dataset.*` b
inner join tab1
on b._TABLE_SUFFIX = tab1.start_date_string
led to the following error:
Error running query. Cannot read field of type STRING as INT64 Field: GTIN
If your table is a native BigQuery one you can try to test and use a wildcard:
WITH tab1 as (
select
product_id,
start_date,
from `project.user.table`
)
SELECT *
FROM `<yourproject>.<yourdataset>.*` b
INNER JOIN tab1
ON b._TABLE_PREFIX = tab1.start_date
AND b.p_id = tab1.product_id
I can't test unfortunately

must appear in the group by clause in sql

I have a sql statement and I am trying to add order by, when I add order statement I get an error
ERROR: column "items.id" must appear in the GROUP BY clause or be used in an aggregate function
My query is.
WITH "has_children_cte"
AS (SELECT DISTINCT "parent_id" AS "item_id",
1 AS "has_children"
FROM "items")
SELECT "item_category_id",
Count(*) AS "count"
FROM "items"
INNER JOIN "items" AS "root_item"
ON ( "root_item"."id" = "items"."root_id" )
LEFT JOIN "item_types"
ON ( "items"."item_type_id" = "item_types"."id" )
LEFT JOIN "item_categories"
ON ( "item_categories"."id" = "item_types"."item_category_id" )
INNER JOIN "order_items"
ON ( "items"."order_item_id" = "order_items"."id" )
INNER JOIN "orders"
ON ( "order_items"."order_id" = "orders"."id" )
LEFT JOIN "has_children_cte"
ON ( "items"."id" = "has_children_cte"."item_id" )
WHERE ( ( "items"."parent_id" IS NULL )
AND ( "items"."state" != 'discarded' ) )
GROUP BY "item_category_id"
ORDER BY "items"."id";
I have add the ORDER BY "items"."id";
Then I get this error. When I try to add items.id into group by I got bad results.
Unfortunately I am unable to handle this error.
The ORDER BY (logically) takes place after the aggregation. And after the aggregation, "items"."id" is not available in each row.
So just use an aggregation function:
ORDER BY MIN("items"."id")

Unable to convert this legacy SQL into Standard SQL in Google BigQuery

I am not able to validate this legacy sql into standard bigquery sql as I don't know what else is required to change here(This query fails during validation if I choose standard SQL as big query dialect):
SELECT
lineitem.*,
proposal_lineitem.*,
porder.*,
company.*,
product.*,
proposal.*,
trafficker.name,
salesperson.name,
rate_card.*
FROM (
SELECT
*
FROM
dfp_data.dfp_order_lineitem
WHERE
DATE(end_datetime) >= DATE(DATE_ADD(CURRENT_TIMESTAMP(), -1, 'YEAR'))
OR end_datetime IS NULL ) lineitem
JOIN (
SELECT
*
FROM
dfp_data.dfp_order) porder
ON
lineitem.order_id = porder.id
LEFT JOIN (
SELECT
*
FROM
adpoint_data.dfp_proposal_lineitem) proposal_lineitem
ON
lineitem.id = proposal_lineitem.dfp_lineitem_id
JOIN (
SELECT
*
FROM
dfp_data.dfp_company) company
ON
porder.advertiser_id = company.id
LEFT JOIN (
SELECT
*
FROM
adpoint_data.dfp_product) product
ON
proposal_lineitem.product_id=product.id
LEFT JOIN (
SELECT
*
FROM
adpoint_data.dfp_proposal) proposal
ON
proposal_lineitem.proposal_id=proposal.id
LEFT JOIN (
SELECT
*
FROM
adpoint_data.dfp_rate_card) rate_card
ON
proposal_lineitem.ratecard_id=rate_card.id
LEFT JOIN (
SELECT
id,
name
FROM
dfp_data.dfp_user) trafficker
ON
porder.trafficker_id =trafficker.id
LEFT JOIN (
SELECT
id,
name
FROM
dfp_data.dfp_user) salesperson
ON
porder. salesperson_id =salesperson.id
Most likely the error you are getting is something like below
Duplicate column names in the result are not supported. Found duplicate(s): name
Legacy SQL adjust trafficker.name and salesperson.name in your SELECT statement into respectively trafficker_name and salesperson_name thus effectively eliminating column names duplication
Standard SQL behaves differently and treat both those columns as named name thus producing duplication case. To avoid it - you just need to provide aliases as in example below
SELECT
lineitem.*,
proposal_lineitem.*,
porder.*,
company.*,
product.*,
proposal.*,
trafficker.name AS trafficker_name,
salesperson.name AS salesperson_name,
rate_card.*
FROM ( ...
You can easily check above explained using below simplified/dummy queries
#legacySQL
SELECT
porder.*,
trafficker.name,
salesperson.name
FROM (
SELECT 1 order_id, 'abc' order_name, 1 trafficker_id, 2 salesperson_id
) porder
LEFT JOIN (SELECT 1 id, 'trafficker' name) trafficker
ON porder.trafficker_id =trafficker.id
LEFT JOIN (SELECT 2 id, 'salesperson' name ) salesperson
ON porder. salesperson_id =salesperson.id
and
#standardSQL
SELECT
porder.*,
trafficker.name AS trafficker_name,
salesperson.name AS salesperson_name
FROM (
SELECT 1 order_id, 'abc' order_name, 1 trafficker_id, 2 salesperson_id
) porder
LEFT JOIN (SELECT 1 id, 'trafficker' name) trafficker
ON porder.trafficker_id =trafficker.id
LEFT JOIN (SELECT 2 id, 'salesperson' name ) salesperson
ON porder. salesperson_id =salesperson.id
Note: if you have more duplicate names - you need to alias all of them too

INNER JOINING THE TABLE ITSELF GIVES No column name was specified for column 2

SELECT *
FROM
construction AS T2
INNER JOIN
(
SELECT project,MAX(report_date)
FROM construction
GROUP BY project
) AS R
ON T2.project=R.project AND T2.report_date=R.report_date
getting this error. plz help
No column name was specified for column 2 of 'R'
You need to add alias for MAX(report_date):
SELECT *
FROM construction AS T2
INNER JOIN
(
SELECT project,MAX(report_date) AS report_date
FROM construction
GROUP BY project
) AS R
ON T2.project = R.project
AND T2.report_date = R.report_date;
In SQL Server you can use syntax:
SELECT *
FROM construction AS T2
INNER JOIN
(
SELECT project,MAX(report_date)
FROM construction
GROUP BY project
) AS R(project, report_date)
ON T2.project = R.project
AND T2.report_date = R.report_date;
You should specific the MAX(report_date) with an alias report_date.
Because your table R have two columns project,MAX(report_date).
You are getting this error because you have not specified column name for inner query
You have to write your query as
SELECT *
FROM construction
INNER JOIN
(
SELECT project,MAX(report_date)"Max_ReportDate"
FROM construction
GROUP BY project
) Max_construction
ON construction.project = Max_construction .project
AND construction.report_date = Max_construction .Max_ReportDate