Inner join and Group By error - sql

I have two tabels :
tbl_album and tbl_gallery
How can I select the last image of the last three albums?
these are columns of my tables:
tbl_album: Id,al_name
tbl_gallery: Id,album_id,ga_pic_title,ga_file_name
I use this query:
select al.Id, al.al_name, ga.ga_file_name
from tbl_album al inner join tbl_gallery ga
on al.Id=ga.album_id order by Id desc
I receive an error when I used Group By clause:
select al.Id, al.al_name, ga.ga_file_name
from tbl_album al inner join tbl_gallery ga
on al.Id=ga.album_id group by al.al_name order by Id desc
Msg 8120, Level 16, State 1, Line 1 Column 'tbl_album.Id' is invalid
in the select list because it is not contained in either an aggregate
function or the GROUP BY clause.
I do not want to repeat al_name column.
Is there a better way?

You will have to include all your column in select list to group by like below (I think you are using SQL Server or Other but not MySQL). BTW, why do you need a group by in your posted query?
select al.Id, al.al_name, ga.ga_file_name
from tbl_album al
inner join tbl_gallery ga on al.Id=ga.album_id
group by al.Id, al.al_name, ga.ga_file_name
order by Id desc

Most databases support the row_number() function which really helps with what you want to do:
select id, al_name, ga_filename
from (select al.Id, al.al_name, ga.ga_file_name,
row_number() over (partition by al.id order by ga.id desc) as seqnum,
dense_rank() over (order by al.id desc) as seqnum_album
from tbl_album al inner join
tbl_gallery ga
on al.Id = ga.album_id
) t
where seqnum = 1 and seqnum_album <= 3;
Note that I used the window function dense_rank() to determine the last three albums. You can also do this with an order by and a clause the limits the number of rows. Unfortunately, the latter depends on the database, so it might be top 3, or limit 3, or fetch first 3 rows only, or even something else.

Related

Apply OFFSET and LIMIT in ORACLE for complex Join Queries?

I'm using Oracle 11g and have a complex join query. In this query I really wanted to apply OFFSET and LIMIT in order to be get used in Spring Batch Framework effectively.
I went through:
How do I limit the number of rows returned by an Oracle query after ordering? and
Alternatives to LIMIT and OFFSET for paging in Oracle
But things are not very clear to me.
My Query
SELECT DEPT.ID rowobjid, DEPT.CREATOR createdby, DEPT.CREATE_DATE createddate, DEPT.UPDATED_BY updatedby, DEPT.LAST_UPDATE_DATE updateddate,
DEPT.NAME name, DEPT.STATUS status, statusT.DESCR statusdesc,
REL.ROWID_DEPT1 rowidDEPT1, REL.ROWID_DEPT2 rowidDEPT2, DEPT2.DEPT_FROM_VAL parentcid, DEPT2.NAME parentname
FROM TEST.DEPT_TABLE DEPT
LEFT JOIN TEST.STATUS_TABLE statusT ON DEPT.STATUS = statusT.STATUS
LEFT JOIN TEST.C_REL_DEPT rel ON DEPT.ID=REL.ROWID_DEPT2
LEFT JOIN TEST.DEPT_TABLE DEPT2 ON REL.ROWID_DEPT1=DEPT2.ID
ORDER BY rowobjid asc;
Above Query gives me 10 millions records.
Note: Neither database table has PK, so I would need to use OFFSET and LIMIT.
You can use Analytic functions such as ROW_NUMBER() within a subquery for Oracle 11g assuming you need to get the rows ranked between 3rd and 8th in order to capture the OFFSET 3 LIMIT 8 logic within the Oracle DB(indeed those clauses are included for versions 12c+), whenever the result should be grouped by CREATE_DATE and ordered by the ID of the departments :
SELECT q.*
FROM (SELECT DEPT.ID rowobjid,
DEPT.CREATOR createdby,
DEPT.CREATE_DATE createddate,
DEPT.UPDATED_BY updatedby,
DEPT.LAST_UPDATE_DATE updateddate,
DEPT.NAME name,
DEPT.STATUS status,
statusT.DESCR statusdesc,
REL.ROWID_DEPT1 rowidDEPT1,
REL.ROWID_DEPT2 rowidDEPT2,
DEPT2.DEPT_FROM_VAL parentcid,
DEPT2.NAME parentname,
ROW_NUMBER() OVER (PARTITION BY DEPT.CREATE_DATE ORDER BY DEPT.ID) AS rn
FROM TEST.DEPT_TABLE DEPT
LEFT JOIN TEST.STATUS_TABLE statusT
ON DEPT.STATUS = statusT.STATUS
LEFT JOIN TEST.C_REL_DEPT rel
ON DEPT.ID = REL.ROWID_DEPT2
LEFT JOIN TEST.DEPT_TABLE DEPT2
ON REL.ROWID_DEPT1 = DEPT2.ID) q
WHERE rn BETWEEN 3 AND 8;
which returns exactly 6(8-3+1) rows. If you need to include the ties(the equal values for department identities for each creation date), ROW_NUMBER() should be replaced with another window function called DENSE_RANK() as all other parts of the query remains the same. At least 6 records would return in this case.

SQL How to select customers with highest transaction amount by state

I am trying to write a SQL query that returns the name and purchase amount of the five customers in each state who have spent the most money.
Table schemas
customers
|_state
|_customer_id
|_customer_name
transactions
|_customer_id
|_transact_amt
Attempts look something like this
SELECT state, Sum(transact_amt) AS HighestSum
FROM (
SELECT name, transactions.transact_amt, SUM(transactions.transact_amt) AS HighestSum
FROM customers
INNER JOIN customers ON transactions.customer_id = customers.customer_id
GROUP BY state
) Q
GROUP BY transact_amt
ORDER BY HighestSum
I'm lost. Thank you.
Expected results are the names of customers with the top 5 highest transactions in each state.
ERROR: table name "customers" specified more than once
SQL state: 42712
First, you need for your JOIN to be correct. Second, you want to use window functions:
SELECT ct.*
FROM (SELECT c.customer_id, c.name, c.state, SUM(t.transact_amt) AS total,
ROW_NUMBER() OVER (PARTITION BY c.state ORDER BY SUM(t.transact_amt) DESC) as seqnum
FROM customers c JOIN
transaactions t
ON t.customer_id = c.customer_id
GROUP BY c.customer_id, c.name, c.state
) ct
WHERE seqnum <= 5;
You seem to have several issues with SQL. I would start with understanding aggregation functions. You have a SUM() with the alias HighestSum. It is simply the total per customer.
You can get them using aggregation and then by using the RANK() window function. For example:
select
state,
rk,
customer_name
from (
select
*,
rank() over(partition by state order by total desc) as rk
from (
select
c.customer_id,
c.customer_name,
c.state,
sum(t.transact_amt) as total
from customers c
join transactions t on t.customer_id = c.customer_id
group by c.customer_id
) x
) y
where rk <= 5
order by state, rk
There are two valid answers already. Here's a third:
SELECT *
FROM (
SELECT c.state, c.customer_name, t.*
, row_number() OVER (PARTITION BY c.state ORDER BY t.transact_sum DESC NULLS LAST, customer_id) AS rn
FROM (
SELECT customer_id, sum(transact_amt) AS transact_sum
FROM transactions
GROUP BY customer_id
) t
JOIN customers c USING (customer_id)
) sub
WHERE rn < 6
ORDER BY state, rn;
Major points
When aggregating all or most rows of a big table, it's typically substantially faster to aggregate before the join. Assuming referential integrity (FK constraints), we won't be aggregating rows that would be filtered otherwise. This might change from nice-to-have to a pure necessity when joining to more aggregated tables. Related:
Why does the following join increase the query time significantly?
Two SQL LEFT JOINS produce incorrect result
Add additional ORDER BY item(s) in the window function to define which rows to pick from ties. In my example, it's simply customer_id. If you have no tiebreaker, results are arbitrary in case of a tie, which may be OK. But every other execution might return different results, which typically is a problem. Or you include all ties in the result. Then we are back to rank() instead of row_number(). See:
PostgreSQL equivalent for TOP n WITH TIES: LIMIT "with ties"?
While transact_amt can be NULL (has not been ruled out) any sum may end up to be NULL as well. With an an unsuspecting ORDER BY t.transact_sum DESC those customers come out on top as NULL comes first in descending order. Use DESC NULLS LAST to avoid this pitfall. (Or define the column transact_amt as NOT NULL.)
PostgreSQL sort by datetime asc, null first?

How to work with problems correlated subqueries that reference other tables, without using Join

I am trying to work on public dataset bigquery-public-data.austin_crime.crime of the BigQuery. My goal is to get the output as three column that shows the
discription(of the crime), count of them, and top district for that particular description(crime).
I am able to get the first two columns with this query.
select
a.description,
count(*) as district_count
from `bigquery-public-data.austin_crime.crime` a
group by description order by district_count desc
and was hoping I can get that done with one query and then I tried this in order to get the third column showing me the Top district for that particular description (crime) by adding the code below
select
a.description,
count(*) as district_count,
(
select district from
( select
district, rank() over(order by COUNT(*) desc) as rank
FROM `bigquery-public-data.austin_crime.crime`
where description = a.description
group by district
) where rank = 1
) as top_District
from `bigquery-public-data.austin_crime.crime` a
group by description
order by district_count desc
The error i am getting is this. "Correlated subqueries that reference other tables are not supported unless they can be de-correlated, such as by transforming them into an efficient JOIN."
I think i can do that by joins. Can someone has better solution possibly to do that using without join.
Below is for BigQuery Standard SQL
#standardSQL
SELECT description,
ANY_VALUE(district_count) AS district_count,
STRING_AGG(district ORDER BY cnt DESC LIMIT 1) AS top_district
FROM (
SELECT description, district,
COUNT(1) OVER(PARTITION BY description) AS district_count,
COUNT(1) OVER(PARTITION BY description, district) AS cnt
FROM `bigquery-public-data.austin_crime.crime`
)
GROUP BY description
-- ORDER BY district_count DESC

Need Help on Inner Join two Query

Hi Guys can you help me out to create this inner joint query.
the idea is I need to first get which is the top 3 highest keyword count then show that count of keyword per month (I need the Month Number only)
SELECT ReportRaw.Keyword, Format([DateApplying],'m') AS appdate, Count(ReportRaw.Keyword) AS CountOfKeyword1
FROM
(
SELECT TOP 3 Count(Keyword) AS CountOfKeyword,Keyword
FROM ReportRaw
GROUP BY Keyword
ORDER BY Count(Keyword) DESC;
) as T1
INNER JOIN ReportRaw
ON T1.Keyword = ReportRaw.Keyword
GROUP BY ReportRaw.Keyword, Format([DateApplying],'m') ;
It looks like that semicolon (;) after DESC. Assuming this is SQL Server
Made a tiny update to the query to get that month number:
SELECT #ReportRaw.Keyword, DATEPART(MONTH, [DateApplying]) AS appdate,
Count(#ReportRaw.Keyword) AS CountOfKeyword1
FROM
(
SELECT TOP 3 Count(Keyword) AS CountOfKeyword,Keyword
FROM #ReportRaw
GROUP BY Keyword
ORDER BY Count(Keyword) DESC
) as T1
INNER JOIN #ReportRaw
ON T1.Keyword = #ReportRaw.Keyword
GROUP BY #ReportRaw.Keyword, DATEPART(MONTH, [DateApplying]) ;

ORDER BY in GROUP BY clause

I have a query
Select
(SELECT id FROM xyz M WHEREM.ID=G.ID AND ROWNUM=1 ) TOTAL_X,
count(*) from mno G where col1='M' group by col2
Now from subquery i have to fetch ramdom id for this I am doing
Select
(SELECT id FROM xyz M WHEREM.ID=G.ID AND ROWNUM=1 order by dbms_random.value ) TOTAL_X,
count(*) from mno G where col1='M' group by col2
But , oracle is showing an error
"Missing right parenthesis".
what is wrong with the query and how can i wrtie this query to get random Id.
Please help.
Even if what you did was legal, it would not give you the result you want. The ROWNUM filter would be applied before the ORDER BY, so you would just be sorting one row.
You need something like this. I am not sure if this exact code will work given the correlated subquery, but the basic point is that you need to have a subquery that contains the ORDER BY without the ROWNUM filter, then apply the ROWNUM filter one level up.
WITH subq AS (
SELECT id FROM xyz M WHERE M.ID=G.ID order by dbms_random.value
)
SELECT (SELECT id FROM subq WHERE rownum = 1) total_x,
count(*)
from mno g where col1='M' group by col2
You can't use order by in a subselect. It wouldn't matter too, because the row numbering is applied first, so you cannot influence it by using order by,
[edit]
Tried a solution. Don't got Oracle here, so you'll have to read between the typos.
In this case, I generate a single random value, get the count of records in xyz per mno.id, and generate a sequence for those records per mno.id.
Then, a level higher, I filter only those records whose index match with the random value.
This should give you a random id from xyz that matches the id in mno.
select
x.mnoId,
x.TOTAL_X
from
(SELECT
g.id as mnoId,
m.id as TOTAL_X,
count(*) over (partition by g.id) as MCOUNT,
dense_rank() over (partition by g.id) as MINDEX,
r.RandomValue
from
mno g
inner join xyz m on m.id = g.id
cross join (select dbms_random.value as RandomValue from dual) r
where
g.col1 = 'M'
) x
where
x.MINDEX = 1 + trunc(x.MCOUNT * x.RandomValue)
The only difference between your two lines are that you order_by in the one that fails, right?
It so happens that order_by doesn't fit inside a nested select.
You could do an order_by inside a where clause that contains a select, though.
Edit: #StevenV is right.
If you're trying to do what I suspect, this should work
Select A.Id, Count(*)
From MNO A
Join (Select ID From XYZ M Where M.ID=G.ID And Rownum=1 Order By Dbms_Random.Value ) B On (B.ID = A.ID)
GROUP BY A.ID