How to use a bunch of clause in a having statement. Oracle - sql

assume i have a query like this:
SELECT table1.id
FROM (
SELECT id, sum(column) as A
FROM table1
GROUP BY id
) a1
Left join (
SELECT id,
sum(column) as B
FROM table 2
GROUP BY Id
) a2
on table1.id=table2.id
.
.
.
.
Left join (
SELECT id, sum(column) as G
FROM table 7
GROUP BY id
) g1
on table1.id=table7.id
Having or where A+B - (C+D+E+F+G) >0
I tried both, none works.
Having return error on there's no group by in the first select and where doesn't return any rows.

First your question have some issues.
I'm going to guess you mean put alias a, b, c, d ....
instead of a1, a2, g1.
Also your left join should be something like a.id = b.id at the moment you create a subquery you have to use the alias instead of tablename.
If you fix that you should add a WHERE, I also guess you mean use the SUM() result
WHERE a.A + b.B - (c.C+ d.D+ e.E+ f.F+ g.G) > 0
.
SELECT a.id
FROM (
SELECT id, sum(column) as sumA
FROM table1
GROUP BY id
) a
Left join
(
SELECT id, sum(column) as sumB
FROM table 2
GROUP BY Id
) b
on a.id = b.id
.
.
.
.
Left join
(
SELECT id, sum(column) as sumG
FROM table 2
GROUP BY id
) g
on f.id = g.id
WHERE a.sumA + b.sumB - (c.sumC + d.sumD + e.sumE + f.sumF + g.sumG) >0

Juan has the right answer. I am just adding a SQLFiddle to help strengthen his answer. Please look at a smaller instance of the same solution here: http://sqlfiddle.com/#!4/81c275/1
Tables
create table table1(id int, col int);
insert into table1 values (1, 10);
insert into table1 values (2, 20);
insert into table1 values (2, 30);
create table table2(id int, col int);
insert into table2 values (1, 5);
insert into table2 values (2, 3);
insert into table2 values (2, 2);
create table table3(id int, col int);
insert into table3 values (1, 100);
insert into table3 values (2, 20);
insert into table3 values (2, 3);
SQL
select a1.id
from (select id, sum(col) as A from table1 group by id) a1
left join (select id, sum(col) as B from table2 group by id) a2
on a1.id = a2.id
left join (select id, sum(col) as C from table3 group by id) a3
on a1.id = a3.id
where A + B - (C) > 0
You can add more tables in the SQLFiddle with whatever values you please, and change the SQL accordingly by appending D, E, F, G etc after C in (C).
The above example will result in output of 2 since ID 2's A+B = 55 and C = 23. A+B-C > 0 for this record and therefore the output will be 2.

I believe that you need to take out the 'where' and move it up if you still need it.
So that it would look something like this,
select table1.id from(
...
...
...)
Having ((A+B)-(C+D+R+F+G)>0)
According to this site:
http://www.w3schools.com/sql/sql_having.asp

Related

max function does not when having case when clause

i have two tables.
one is as below
table a
ID, count
1, 123
2, 123
3, 123
table b
ID, count
table b is empty
when using
SELECT CASE
WHEN isnotnull(max(b.count)) THEN max(a.count) + max(b.count)
ELSE max(a.count)
FROM a, b
the only result is always NULL
i am very confused. why?
You don't need to use a JOIN, a simple SUM of two sub-queries will give you your desired result. Since you only add MAX(b.count) when it is non-NULL, we can just add it all the time but COALESCE it to 0 when it is NULL.
SELECT COALESCE((SELECT MAX(count) FROM b), 0) + (SELECT MAX(count) FROM a)
Another way to make this work is to UNION the count values from each table:
SELECT COALESCE(MAX(bcount), 0) + MAX(acount)
FROM (SELECT count AS acount, NULL AS bcount FROM a
UNION
SELECT NULL AS acount, count AS bcount FROM b) u
Note that if you use a JOIN it must be a FULL JOIN. If you use a LEFT JOIN you risk not seeing all the values from table b. For example, consider the case where table b has one entry: ID=4, count=456. A LEFT JOIN on ID will not include this value in the result table (since table a only has ID values of 1,2 and 3) so you will get the wrong result:
CREATE TABLE a (ID INT, count INT);
INSERT INTO a VALUES (1, 123), (2, 123), (3, 123);
CREATE TABLE b (ID INT, count INT);
INSERT INTO b VALUES (4, 456);
SELECT COALESCE(MAX(b.count), 0) + MAX(a.count)
FROM a
LEFT JOIN b ON a.ID = b.ID
Output
123 (should be 579)
To use a FULL JOIN you would write
SELECT COALESCE(MAX(b.count), 0) + MAX(a.count)
FROM a
FULL JOIN b ON a.ID = b.ID
Since, tableb is empty, max(b.count) will return NULL. And any operation done with NULL, results in NULL.
So, max(a.count) + max(b.count) is NULL.(this is 123 + NULL which will be NULL always). Hence, your query is returning NULL.
Just use a coalesce to assign a default value whenever NULL comes.
use coalesce() function and explicit join, avoid coma separated table name type old join method
select coalesce(max(a.count)+max(b.count),max(a.count))
from a left join b on a.id=b.id
Use left join
SELECT coalesce(max(a.count) + max(b.count),max(a.count))
FROM a left join b a.id=b.id

Oracle SQL - Comparing Rows

I have a problem I'm working on with Oracle SQL that goes something like this.
TABLE
PurchaseID CustID Location
----1------------1-----------A
----2------------1-----------A
----3------------2-----------A
----4------------2-----------B
----5------------2-----------A
----6------------3-----------B
----7------------3-----------B
I'm interested in querying the Table to return all instances where the same customer makes a purchase in different locations. So, for the table above, I would want:
OUTPUT
PurchaseID CustID Location
----3------------2-----------A
----4------------2-----------B
----5------------2-----------A
Any ideas on how to accomplish this? I haven't been able to think of how to do it, and most of my ideas seem like they would be pretty clunky. The database I'm using has 1MM+ records, so I don't want it to run too slowly.
Any help would be appreciated. Thanks!
SELECT *
FROM YourTable T
WHERE CustId IN (SELECT CustId
FROM YourTable
GROUP BY CustId
HAVING MIN(Location) <> MAX(Location))
You should be able to use something similar to the following:
select purchaseid, custid, location
from yourtable
where custid in (select custid
from yourtable
group by custid
having count(distinct location) >1);
See SQL Fiddle with Demo.
The subquery in the WHERE clause is returning all custids that have a total number of distinct locations that are greater than 1.
In English:
Select a row if another row exists with the same customer and a different location.
In SQL:
SELECT *
FROM atable t
WHERE EXISTS (
SELECT *
FROM atable
WHERE CustID = t.CustID
AND Location <> t.Location
);
Here's one approach using a sub-query
SELECT T1.PurchaseID
,T1.CustID
,T1.Location
FROM YourTable T1
INNER JOIN
(SELECT T2.CustID
,COUNT (DISTINCT T2.Location )
FROM YourTable T1
GROUP BY
T2.CustID
HAVING COUNT (DISTINCT T2.Location )>1
) SQ
ON SQ.CustID = T1.CustID
This should only require one full table scan.
create table test (PurchaseID number, CustID number, Location varchar2(1));
insert into test values (1,1,'A');
insert into test values (2,1,'A');
insert into test values (3,2,'A');
insert into test values (4,2,'B');
insert into test values (5,2,'A');
insert into test values (6,3,'B');
insert into test values (7,3,'A');
with repeatCustDiffLocations as (
select PurchaseID, custid, location, dense_rank () over (partition by custid order by location) r
from test)
select b.*
from repeatCustDiffLocations a, repeatCustDiffLocations b
where a.r > 1
and a.custid = b.custid;
This makes most sense to me as I was trying to return the rows with the same values throughout the table, specifically for two columns as shown in this stackoverflow answer here.
The answer to your problem in this format is:
SELECT DISTINCT a.*
FROM TEST a
INNER JOIN TEST b
ON a.CUSTOMERID = b.CUSTOMERID AND
a.LOCATION <> b.LOCATION;
However, the solution to a problem such as mine with two columns having matching values in multiple rows (2 in this instance, would yield no results because all PurchaseID's are unique):
SELECT DISTINCT a.*
FROM TEST a
INNER JOIN TEST b
ON a.CUSTOMERID = b.CUSTOMERID AND
a.PURCHASEID = b.PURCHASEID AND
a.LOCATION <> b.LOCATION;
Although, this wouldn't return the correct results based on the what needs to be queried, it shows that the query logic works
SELECT DISTINCT a.*
FROM TEST a
INNER JOIN TEST b
ON a.CUSTOMERID = b.CUSTOMERID AND
a.PURCHASEID <> b.PURCHASEID AND
a.LOCATION = b.LOCATION;
If anyone wants to try in Oracle here is the table and values to insert:
CREATE TABLE TEST (
PurchaseID integer,
CustomerID integer,
Location varchar(1));
INSERT ALL
INTO TEST VALUES (1, 1, 'A')
INTO TEST VALUES (2, 1, 'A')
INTO TEST VALUES (3, 2, 'A')
INTO TEST VALUES (4, 2, 'B')
INTO TEST VALUES (5, 2, 'A')
INTO TEST VALUES (6, 3, 'B')
INTO TEST VALUES (7, 3, 'B')
SELECT * FROM DUAL;

Parent Child Query without using SubQuery

Let say I have two tables,
Table A
ID Name
-- ----
1 A
2 B
Table B
AID Date
-- ----
1 1/1/2000
1 1/2/2000
2 1/1/2005
2 1/2/2005
Now I need this result without using sub query,
ID Name Date
-- ---- ----
1 A 1/2/2000
2 B 1/2/2005
I know how to do this using sub query but I want to avoid using sub query for some reason?
If I got your meaning right and you need the latest date from TableB, then the query below should do it:
select a.id,a.name,max(b.date)
from TableA a
join TableB b on b.aid = a.id
group by a.id,a.name
SELECT a.ID, a.Name, MAX(B.Date)
FROM TableA A
INNER JOIN TableB B
ON B.ID = A.ID
GROUP BY A.id, A.name
It's a simple aggregation. Looks like you want the highest date per id/name combo.
create table #t1 (id int, Name varchar(10))
create table #t2 (Aid int, Dt date)
insert #t1 values (1, 'A'), (2, 'B')
insert #t2 values (1, '1/1/2000'), (1, '1/2/2000'), (2, '1/1/2005'), (2, '1/2/2005')
;WITH cte (AId, MDt)
as
(
select Aid, MAX(Dt) from #t2 group by AiD
)
select #t1.Id, #t1.Name, cte.MDt
from #t1
join cte
on cte.AId = #t1.Id

Join to only the "latest" record with t-sql

I've got two tables. Table "B" has a one to many relationship with Table "A", which means that there will be many records in table "B" for one record in table "A".
The records in table "B" are mainly differentiated by a date, I need to produce a resultset that includes the record in table "A" joined with only the latest record in table "B". For illustration purpose, here's a sample schema:
Table A
-------
ID
Table B
-------
ID
TableAID
RowDate
I'm having trouble formulating the query to give me the resultset I'm looking for any help would be greatly appreciated.
SELECT *
FROM tableA A
OUTER APPLY (SELECT TOP 1 *
FROM tableB B
WHERE A.ID = B.TableAID
ORDER BY B.RowDate DESC) as B
select a.*, bm.MaxRowDate
from (
select TableAID, max(RowDate) as MaxRowDate
from TableB
group by TableAID
) bm
inner join TableA a on bm.TableAID = a.ID
If you need more columns from TableB, do this:
select a.*, b.* --use explicit columns rather than * here
from (
select TableAID, max(RowDate) as MaxRowDate
from TableB
group by TableAID
) bm
inner join TableB b on bm.TableAID = b.TableAID
and bm.MaxRowDate = b.RowDate
inner join TableA a on bm.TableAID = a.ID
table B join is optional: it depends if there are other columns you want
SELECT
*
FROM
tableA A
JOIN
tableB B ON A.ID = B.TableAID
JOIN
(
SELECT Max(RowDate) AS MaxRowDate, TableAID
FROM tableB
GROUP BY TableAID
) foo ON B.TableAID = foo.TableAID AND B.RowDate= foo.MaxRowDate
With ABDateMap AS (
SELECT Max(RowDate) AS LastDate, TableAID FROM TableB GROUP BY TableAID
),
LatestBRow As (
SELECT MAX(ID) AS ID, TableAID FROM ABDateMap INNER JOIN TableB ON b.TableAID=a.ID AND b.RowDate = LastDate GROUP BY TableAID
)
SELECT columns
FROM TableA a
INNER JOIN LatestBRow m ON m.TableAID=a.ID
INNER JOIN TableB b on b.ID = m.ID
Just for the clarity's sake and to benefit those who will stumble upon this ancient question. The accepted answer would return duplicate rows if there are duplicate RowDate in Table B. A safer and more efficient way would be to utilize ROW_NUMBER():
Select a.*, b.* -- Use explicit column list rather than * here
From [Table A] a
Inner Join ( -- Use Left Join if the records missing from Table B are still required
Select *,
ROW_NUMBER() OVER (PARTITION BY TableAID ORDER BY RowDate DESC) As _RowNum
From [Table B]
) b
On b.TableAID = a.ID
Where b._RowNum = 1
Try using this:
BEGIN
DECLARE #TB1 AS TABLE (ID INT, NAME VARCHAR(30) )
DECLARE #TB2 AS TABLE (ID INT, ID_TB1 INT, PRICE DECIMAL(18,2))
INSERT INTO #TB1 (ID, NAME) VALUES (1, 'PRODUCT X')
INSERT INTO #TB1 (ID, NAME) VALUES (2, 'PRODUCT Y')
INSERT INTO #TB2 (ID, ID_TB1, PRICE) VALUES (1, 1, 3.99)
INSERT INTO #TB2 (ID, ID_TB1, PRICE) VALUES (2, 1, 4.99)
INSERT INTO #TB2 (ID, ID_TB1, PRICE) VALUES (3, 1, 5.99)
INSERT INTO #TB2 (ID, ID_TB1, PRICE) VALUES (1, 2, 0.99)
INSERT INTO #TB2 (ID, ID_TB1, PRICE) VALUES (2, 2, 1.99)
INSERT INTO #TB2 (ID, ID_TB1, PRICE) VALUES (3, 2, 2.99)
SELECT A.ID, A.NAME, B.PRICE
FROM #TB1 A
INNER JOIN #TB2 B ON A.ID = B.ID_TB1 AND B.ID = (SELECT MAX(ID) FROM #TB2 WHERE ID_TB1 = A.ID)
END
This will fetch the latest record with JOIN. I think this will help someone
SELECT cmp.*, lr_entry.lr_no FROM
(SELECT * FROM lr_entry ORDER BY id DESC LIMIT 1)
lr_entry JOIN companies as cmp ON cmp.id = lr_entry.company_id

SQL to resequence items by groups

Lets say I have a database that looks like this:
tblA:
ID, Name, Sequence, tblBID
1 a 5 14
2 b 3 15
3 c 3 16
4 d 3 17
tblB:
ID, Group
14 1
15 1
16 2
17 3
I would like to sequence A so that the sequences go 1...n for each group of B.
So in this case, the sequences going down should be 1,2,1,1.
The ordering needs to be consistent with the current ordering, but there are no guarantees as to the current ordering.
I am not exactly a sql master and I am sure there is a fairly easy way to do this, but I really don't know the right route to take. Any hints?
If you are using SQL Server 2005+ or higher, you can use a ranking function:
Select tblA.Id, tblA.Name
, Row_Number() Over ( Partition By tblB.[Group] Order By tblA.Id ) As Sequence
, tblA.tblBID
From tblA
Join tblB
On tblB.tblBID = tblB.ID
Row_Number ranking function.
Here's another solution that would work in SQL Server 2000 and prior.
Select A.Id, A.Name
, (Select Count(*)
From tblB As B1
Where B1.[Group] = B.[Group]
And B1.Id < B.ID) + 1 As Sequence
, A.tblBID
From tblA As A
Join tblB As B
On B.Id = A.tblBID
EDIT
Also want to make it clear that I want to actually update tblA to reflect the proper sequences.
In SQL Server, you can use their proprietary From clause in an Update statement like so:
Update tblA
Set Sequence = (
Select Count(*)
From tblB As B1
Where B1.[Group] = B.[Group]
And B1.Id < B.ID
) + 1
From tblA As A
Join tblB As B
On B.Id = A.tblBID
The Hoyle ANSI solution might be something like:
Update tblA
Set Sequence = (
Select (Select Count(*)
From tblB As B1
Where B1.[Group] = B.[Group]
And B1.Id < B.ID) + 1
From tblA As A
Join tblB As B
On B.Id = A.tblBID
Where A.Id = tblA.Id
)
EDIT
Can we do that [the inner group] comparison based on A.Sequence instead of B.ID?
Select A1.*
, (Select Count(*)
From tblB As B2
Join tblA As A2
On A2.tblBID = B2.Id
Where B2.[Group] = B1.[Group]
And A2.Sequence < A1.Sequence) + 1
From tblA As A1
Join tblB As B1
On B1.Id = A1.tblBID
Because it's SQL 2000, we can't use a windowing function. That's okay.
Thomas's queries are good and will work. However, they will get worse and worse as the number of rows increases—with different characteristics depending on how wide (the number of groups) and how deep (the number of items per group). This is because those queries use a partial cross-join, perhaps we could call it a "pyramidal cross-join" where the crossing part is limited to right side values less than left side values rather than left crossing to all right values.
What to do?
I think you will be surprised to find that the following long and painful-looking script will outperform the pyramidal join at a certain size of data (which may not be all that big) and eventually, with really large data sets must be considered a screaming performer:
CREATE TABLE #tblA (
ID int identity(1,1) NOT NULL,
Name varchar(1) NOT NULL,
Sequence int NOT NULL,
tblBID int NOT NULL,
PRIMARY KEY CLUSTERED (ID)
)
INSERT #tblA VALUES ('a', 5, 14)
INSERT #tblA VALUES ('b', 3, 15)
INSERT #tblA VALUES ('c', 3, 16)
INSERT #tblA VALUES ('d', 3, 17)
CREATE TABLE #tblB (
ID int NOT NULL PRIMARY KEY CLUSTERED,
GroupID int NOT NULL
)
INSERT #tblB VALUES (14, 1)
INSERT #tblB VALUES (15, 1)
INSERT #tblB VALUES (16, 2)
INSERT #tblB VALUES (17, 3)
CREATE TABLE #seq (
seq int identity(1,1) NOT NULL,
ID int NOT NULL,
GroupID int NOT NULL,
PRIMARY KEY CLUSTERED (ID)
)
INSERT #seq
SELECT
A.ID,
B.GroupID
FROM
#tblA A
INNER JOIN #tblB B ON A.tblBID = b.ID
ORDER BY B.GroupID, A.Sequence
UPDATE A
SET A.Sequence = S.seq - X.MinSeq + 1
FROM
#tblA A
INNER JOIN #seq S ON A.ID = S.ID
INNER JOIN (
SELECT GroupID, MinSeq = Min(seq)
FROM #seq
GROUP BY GroupID
) X ON S.GroupID = X.GroupID
SELECT * FROM #tblA
DROP TABLE #seq
DROP TABLE #tblB
DROP TABLE #tblA
If I understood you correctly, then ORDER BY B.GroupID, A.Sequence is correct. If not, you can switch A.Sequence to B.ID.
Also, my index on the temp table should be experimented with. For a certain quantity of rows, and also the width and depth characteristics of those rows, clustering on one of the other two columns in the #seq table could be helpful.
Last, there is a possible different data organization possible: leaving GroupID out of the #seq table and joining again. I suspect it would be worse, but am not 100% sure.
Something like:
SELECT a.id, a.name, row_number() over (partition by b.group order by a.id)
FROM tblA a
JOIN tblB on a.tblBID = b.ID;