SQL query to fetch IDs present only after a certain date - sql

Tables look like this :-
Table A
------------------------
ID | C_Start_Date
------------------------
1 | 2018-03-10
2 | 2018-03-15
Table B
----------------------------
ID | Invoice_Date
----------------------------
1 | 2018-01-15
1 | 2018-02-15
1 | 2018-03-15
2 | 2018-04-01
2 | 2018-04-04
I have to fetch ONLY those Id's which have Invoice_Date later than their C_Start_Date.
For example, from the above table, the query should fetch only '2' as '1' has an entry in Table B with Invoice_Date earlier than its C_Start_Date.

Working query:
SELECT DISTINCT ID FROM A
WHERE ID NOT IN(
SELECT DISTINCT A.ID FROM A
INNER JOIN B ON A.ID = B.ID
WHERE B.INVOICEDATE < A.STARTDATE)
Fiddle for you to play and try around -> https://www.db-fiddle.com/f/kXXXJopWvsmHccdnPAgdmv/0

If I understand your requisites the below example will solve your problem (written in TSQL)
declare #TableA as table
(
ID int not null
,C_Start_Date date not null
)
declare #TableB as table
(
ID int not null
,Invoice_Date date not null
)
insert into #TableA
values
(1,'2018-03-10')
,(2,'2018-03-15')
insert into #TableB
values
(1,'2018-01-15')
,(1,'2018-02-15')
,(1,'2018-03-15')
,(2,'2018-04-01')
,(2,'2018-04-04')
select a.ID
from #TableA a
where not exists
(
select * from #TableB b where b.ID = a.ID and b.Invoice_Date < a.C_Start_Date
)
while the same be achieved with DISTINCT and a simple JOIN I utilised EXISTS as it generaly performs better than aggregates and distincts..

Related

SQL group with Recursive CTE

I'm working on SQL Server 2008. I believe the answer to my Q lies in a recursive CTE but any solution would be greatly appreciated.
In the sam_DB.dbo.example table below where the PID is not null it links back to an ID
ID | PID
------ | ------
1 | NULL
2 | 1
3 | 2
4 | 3
5 | NULL
6 | 5
7 | 6
8 | NULL
9 | NULL
10 | 9
I want my output to have a new field (CID) that identifies each record in a chain of linkages from PID to ID as part of a group, as per below.
ID | PID | CID
------ | ------ | ------
1 | NULL | 1
2 | 1 | 1
3 | 2 | 1
4 | 3 | 1
5 | NULL | 2
6 | 5 | 2
7 | 6 | 2
8 | NULL | 3
9 | NULL | 4
10 | 9 | 4
You're correct that you'd need a CTE for this.
You need to define the first part of the query to select the top level records (i.e. those which have no parent):
select ID, PID, ID
from #t
where PID is null
Then, for every row added to the resulting CTE (i.e. first of all for those records returned by the above query, then again for each new row added by this second part of the query, repeated for each addition until no new additions are made) you should add all records from the source table for which the parent id matches the previously added row's id.
select t.ID, t.PID, c.CID
from cte c
inner join #t t
on t.PID = c.ID
Aside from this logic, the only other thing to be aware of is that the CID column for the first expression takes the record's ID, whilst for those records returned by the second expression it takes the parent record's CID.
Full Code
--set up the demo data
declare #t table (ID int not null, PID int null)
insert #t
values (1, null)
, (2,1)
, (3,2)
, (4,3)
, (5,null)
, (6,5)
, (7,6)
, (8,null)
, (9,null)
, (10,9)
--actual demo
;with cte (ID, PID, CID) as (
--select out top most (ancestor) records; setting CID to ID (since they're the oldest ancestor in their own chain, given they don't have parents)
select ID, PID, ID
from #t
where PID is null
union all
--select each record that is a child of the record we previously selected, holding the ancestor as the parent record's ancestor
select t.ID, t.PID, c.CID
from cte c
inner join #t t
on t.PID = c.ID
)
select *
from CTE
order by ID
you have to use Common text expression With Row_Number Window function
CREATE TABLE #TblTemp(ID int,PID int)
INSERT INTO #TblTemp(ID ,PID ) VALUES (1,NULL),(2,1),(3,1),(4,3),(5,NULL),(6,5),(7,6),(8,NULL),(9,NULL),(10,9)
;WITH CTE (ID, PID, CID) AS (
SELECT ID, PID, ROW_NUMBER() OVER(ORDER BY ID) RN
FROM #TBLTEMP
WHERE PID IS NULL
UNION ALL
SELECT T.ID, T.PID, C.CID
FROM CTE C
INNER JOIN #TBLTEMP T
ON T.PID = C.ID
)
SELECT *
FROM CTE
ORDER BY ID
I will post some simple example
-- shows how to create a recursive grouping for wrongly linked or corrupted pieces of a parent/child groups
declare #t table (item varchar(2), tr int null, rc int null)
insert #t select 'a',1,9 -- no links 'a' - is group parent
insert #t select 'b',2,1 -- links to 'a'
insert #t select 'c',3,2 -- links to 'b'
insert #t select 'd',4,3 -- links to 'd'
insert #t select 'e',6,7 -- no links 'e' - is a different group
insert #t select 'f',8,2 -- links to 'c'
-- grn-group name based on a parent item name;
-- gid-group name based on a parent item id;
-- tr-transactionID ; rc-recursiveID;
-- rc_New-new recursiveID to use; rc_Old - original recursiveID
;with cte as
(
select grn=s.item, gid=s.tr, s.item, s.tr, rc_New= t.tr, rc_Old=s.rc from #t s
left join #t t on t.tr=s.rc where (t.tr is NULL or s.rc is NULL)
union all
select c.grn, c.gid,s.item, s.tr, rc_New=s.rc, rc_Old=s.rc
from cte c join #t s on s.rc=c.tr where s.rc is not NULL
)
select * from cte order by 2,3
option (MAXRECURSION 32767, FAST 100)

Calculating overlap between groups

I have a table with two columns of interest, item_id and bucket_id. There are a fixed number of values for bucket_id and I'm okay with listing them out if I need to.
Each item_id can appear multiple times, but each occurrence will have a separate bucket_id value. For example, the item_id of 123 can appear twice in the table, once under bucket_id of A, once under B.
My goal is to determine how much overlap exists between each pair of bucket_id values and display it as an N-by-N matrix.
For example, consider the following small example table:
item_id bucket_id
========= ===========
111 A
111 B
111 C
222 B
222 D
333 A
333 C
444 C
So for this dataset, buckets A and B have one item_id in common, buckets C and D have no items in common, etc.
I would like to get the above table formatted into something like the following:
A B C D
===================================
A 2 1 2 0
B 1 2 1 1
C 2 1 3 0
D 0 1 0 1
In the above table, the intersect of a row and column tells you how many records exist in both bucket_id values. For example, where the A row intersects the C column we have a 2, because there are 2 records that exist in both bucket_id A and C. Because the intersection of X and Y is the same as the intersection of Y and X, the above table is mirrored across the diagonal.
I imagine the query involves a PIVOT, but I can't for the life of me figure out how to get it working.
You can use simple PIVOT:
SELECT t1.bucket_id,
SUM( CASE WHEN t2.bucket_id = 'A' THEN 1 ELSE 0 END ) AS A,
SUM( CASE WHEN t2.bucket_id = 'B' THEN 1 ELSE 0 END ) AS B,
SUM( CASE WHEN t2.bucket_id = 'C' THEN 1 ELSE 0 END ) AS C,
SUM( CASE WHEN t2.bucket_id = 'D' THEN 1 ELSE 0 END ) AS D
FROM table1 t1
JOIN table1 t2 ON t1.item_id = t2.item_id
GROUP BY t1.bucket_id
ORDER BY 1
;
or you can use Oracle PIVOT clause (works on 11.2 and above):
SELECT * FROM (
SELECT t1.bucket_id AS Y_bid,
t2.bucket_id AS x_bid
FROM table1 t1
JOIN table1 t2 ON t1.item_id = t2.item_id
)
PIVOT (
count(*) FOR x_bid in ('A','B','C','D')
)
ORDER BY 1
;
Examples: http://sqlfiddle.com/#!4/39d30/7
I believe this should get you the data you need. Pivoting the table could then be done programmatically (or in Excel, etc.).
-- This gets the distinct pairs of buckets
select distinct
a.name,
b.name
from
bucket a
join bucket b
where
a.name < b.name
order by
a.name,
b.name
+ --------- + --------- +
| name | name |
+ --------- + --------- +
| A | B |
| A | C |
| A | D |
| B | C |
| B | D |
| C | D |
+ --------- + --------- +
6 rows
-- This gets the distinct pairs of buckets with the counts you are looking for
select distinct
a.name,
b.name,
count(distinct bi.item_id)
from
bucket a
join bucket b
left outer join bucket_item ai on ai.bucket_name = a.name
left outer join bucket_item bi on bi.bucket_name = b.name and ai.item_id = bi.item_id
where
a.name < b.name
group by
a.name,
b.name
order by
a.name,
b.name
+ --------- + --------- + ------------------------------- +
| name | name | count(distinct bi.item_id) |
+ --------- + --------- + ------------------------------- +
| A | B | 2 |
| A | C | 1 |
| A | D | 0 |
| B | C | 2 |
| B | D | 0 |
| C | D | 0 |
+ --------- + --------- + ------------------------------- +
6 rows
Here's the entire example with the DDL and inserts to set it up (this is in mysql but the same ideas apply elsewhere):
use example;
drop table if exists bucket;
drop table if exists item;
drop table bucket_item;
create table bucket (
name varchar(1)
);
create table item(
id int
);
create table bucket_item(
bucket_name varchar(1) references bucket(name),
item_id int references item(id)
);
insert into bucket values ('A');
insert into bucket values ('B');
insert into bucket values ('C');
insert into bucket values ('D');
insert into item values (111);
insert into item values (222);
insert into item values (333);
insert into item values (444);
insert into item values (555);
insert into bucket_item values ('A',111);
insert into bucket_item values ('A',222);
insert into bucket_item values ('A',333);
insert into bucket_item values ('B',222);
insert into bucket_item values ('B',333);
insert into bucket_item values ('B',444);
insert into bucket_item values ('C',333);
insert into bucket_item values ('C',444);
insert into bucket_item values ('D',555);
-- query to get distinct pairs of buckets
select distinct
a.name,
b.name
from
bucket a
join bucket b
where
a.name < b.name
order by
a.name,
b.name
;
select distinct
a.name,
b.name,
count(distinct bi.item_id)
from
bucket a
join bucket b
left outer join bucket_item ai on ai.bucket_name = a.name
left outer join bucket_item bi on bi.bucket_name = b.name and ai.item_id = bi.item_id
where
a.name < b.name
group by
a.name,
b.name
order by
a.name,
b.name
;

SQL Query MAX date and some fields from other table

I have two tables, say A and B.
Table : A
ID_Sender | Date
________________________
1 | 11-13-2013
1 | 11-12-2013
2 | 11-12-2013
2 | 11-11-2013
3 | 11-13-2013
4 | 11-11-2013
Table : B
ID | Tags
_______________________
1 | Company A
2 | Company A
3 | Company C
4 | Company D
result table:
Tags | Date
____________________________
Company A | 11-13-2013
Company C | 11-13-2013
Company D | 11-11-2013
I have already tried out this out GROUP BY with MAX(DATE) but failed with no luck, I did some inner joins and subqueries but failed to produce the output.
Here is my code so far, and an image for the output attached.
SELECT E.Tags, D.[Date] FROM
(SELECT A.ID_Sender AS Sendah, MAX(A.[Date]) AS Datee
FROM tblA A
LEFT JOIN tblB B ON A.ID_Sender = B.ID
GROUP BY A.ID_Sender) C
INNER JOIN tblA D ON D.ID_Sender = C.Sendah AND D.[Date] = C.Datee
INNER JOIN tblB E ON E.ID = D.ID_Sender
Any suggestions? I'm already pulling my hairs out !
(maybe you guys can just give me some sql concepts that can be helpful, the answer is not that necessary cos I really really wanted to solve it on my own :) )
Thanks!
SELECT Tags, MAX(Date) AS [Date]
FROM dbo.B INNER JOIN dbo.A
ON B.ID = A.ID_Sender
GROUP BY B.Tags
Demo
The result
Company A November, 13 2013 00:00:00+0000
Company C November, 13 2013 00:00:00+0000
Company D November, 11 2013 00:00:00+0000
try this please let me correct if I wrong. In table B Id = 2 is Company B I am assuming.. if it is right then go ahead with this code.
declare #table1 table(ID_Sender int, Dates varchar(20))
insert into #table1 values
( 1 , '11-13-2013'),
(1 , '11-12-2013'),
(2 ,'11-12-2013'),
(2 ,'11-11-2013'),
(3 ,'11-13-2013'),
(4 ,'11-11-2013')
declare #table2 table ( id int, tags varchar(20))
insert into #table2 values
(1 ,'Company A'),
(2 , 'Company B'),
(3 , 'Company C'),
(4 , 'Company D')
;with cte as
(
select
t1.ID_Sender, t1.Dates, t2.tags
from #table1 t1
join
#table2 t2 on t1.ID_Sender = t2.id
)
select tags, MAX(dates) as dates from cte group by tags
Change first your schema with 'Company B' on ID in B Table
Here's my code:
Select B.Tags, max(A.Date) as 'Date'
from A, B
where B.ID = A.ID_Sender
group by B.Tags

IF ( Count(*) > 1)

I am trying to look through two tables TableA and TableB get print out the TableA.ID of any that show more than 1 count. TableA looks like this:
ID | Code
------------
1 | A
2 | B
3 | C
Table B Looks like
ID | AID | EffectiveDate | ExpirationDate
------------------------------------------------
1 | 1 | 2012-01-01 | 2012-12-31
2 | 1 | 2012-01-01 | 2012-12-31
3 | 2 | 2012-01-01 | 2012-12-31
4 | 3 | 2012-01-01 | 2012-12-31
The Query I am using looks like this:
DECLARE #MoreThanOne varchar(250)
SET #MoreThanOne = ''
IF((SELECT COUNT(*) FROM TableA
WHERE EXISTS(
SELECT TableB.ID
,TableB.EffectiveDate
,TableB.ExpirationDate
FROM TableB
WHERE TableB.AID = TableA.ID
and GETDATE() Between TableB.EffectiveDate and TableB.ExpirationDate
)
GROUP BY TableA.Code) > 1)
BEGIN
--SET #MoreThanOne = #MoreThanOne + TableA.Code + CHAR(10)
END
PRINT #MoreThanOne
I know that my nested Query works when reworked it will print the counts for all in the unique codes in TableA.
I know that I can not use what I commented out because i don't have access to TableA.Code.
My question is there another way to do this or how can I get access to TableA.Code for the Message MoreThanOne.
Thanks For the help!
This query will get you the codes for all AIDs that are duplicated in table B:
SELECT Code
FROM TableA
WHERE AID IN
(
SELECT AID
FROM TableB
GROUP BY AID
HAVING COUNT(*) > 1
)
You may also wish to add the WHERE condition that have in your stored procedure to the inner select.
Try this
SELECT TableA.ID, TableA.Code, Count(*) As Cnt
FROM TableB, TableA
WHERE TableB.AID = TableA.ID
and GETDATE() Between TableB.EffectiveDate and TableB.ExpirationDate
GROUP BY TableA.ID, TableA.Code
HAVING COUNT(*) > 1
You can do this very simply, with no join:
SELECT tableb.AID
FROM TableB
WHERE GETDATE() Between TableB.EffectiveDate and TableB.ExpirationDate
group by tableb.AID
having count(*) > 1
This simply aggregates tableB by AID, returning the values with more than one record. You only need to join to TableA if you want the code as well.
This should do you:
select Code = a.Code ,
Frequency = count(*)
from table_a a
join table_b b on b.aid = a.id
and current_timestamp between b.EffectiveDate and b.ExpirationDate
group by a.Code
having count(*) > 1
order by 2 desc -- order in descending sequence by frequency
1 -- then ascending sequence by code

SQL right join, force return only one value from right hand side

table 1
---
id , name
table2
---
id , activity, datefield
table1 'right join' table2 will return more than 1 results from right table (table2) . how to make it return only "1" result from table2 with the highest date
You write poor information about your problem, But I'll try to make an example to help you.
You have a table "A" and a table "B" and you need to fetch the "top" date of table "B" that is related with table "A"
Example tables:
Table A:
AID| NAME
----|-----
1 | Foo
2 | Bar
Table B:
BID | AID | DateField
----| ----| ----
1 | 1 | 2000-01-01
2 | 1 | 2000-01-02
3 | 2 | 2000-01-01
If you do this sql:
SELECT * FROM A RIGHT JOIN B ON B.ID = A.ID
You get all information of A and B that is related by ID (that in this theoretical case is the field that is common for both tables to link the relation)
A.AID | A.NAME | B.BID | B.AID | B.DateField
------|--------|-------|-------|--------------
1 | Foo | 1 | 1 | 2000-01-01
1 | Foo | 2 | 1 | 2000-01-02
2 | Bar | 3 | 2 | 2000-01-01
But you require only the last date for each element of the Table A (the top date of B)
Next if you need to get only the top DATE you need to group your query by the B.AID and fetch only the top date
SELECT
B.AID, First(A.NAME), MAX(B.DateField)
FROM
A RIGHT JOIN B ON B.ID = A.ID
GROUP BY
B.AID
And The result of this operation is:
B.AID | A.NAME | B.DateField
------|--------|--------------
1 | Foo | 2000-01-02
2 | Bar | 2000-01-01
In this result I removed some fields that are duplicated (like A.AID and B.AID that is the relationship between the two tables) or are not required.
Tip: this also works if you have more tables into the sql. The sql "makes" the query and next applies a grouping for using the B to limit the repetitions of B to the top date.
right join table2 on on table1.id to to select id, max = max(date) from table2
Analytics!
Test data:
create table t1
(id number primary key,
name varchar2(20) not null
);
create table t2
(id number not null,
activity varchar2(20) not null,
datefield date not null
);
insert into t1 values (1, 'foo');
insert into t1 values (2, 'bar');
insert into t1 values (3, 'baz');
insert into t2 values (1, 'foo activity 1', date '2009-01-01');
insert into t2 values (2, 'bar activity 1', date '2009-01-01');
insert into t2 values (2, 'bar activity 2', date '2010-01-01');
Query:
select id, name, activity, datefield
from (select t1.id, t1.name, t2.id as t2_id, t2.activity, t2.datefield,
max(datefield) over (partition by t1.id) as max_datefield
from t1
left join t2
on t1.id = t2.id
)
where ( (t2_id is null) or (datefield = maxdatefield) )
The outer where clause will filter out all but the maximum date from t2 tuples, but leave in the null row where there was no matching row in t2.
Results:
ID NAME ACTIVITY DATEFIELD
---------- -------- ------------------- -------------------
1 foo foo activity 1 2009-01-01 00:00:00
2 bar bar activity 2 2010-01-01 00:00:00
3 baz
To retrieve the Top N records from a query, you can use the following syntax:
SELECT *
FROM (your ordered by datefield desc query with join) alias_name
WHERE rownum <= 1
ORDER BY rownum;
PS: I am not familiar with PL/SQL so maybe I'm wrong
my solution is
select from table1 right join table2 on (table1.id= table2.id and table2.datefiled= (select max(datefield) from table2 where table2.id= table1.id) )