SQL SELECT repeating rows from table for specific time interval - sql

I have table and I want to find repeating rows for specific time interval (DATE is input parameter for SQL query) where it will list all rows with the same PERSON and TYPE value.
ID DATE PERSON TYPE
1 01.01.2017 PERSON1 TYPE1
2 02.02.2017 PERSON1 TYPE1
3 03.03.2017 PERSON2 TYPE1
4 04.04.2017 PERSON2 TYPE2
5 05.05.2017 PERSON2 TYPE1
6 06.06.2017 PERSON1 TYPE2
So for example if DATE is between 01.01 and 04.04 it should list me rows with ID 1 and 2.
If DATE is between 01.01 and 06.06 it should list me rows with ID 1, 2, 3 and 5 because 1 and 2 have the same person and type in that interval and 3 and 5 have the same person and type in that interval.
SELECT ID FROM TABLE
WHERE DATE>='01.01.2017' AND DATE<='06.06.2017'
but I am not sure even how to start to define this repeating clause based on PERSON and TYPE columns.
Maybe can INNER JOIN help with this if referencing the same table and matching those two columns and third column ID is different?: TABLE.PERSON=TABLE.PERSON and TABLE.TYPE=TABLE.TYPE and TABLE.ID!=TABLE.ID of course table is the same but different alias can be used for this?

Please try...
SELECT ID AS ID
FROM tableName
JOIN
(
SELECT person AS person,
type AS type,
COUNT( person ) AS countOfPair
FROM tableName
WHERE date BETWEEN startDate AND endDate
GROUP BY person,
type
) tempTable ON tableName.person = tempTable.person AND
tableName.type = tempTable.type
WHERE countOfPair >= 2
The inner SELECT gathers each combination of person and type in between your start and end dates (please replace startDate and endDate with however you are referencing those) and performs a count of them.
The outer SELECT statement's JOIN then has the effect of appending the count of each combination to the end of each row containing that combination. The outer SELECT then retrieves the ID from each row that has a repeated combination.
If you have any questions or comments, then please feel free to post a Comment accordingly.

You can try this (I don't know if your version has window analytic function):
(X is the name of your table)
SELECT Y.ID, Y.DATE, Y.PERSON, Y.TYPE
FROM (
SELECT *, COUNT(*) OVER (PARTITION BY PERSON, TYPE) AS RC
FROM X
WHERE DATE >='01.01.2017' AND DATE <='04.04.2017'
) Y
WHERE RC>1
Or this if it doesn't support them:
SELECT X.ID, X.DATE, X.PERSON, X.TYPE
FROM X
INNER JOIN (
SELECT PERSON, TYPE, COUNT(*) AS RC
FROM X
WHERE DATE >='01.01.2017' AND DATE <='04.04.2017'
GROUP BY PERSON, TYPE
) Y ON X.PERSON = Y.PERSON AND X.TYPE = Y.TYPE
WHERE RC>1
I suggest to use always appropriate conversion for date datatypes.

Another method would be:
SELECT a.id
FROM tablename a NATURAL JOIN
(SELECT person,type FROM tablename
WHERE date>='01.01.2017' AND date<='06.06.2017'
GROUP BY person, type HAVING COUNT(*)>1) b ;
The NATURAL JOIN would automatically use columns person and type.

Add "DISTINCT" clause to avoid redundancy
SELECT DISTINCT ID FROM TABLE
WHERE DATE>='01.01.2017' AND DATE<='06.06.2017'

Related

SQL: select rows from a certain table based on conditions in this and another table

I have two tables that share IDs on a postgresql .
I would like to select certain rows from table A, based on condition Y (in table A) AND based on Condition Z in a different table (B) ).
For example:
Table A Table B
ID | type ID | date
0 E 1 01.01.2022
1 F 2 01.01.2022
2 E 3 01.01.2010
3 F
IDs MUST by unique - the same ID can appear only once in each table, and if the same ID is in both tables it means that both are referring to the same object.
Using an SQL query, I would like to find all cases where:
1 - the same ID exists in both tables
2 - type is F
3 - date is after 31.12.2021
And again, only rows from table A will be returned.
So the only returned row should be:1 F
It is a bit hard t understand what problem you are actually facing, as this is very basic SQL.
Use EXISTS:
select *
from a
where type = 'F'
and exists (select null from b where b.id = a.id and dt >= date '2022-01-01');
Or IN:
select *
from a
where type = 'F'
and id in (select id from b where dt >= date '2022-01-01');
Or, as the IDs are unique in both tables, join:
select a.*
from a
join b on b.id = a.id
where a.type = 'F'
and b.dt >= date '2022-01-01';
My favorite here is the IN clause, because you want to select data from table A where conditions are met. So no join needed, just a where clause, and IN is easier to read than EXISTS.
SELECT *
FROM A
WHERE type='F'
AND id IN (
SELECT id
FROM B
WHERE DATE>='2022-01-01'; -- '2022' imo should be enough, need to check
);
I don't think joining is necessary.

SQL UNION ALL only include newer entries from 'bottom' table

Fair warning: I'm new to using SQL. I do so on an Oracle server either via AQT or with SQL Developer.
As I haven't been able to think or search my way to an answer, I put myself in your able hands...
I'd like to combine data from table A (high quality data) with data from table B (fresh data) such that the entries from B are only included when the date stamp are later than those available from table A.
Both tables include entries from multiple entities, and the latest date stamp varies with those entities.
On the 4th of january, the tables may look something like:
A____________________________ B_____________________________
entity date type value entity date type value
X 1.jan 1 1 X 1.jan 1 2
X 1.jan 0 1 X 1.jan 0 2
X 2.jan 1 1 X 2.jan 1 2
Y 1.jan 1 1 (new entry)X 3.jan 1 1
Y 3.jan 1 1 Y 1.jan 1 2
Y 3.jan 1 2
(new entry)Y 4.jan 1 1
I have made an attempt at some code that I hope clarify my need:
WITH
AA AS
(SELECT entity, date, SUM(value)
FROM table_A
GROUP BY
entity,
date),
BB AS
(SELECT entity, date, SUM(value)
FROM table_B
WHERE date > ALL (SELECT date FROM AA)
GROUP BY
entity,
date
)
SELECT * FROM (SELECT * FROM AA UNION ALL SELECT * FROM BB)
Now, if the WHERE date > ALL (SELECT date FROM AA)would work seperately for each entity, I think have what I need.
That is, for each entity I want all entries from A, and only newer entries from B.
As the data in table A often differ from that of B (values are often corrected) I dont think I can use something like: table A UNION ALL (table B MINUS table A)?
Thanks
Essentially you are looking for entries in BB which do not exist in AA. When you are doing date > ALL (SELECT date FROM AA) this will not take into consideration the entity in question and you will not get the correct records.
Alternative is to use the JOIN and filter out all matching entries with AA.
Something like below.
WITH
AA AS
(SELECT entity, date, SUM(value)
FROM table_A
GROUP BY
entity,
date),
BB AS
(SELECT entity, date, SUM(value)
FROM table_B
LEFT OUTER JOIN AA
ON AA.entity = BB.entity
AND AA.DATE = BB.date
WHERE AA.date == null
GROUP BY
entity,
date
)
SELECT * FROM (SELECT * FROM AA UNION ALL SELECT * FROM BB)
I find your question confusing, because I don't know where the aggregation is coming from.
The basic idea on getting newer rows from table_b uses conditions in the where clause, something like this:
select . . .
from table_a a
union all
select . . .
from table_b b
where b.date > (select max(a.date) from a where a.entity = b.entity);
You can, of course, run this on your CTEs, if those are what you really want to combine.
Use UNION instead of UNION ALL , it will remove the duplicate records
SELECT * FROM (
SELECT *
FROM AA
UNION
SELECT *
FROM BB )

SQL - Count Results of 2 Columns

I have the following table which contains ID's and UserId's.
ID UserID
1111 11
1111 300
1111 51
1122 11
1122 22
1122 3333
1122 45
I'm trying to count the distinct number of 'IDs' so that I get a total, but I also need to get a total of ID's that have also seen the that particular ID as well... To get the ID's, I've had to perform a subquery within another table to get ID's, I then pass this into the main query... Now I just want the results to be displayed as follows.
So I get a Total No for ID and a Total Number for Users ID - Also would like to add another column to get average as well for each ID
TotalID Total_UserID Average
2 7 3.5
If Possible I would also like to get an average as well, but not sure how to calculate that. So I would need to count all the 'UserID's for an ID add them altogether and then find the AVG. (Any Advice on that caluclation would be appreciated.)
Current Query.
SELECT DISTINCT(a.ID)
,COUNT(b.UserID)
FROM a
INNER JOIN b ON someID = someID
WHERE a.ID IN ( SELECT ID FROM c WHERE GROUPID = 9999)
GROUP BY a.ID
Which then Lists all the IDs and COUNT's all the USERID.. I would like a total of both columns. I've tried warpping the query in a
SELECT COUNT(*) FROM (
but this only counts the ID's which is great, but how do I count the USERID column as well
You seem to want this:
SELECT COUNT(DISTINCT a.ID), COUNT(b.UserID),
COUNT(b.UserID) * 1.0 / COUNT(DISTINCT a.ID)
FROM a INNER JOIN
b
ON someID = someID
WHERE a.ID IN ( SELECT ID FROM c WHERE GROUPID = 9999);
Note: DISTINCT is not a function. It applies to the whole row, so it is misleading to put an expression in parentheses after it.
Also, the GROUP BY is unnecessary.
The 1.0 is because SQL Server does integer arithmetic and this is a simple way to convert a number to a decimal form.
You can use
SELECT COUNT(DISTINCT a.ID) ...
to count all distinct values
Read details here
I believe you want this:
select TotalID,
Total_UserID,
sum(Total_UserID+TotalID) as Total,
Total_UserID/TotalID as Average
from (
SELECT (DISTINCT a.ID) as TotalID
,COUNT(b.UserID) as Total_UserID
FROM a
INNER JOIN b ON someID = someID
WHERE a.ID IN ( SELECT ID FROM c WHERE GROUPID = 9999)
) x

SQL query with grouping and MAX

I have a table that looks like the following but also has more columns that are not needed for this instance.
ID DATE Random
-- -------- ---------
1 4/12/2015 2
2 4/15/2015 2
3 3/12/2015 2
4 9/16/2015 3
5 1/12/2015 3
6 2/12/2015 3
ID is the primary key
Random is a foreign key but i am not actually using table it points to.
I am trying to design a query that groups the results by Random and Date and select the MAX Date within the grouping then gives me the associated ID.
IF i do the following query
select top 100 ID, Random, MAX(Date) from DateBase group by Random, Date, ID
I get duplicate Randoms since ID is the primary key and will always be unique.
The results i need would look something like this
ID DATE Random
-- -------- ---------
2 4/15/2015 2
4 9/16/2015 3
Also another question is there could be times where there are many of the same date. What will MAX do in that case?
You can use NOT EXISTS() :
SELECT * FROM YourTable t
WHERE NOT EXISTS(SELECT 1 FROM YourTable s
WHERE s.random = t.random
AND s.date > t.date)
This will select only those who doesn't have a bigger date for corresponding random value.
Can also be done using IN() :
SELECT * FROM YourTable t
WHERE (t.random,t.date) in (SELECT s.random,max(s.date)
FROM YourTable s
GROUP BY s.random)
Or with a join:
SELECT t.* FROM YourTable t
INNER JOIN (SELECT s.random,max(s.date) as max_date
FROM YourTable s
GROUP BY s.random) tt
ON(t.date = tt.max_date and s.random = t.random)
In SQL Server you could do something like the following,
select a.* from DateBase a inner join
(select Random,
MAX(dt) as dt from DateBase group by Random) as x
on a.dt =x.dt and a.random = x.random
This method will work in all versions of SQL as there are no vendor specifics (you'll need to format the dates using your vendor specific syntax)
You can do this in two stages:
The first step is to work out the max date for each random:
SELECT MAX(DateField) AS MaxDateField, Random
FROM Example
GROUP BY Random
Now you can join back onto your table to get the max ID for each combination:
SELECT MAX(e.ID) AS ID
,e.DateField AS DateField
,e.Random
FROM Example AS e
INNER JOIN (
SELECT MAX(DateField) AS MaxDateField, Random
FROM Example
GROUP BY Random
) data
ON data.MaxDateField = e.DateField
AND data.Random = e.Random
GROUP BY DateField, Random
SQL Fiddle example here: SQL Fiddle
To answer your second question:
If there are multiples of the same date, the MAX(e.ID) will simply choose the highest number. If you want the lowest, you can use MIN(e.ID) instead.

SQL Query to generate an extra field from data in the table

I have a table with 3 fields like this sample table Tbl1
Person Cost FromDate
1 10 2009-1-1
1 20 2010-1-1
2 10 2009-1-1
I want to query it and get back the 3 fields and a generated field called ToDate that defaults to 2099-1-1 unless there is an actual ToDate implied from another entry for the person in the table.
select Person,Cost,FromDate,ToDate From Tbl1
Person Cost FromDate ToDate
1 10 2009-1-1 2010-1-1
1 20 2010-1-1 2099-1-1
2 10 2009-1-1 2099-1-1
You can select the minimum date from all dates that are after the record's date. If there is none you get NULL. With COALESCE you change NULL into the default date:
select
Person,
Cost,
FromDate,
coalesce((select min(FromDate) from Tbl1 later where later.FromDate > Tbl1.FromDate), '2099-01-01') as ToDate
From Tbl1
order by Person, FromDate;
Although Thorsten's answer is perfectly fine, it would be more efficient to use window-functions to match the derived end-dates.
;WITH nbrdTbl
AS ( SELECT Person, Cost, FromDate, row_nr = ROW_NUMBER() OVER (PARTITION BY Person ORDER BY FromDate ASC)
FROM Tbl1)
SELECT t.Person, t.Cost, t.FromDate, derived_end_date = COALESCE(nxt.FromDate, '9991231')
FROM nbrdTbl t
LEFT OUTER JOIN nbrdTbl nxt
ON nxt.Person = t.Person
AND nxt.row_nr = t.row_nr + 1
ORDER BY t.Person, t.FromDate
Doing a test on a 2000-records table it's about 3 times as efficient according to the Execution plan (78% vs 22%).