SQL Server - Compare on multiple columns in differnet rows within same table - sql

I have a table that holds the values read from 2 meters:
----------------------------
Date | MeterID | Value
----------------------------
1/2/14 A 1.3
2/2/14 A 1.8
2/2/14 B 3.8
3/3/14 A 1.2
4/3/14 A 1.8
4/3/14 B 2.9
I need a query that will count the number of days that a reading exists for BOTH meter types (A & B)?
In the example above this should yield 2 as the result.
Thanks.

You can use a temporary table to list [Date]s when there were occurrences in MeterID for both A and B and then COUNT() all this [Date]s :
SELECT COUNT(t.*)
FROM ( SELECT [Date]
FROM [table]
GROUP BY [Date]
HAVING COUNT(DISTINCT [MeterID]) = 2
) t

Another solution, using sets:
select count(*) from
(
select distinct date from table where meterid='A'
intersect
select distinct date from table where meterid='B'
) x

Related

SQL SELECT repeating rows from table for specific time interval

I have table and I want to find repeating rows for specific time interval (DATE is input parameter for SQL query) where it will list all rows with the same PERSON and TYPE value.
ID DATE PERSON TYPE
1 01.01.2017 PERSON1 TYPE1
2 02.02.2017 PERSON1 TYPE1
3 03.03.2017 PERSON2 TYPE1
4 04.04.2017 PERSON2 TYPE2
5 05.05.2017 PERSON2 TYPE1
6 06.06.2017 PERSON1 TYPE2
So for example if DATE is between 01.01 and 04.04 it should list me rows with ID 1 and 2.
If DATE is between 01.01 and 06.06 it should list me rows with ID 1, 2, 3 and 5 because 1 and 2 have the same person and type in that interval and 3 and 5 have the same person and type in that interval.
SELECT ID FROM TABLE
WHERE DATE>='01.01.2017' AND DATE<='06.06.2017'
but I am not sure even how to start to define this repeating clause based on PERSON and TYPE columns.
Maybe can INNER JOIN help with this if referencing the same table and matching those two columns and third column ID is different?: TABLE.PERSON=TABLE.PERSON and TABLE.TYPE=TABLE.TYPE and TABLE.ID!=TABLE.ID of course table is the same but different alias can be used for this?
Please try...
SELECT ID AS ID
FROM tableName
JOIN
(
SELECT person AS person,
type AS type,
COUNT( person ) AS countOfPair
FROM tableName
WHERE date BETWEEN startDate AND endDate
GROUP BY person,
type
) tempTable ON tableName.person = tempTable.person AND
tableName.type = tempTable.type
WHERE countOfPair >= 2
The inner SELECT gathers each combination of person and type in between your start and end dates (please replace startDate and endDate with however you are referencing those) and performs a count of them.
The outer SELECT statement's JOIN then has the effect of appending the count of each combination to the end of each row containing that combination. The outer SELECT then retrieves the ID from each row that has a repeated combination.
If you have any questions or comments, then please feel free to post a Comment accordingly.
You can try this (I don't know if your version has window analytic function):
(X is the name of your table)
SELECT Y.ID, Y.DATE, Y.PERSON, Y.TYPE
FROM (
SELECT *, COUNT(*) OVER (PARTITION BY PERSON, TYPE) AS RC
FROM X
WHERE DATE >='01.01.2017' AND DATE <='04.04.2017'
) Y
WHERE RC>1
Or this if it doesn't support them:
SELECT X.ID, X.DATE, X.PERSON, X.TYPE
FROM X
INNER JOIN (
SELECT PERSON, TYPE, COUNT(*) AS RC
FROM X
WHERE DATE >='01.01.2017' AND DATE <='04.04.2017'
GROUP BY PERSON, TYPE
) Y ON X.PERSON = Y.PERSON AND X.TYPE = Y.TYPE
WHERE RC>1
I suggest to use always appropriate conversion for date datatypes.
Another method would be:
SELECT a.id
FROM tablename a NATURAL JOIN
(SELECT person,type FROM tablename
WHERE date>='01.01.2017' AND date<='06.06.2017'
GROUP BY person, type HAVING COUNT(*)>1) b ;
The NATURAL JOIN would automatically use columns person and type.
Add "DISTINCT" clause to avoid redundancy
SELECT DISTINCT ID FROM TABLE
WHERE DATE>='01.01.2017' AND DATE<='06.06.2017'

SQL query with grouping and MAX

I have a table that looks like the following but also has more columns that are not needed for this instance.
ID DATE Random
-- -------- ---------
1 4/12/2015 2
2 4/15/2015 2
3 3/12/2015 2
4 9/16/2015 3
5 1/12/2015 3
6 2/12/2015 3
ID is the primary key
Random is a foreign key but i am not actually using table it points to.
I am trying to design a query that groups the results by Random and Date and select the MAX Date within the grouping then gives me the associated ID.
IF i do the following query
select top 100 ID, Random, MAX(Date) from DateBase group by Random, Date, ID
I get duplicate Randoms since ID is the primary key and will always be unique.
The results i need would look something like this
ID DATE Random
-- -------- ---------
2 4/15/2015 2
4 9/16/2015 3
Also another question is there could be times where there are many of the same date. What will MAX do in that case?
You can use NOT EXISTS() :
SELECT * FROM YourTable t
WHERE NOT EXISTS(SELECT 1 FROM YourTable s
WHERE s.random = t.random
AND s.date > t.date)
This will select only those who doesn't have a bigger date for corresponding random value.
Can also be done using IN() :
SELECT * FROM YourTable t
WHERE (t.random,t.date) in (SELECT s.random,max(s.date)
FROM YourTable s
GROUP BY s.random)
Or with a join:
SELECT t.* FROM YourTable t
INNER JOIN (SELECT s.random,max(s.date) as max_date
FROM YourTable s
GROUP BY s.random) tt
ON(t.date = tt.max_date and s.random = t.random)
In SQL Server you could do something like the following,
select a.* from DateBase a inner join
(select Random,
MAX(dt) as dt from DateBase group by Random) as x
on a.dt =x.dt and a.random = x.random
This method will work in all versions of SQL as there are no vendor specifics (you'll need to format the dates using your vendor specific syntax)
You can do this in two stages:
The first step is to work out the max date for each random:
SELECT MAX(DateField) AS MaxDateField, Random
FROM Example
GROUP BY Random
Now you can join back onto your table to get the max ID for each combination:
SELECT MAX(e.ID) AS ID
,e.DateField AS DateField
,e.Random
FROM Example AS e
INNER JOIN (
SELECT MAX(DateField) AS MaxDateField, Random
FROM Example
GROUP BY Random
) data
ON data.MaxDateField = e.DateField
AND data.Random = e.Random
GROUP BY DateField, Random
SQL Fiddle example here: SQL Fiddle
To answer your second question:
If there are multiples of the same date, the MAX(e.ID) will simply choose the highest number. If you want the lowest, you can use MIN(e.ID) instead.

Generate SQL rows

Given a number of types and a number of occurrences per type, I would like to generate something like this in T-SQL:
Occurrence | Type
-----------------
0 | A
1 | A
0 | B
1 | B
2 | B
Both the number of types and the number of occurrences per type are presented as values in different tables.
While I can do this with WHILE loops, I'm looking for a better solution.
Thanks!
This works with a number-table which i would use.
SELECT Occurrence = ROW_NUMBER() OVER (PARTITION BY Type ORDER BY Type) - 1
, Type
FROM Numbers num
INNER JOIN #temp1 t
ON num.n BETWEEN 1 AND t.Occurrence
Tested with this sample data:
create table #temp1(Type varchar(10),Occurrence int)
insert into #temp1 VALUES('A',2)
insert into #temp1 VALUES('B',3)
How to create a number-table? http://sqlperformance.com/2013/01/t-sql-queries/generate-a-set-1
If you have a table with the columns type and num, you have two approaches. One way is to use recursive CTEs:
with CTE as (
select type, 0 as occurrence, num
from table t
union all
select type, 1 + occurrence, num
from cte
where occurrence + 1 < num
)
select cte.*
from cte;
You may have to set the MAXRECURSION option, if the number exceeds 100.
The other way is to join in a numbers table. SQL Server uses spt_values for this purpose:
select s.number - 1 as occurrence, t.type
from table t join
spt_values s
on s.number <= t.num ;

SQL Server find the missing number

I have a table like below
id name year
--------------
1 A 2000
2 B 2000
2 B 2000
2 B 2000
5 C 2000
1 D 2001
3 E 2001
as well as you see in the year 2000 we missed id '3' and id '4' and in the year 2001 we missed id '2'. I want to generate my second table which includes missed items.
2nd table :
From-id to-id name year
--------------------------------
3 4 null 2000
2 null null 2001
Which method in a SQL query can solve my problem?
Gaps and Islands in Sequences is the name of this problem. you read this article
Here's something to get you started:
WITH cte AS
(
SELECT *
FROM
(VALUES
(1),(2),(3),(4),(5)
) Tally(number)
), cte2 as
(
SELECT DISTINCT [year]
FROM
(VALUES
(2000),(2000),(2001)
)tbl([year])
), cte3 as
(
SELECT *
FROM cte
CROSS JOIN cte2
)
SELECT *
FROM cte3
LEFT OUTER JOIN YourTable ON cte3.number = YourTable.id AND cte3.[year] = YourTable[year)
A few notes: please avoid using reserved keywords as column names (such as year).
Furthermore, since I didn't know how you'd handle multiple missing ranges I did not format the output to reflect a range. For example: What would be your expected output if only one row with id=3 would be in your table?
I'd probably use ROW_NUMBER for this
This query gives you what the correct ID should be (if I interpreted your question right):
SELECT
ROW_NUMBER() OVER (PARTITION BY yr ORDER BY name, yr) as "Correct ID", *
FROM misorder
It assigns a row number (so a number starting from 1 increasing by 1 every time the year is the same).
And to let you know which ones are missing I think this should be a working solution:
WITH missing AS
(
SELECT
ROW_NUMBER() OVER (PARTITION BY yr ORDER BY name, yr) as "Correct ID", *
FROM misorder
)
SELECT * FROM missing
WHERE "Correct ID" != "id"
It takes the first query as a base to select only those records where the assumed correct ID is not equal to the currently assigned ID. You can turn this into a query to include the ranges you mentioned, but not sure if that is really necessary.
Hope this helps.

What's the most efficient way to match values between 2 tables based on most recent prior date?

I've got two tables in MS SQL Server:
dailyt - which contains daily data:
date val
---------------------
2014-05-22 10
2014-05-21 9.5
2014-05-20 9
2014-05-19 8
2014-05-18 7.5
etc...
And periodt - which contains data coming in at irregular periods:
date val
---------------------
2014-05-21 2
2014-05-18 1
Given a row in dailyt, I want to adjust its value by adding the corresponding value in periodt with the closest date prior or equal to the date of the dailyt row. So, the output would look like:
addt
date val
---------------------
2014-05-22 12 <- add 2 from 2014-05-21
2014-05-21 11.5 <- add 2 from 2014-05-21
2014-05-20 10 <- add 1 from 2014-05-18
2014-05-19 9 <- add 1 from 2014-05-18
2014-05-18 8.5 <- add 1 from 2014-05-18
I know that one way to do this is to join the dailyt and periodt tables on periodt.date <= dailyt.date and then imposing a ROW_NUMBER() (PARTITION BY dailyt.date ORDER BY periodt.date DESC) condition, and then having a WHERE condition on the row number to = 1.
Is there another way to do this that would be more efficient? Or is this pretty much optimal?
I think using APPLY would be the most efficient way:
SELECT d.Val,
p.Val,
NewVal = d.Val + ISNULL(p.Val, 0)
FROM Dailyt AS d
OUTER APPLY
( SELECT TOP 1 Val
FROM Periodt p
WHERE p.Date <= d.Date
ORDER BY p.Date DESC
) AS p;
Example on SQL Fiddle
If there relatively very few periodt rows, then there is an option that may prove quite efficient.
Convert periodt into a From/To ranges table using subqueries or CTEs. (Obviously performance depends on how efficiently this initial step can be done, which is why a small number of periodt rows is preferable.) Then the join to dailyt will be extremely efficient. E.g.
;WITH PIds AS (
SELECT ROW_NUMBER() OVER(ORDER BY PDate) RN, *
FROM #periodt
),
PRange AS (
SELECT f.PDate AS FromDate, t.PDate as ToDate, f.PVal
FROM PIds f
LEFT OUTER JOIN PIds t ON
t.RN = f.RN + 1
)
SELECT d.*, p.PVal
FROM #dailyt d
LEFT OUTER JOIN PRange p ON
d.DDate >= p.FromDate
AND (d.DDate < p.ToDate OR p.ToDate IS NULL)
ORDER BY 1 DESC
If you want to try the query, the following produces the sample data using table variables. Note I added an extra row to dailyt to demonstrate no periodt entries with a smaller date.
DECLARE #dailyt table (
DDate date NOT NULL,
DVal float NOT NULL
)
INSERT INTO #dailyt(DDate, DVal)
SELECT '20140522', 10
UNION ALL SELECT '20140521', 9.5
UNION ALL SELECT '20140520', 9
UNION ALL SELECT '20140519', 8
UNION ALL SELECT '20140518', 7.5
UNION ALL SELECT '20140517', 6.5
DECLARE #periodt table (
PDate date NOT NULL,
PVal int NOT NULL
)
INSERT INTO #periodt
SELECT '20140521', 2
UNION ALL SELECT '20140518', 1