How to bring the most recent value? - sql

I need to know my organisation sectors. But my code brings all the codes the organisation had since it was insert on the system and grouping by the its tag.
The most accurate query I have so far is grouping the the updates by MAX, but I don't want all the registries.. just the most recent one.
Expected Result: I need only the most recent value (B.Code), discarding all the other old B.Code values
SELECT A.organisation_ref, A.name, A.block_level, B.code_type, B.code , MAX(B.update_timestamp)
FROM [TB1].[DBO].[ORG] AS A
INNER JOIN [TB2].[DBO].[CODE] AS B
ON A.organisation_ref = B.organisation_ref AND B.CountryID = '76'
WHERE B.code_type = '1005'
GROUP BY A.organisation_ref, A.name, A.block_level, B.code_type, B.code
ORDER BY A.organisation_ref ASC
Result so Far:
organisation_ref organisation_name block_level code_type code update_timestamp
1 contoso A 7 1005 IAC 2008-05-12 19:27:41.567
1 contoso A 7 1005 IAE 2015-03-30 20:51:20.693
1 contoso A 7 1005 IN NULL
1 contoso A 7 1005 INE 2014-11-19 09:51:00.417
1 contoso A 7 1005 IQQ 2015-08-05 17:22:28.763
4 contoso B 0 1005 CUU 2011-10-25 11:34:58.420
4 contoso B 0 1005 DAB 2012-05-02 17:15:38.667
4 contoso B 0 1005 LLH 2015-10-08 08:25:43.260

You can use apply:
SELECT o.organisation_ref, o.name, o.block_level, c.code_type, c.code
FROM [TB1].[DBO].[ORG] o CROSS APPLY
(SELECT TOP (1) c.*
FROM [TB2].[DBO].[CODE] c
WHERE c.organisation_ref = o.organisation_ref AND
c.CountryID = '76' AND
c.code_type = '1005'
ORDER BY c.update_timestamp DESC
) c
ORDER BY o.organisation_ref ASC;
Notice that I fixed your table aliases so they are meaningful abbreviations for the tables, rather than meaningless arbitrary letters.
Also, if CountryID and code_type are strings, then the comparison to strings is fine. Otherwise, drop the single quotes so numbers are compared to numbers.

Related

How to count the last records of a given status?

I need to count the last few months a member has had a D status.
For example, I have the table below, where I have the months from February to August for 2 members.
year_month
member_id
status
2020_02
1010
D
2020_03
1010
D
2020_04
1010
D
2020_05
1010
A
2020_06
1010
A
2020_07
1010
D
2020_08
1010
D
2020_02
1030
A
2020_03
1030
A
2020_04
1030
A
2020_05
1030
D
2020_06
1030
A
2020_07
1030
A
2020_08
1030
D
I need to count the number of months a member has been in D status in a row. In this example the expected result would be:
member_id
count status D
1010
2
1030
1
For member 1010 I need to count July and August, because in June he had A status.
Can anyone help me, please?
I'm a beginner and I have no idea how I can do this.
We can try first filtering for each member to only latest D records. Then, aggregate by members and find the counts.
SELECT member_id, COUNT(*) AS count_status_D
FROM
(
SELECT member_id
FROM yourTable t1
WHERE status = 'D' AND
NOT EXISTS (SELECT 1
FROM yourTable t2
WHERE t2.member_id = t1.member_id AND
t2.year_month > t1.year_month AND
t2.status <> 'D')
) t
GROUP BY member_id;
With SQL you could use for example:
SELECT COUNT(DISTINCT member_ID) AS member_ID FROM Table WHERE Status = D;
For more info: http://www-db.deis.unibo.it/courses/TW/DOCS/w3schools/sql/sql_func_count.asp.html

SQL Query Problem Involving (SUM, Group By, Order by, I guess? and maybe total, or even count)

By using SQL query, find out the Top 5 highest total Transaction Value, which Industry are they? and the number of stores in that industry?
My SQL data looks like this:
Store Name
Industry
Transaction Value
Ace
A
196
Ace
A
193
Area
A
168
Apple
A
165
Boy
B
145
Boy
B
143
Bull
B
136
Bread
B
131
Cat
C
116
Cat
C
106
Cake
C
104
Candy
C
102
Dog
D
101
Dog
D
92
Door
D
80
Daddy
D
75
Egg
E
70
Egg
E
67
Earl
E
66
Eagle
E
61
This is just for your reference, Top 5 highest Transaction Value are:
No.
Store Name
Industry
Total Transaction Value
1
Ace
A
389
2
Boy
B
288
3
Cat
C
222
4
Dog
D
193
5
Area
A
168
SQL Query Results should look something like this:
Industry
No. of Stores
A
2
B
1
C
1
D
1
E
0
select a.industry, sum(case when b.name is null then 0 else 1 end) as no
from
(select distinct industry from transactions ) a
left join
(select name, industry
from transactions
group by name, industry
order by sum(transaction_vaule) desc limit 5) b
on a.industry = b.industry
group by a.industry
order by a.industry
I think I have a solution for you. Please check my code I have used Common Table Expression ,CASE,SUM and group by =>
WITH CTE AS
(
SELECT industry, SUM(TransactionValue) AS Transaction_Value,
COUNT(StoreName) AS StoreCount FROM MYTable
GROUP BY StoreName,industry
ORDER BY SUM(TransactionValue) DESC
Limit 5
)
SELECT T1.industry,
SUM((CASE WHEN c.industry IS NULL THEN 0
ELSE 1 END)) as CT
FROM
(SELECT DISTINCT Industry FROM MYTable) AS T1
LEFT JOIN CTE as c ON T1.industry=c.industry
GROUP BY T1.industry
Note: Subquery is not best practice, but in your case, I think there will be no performance issue. Also, please check the code because, I do not have Snowflake SQL database installed, so there might be some syntactical error can be evident
.
To get a deterministic result, you must be aware of ties. Let's say the top 9 results are
Cat/A/600, Dog/A/500, Cat/B/500, Dog/B/400, Cat/C/300, Dog/C/300, Cat/D/300, Dog/D/200, Cat/E/100
Which is the top fifth? Cat/C/300 or Dog/C/300 or Cat/D/300? Or none of them? If we pick a row arbitrarily (by LIMIT 5 or FETCH FIRST 5 ROWS ONLY) we prefer one industry over another.
In standard SQL we have the clause FETCH FIRST 5 ROWS WITH TIES, but snowflake doesn't feature this, unfortunately. It does however feature DENSE_RANK. It ranks my sample rows thus:
#1: Cat/A/600
#2: Dog/A/500
#2: Cat/B/500
#3: Dog/B/400
#4: Cat/C/300
#4: Dog/C/300
#4: Cat/D/300
#5: Dog/D/200
#6: Cat/E/100
because the five top values are 600, 500, 400, 300, and 200.
The query:
select industry, count(case when rnk <= 5 then 1 end) as stores
from
(
select industry, dense_rank() over (order by sum(transaction_value) desc) as rnk
from mytable
group by store_name, industry
) ranked
group by industry
order by industry;
If you only want to show top industries:
select industry, count(*) as stores
from
(
select industry, dense_rank() over (order by sum(transaction_value) desc) as rnk
from mytable
group by store_name, industry
) ranked
where rnk <= 5
group by industry
order by industry;

How to assign filters to row number () function in sql

I am trying to extract only single row after name = system in each case where the town is not Austin.
In case 1001 there are 8 rows, row # 4 is system, output should be only the row with Name=Terry and Date Moved=7/4/2019 (Next entry with town /= Austin)
Case Name Town Date Moved Row #(Not in table)
1001 Ted Madisson 9/7/2018 1
1001 Joyal Boston 10/4/2018 2
1001 Beatrice Chicago 1/1/2019 3
1001 System Chicago 1/5/2019 4
1001 John Austin 4/11/2019 5
1001 Simon Austin 6/11/2019 6
1001 Terry Cleveland 7/4/2019 7
1001 Hawkins Newyork 8/4/2019 8
1002 Devon Boston 12/4/2018 1
1002 Joy Austin 12/7/2018 2
1002 Rachael Newyork 12/19/2018 3
1002 Bill Chicago 1/4/2019 4
1002 System Dallas 2/12/2019 5
1002 Phil Austin 3/16/2019 6
1002 Dan Seattle 5/18/2019 7
1002 Claire Birmingham 7/7/2019 8
Tried sub query with row number function and not in ('Austin') filter
ROW_NUMBER() OVER(PARTITION BY Case ORDER BY Moved_date ASC) AS ROWNUM
Please note there are > 10k cases.
You can try this below script-
WITH CTE AS
(
SELECT [Case],[Name],Town,[Date Moved],
ROW_NUMBER() OVER (PARTITION BY [Case] ORDER BY [Date Moved]) [Row #]
FROM your_table
)
SELECT A.*
FROM CTE A
INNER JOIN
(
SELECT C.[Case],C.Town,MAX(C.[Row #]) MRN
FROM CTE C
INNER JOIN
(
SELECT *
FROM CTE A
WHERE A.Name = 'System'
)D ON C.[Case] = D.[Case] AND C.[Row #] > D.[Row #]
AND C.Town = 'Austin'
GROUP BY C.[Case],C.Town
)B ON A.[Case] = B.[Case] AND A.[Row #] = B.MRN+1
Output is -
Case Name Town Date Moved Row #
1001 Terry Cleveland 7/4/2019 6
1002 Dan Seattle 5/18/2019 7
Here are three possibilities. I'm still concerned about ties though. The first one will return multiple rows while the others only one per case:
with matches as (
select t1."case", min(t2."Date Moved") as "Date Moved"
from Movements r1 inner join Movements t2 on t1."case" = t2."case"
where t1.name = 'System' and t2.Town <> 'Austin'
and t2."Date Moved" > t1."Date Moved"
group by t1."case"
)
select t.*
from Movements t inner join matches m
on m."case" = t."case" and m."Date Moved" = t."Date Moved";
select m2.*
from Movements m1 cross apply (
select top 1 * from Movements m2
where m2.Town <> 'Austin' and m2."Date Moved" > m1."Date Moved"
order by m2."Date Moved"
) as match
where m1.name = 'System';
with m1 as (
select *,
count(case when name = 'System') over (partition by "case" order by "Date Moved") as flag
from Movements
), m2 as (
select *,
row_number() over (partition by "case" order by "Date Moved") as rn
from m1
where flag = 1 and name <> 'System' and Town <> 'Austin'
)
select * from m2 where rn = 1;
I'm basically assuming this is SQL Server. You might need a few minor tweaks if not.
It also does not require a town named Austin to fall between the "System" row and the desired row as I do not believe that was a stated requirement.

SQL Prevent Multiple Counting

I found some similar SO questions regarding my issue and they fixed it by using sub-queries but I can't seem to apply it on my situation.
Goal
My goal is to count the animals that have been a breeder at least once in their lifespan.
I have 2 tables to keep track of when an animal became a breeder. Here's a simple look of how the tables are structured:
Animals (a)
id name
-------------------
100 Mouse
101 Cow
102 Pig
103 Dog
Breeding History (bh)
id animal_id code date
--------------------------------------------
500 100 B 2016-01-12
501 100 A 2016-01-25
502 101 B 2016-01-28
503 102 B 2016-02-02
504 100 B 2016-02-05
505 100 A 2016-02-08
In this scenario, my current query for counting works fine for both 101 | Cow and 102 | Pig since they only became a breeder (Code: B) once. The count for an animal who never became a breeder is also correct but it's not really a problem here. For an animal that became a breeder more than once in its lifespan e.g. 100 | Mouse it would be counted by the number of times it became a breeder.
Query
SELECT
a.name,
COUNT(CASE WHEN bh.code IN ('B') THEN 1 ELSE NULL END) AS breeder_count
FROM animals a
LEFT OUTER JOIN breeding_history bh
ON a.id = bh.animal_id
GROUP BY a.name
Result
name breeder_count
--------------------------
Mouse 2
Cow 1
Pig 1
Dog 0
The result shows that there are 2 mice that became a breeder when actually it was the same animal and should only be counted once.
You can use the DISTINCT keyword, so as to count a 'B' just once:
SELECT
a.name,
COUNT(DISTINCT CASE WHEN bh.code IN ('B') THEN 1 END) AS breeder_count
FROM animals a
LEFT OUTER JOIN breeding_history bh
ON a.id = bh.animal_id
GROUP BY a.name
As a side note, ELSE NULL is redundant and has been removed from the CASE expression.
Demo here

How to number lines, with total at end?

My current sql:
select s.dcid, substr(s.lastfirst,0,3), to_char(a.att_date, 'mm/dd/yyyy'), a.periodid, p.name, a.attendance_codeid, ac.att_code, count(*)
from students s
join attendance a on s.id = a.studentid
join period p on a.periodid = p.id
join attendance_code ac on a.attendance_codeid = ac.id
WHERE ac.att_code IS NOT NULL
AND s.schoolid = 109
AND s.enroll_status = 0
AND s.student_number = 100887
AND a.att_date >= to_date('08/15/2013', 'mm/dd/yyyy')
group by s.dcid, s.lastfirst, to_char(a.att_date, 'mm/dd/yyyy'), a.periodid, p.name, a.attendance_codeid, ac.att_code
Output:
I would like to get the output to sequentially number each record where the count(*) column is, starting with 1 at each new group, and put a total at the bottom of the group, but I'm not sure how to do that. I have tried rollup at various parts of the group by expression, but it winds up giving subtotals for the dates, periodids, etc... I need it to total ONLY for the student (either s.dcid or s.lastfirst)
[Additional information per request...]
I'm hoping to achieve a report where my end users can search for students who have a given number of attendance records in a date range. For example, if the end user wants to find students who have 20 absences between 10/1/2013 and 10/31/2013, where the att_code is one of A,C,E,G... etc. Once the report runs, I want to show them the date the absence occurred, and the code that was used as a visual verification that the records found do indeed match their search criteria.
The output should look like the current output with the exception of the COUNT(*) column, which is where I'm hung up right now. I like how row_number sequentially numbers each record, but what I'm still seeking is how to reset the sequential numbering when the group (the student) changes.
For example...
DCID S.LASTFIRST A.ATT_DATE PERIODID NAME ATT_CODE COUNT(or # or Num...)
1006 Aco 08/29/2013 1704 4 W 1
1006 Aco 09/03/2013 1701 1 6 2
1006 Aco 09/05/2013 1706 6 G 3
...
1006 Aco 10/04/2013 1706 6 z 20
2543 Bro 08/29/2013 1704 4 W 1
2543 Bro 09/03/2013 1701 1 6 2
2543 Bro 09/05/2013 1706 6 G 3
...
2543 Bro 10/04/2013 1706 6 z 20
3121 Com 08/29/2013 1704 4 W 1
3121 Com 09/03/2013 1701 1 6 2
3121 Com 09/05/2013 1706 6 G 3
...
3121 Com 10/04/2013 1706 6 z 20
Of course, in this example, I am abbreviating the output by replacing row numbers 4 - 19 in each of the three groups with '...' I don't want to literally output this.
The ROW_NUMBER() analytical function will, unsurprisingly, number rows sequentially using the partitions and ordering you specify.
select s.dcid,
substr(s.lastfirst,0,3),
to_char(a.att_date, 'mm/dd/yyyy'),
a.periodid,
p.name,
a.attendance_codeid,
ac.att_code,
ROW_NUMBER() OVER ( ORDER BY s.dcid )
from students s
join attendance a on s.id = a.studentid
join period p on a.periodid = p.id
join attendance_code ac on a.attendance_codeid = ac.id
WHERE ac.att_code IS NOT NULL
AND s.schoolid = 109
AND s.enroll_status = 0
AND s.student_number = 100887
AND a.att_date >= to_date('08/15/2013', 'mm/dd/yyyy')
GROUP BY s.dcid,
s.lastfirst,
to_char(a.att_date, 'mm/dd/yyyy'),
a.periodid,
p.name,
a.attendance_codeid,
ac.att_code;
From your screenshot the COUNT() column is always 1 so the ROW_NUMBER() will also always be one (as that appears to be the maximum size of each group).
If this is not meant to be the case then you will need to be less restrictive in your GROUP BY clause - however you have not given enough information on what you expect the query to do for me to make any changes.
Use ROW_NUMBER function as follows:
SELECT s.dcid,
SUBSTR (s.lastfirst, 0, 3),
TO_CHAR (a.att_date, 'mm/dd/yyyy'),
a.periodid,
p.name,
a.attendance_codeid,
ac.att_code,
ROW_NUMBER() OVER (ORDER BY s.dcid) AS rownumber
-- I have ordered by s.dcid. You can order by whichever column you want.
FROM students s
JOIN attendance a ON s.id = a.studentid
JOIN period p ON a.periodid = p.id
JOIN attendance_code ac ON a.attendance_codeid = ac.id
WHERE ac.att_code IS NOT NULL
AND s.schoolid = 109
AND s.enroll_status = 0
AND s.student_number = 100887
AND a.att_date >= TO_DATE ('08/15/2013', 'mm/dd/yyyy');