How to number lines, with total at end?

How to number lines, with total at end? - sql

My current sql:
select s.dcid, substr(s.lastfirst,0,3), to_char(a.att_date, 'mm/dd/yyyy'), a.periodid, p.name, a.attendance_codeid, ac.att_code, count(*)
from students s
join attendance a on s.id = a.studentid
join period p on a.periodid = p.id
join attendance_code ac on a.attendance_codeid = ac.id
WHERE ac.att_code IS NOT NULL
AND s.schoolid = 109
AND s.enroll_status = 0
AND s.student_number = 100887
AND a.att_date >= to_date('08/15/2013', 'mm/dd/yyyy')
group by s.dcid, s.lastfirst, to_char(a.att_date, 'mm/dd/yyyy'), a.periodid, p.name, a.attendance_codeid, ac.att_code
Output:
I would like to get the output to sequentially number each record where the count(*) column is, starting with 1 at each new group, and put a total at the bottom of the group, but I'm not sure how to do that. I have tried rollup at various parts of the group by expression, but it winds up giving subtotals for the dates, periodids, etc... I need it to total ONLY for the student (either s.dcid or s.lastfirst)
[Additional information per request...]
I'm hoping to achieve a report where my end users can search for students who have a given number of attendance records in a date range. For example, if the end user wants to find students who have 20 absences between 10/1/2013 and 10/31/2013, where the att_code is one of A,C,E,G... etc. Once the report runs, I want to show them the date the absence occurred, and the code that was used as a visual verification that the records found do indeed match their search criteria.
The output should look like the current output with the exception of the COUNT(*) column, which is where I'm hung up right now. I like how row_number sequentially numbers each record, but what I'm still seeking is how to reset the sequential numbering when the group (the student) changes.
For example...
DCID S.LASTFIRST A.ATT_DATE PERIODID NAME ATT_CODE COUNT(or # or Num...)
1006 Aco 08/29/2013 1704 4 W 1
1006 Aco 09/03/2013 1701 1 6 2
1006 Aco 09/05/2013 1706 6 G 3
...
1006 Aco 10/04/2013 1706 6 z 20
2543 Bro 08/29/2013 1704 4 W 1
2543 Bro 09/03/2013 1701 1 6 2
2543 Bro 09/05/2013 1706 6 G 3
...
2543 Bro 10/04/2013 1706 6 z 20
3121 Com 08/29/2013 1704 4 W 1
3121 Com 09/03/2013 1701 1 6 2
3121 Com 09/05/2013 1706 6 G 3
...
3121 Com 10/04/2013 1706 6 z 20
Of course, in this example, I am abbreviating the output by replacing row numbers 4 - 19 in each of the three groups with '...' I don't want to literally output this.

The ROW_NUMBER() analytical function will, unsurprisingly, number rows sequentially using the partitions and ordering you specify.
select s.dcid,
substr(s.lastfirst,0,3),
to_char(a.att_date, 'mm/dd/yyyy'),
a.periodid,
p.name,
a.attendance_codeid,
ac.att_code,
ROW_NUMBER() OVER ( ORDER BY s.dcid )
from students s
join attendance a on s.id = a.studentid
join period p on a.periodid = p.id
join attendance_code ac on a.attendance_codeid = ac.id
WHERE ac.att_code IS NOT NULL
AND s.schoolid = 109
AND s.enroll_status = 0
AND s.student_number = 100887
AND a.att_date >= to_date('08/15/2013', 'mm/dd/yyyy')
GROUP BY s.dcid,
s.lastfirst,
to_char(a.att_date, 'mm/dd/yyyy'),
a.periodid,
p.name,
a.attendance_codeid,
ac.att_code;
From your screenshot the COUNT() column is always 1 so the ROW_NUMBER() will also always be one (as that appears to be the maximum size of each group).
If this is not meant to be the case then you will need to be less restrictive in your GROUP BY clause - however you have not given enough information on what you expect the query to do for me to make any changes.

Use ROW_NUMBER function as follows:
SELECT s.dcid,
SUBSTR (s.lastfirst, 0, 3),
TO_CHAR (a.att_date, 'mm/dd/yyyy'),
a.periodid,
p.name,
a.attendance_codeid,
ac.att_code,
ROW_NUMBER() OVER (ORDER BY s.dcid) AS rownumber
-- I have ordered by s.dcid. You can order by whichever column you want.
FROM students s
JOIN attendance a ON s.id = a.studentid
JOIN period p ON a.periodid = p.id
JOIN attendance_code ac ON a.attendance_codeid = ac.id
WHERE ac.att_code IS NOT NULL
AND s.schoolid = 109
AND s.enroll_status = 0
AND s.student_number = 100887
AND a.att_date >= TO_DATE ('08/15/2013', 'mm/dd/yyyy');

Related

Inner Join - special time conditions

Given an hourly table A with full heart_rate records, e.g.:
User Hour Heart_rate
Joe 1 60
Joe 2 70
Joe 3 72
Joe 4 75
Joe 5 68
Joe 6 71
Joe 7 78
Joe 8 83
Joe 9 85
Joe 10 80
And a subset hours where a purchase happened, e.g.
User Hour Purchase
Joe 3 'Soda'
Joe 9 'Coke'
Joe 10 'Doughnut'
I want to keep only those records from A that are in B or at most 2hr behind the B subset, without duplication, i.e. and preserving both the heart_rate from A and the item purchased from b so the outcome is
User Hour Heart_rate Purchase
Joe 1 60 null
Joe 2 70 null
Joe 3 72 'Soda'
Joe 7 78 null
Joe 8 83 null
Joe 9 85 'Coke'
Joe 10 80 'Doughnut'
How can the result be achieved with an inner join, without duplication (in this case the hours 8&9) (This is an MWE, assume multiple users and timestamps instead of hours)
The obvious solution is to combine
Inner Join + deduplication
Left join
Can this be achieved in a more elegant way?

You could use an INNER join of the tables and conditional aggregation for the deduplication:
SELECT a.User, a.Hour, a.Heart_rate,
MAX(CASE WHEN a.Hour = b.Hour THEN b.Purchase END) Purchase
FROM a INNER JOIN b
ON b.User = a.User AND a.Hour BETWEEN b.Hour - 2 AND b.Hour
WHERE a.User = 'Joe' -- remove this line if you want results for all users
GROUP BY a.User, a.Hour, a.Heart_rate;
Or with MAX() window function:
SELECT DISTINCT a.*,
MAX(CASE WHEN a.Hour = b.Hour THEN b.Purchase END) OVER (PARTITION BY a.User, a.Hour) Purchase
FROM a INNER JOIN b
ON b.User = a.User AND a.Hour BETWEEN b.Hour - 2 AND b.Hour;
See the demo (for MySql but it is standard SQL).

Your solutiuons should work and sounds good.
There is another way, using 3 Select Statements.
The inner Select combines both tables by UNION ALL. Because only tables with the same columns can be combinded, fields which are only in one table have to be defined in the other one as well and set to null. The column hour_eat is added to see when the last purchase has occured. By sorting this table, we can archive that under each row from table B lies now the row of table A which occures next.
In the middle Select statement the lag(Purchase) gets the last Purchase. If we only think about the rows from the 1st table, the Purchase value from the 2nd table is now at the right place. This comes in handy if timestamps and not defined hours are used. The row the last_value calculates the time between the purchase and measurement of the heart_beat.
The outer Select filters the rows of interest. The last 2 hours before the purchase and only the rows of the 1st table.
With
heart_tbl as (SELECT "Joe" as USER, row_number() over() Hour, Heart_rate from unnest([60,72,72,75,68,71,78,83,85,80]) Heart_rate ),
eat_tbl as (Select "Joe" as User ,3 Hour , 'Soda' as Purchase UNION ALL SELECT "Joe", 9, 'Coke' UNION ALL SELECT "Joe", 10, 'Doughnut' )
SELECT user, hour,heart_rate,Purchase_,hours_till_Purchase
from
(
SELECT *,
lag(Purchase) over (order by hour, heart_rate is not null) as Purchase_,
hour-last_value(hour_eat ignore nulls) over (order by hour desc,heart_rate is not null) as hours_till_Purchase
From # combine both tables to one table (ordered by hours)
(
SELECT user, hour,heart_rate, null as Purchase, null as hour_eat from heart_tbl
UNION ALL
Select user, hour, null as heart_rate, Purchase, hour from eat_tbl
)
)
Where heart_rate is not null and hours_till_Purchase >= -2
order by hour

Conditional Aggregation with multiple case and group by

The query below gives me average of case when QuoteStatusID = 6 but it I am having issues with associating the average by Street column.
QuoteTable
QuoteID
QuoteStateID
ProjectManager_userID
Shipping_AddressID
1
6
12
56
2
6
12
56
3
26
12
56
4
6
12
18
5
26
12
18
Shipping_AddressID
56: 338 Elizabeth St
18: 83 East St
select [User].UserID, [User].fname, [User].lname,[User].JobTitle, address.Street,
(select avg(case when QuoteStatusID = 6 then 1.0 else 0 end) as QuoteAccept
from Quote q
where ProjectManager_UserID = userId
) as AcceptanceRate
from [User]
join quote on [user].UserID=Quote.ProjectManager_UserID
join Address on quote.Shipping_AddressID=Address.AddressID
where userID in (select distinct ProjectManager_UserID from quote)
order by AcceptanceRate desc;
Current output 3/5 =0.60
userid
fname
Lname
Street
AcceptanceRate
12
Jon
Smith
338 Elizabeth St
0.6
12
Jon
Smith
83 East St
0.6
Desired output 2/3=0.66 AND 1/2=0.50
userid
fname
Lname
Street
AcceptanceRate
12
Jon
Smith
338 Elizabeth St
0.66
12
Jon
Smith
83 East St.
0.50

I think you don't need a sub-query. Just avg as part of the query you have and use group by to give you distinct users and addresses.
select U.UserID, U.fname, U.lname, U.JobTitle, A.Street
, avg(case when Q1.QuoteStatusID = 6 then 1.0 else 0 end) as QuoteAccept
from [User] U
inner join Quote Q on Q.ProjectManager_UserID = U.UserID
inner join [Address] A on A.AddressID = Q.Shipping_AddressID
group by U.UserID, U.fname, U.lname, U.JobTitle, A.Street
order by AcceptanceRate desc;
Note: Short aliases make a query more readable. And you don't need your where clause, since the join on Quote already ensures the same condition.

Can you simply amend your avg to be
select avg(case when QuoteStateID = 6 then 1.0 else 0 end) over(partition by Shipping_AddressId) as QuoteAccept
Edit
To still use as a subquery it will need correlating in the where clause on Shipping_AddressId also

How to bring the most recent value?

I need to know my organisation sectors. But my code brings all the codes the organisation had since it was insert on the system and grouping by the its tag.
The most accurate query I have so far is grouping the the updates by MAX, but I don't want all the registries.. just the most recent one.
Expected Result: I need only the most recent value (B.Code), discarding all the other old B.Code values
SELECT A.organisation_ref, A.name, A.block_level, B.code_type, B.code , MAX(B.update_timestamp)
FROM [TB1].[DBO].[ORG] AS A
INNER JOIN [TB2].[DBO].[CODE] AS B
ON A.organisation_ref = B.organisation_ref AND B.CountryID = '76'
WHERE B.code_type = '1005'
GROUP BY A.organisation_ref, A.name, A.block_level, B.code_type, B.code
ORDER BY A.organisation_ref ASC
Result so Far:
organisation_ref organisation_name block_level code_type code update_timestamp
1 contoso A 7 1005 IAC 2008-05-12 19:27:41.567
1 contoso A 7 1005 IAE 2015-03-30 20:51:20.693
1 contoso A 7 1005 IN NULL
1 contoso A 7 1005 INE 2014-11-19 09:51:00.417
1 contoso A 7 1005 IQQ 2015-08-05 17:22:28.763
4 contoso B 0 1005 CUU 2011-10-25 11:34:58.420
4 contoso B 0 1005 DAB 2012-05-02 17:15:38.667
4 contoso B 0 1005 LLH 2015-10-08 08:25:43.260

You can use apply:
SELECT o.organisation_ref, o.name, o.block_level, c.code_type, c.code
FROM [TB1].[DBO].[ORG] o CROSS APPLY
(SELECT TOP (1) c.*
FROM [TB2].[DBO].[CODE] c
WHERE c.organisation_ref = o.organisation_ref AND
c.CountryID = '76' AND
c.code_type = '1005'
ORDER BY c.update_timestamp DESC
) c
ORDER BY o.organisation_ref ASC;
Notice that I fixed your table aliases so they are meaningful abbreviations for the tables, rather than meaningless arbitrary letters.
Also, if CountryID and code_type are strings, then the comparison to strings is fine. Otherwise, drop the single quotes so numbers are compared to numbers.

How to assign filters to row number () function in sql

I am trying to extract only single row after name = system in each case where the town is not Austin.
In case 1001 there are 8 rows, row # 4 is system, output should be only the row with Name=Terry and Date Moved=7/4/2019 (Next entry with town /= Austin)
Case Name Town Date Moved Row #(Not in table)
1001 Ted Madisson 9/7/2018 1
1001 Joyal Boston 10/4/2018 2
1001 Beatrice Chicago 1/1/2019 3
1001 System Chicago 1/5/2019 4
1001 John Austin 4/11/2019 5
1001 Simon Austin 6/11/2019 6
1001 Terry Cleveland 7/4/2019 7
1001 Hawkins Newyork 8/4/2019 8
1002 Devon Boston 12/4/2018 1
1002 Joy Austin 12/7/2018 2
1002 Rachael Newyork 12/19/2018 3
1002 Bill Chicago 1/4/2019 4
1002 System Dallas 2/12/2019 5
1002 Phil Austin 3/16/2019 6
1002 Dan Seattle 5/18/2019 7
1002 Claire Birmingham 7/7/2019 8
Tried sub query with row number function and not in ('Austin') filter
ROW_NUMBER() OVER(PARTITION BY Case ORDER BY Moved_date ASC) AS ROWNUM
Please note there are > 10k cases.

You can try this below script-
WITH CTE AS
(
SELECT [Case],[Name],Town,[Date Moved],
ROW_NUMBER() OVER (PARTITION BY [Case] ORDER BY [Date Moved]) [Row #]
FROM your_table
)
SELECT A.*
FROM CTE A
INNER JOIN
(
SELECT C.[Case],C.Town,MAX(C.[Row #]) MRN
FROM CTE C
INNER JOIN
(
SELECT *
FROM CTE A
WHERE A.Name = 'System'
)D ON C.[Case] = D.[Case] AND C.[Row #] > D.[Row #]
AND C.Town = 'Austin'
GROUP BY C.[Case],C.Town
)B ON A.[Case] = B.[Case] AND A.[Row #] = B.MRN+1
Output is -
Case Name Town Date Moved Row #
1001 Terry Cleveland 7/4/2019 6
1002 Dan Seattle 5/18/2019 7

Here are three possibilities. I'm still concerned about ties though. The first one will return multiple rows while the others only one per case:
with matches as (
select t1."case", min(t2."Date Moved") as "Date Moved"
from Movements r1 inner join Movements t2 on t1."case" = t2."case"
where t1.name = 'System' and t2.Town <> 'Austin'
and t2."Date Moved" > t1."Date Moved"
group by t1."case"
)
select t.*
from Movements t inner join matches m
on m."case" = t."case" and m."Date Moved" = t."Date Moved";
select m2.*
from Movements m1 cross apply (
select top 1 * from Movements m2
where m2.Town <> 'Austin' and m2."Date Moved" > m1."Date Moved"
order by m2."Date Moved"
) as match
where m1.name = 'System';
with m1 as (
select *,
count(case when name = 'System') over (partition by "case" order by "Date Moved") as flag
from Movements
), m2 as (
select *,
row_number() over (partition by "case" order by "Date Moved") as rn
from m1
where flag = 1 and name <> 'System' and Town <> 'Austin'
)
select * from m2 where rn = 1;
I'm basically assuming this is SQL Server. You might need a few minor tweaks if not.
It also does not require a town named Austin to fall between the "System" row and the desired row as I do not believe that was a stated requirement.

pgsql -Showing top 10 products's sales and other products as 'others' and its sum of sales

I have a table called "products" where it has 100 records with sales details. My requirement is so simple that I was not able to do it.
I need to show the top 10 product names with sales and other product names as "others" and its sales. so totally my o/p will be 11 rows. 11-th row should be others and sum of sales of all remaining products. Can anyone give me the logic?
O/p should be like this,
Name sales
------ -----
1 colgate 9000
2 pepsodent 8000
3 closeup 7000
4 brittal 6000
5 ariies 5000
6 babool 4000
7 imami 3000
8 nepolop 2500
9 lactoteeth 2000
10 menwhite 1500
11 Others 6000 (sum of sales of remaining 90 products)
here is my sql query,
select case when rank<11 then prod_cat else 'Others' END as prod_cat,
total_sales,ID,rank from (select ROW_NUMBER() over (order by (sum(i.grandtotal)) desc) as rank,pc.name as prod_cat,sum(i.grandtotal) as total_sales, pc.m_product_category_id as ID`enter code here`
from adempiere.c_invoice i join adempiere.c_invoiceline il on il.c_invoice_id=i.c_invoice_id join adempiere.m_product p on p.m_product_id=il.m_product_id join adempiere.m_product_category pc on pc.m_product_category_id=p.m_product_category_id
where extract(year from i.dateacct)=extract(year from now())
group by pc.m_product_category_id) innersql
order by total_sales desc
o/p what i got is,
prod_cat total_sales id rank
-------- ----------- --- ----
BSHIRT 4511697.63 460000015 1
BT-SHIRT 2725167.03 460000016 2
SHIRT 2630471.56 1000003 3
BJEAN 1793514.07 460000005 4
JEAN 1115402.90 1000004 5
GT-SHIRT 1079596.33 460000062 6
T SHIRT 446238.60 1000006 7
PANT 405189.00 1000005 8
GDRESS 396789.02 460000059 9
BTROUSER 393739.48 460000017 10
Others 164849.41 1000009 11
Others 156677.00 1000008 12
Others 146678.00 1000007 13

As #e4c5 suggests, use UNION:
select id, prod_cat, sum(total_sales) as total_sales
with
totals as (
select --pc.m_product_category_id as id,
pc.name as prod_cat,
sum(i.grandtotal) as total_sales,
ROW_NUMBER() over (order by sum(i.grandtotal) desc) as rank
from adempiere.c_invoice i
join adempiere.c_invoiceline il on (il.c_invoice_id=i.c_invoice_id)
join adempiere.m_product p on (p.m_product_id=il.m_product_id)
join adempiere.m_product_category pc on (pc.m_product_category_id=p.m_product_category_id)
where i.dateacct >= date_trunc('year', now()) and i.dateacct < date_trunc('year', now()) + interval '1' year
group by pc.m_product_category_id, pc.name
),
rankedothers as (
select prod_cat, total_sales, rank
from totals where rank <= 10
union
select 'Others', sum(total_sales), 11
from totals where rank > 10
)
select prod_cat, total_sales
from ranked_others
order by rank
Also, I recommend using sargable conditions like the one above, which is slightly more complicated than the one you implemented, but generally worth the extra effort.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to number lines, with total at end? - sql

Related

Inner Join - special time conditions

Conditional Aggregation with multiple case and group by

How to bring the most recent value?

How to assign filters to row number () function in sql

pgsql -Showing top 10 products's sales and other products as 'others' and its sum of sales

Categories

Resources