Grouping 'like' rows in SQL and summing? - sql

I have a report that looks like this:
Notice the rows that I have circled (row 2 and 3 in the 1109 group). These rows have the same MemberSep, Location, and Consumer text. The only difference is they each have different values for the TODKWH001 and TODKWH002 fields.
What I'd like to do is group rows like this together and sum the TODKWH001 and TODKWH002 fields together.
So, instead of these two rows:
00002574027 00000003105401 YEAGER FMS PMP 50 13 00 0 1
00002574027 00000003105401 YEAGER FMS PMP 50 13 00 4998 81
I'd have just one row:
00002574027 00000003105401 YEAGER FMS PMP 50 13 00 4998 82
Can I do this in SQL? Or should I try to do the grouping in my report?
Also, here is my SQL that I use to populate the report now:
SELECT CAR1.CAV_MBRHISTDETL.MBRSEP,
CAR1.CAV_MBRHISTDETL.LOCATION,
CAR1.CAV_MBRHISTDETL.BILLTYPE,
CAR1.CAV_MBRHISTDETL.BILLMOYR,
CAR1.CAV_MBRHISTDETL.RATE,
CAR1.CAV_LOCINFODETL.DIST,
CAR1.CAV_DEMANDHISTDETL.TODKWH_001,
CAR1.CAV_DEMANDHISTDETL.TODKWH_002,
CAR1.CAV_LOCINFODETL.ADDR1, CAR1.CAV_DEMANDHISTDETL.READTYPE
FROM CAR1.CAV_LOCINFODETL, { oj CAR1.CAV_MBRHISTDETL LEFT OUTER JOIN
CAR1.CAV_DEMANDHISTDETL ON CAR1.CAV_MBRHISTDETL.MBRSEP =
CAR1.CAV_DEMANDHISTDETL.MBRSEP AND
CAR1.CAV_MBRHISTDETL.BILLMOYR = CAR1.CAV_DEMANDHISTDETL.BILLMOYR }
WHERE CAR1.CAV_MBRHISTDETL.LOCATION = CAR1.CAV_LOCINFODETL.LOCATION AND
(CAR1.CAV_MBRHISTDETL.BILLMOYR IN ('1104', '1105', '1106', '1107', '1108','1109')) AND
(CAR1.CAV_MBRHISTDETL.RATE = '0096') AND (CAR1.CAV_MBRHISTDETL.BILLTYPE IN ('00', '01'))
ORDER BY CAR1.CAV_LOCINFODETL.DIST,
CAR1.CAV_MBRHISTDETL.BILLMOYR,
CAR1.CAV_MBRHISTDETL.MBRSEP

Work this into your current query:
SELECT MemberSep, Location, Consumer, SUM(TODKWH001), SUM(TODKWH002)
FROM yourtable
GROUP BY MemberSep, Location, Consumer

Related

DeCorrelated SubQueries in Google BigQuery?

I have been struggling with a problem for hours. I have found myself down multiple rabbit holes and into the realms of DeCorrelated SubQueries which are frankly beyond me...
I have two tables and I'm trying to pull from both without a common column to join against. I need to take the a value from table 1, find the closest value (that is lower) in table 2 and then pull related data from table 2.
table_1
id
score
1
99.983545
2
98.674359
3
97.832475
4
96.184545
5
93.658572
6
89.963544
7
87.427353
8
82.883345
table_2
average_level
percentile
99.743545
99
97.994359
98
97.212485
97
96.987545
96
95.998573
95
88.213584
94
87.837384
93
80.982147
92
From the two tables above I need to:
Take the id and score
identify the closest average_level to the score
include the correlating average_level and percentile
The hoped for output would look like this...
id
score
average_level
percentile
1
99.983545
99.743545
99
2
98.674359
97.994359
98
3
97.832475
97.212485
97
4
96.184545
95.998573
95
5
93.658572
88.213584
94
6
89.963544
88.213584
94
7
87.427353
80.982147
92
8
82.883345
80.982147
92
Any help or advice would be very much appreciated
You can do this by joining both tables with table_1.score >= table_2.average_level and then getting the max(average_level) and max(average_level) - which will be the closest yet inferior or equal values from table_2 - and grouping by the fields in table_1:
SELECT TABLE_1.ID, TABLE_1.SCORE,
MAX(TABLE_2.AVERAGE_LEVEL) AS AVERAGE_LEVEL,
MAX(TABLE_2.PERCENTILE) AS PERCENTILE
FROM TABLE_1 INNER JOIN TABLE_2
ON TABLE_1.SCORE >= TABLE_2.AVERAGE_LEVEL
GROUP BY TABLE_1.ID, TABLE_1.SCORE
ORDER BY TABLE_1.ID
I add the fiddle example here, it also includes #Ömer's answer
if we say first table score
and second one avarage
you can try this.
select *
from Score s
inner join average a on a.Percentile = (select top(1) al.Percentile from average al order by Abs(average_level - s.score))
enter image description here

How do I exclude when all lines matches

I’m looking to exclude all rows when there is matching edit_codes to line_nbr.
Teradata SQL query:
Select d.*
From decision_table d
LEFT OUTER JOIN edit_codes e ON
d.unique_key=e.unique_key
WHERE e.source_edit like (‘hp%’)
AND d.line_nbr <>SUBSTR(e.edit_line_nbr,
3,2)
***OR I wrote it as “AND d.line_number =
SUBSTR(e.edit_line_nbr,3,2)
Q1: Even though, row 2 and 3 (EDIT_CODES TO LINE_NBR) don’t match, however, since row 1 and 4 does I want to exclude. How do I write it to capture if line_nbr matches any edit_codes on any row?
row# EDIT_CODES LINE_NBR
1 HP01 01
2 HP02 01
3 HP01 02
4 HP02 02

Select row with shortest string in one column if there are duplicates in another column?

Let's say I have a database with rows like this
ID PNR NAME
1 35 Television
2 35 Television, flat screen
3 35 Television, CRT
4 87 Hat
5 99 Cup
6 99 Cup, small
I want to select each individual type of item (television, hat, cup) - but for the ones that have multiple entries in PNR I only want to select the one with the shortest NAME. So the result set would be
ID PNR NAME
1 35 Television
4 87 Hat
5 99 Cup
How would I construct such a query using SQLite? Is it even possible, or do I need to do this filtering in the application code?
Since SQLite 3.7.11, you can use MIN() or MAX() to select a row in a group:
SELECT ID,
PNR,
Name,
min(length(Name))
FROM MyTable
GROUP BY PNR;
You can use MIN(length(name))-aggregate function to find out the minimum length of several names; the slightly tricky thing is to get corresponding ID and NAME into the result. The following query should work:
select mt1.ID, mt1.PNR, mt1.Name
from MyTable mt1 inner join (
select pnr, min(length(Name)) as minlength
from MyTable group by pnr) mt2
on mt1.pnr = mt2.pnr and length(mt1.Name) = mt2.minlength

SQL - Delete value if incremental pattern not met

I have a table with a column of values with the following sample data that has been pulled for 1 user:
ID | Data
5 Record1
12 NULL
13 NULL
15 Record1
20 Record12
28 NULL
31 NULL
35 Record12
37 Record23
42 Record34
51 NULL
53 Record34
58 Record5
61 Record17
63 NULL
69 Record17
What I would like to do is to delete any values in the Data column where the Data value does not have a start and finish record. So in the above Record 23 and Record 5 would be deleted.
Please note that the Record(n) may appear more than once so it's not as straight forward as doing a count on the Data value. It needs to be incremental, a record should always start and finish before another one starts, if it starts and doesnt finish then I want to remove it.
Sadly SQL Server 2008 does not have LAG or LEAD which would make the operation simpler.
You could use a common table expression for finding the non consecutive (non null) values, and delete them;
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (ORDER BY id) rn FROM table1 WHERE data IS NOT NULL
)
DELETE c1 FROM cte c1
LEFT JOIN cte c2 ON (c1.rn = c2.rn+1 OR c1.rn = c2.rn-1) AND c1.data = c2.data
WHERE c2.id IS NULL
An SQLfiddle to test with.
If you just want to see which rows would be deleted, replace DELETE c1 with SELECT c1.*.
...and as always, remember to back up before running potentially destructive SQL for random people on the Internet.

Pivot results not working for me

I have the following query that generates my pivot results:
SELECT * FROM
(
SELECT
#tmp1.Name,
DATEDIFF(D,#tmp1.AuthDate,#tmp1.AuthExpirationDate) AS AuthLenInDays,
#tmp1.NbrOfAuthorizations,
#tmp1.MODE
FROM #tmp1
LEFT JOIN #tmp2
ON #tmp2.AuthID = #tmp1.AuthID
GROUP BY #tmp1.Name, #tmp1.NbrOfAuthorizations, #tmp1.AuthDate, #tmp1.AuthExpirationDate, #tmp1.MODE
) AS InnerTbl
PIVOT
(AVG(AuthLenInDays) FOR [MODE] IN ([Preservation])
) PivotResults1
The results are as follows:
Name NbrOfAuthorizations Preservation
Centro 1 79
Dennis 1 92
Therapy Center 1 68
Florez 1 92
I have two problems that I have not been able to figure out, I've tried everything I can think of and even other suggestions from stackoverflow.
I can't figure out how to change the name of the right-most column (Preservation)
in my results. It's an average number so I'd like to label that
column 'Average'.
Also, the NbrOfAuthorizations needs to be summed for all the values
in the table. I have tried using a pivot and this gets me close but
not all the way there, I have also tried using a SUM in the InnerTbl
query but that isn't it either.
If I take my raw data and export that to excel and do a pivot there, I can see the numbers and what I should be getting. I am trying to take that process and do it purely in SQL. Based on the data in the table, the values for the SUM should be
Name NbrOfAuthorizations Preservation
Centro 5 79
Dennis 1 92
Therapy Center 57 68
Florez 1 92
Any masters of pivot out there?
Looks like you don't need pivot at all:
select
t1.Name,
sum(t1.NbrOfAuthorizations) as NbrOfAuthorizations,
avg(datediff(dd, t1.AuthDate, t1.AuthExpirationDate)) as AuthLenInDays
from #tmp1 as t1
-- looks like you don't need join also, or there're multiple rows
-- in #tmp2 for row in #tmp1
-- left outer join #tmp2 as t2 on t2.AuthID = t1.AuthID
where t1.mode = 'Preservation'
group by t1.Name