I have a query below where it compares the number of stagingCabincrew and StagingCockpitCrew columns from the staging schema and compares them to their data schema equivalent 'DataCabinCrew' and 'DataCockpitCrew'.
Below is the query and the results outputted:
WITH CTE AS
(SELECT cd.*,
c.*,
DataFlight,
l.ScheduledDepartureDate,
l.ScheduledDepartureAirport
FROM
(SELECT *,
ROW_NUMBER() OVER(PARTITION BY LegKey
ORDER BY UpdateID DESC) AS RowNumber
FROM Data.Crew) c
INNER JOIN Data.CrewDetail cd ON c.UpdateID = cd.CrewUpdateID
AND cd.IsPassive = 1
AND RowNumber = 1
INNER JOIN
(SELECT *,
Carrier + CAST(FlightNumber AS VARCHAR) + Suffix AS DataFlight
FROM Data.Leg) l ON c.LegKey = l.LegKey )
SELECT StagingFlight,
sac.DepartureDate,
sac.DepartureAirport,
cte.DataFlight,
cte.ScheduledDepartureDate,
cte.ScheduledDepartureAirport,
SUM(CASE
WHEN sac.CREWTYPE = 'F' THEN 1
ELSE 0
END) AS StagingCabinCrew,
SUM(CASE
WHEN sac.CREWTYPE = 'C' THEN 1
ELSE 0
END) AS StagingCockpitCrew,
SUM(CASE
WHEN cte.CrewType = 'F' THEN 1
ELSE 0
END) AS DataCabinCrew,
SUM(CASE
WHEN cte.CrewType = 'C' THEN 1
ELSE 0
END) AS DataCockpitCrew
FROM
(SELECT *,
Airline + CAST(FlightNumber AS VARCHAR) + Suffix AS StagingFlight,
ROW_NUMBER() OVER(PARTITION BY Airline + CAST(FlightNumber AS VARCHAR) + Suffix
ORDER BY UpdateId DESC) AS StageRowNumber
FROM Staging.SabreAssignedCrew) sac
LEFT JOIN CTE cte ON StagingFlight = DataFlight
AND sac.DepartureDate = cte.ScheduledDepartureDate
AND sac.DepartureAirport = cte.ScheduledDepartureAirport
AND sac.CREWTYPE = cte.CrewType
WHERE MONTH(sac.DepartureDate) + YEAR(sac.DepartureDate) = MONTH(GETDATE()) + YEAR(GETDATE())
AND StageRowNumber = 1 --AND cte.ScheduledDepartureDate IS NOT NULL
--AND cte.ScheduledDepartureAirport IS NOT NULL
GROUP BY StagingFlight,
sac.DepartureDate,
sac.DepartureAirport,
cte.DataFlight,
cte.ScheduledDepartureDate,
cte.ScheduledDepartureAirport
The results are correct, all I need to do is add a condition in the WHERE clause where StagingCabinCrew <> DataCabinCrew AND StagingCockpitCrew <> DataCockpitCrew
If a row appears then we have found an error in the data, I just need helping adding this condition in the WHERE Clause because the columns in the WHERE Clause are referring to a SUM and CASE Function. I just need help manipulating the query so that I can add this WHERE Clause
I will guess you are trying to use an alias in the same query.
You CANT do this, because the alias wont be recognized in the WHERE.
SELECT field1 + field2 as myField
FROM yourTable
WHERE myField > 3
You need to include it in a sub query
with cte2 as (
SELECT field1 + field2 as myField
FROM yourTable
)
SELECT *
FROM cte2
WHERE myField > 3
or repeat the function
SELECT field1 + field2 as myField
FROM yourTable
WHERE field1 + field2 > 3
Related
I have a Select statement that looks a bit like this (shortened here as its just selecting fields from an existing table and nothing overly complicated)
SELECT
CASE
WHEN dbo.Account_Inventory.NUMBER IS NULL THEN 'C' + CAST(CAST(dbo.Account_Inventory.CUST_ID AS bigint) AS nvarchar)
WHEN dbo.Account_Inventory.CUST_ID IS NULL THEN 'A' + CAST(CAST(dbo.Account_Inventory.ACCT_NUM AS bigint) AS nvarchar)
ELSE 'M' + CAST(CAST(dbo.Account_Inventory.NUMBER AS bigint) AS varchar)
END AS ID,
CASE...
...FROM
dbo.Account_Inventory LEFT OUTER JOIN dbo.Dorm ON dbo.Account_Inventory.ACCT_NUM = dbo.Dorm.ACCT_NUM
WHERE
(dbo.Account_Inventory.ACCT_CLOSE_DT IS NULL) AND
(CASE
WHEN dbo.Account_Inventory.XYZ = 'Yes' AND dbo.Account_Inventory.BUS_LINE_CDE IN ('BB', 'BBM', 'ABC', 'ABCD') THEN 'ABC'
WHEN dbo.Account_Inventory.XYZ = 'YES' THEN 'EFG'
ELSE dbo.Account_Inventory.GLOBAL_BUSINESS
END IN ('BIG', 'SMAL','ABC'))
ORDER BY
Order By
ID, dbo.Account_Inventory.INT_DAILY_RATE DESC
After this, I want to add a field which will flag the first record (ID field) and mark it as "Unique" and the other records as "na".
Any help is appreciated!
You can use case statement.
case when row_number() over (order by ID, Rate desc) =1 then 'Unique' else 'na'
Here is how I finally made it work
WITH "temp_results" AS (SELECT
CASE
WHEN dbo.Account_Inventory.NUMBER IS NULL THEN 'C' + CAST(CAST(dbo.Account_Inventory.CUST_ID AS bigint) AS nvarchar)
WHEN dbo.Account_Inventory.CUST_ID IS NULL THEN 'A' + CAST(CAST(dbo.Account_Inventory.ACCT_NUM AS bigint) AS nvarchar)
ELSE 'M' + CAST(CAST(dbo.Account_Inventory.NUMBER AS bigint) AS varchar)
END AS ID,
CASE...
*MORE CASE STATEMENTS*
...FROM
dbo.Account_Inventory LEFT OUTER JOIN dbo.Dorm ON dbo.Account_Inventory.ACCT_NUM = dbo.Dorm.ACCT_NUM
WHERE
(dbo.Account_Inventory.ACCT_CLOSE_DT IS NULL) AND
(CASE
WHEN dbo.Account_Inventory.XYZ = 'Yes' AND dbo.Account_Inventory.BUS_LINE_CDE IN ('BB', 'BBM', 'ABC', 'ABCD') THEN 'ABC'
WHEN dbo.Account_Inventory.XYZ = 'YES' THEN 'EFG'
ELSE dbo.Account_Inventory.GLOBAL_BUSINESS
END IN ('BIG', 'SMAL','ABC'))
ORDER BY
ID, dbo.Account_Inventory.INT_DAILY_RATE DESC OFFSET 0 ROWS)
SELECT "Rel ID", "Rel_Name"...
*LIST ALL FIELDS*
...(CASE WHEN ROW_NUMBER() OVER
(PARTITION BY "Rel_ID" ORDER BY Rel_ID, INT_DAILY_RATE desc)=1 THEN 'Unique' ELSE 'na' END) AS "Unique_Rel_Flag" FROM "temp_results"
What I was stuck on the longest is ending the first Order By statement with "OFFSET 0 ROWS". It will not work without that.
ORDER BY
ID, dbo.Account_Inventory.INT_DAILY_RATE DESC OFFSET 0 ROWS)
Hope that helps someone else out down the line!
I have a simple select query with some joins like:
SELECT
[c].[column1]
, [c].[column2]
FROM [Customer] AS [c]
INNER JOIN ...
So I do a left join with my principal table as:
LEFT JOIN [Communication] AS [com] ON [c].[CustomerGuid] = [com].[ComGuid]
this relatioship its 1 to *, one customer can have multiple communications
So in my select I want to get value 1 or 2 depending of condition:
Condition:
if ComTypeKey (from communication) table have a row with value 3 and have another row with vale 4 return 1 then 0
So I try something like:
SELECT
[c].[column1]
, [c].[column2]
, IIF([com].[ComTypeKey] = 3 AND [com].[ComTypeKey] = 4,1,0)
FROM [Customer] AS [c]
INNER JOIN ...
LEFT JOIN [Communication] AS [com] ON [c].[CustomerGuid] = [com].[ComGuid]
But it throws me two rows, beacause there are 2 rows on communication. My desire value is to get only one row with value 1 if my condition is true
If you have multiple rows you need GROUP BY, then count the relevant keys and subtract 1 to get (1, 0)
SELECT
[c].[column1]
, [c].[column2]
, COUNT(CASE WHEN [ComTypeKey] IN (3,4) THEN 1 END) - 1 as FLAG_CONDITION
FROM [Customer] AS [c]
INNER JOIN ...
LEFT JOIN [Communication] AS [com]
ON [c].[CustomerGuid] = [com].[ComGuid]
GROUP BY
[c].[column1]
, [c].[column2]
I'm not really sure I understand.
This will literally find if both values 3 and 4 exist for that CustomerGuid, and only select one of them in that case - not filtering out any record otherwise.
If this is not what you want, providing sample data with the expected result would remove the ambiguity.
SELECT Field1,
Field2,
...
FieldN
FROM (SELECT TMP.*,
CASE WHEN hasBothValues = 1 THEN
ROW_NUMBER() OVER ( PARTITION BY CustomerGuid ORDER BY 1 )
ELSE 1
END AS iterim_rn
FROM (SELECT TD.*,
MAX(CASE WHEN Value1 = '3' THEN 1 ELSE 0 END) OVER
( PARTITION BY CustomerGuid ) *
MAX(CASE WHEN Value1 = '4' THEN 1 ELSE 0 END) OVER
( PARTITION BY CustomerGuid ) AS hasBothValues
FROM TEST_DATA TD
) TMP
) TMP2
WHERE interim_rn = 1
I have a program with which my users can look up all the data traffic that happend the last 7 days. I use a stored procedure to get me that data - 250 records at a time (the user can page through that). The problem was, that the users get a lot of timeouts when they wanted to see that data.
Here is the stored procedure before I tried to optimize ist.
#MaxRecCount INT,
#PageOffset INT,
#IncludeData BIT
SELECT [Client], [Schema], [Version], [Records], [Fetched], [Receipted], [ProvidedAt], [FetchedAt], [ReceiptedAt],[PacketIds], [Record] FROM (
SELECT TOP(#MaxRecCount) MAX(bai_ExportPendingArchive.[UserName]) AS Client,
MAX(bai_ExportPendingArchive.Category) AS [Schema],
MAX(bai_ExportPendingArchive.ContractVersion) AS [Version],
COUNT(*) AS [Records],
SUM (CASE WHEN bai_ExportPendingAckArchive.ExportPendingId IS NULL THEN 0 ELSE 1 END) as [Fetched],
SUM (CASE WHEN bai_ExportPendingAckArchive.Receipted IS NULL THEN 0 ELSE 1 END) as [Receipted],
MAX(bai_ExportArchive.Inserted) AS [ProvidedAt],
MAX(CASE WHEN bai_ExportPendingAckArchive.ExportPendingId IS NULL THEN NULL ELSE bai_ExportPendingAckArchive.Inserted END) AS [FetchedAt],
MAX(CASE WHEN bai_ExportPendingAckArchive.Receipted IS NULL THEN NULL ELSE bai_ExportPendingAckArchive.Receipted END) AS [ReceiptedAt],
bai_ExportArchive.PacketIds AS [PacketIds],
NULL AS [Record],
ROW_NUMBER() Over (Order By MAX(bai_ExportArchive.Inserted) desc) as [RowNumber]
FROM bai_ExportArchive
INNER JOIN bai_ExportPendingArchive ON bai_ExportArchive.Id = bai_ExportPendingArchive.ExportId
LEFT OUTER JOIN bai_ExportPendingAckArchive ON bai_ExportPendingAckArchive.ExportPendingId = bai_ExportPendingArchive.Id
GROUP BY bai_ExportPendingArchive.[UserName], bai_ExportArchive.PacketIds, bai_ExportPendingArchive.Category
) AS InnerTable WHERE RowNumber > (#PageOffset * #MaxRecCount) and RowNumber <= (#PageOffset * #MaxRecCount + #MaxRecCount)
ORDER BY RowNumber
#MaxRecCount, #PageOffset and #IncludeData are parameter which came from my C#-method.
This version needed about 1:35min to get me the data I wanted. To make the stored procedure faster I insered a WHERE clause to filter for the Inserted col (also I made an Index on this column) and to use OFFSET FETCH:
The stored procedure after the optimization:
#MaxRecCount INT,
#PageOffset INT,
#IncludeData BIT
Declare #pageStart int
Declare #pageEnd int
SET #pageStart = #PageOffset * #MaxRecCount
SET #pageEnd = #pageStart + #MaxRecCount + 50
IF #IncludeData = 0
BEGIN
SELECT [Client], [Schema], [Version], [Records], [Fetched], [Receipted], [ProvidedAt], [FetchedAt], [ReceiptedAt],[PacketIds], [Record] FROM (
SELECT TOP(#MaxRecCount) bai_ExportPendingArchive.[UserName] AS Client,
bai_ExportPendingArchive.Category AS [Schema],
MAX(bai_ExportPendingArchive.ContractVersion) AS [Version],
COUNT(*) AS [Records],
SUM (CASE WHEN bai_ExportPendingAckArchive.ExportPendingId IS NULL THEN 0 ELSE 1 END) as [Fetched],
SUM (CASE WHEN bai_ExportPendingAckArchive.Receipted IS NULL THEN 0 ELSE 1 END) as [Receipted],
MAX(bai_ExportArchive.Inserted) AS [ProvidedAt],
MAX(CASE WHEN bai_ExportPendingAckArchive.ExportPendingId IS NULL THEN NULL ELSE bai_ExportPendingAckArchive.Inserted END) AS [FetchedAt],
MAX(CASE WHEN bai_ExportPendingAckArchive.Receipted IS NULL THEN NULL ELSE bai_ExportPendingAckArchive.Receipted END) AS [ReceiptedAt],
bai_ExportArchive.PacketIds AS [PacketIds],
NULL AS [Record],
ROW_NUMBER() Over (Order By MAX(bai_ExportArchive.Inserted) desc) as [RowNumber]
FROM bai_ExportArchive
INNER JOIN bai_ExportPendingArchive ON bai_ExportArchive.Id = bai_ExportPendingArchive.ExportId
LEFT OUTER JOIN bai_ExportPendingAckArchive ON bai_ExportPendingAckArchive.ExportPendingId = bai_ExportPendingArchive.Id
Where bai_ExportArchive.Inserted <= (Select bai_ExportArchive.Inserted from bai_ExportArchive Order by bai_ExportArchive.Inserted DESC Offset #pageStart ROWS FETCH NEXT 1 ROWS Only)
And bai_ExportArchive.Inserted > (Select bai_ExportArchive.Inserted from bai_ExportArchive Order by bai_ExportArchive.Inserted DESC Offset #pageEnd ROWS FETCH NEXT 1 ROWS Only)
GROUP BY bai_ExportPendingArchive.[UserName], bai_ExportArchive.PacketIds, bai_ExportPendingArchive.Category
) AS InnerTable
ORDER BY RowNumber
This version gives me the data in about 2s. The only problem is, I work on Microsoft SQL Server 2014 BUT my Users use SQL Server 2008+. The Problem now is, that the OFFSET FETCH dosn't work in Server 2008. And now I'm clueless how I can optimize my stored procedure that it is fast and work on SQl Server 2008.
I'm thankful for any help :)
Try this method to handle the pagination in SQL Server 2005/2008.
First use a CTE for your select query with a ROW_NUMBER() column to identify the record number/count. After that you can select a range of records from this CTE using your PAGE_NUMBER and PAGE_COUNT. Example is below
DECLARE #P_PAGE_NUM INT = 0
,#P_PAGE_SIZE INT = 20
;WITH CTE
AS
( /*SELECT ROW_NUMBER() OVER (ORDER BY COL_to_SORT DESC) AS [ROW_NO]
,...
WHERE ....
*/ -- You can replace your select query here, but column [ROW_NO] should be there in your select list.
--ie ROW_NUMBER() OVER (ORDER BY put_column-to-sort-here DESC) AS [ROW_NO]
)
SELECT *
--,( SELECT COUNT(*) FROM CTE) AS [TOTAL_ROW_COUNT]
FROM CTE
WHERE (
ISNULL(#P_PAGE_NUM,0) = 0 OR
[ROW_NO] BETWEEN ( #P_PAGE_NUM - 1) * #P_PAGE_SIZE + 1
AND #P_PAGE_NUM * #P_PAGE_SIZE
)
ORDER BY [ROW_NO]
I want to do something like this:
select id,
count(*) as total,
FOR temp IN SELECT DISTINCT somerow FROM mytable ORDER BY somerow LOOP
sum(case when somerow = temp then 1 else 0 end) temp,
END LOOP;
from mytable
group by id
order by id
I created working select:
select id,
count(*) as total,
sum(case when somerow = 'a' then 1 else 0 end) somerow_a,
sum(case when somerow = 'b' then 1 else 0 end) somerow_b,
sum(case when somerow = 'c' then 1 else 0 end) somerow_c,
sum(case when somerow = 'd' then 1 else 0 end) somerow_d,
sum(case when somerow = 'e' then 1 else 0 end) somerow_e,
sum(case when somerow = 'f' then 1 else 0 end) somerow_f,
sum(case when somerow = 'g' then 1 else 0 end) somerow_g,
sum(case when somerow = 'h' then 1 else 0 end) somerow_h,
sum(case when somerow = 'i' then 1 else 0 end) somerow_i,
sum(case when somerow = 'j' then 1 else 0 end) somerow_j,
sum(case when somerow = 'k' then 1 else 0 end) somerow_k
from mytable
group by id
order by id
this works, but it is 'static' - if some new value will be added to 'somerow' I will have to change sql manually to get all the values from somerow column, and that is why I'm wondering if it is possible to do something with for loop.
So what I want to get is this:
id somerow_a somerow_b ....
0 3 2 ....
1 2 10 ....
2 19 3 ....
. ... ...
. ... ...
. ... ...
So what I'd like to do is to count all the rows which has some specific letter in it and group it by id (this id isn't primary key, but it is repeating - for id there are about 80 different values possible).
http://sqlfiddle.com/#!15/18feb/2
Are arrays good for you? (SQL Fiddle)
select
id,
sum(totalcol) as total,
array_agg(somecol) as somecol,
array_agg(totalcol) as totalcol
from (
select id, somecol, count(*) as totalcol
from mytable
group by id, somecol
) s
group by id
;
id | total | somecol | totalcol
----+-------+---------+----------
1 | 6 | {b,a,c} | {2,1,3}
2 | 5 | {d,f} | {2,3}
In 9.2 it is possible to have a set of JSON objects (Fiddle)
select row_to_json(s)
from (
select
id,
sum(totalcol) as total,
array_agg(somecol) as somecol,
array_agg(totalcol) as totalcol
from (
select id, somecol, count(*) as totalcol
from mytable
group by id, somecol
) s
group by id
) s
;
row_to_json
---------------------------------------------------------------
{"id":1,"total":6,"somecol":["b","a","c"],"totalcol":[2,1,3]}
{"id":2,"total":5,"somecol":["d","f"],"totalcol":[2,3]}
In 9.3, with the addition of lateral, a single object (Fiddle)
select to_json(format('{%s}', (string_agg(j, ','))))
from (
select format('%s:%s', to_json(id), to_json(c)) as j
from
(
select
id,
sum(totalcol) as total_sum,
array_agg(somecol) as somecol_array,
array_agg(totalcol) as totalcol_array
from (
select id, somecol, count(*) as totalcol
from mytable
group by id, somecol
) s
group by id
) s
cross join lateral
(
select
total_sum as total,
somecol_array as somecol,
totalcol_array as totalcol
) c
) s
;
to_json
---------------------------------------------------------------------------------------------------------------------------------------
"{1:{\"total\":6,\"somecol\":[\"b\",\"a\",\"c\"],\"totalcol\":[2,1,3]},2:{\"total\":5,\"somecol\":[\"d\",\"f\"],\"totalcol\":[2,3]}}"
In 9.2 it is also possible to have a single object in a more convoluted way using subqueries in instead of lateral
SQL is very rigid about the return type. It demands to know what to return beforehand.
For a completely dynamic number of resulting values, you can only use arrays like #Clodoaldo posted. Effectively a static return type, you do not get individual columns for each value.
If you know the number of columns at call time ("semi-dynamic"), you can create a function taking (and returning) polymorphic parameters. Closely related answer with lots of details:
Dynamic alternative to pivot with CASE and GROUP BY
(You also find a related answer with arrays from #Clodoaldo there.)
Your remaining option is to use two round-trips to the server. The first to determine the the actual query with the actual return type. The second to execute the query based on the first call.
Else, you have to go with a static query. While doing that, I see two nicer options for what you have right now:
1. Simpler expression
select id
, count(*) AS total
, count(somecol = 'a' OR NULL) AS somerow_a
, count(somecol = 'b' OR NULL) AS somerow_b
, ...
from mytable
group by id
order by id;
How does it work?
Compute percents from SUM() in the same SELECT sql query
SQL Fiddle.
2. crosstab()
crosstab() is more complex at first, but written in C, optimized for the task and shorter for long lists. You need the additional module tablefunc installed. Read the basics here if you are not familiar:
PostgreSQL Crosstab Query
SELECT * FROM crosstab(
$$
SELECT id
, count(*) OVER (PARTITION BY id)::int AS total
, somecol
, count(*)::int AS ct -- casting to int, don't think you need bigint?
FROM mytable
GROUP BY 1,3
ORDER BY 1,3
$$
,
$$SELECT unnest('{a,b,c,d}'::text[])$$
) AS f (id int, total int, a int, b int, c int, d int);
How can I improve the SQL query below (SQL Server 2008)? I want to try to avoid sub-selects, and I'm using a couple of them to produce results like this
StateId TotalCount SFRCount OtherCount
---------------------------------------------------------
AZ 102 50 52
CA 2931 2750 181
etc...
SELECT
StateId,
COUNT(*) AS TotalCount,
(SELECT COUNT(*) AS Expr1 FROM Property AS P2
WHERE (PropertyTypeId = 1) AND (StateId = P.StateId)) AS SFRCount,
(SELECT COUNT(*) AS Expr1 FROM Property AS P3
WHERE (PropertyTypeId <> 1) AND (StateId = P.StateId)) AS OtherCount
FROM Property AS P
GROUP BY StateId
HAVING (COUNT(*) > 99)
ORDER BY StateId
This may work the same, hard to test without data
SELECT
StateId,
COUNT(*) AS TotalCount,
SUM(CASE WHEN PropertyTypeId = 1 THEN 1 ELSE 0 END) as SFRCount,
SUM(CASE WHEN PropertyTypeId <> 1 THEN 1 ELSE 0 END) as OtherCount
FROM Property AS P
GROUP BY StateId
HAVING (COUNT(*) > 99)
ORDER BY StateId
Your alternative is a single self-join of Property using your WHERE conditions as a join parameter. The OtherCount can be derived by subtracting the TotalCount - SFRCount in a derived query.
Another alternative would be to use the PIVOT function like this:
SELECT StateID, [1] + [2] AS TotalCount, [1] AS SFRCount, [2] AS OtherCount
FROM Property
PIVOT ( COUNT(PropertyTypeID)
FOR PropertyTypeID IN ([1],[2])
) AS pvt
WHERE [1] + [2] > 99
You would need to add an entry for each property type which could be daunting but it is another alternative. Scott has a great answer.
If PropertyTypeId is not null then you could do this with a single join. Count is faster than Sum. But is Count plus Join faster than Sum. The test case below mimics your data. docSVsys has 800,000 rows and there are about 300 unique values for caseID. The Count plus Join in this test case is slightly faster than the Sum. But if I remove the with (nolock) then Sum is about 1/4 faster. You would need to test with your data.
select GETDATE()
go;
select caseID, COUNT(*) as Ttl,
SUM(CASE WHEN mimeType = 'message/rfc822' THEN 1 ELSE 0 END) as SFRCount,
SUM(CASE WHEN mimeType <> 'message/rfc822' THEN 1 ELSE 0 END) as OtherCount,
COUNT(*) - SUM(CASE WHEN mimeType = 'message/rfc822' THEN 1 ELSE 0 END) as OtherCount2
from docSVsys with (nolock)
group by caseID
having COUNT(*) > 1000
select GETDATE()
go;
select docSVsys.caseID, COUNT(*) as Ttl
, COUNT(primaryCount.sID) as priCount
, COUNT(*) - COUNT(primaryCount.sID) as otherCount
from docSVsys with (nolock)
left outer join docSVsys as primaryCount with (nolock)
on primaryCount.sID = docSVsys.sID
and primaryCount.mimeType = 'message/rfc822'
group by docSVsys.caseID
having COUNT(*) > 1000
select GETDATE()
go;