Count items in SQL Server - sql

This is my database
I want to count Bikes which are currently available in a RouteCode in a SAME EXPIRY WEEK. So if EXPIRY WEEKs are different, the RouteCode can reappear, otherwise the RouteCode has to display with the BikeQuantity it has.
This is my problem. The RouteCode = G shows up 2 times with 1 bikes each even though they are expired in a same week. How can I say it has 2 bikes in BikeQuantity column?

The first problem is that you are GROUPing by 'FoundDate'. You need to group by 'ExpiryWeek' if you want to aggregate (i.e. sum) the number of bikes on a per-ExpiryWeek basis.
The second problem is that you are SELECTing 'FoundDate' and 'ExpiryDate'. You cannot select these columns if you're grouping on 'ExpiryWeek' because there's no way to aggregate the data (dates) in those columns*. It follows that you should not order by 'ExpiryDate' (since this column won't appear in the output table).
( * ...because if you have two entries for a given expiry week but with different a 'FoundDate' for each, what would you expect to see in the result table in the 'FoundDate' column for that row?)
Change your SELECT clause to this:
SELECT li.RouteCode,
DATEPART(WK,DATEADD(WEEK, 4, FoundDate)) as ExpiryWeek,
COUNT(b.BikeId) As BikeQuantity
and change your GROUP BY clause to this:
GROUP BY li.RouteCode, DATEPART(WK,DATEADD(WEEK, 4, FoundDate))
ORDER BY DATEPART(WK,DATEADD(WEEK, 4, FoundDate));
and your query should work.
(Because 'ExpiryWeek' is a calculated column, you have to supply the same calculation in the GROUP BY and ORDER BY clauses - just specifying 'ExpiryWeek' won't work, at least for some variants of SQL. There are ways around this: for example, you could use a 'With' clause. See this answer for examples which avoid duplication: How to group by a Calculated Field)
Based on the latest code you have posted, the correct query should be:
select li.RouteCode,
DATEPART(WK,DATEADD(WEEK, 4, b.FoundDate)) as ExpiryWeek,
COUNT(b.BikeId) as BikeQuantity
FROM dbo.Bike b
LEFT OUTER JOIN dbo.Contact ct ON b.ContactId = ct.ContactId
INNER JOIN dbo.LocationInfo li ON li.PostCode = ct.PostCode
WHERE DATEADD(day, 30, FoundDate) >= GETDATE()
Group by li.RouteCode, DATEPART(WK,DATEADD(WEEK, 4, FoundDate))
ORDER BY DATEPART(WK,DATEADD(WEEK, 4, FoundDate));

Related

Two Tables/Databases error: Only one expression can be specified in the select list when the subquery is not introduced with EXISTS

Using SQL Server I'm trying to multiply two columns of a distinct part number using 2 tables and 2 databases but it gives this error:
Only one expression can be specified in the select list when the subquery is not introduced with EXISTS.
This SQL joins two tables in different databases and show the Final Part Number (FinalPartNo) for a bumper and the ancillary parts that it needs to put it together (bolts, brackets etc.)
Query:
SELECT
tb_1.FinalPartNo,
tb_1.SubPartType,
tb_1.SubPart,
tb_1.FinalItemSubPartQuantity,
tb_2.PurchasedOrMfg,
tb_2.SalesWeek1,
tb_2.SalesWeek2
FROM [009Reports].[dbo].[ANC Parts] tb_1
JOIN [555].[cache].[PurchasingSupplyChainNeeds] tb_2 ON tb_1.FinalPartNo = tb_2.ItemNo
So it displays this:
If you look at the table and to siplify things, I highlighted 3 part numbers that all use the same SubPart. Two of them use 4 of the FinalSubPartQuantity and one uses 2 in the "install" step. For SalesWeek1 of the highlighted in the image, they sold two of the FinalPartNo which requires 4 FinalSubPartQuantity and had two sales so that totals 8 needed for that week. I don't need the FinalPartNo but added that to show that it's multiple FinalPartNo with the same subpart.
Trying to figure to sum them up with a totals column for each SubPart for that week (for 52 weeks, just showing 2).
In this example, 03CSFY-0500350 for SalesWeek1 could total 150 having it on
multiple FinalPartNo and multiple steps (Fabricate, Assembly, Install).
So, I tried a subquery to make the SubPart distinct and multiply the FinalSubPartQuantity x SalesWeek1 for TotalSalesWeek1 but getting error. Trying to figure out syntax.
SELECT
tb_1.SubPart,
tb_1.FinalItemSubPartQuantity,
TotalSalesWeek1 = (SELECT DISTINCT(tb_1.SubPart),
tb_1.FinalItemSubPartQuantity * tb_2.SalesWeek1),
TotalSalesWeek2 = (SELECT DISTINCT(tb_1.SubPart),
tb_1.FinalItemSubPartQuantity * tb_2.SalesWeek2)
FROM [009Reports].[dbo].[ANC Parts] tb_1
JOIN [555].[cache].[PurchasingSupplyChainNeeds] tb_2 ON tb_1.FinalPartNo = tb_2.ItemNo
I'm just trying to display:
SubPart/FinalSubPartQuantity/TotalSalesWk1/TotalSalesWk2/TotalSalesWk3/ to week 52. So it just shows the sub part, sum of all the FinalSubPartQuantity amounts for that part for all the different FinalPartNo's and the total sales FinalItemPartQuantity * SalesWeek1, 2, 3...
summarize: subpart and how many sold of that part that week.
You can't set the TotalSalesWeek1 to two columns (DISTINCT(tb_1.SubPart) and tb_1.FinalItemSubPartQuantity * tb_2.SalesWeek1).
I would suggest something like the following
SELECT
tb_1.SubPart,
SUM(tb_1.FinalItemSubPartQuantity) FinalItemSubPartQuantity,
SUM(tb_1.FinalItemSubPartQuantity * tb_2.SalesWeek1) TotalSalesWeek1
SUM(tb_1.FinalItemSubPartQuantity * tb_2.SalesWeek2) TotalSalesWeek2
FROM [009Reports].[dbo].[ANC Parts] tb_1
JOIN [555].[cache].[PurchasingSupplyChainNeeds] tb_2 ON tb_1.FinalPartNo = tb_2.ItemNo
GROUP BY tb_1.SubPart
The GROUP BY tb_1.SubPart at the end says you want each unique SubPart on a row, the SUMs in the SELECT explain that you want those values summed for each group.

How do I do a sum per id?

SELECT distinct
A.PROPOLN, C.LIFCLNTNO, A.PROSASORG, sum (A.PROSASORG) as sum
FROM [FPRODUCTPF] A
join [FNBREQCPF] B on (B.IQCPLN=A.PROPOLN)
join [FLIFERATPF] C on (C.LIFPOLN=A.PROPOLN and C.LIFPRDCNT=A.PROPRDCNT and C.LIFBNFCNT=A.PROBNFCNT)
where C.LIFCLNTNO='2012042830507' and A.PROSASORG>0 and A.PROPRDSTS='10' and
A.PRORECSTS='1' and A.PROBNFLVL='M' and B.IQCODE='B10000' and B.IQAPDAT>20180101
group by C.LIFCLNTNO, A.PROPOLN, A.PROSASORG
This does not sum correctly, it returns two lines instead of one:
PROPOLN LIFCLNTNO PROSASORG sum
1 209814572 2012042830507 3881236 147486968
2 209814572 2012042830507 15461074 463832220
You are seeing two rows because A.PROSASORG has two different values for the "C.LIFCLNTNO, A.PROPOLN" grouping.
i.e.
C.LIFCLNTNO, A.PROPOLN, A.PROSASORG together give you two unique rows.
If you want a single row for C.LIFCLNTNO, A.PROPOLN, then you may want to use an aggregate on A.PROSASORG as well.
Your entire query is being filtered on your "C" table by the one LifClntNo,
so you can leave that out of your group by and just have it as a MAX() value
in your select since it will always be the same value.
As for you summing the PROSASORG column via comment from other answer, just sum it. Hour column names are not evidently clear for purpose, so I dont know if its just a number, a quantity, or whatever. You might want to just pull that column out of your query completely if you want based on a single product id.
For performance, I would suggest the following indexes on
Table Index
FPRODUCTPF ( PROPRDSTS, PRORECSTS, PROBNFLVL, PROPOLN )
FNBREQCPF ( IQCODE, IQCPLN, IQAPDAT )
FLIFERATPF ( LIFPOLN, LIFPRDCNT, LIFBNFCNT, LIFCLNTNO )
I have rewritten your query to put the corresponding JOIN components to the same as the table they are based on vs all in the where clause.
SELECT
P.PROPOLN,
max( L.LIFCLNTNO ) LIFCLNTNO,
sum (P.PROSASORG) as sum
FROM
[FPRODUCTPF] P
join [FNBREQCPF] N
on N.IQCODE = 'B10000'
and P.PROPOLN = N.IQCPLN
and N.IQAPDAT > 20180101
join [FLIFERATPF] L
on L.LIFCLNTNO='2012042830507'
and P.PROPOLN = L.LIFPOLN
and P.PROPRDCNT = L.LIFPRDCNT
and P.PROBNFCNT = L.LIFBNFCNT
where
P.PROPRDSTS = '10'
and P.PRORECSTS = '1'
and P.PROBNFLVL = 'M'
and P.PROSASORG > 0
group by
P.PROPOLN
Now, one additional issue you will PROBABLY be running into. You are doing a query with multiple joins, and it appears that there will be multiple records in EACH of your FNBREQCPF and FLIFERATPF tables for the same FPRODUCTPF entry. If you, you will be getting a Cartesian result as the PROSASORG value will be counted for each instance combination in the two other tables.
Ex: FProductPF has ID = X with a Prosasorg value of 3
FNBreQCPF has matching records of Y1 and Y2
FLIFERATPF has matching records of Z1, Z2 and Z3.
So now your total will be equal to 3 times 6 = 18.
If you look at the combinations, Y1:Z1, Y1:Z2, Y1:Z3 AND Y2:Z1, Y2:Z2, Y2:Z3 giving your 6 entries that qualify, times the original value of 3, thus bloating your numbers -- IF such multiple records may exist in each respective table. Now, imagine if your tables have 30 and 40 matching instances respectively, you have just bloated your totals by 1200 times.

SQL - sum column for every date

This seemed like a very easy thing to do but I got stuck. I have a query like this:
select op.date, count(p.numberofoutstanding)
from people p
left join outstandingpunches op
on p.fullname = op.fullname
group by op.date
That outputs a table like this:
How can I sum over the dates so the sum for each row is equal to the sums up to that date? For example, the first column would be 27, the second would be 27 + 4, the third 27 + 4 + 11, etc.
I encountered this and this question, and I saw people are using OVER in their queries for this, but I'm confused by what do I have to partition. I tried partitioning by date but it's giving me incorrect results.
You can use a cumulative sum. This looks like:
select op.date, count(*),
sum(count(*)) over (order by op.date) as running_count
from people p join
outstandingpunches op
on p.fullname = op.fullname
group by op.date;
Note: I changed the join from a left join to an inner join. You are aggregating by a column in the second table. Your results have no examples of a NULL date column and that doesn't seem useful. Hence, it seems that rows are assumed to match.
I believe you need to use sum and not count.
select o.date_c,
sum(sum(p.numberofoutstanding)) over (order by o.date_c)
from people p
left join outstandingpunches o on p.fullname = o.fullname
group by o.date_c;
Here is a small demo:
DEMO
Have in mind that I have renamed your column date to date_c. I believe you should not use data types as column names.

left join not doing as expected with sum and group by

This is all going to have to be pseudo as I am on my phone and have no internet access right now as I have just moved but its bugging the crap out of me. This also means I can't do code blocks please bear with me: I'll try.
I have a table with amounts in it, and I have a table with labels. I want to sum the amounts in the first table grouped by the labels. The problem is, if there are no records for a label existing in the table with the amounts then I don't get a record in the result set for that label. I need a record there with nulls for the amount tables field. Here is what some sample data might look like:
Amount_table:
Columns: id, tpa, amt, link_to_label_table
Data:
1, GTL, 2000, 1
2, GTL, 1000, 1
Label_table:
Columns: link_to_amount_table, label_name
Data:
1, Label1
2, Label2
Query:
Select at.tpa, sum(at.amt) as amt, lt.label_name
From Amount_table as at
Left join Label_tabl lt on lt.link_to_amount_table = at.link_to_label_table
Where at.tpa = 'GTL'
Group by lt.label, at.tpa
Now this returns:
GTL, 3000, Label1
I tried selecting from the labels table then left joining the amount table and it still didn't give my desired results which are:
GTL, 3000, Label1
Null, Null, Label2
Is this possible with the sum and group by? The fields being grouped by have to be there otherwise you get an error.
This is in DB2 by the way. Is there any way possible to get this to return the way I need it? I have to get the labels; they are dynamic.
On the face of it, you want to have your labels table as the dominant table and the amounts table as the one that is outer joined.
SELECT a.tpa, sum(a.amt) as amt, l.label_name
FROM Label_table AS l
LEFT JOIN Amount_table AS a
ON l.link_to_amount_table = a.link_to_label_table
GROUP BY l.label, a.tpa
You have a condition Amount_table.tpa = 'GTL'; it is not entirely clear why you have that, but presumably it is significant with more data in the tables. There are (at least) two ways you can incorporate that condition into the query (other than the one you chose - which eliminates the rows where a.tpa is null).
SELECT a.tpa, sum(a.amt) as amt, l.label_name
FROM Label_table AS l
LEFT JOIN Amount_table AS a
ON l.link_to_amount_table = a.link_to_label_table
AND a.tpa = 'GTL'
GROUP BY l.label, a.tpa
Or:
SELECT a.tpa, sum(a.amt) as amt, l.label_name
FROM Label_table AS l
LEFT JOIN (SELECT *
FROM Amount_table
WHERE tpa = 'GTL') AS a
ON l.link_to_amount_table = a.link_to_label_table
GROUP BY l.label, a.tpa
A decent optimizer will produce the same query plan for both, so it probably doesn't matter which you use. There's an argument that suggests the second alternative is cleaner in that the ON clause is primarily for joining conditions, and the filter condition on a.tpa is not a joining condition. There's another argument that says the first alternative avoids a sub-query and is therefore preferable. I'd validate that the query plans are the same and would probably choose the second, but it is a somewhat nebulous decision based on a mild preference.
You were so close on your second try. Change WHERE to AND. This has the effect of applying at.tpa='GTL' to the JOIN instead of applying it to the filter so you don't filter out the NULLs.

Aggregate SQL Function to grab only the first from each group

I have 2 tables - an Account table and a Users table. Each account can have multiple users. I have a scenario where I want to execute a single query/join against these two tables, but I want all the Account data (Account.*) and only the first set of user data (specifically their name).
Instead of doing a "min" or "max" on my aggregated group, I wanted to do a "first". But, apparently, there is no "First" aggregate function in TSQL.
Any suggestions on how to go about getting this query? Obviously, it is easy to get the cartesian product of Account x Users:
SELECT User.Name, Account.* FROM Account, User
WHERE Account.ID = User.Account_ID
But how might I got about only getting the first user from the product based on the order of their User.ID ?
Rather than grouping, go about it like this...
select
*
from account a
join (
select
account_id,
row_number() over (order by account_id, id) -
rank() over (order by account_id) as row_num from user
) first on first.account_id = a.id and first.row_num = 0
I know my answer is a bit late, but that might help others. There is a way to achieve a First() and Last() in SQL Server, and here it is :
Stuff(Min(Convert(Varchar, DATE_FIELD, 126) + Convert(Varchar, DESIRED_FIELD)), 1, 23, '')
Use Min() for First() and Max() for Last(). The DATE_FIELD should be the date that determines if it is the first or last record. The DESIRED_FIELD is the field you want the first or the last value. What it does is :
Add the date in ISO format at the start of the string (23 characters long)
Append the DESIRED_FIELD to that string
Get the MIN/MAX value for that field (since it start with the date, you will get the first or last record)
Stuff that concatened string to remove the first 23 characters (the date part)
Here you go!
EDIT: I got problems with the first formula : when the DATE_FIELD has .000 as milliseconds, SQL Server returns the date as string with NO milliseconds at all, thus removing the first 4 characters from the DESIRED_FIELD. I simply changed the format to "20" (without milliseconds) and it works all great. The only downside is if you have two fields that were created at the same seconds, the sort can possibly be messy... in which cas you can revert to "126" for the format.
Stuff(Max(Convert(Varchar, DATE_FIELD, 20) + Convert(Varchar, DESIRED_FIELD)), 1, 19, '')
EDIT 2 : My original intent was to return the last (or first) NON NULL row. I got asked how to return the last or first row, wether it be null or not. Simply add a ISNULL to the DESIRED_FIELD. When you concatenate two strings with a + operator, when one of them is NULL, the result is NULL. So use the following :
Stuff(Max(Convert(Varchar, DATE_FIELD, 20) + IsNull(Convert(Varchar, DESIRED_FIELD), '')), 1, 19, '')
Select *
From Accounts a
Left Join (
Select u.*,
row_number() over (Partition By u.AccountKey Order By u.UserKey) as Ranking
From Users u
) as UsersRanked
on UsersRanked.AccountKey = a.AccountKey and UsersRanked.Ranking = 1
This can be simplified by using the Partition By clause. In the above, if an account has three users, then the subquery numbers them 1,2, and 3, and for a different AccountKey, it will reset the numnbering. This means for each unique AccountKey, there will always be a 1, and potentially 2,3,4, etc.
Thus you filter on Ranking=1 to grab the first from each group.
This will give you one row per account, and if there is at least one user for that account, then it will give you the user with the lowest key(because I use a left join, you will always get an account listing even if no user exists). Replace Order By u.UserKey with another field if you prefer that the first user be chosen alphabetically or some other criteria.
I've benchmarked all the methods, the simpelest and fastest method to achieve this is by using outer/cross apply
SELECT u.Name, Account.* FROM Account
OUTER APPLY (SELECT TOP 1 * FROM User WHERE Account.ID = Account_ID ) as u
CROSS APPLY works just like INNER JOIN and fetches the rows where both tables are related, while OUTER APPLY works like LEFT OUTER JOIN and fetches all rows from the left table (Account here)
You can use OUTER APPLY, see documentation.
SELECT User1.Name, Account.* FROM Account
OUTER APPLY
(SELECT TOP 1 Name
FROM [User]
WHERE Account.ID = [User].Account_ID
ORDER BY Name ASC) User1
SELECT (SELECT TOP 1 Name
FROM User
WHERE Account_ID = a.AccountID
ORDER BY UserID) [Name],
a.*
FROM Account a
The STUFF response from Dominic Goulet is slick. But, if your DATE_FIELD is SMALLDATETIME (instead of DATETIME), then the ISO 8601 length will be 19 instead of 23 (because SMALLDATETIME has no milliseconds) - so adjust the STUFF parameter accordingly or the return value from the STUFF function will be incorrect (missing the first four characters).
First and Last do not exist in Sql Server 2005 or 2008, but in Sql Server 2012 there is a First_Value, Last_Value function. I tried to implement the aggregate First and Last for Sql Server 2005 and came to the obstacle that sql server does guarantee the calculation of the aggregate in a defined order. (See attribute SqlUserDefinedAggregateAttribute.IsInvariantToOrder Property, which is not implemented.) This might be because the query analyser tries to execute the calculation of the aggregate on multiple threads and combine the results, which speeds up the execution, but does not guarantee an order in which elements are aggregated.
Define "First". What you think of as first is a coincidence that normally has to do with clustered index order but should not be relied on (you can contrive examples that break it).
You are right not to use MAX() or MIN(). While tempting, consider the scenario where you the first name and last name are in separate fields. You might get names from different records.
Since it sounds like all your really care is that you get exactly one arbitrary record for each group, what you can do is just MIN or MAX an ID field for that record, and then join the table into the query on that ID.
There are a number of ways of doing this, here a a quick and dirty one.
Select (SELECT TOP 1 U.Name FROM Users U WHERE U.Account_ID = A.ID) AS "Name,
A.*
FROM Account A
(Slightly Off-Topic, but) I often run aggregate queries to list exception summaries, and then I want to know WHY a customer is in the results, so use MIN and MAX to give 2 semi-random samples that I can look at in details e.g.
SELECT Customer.Id, COUNT(*) AS ProblemCount
, MIN(Invoice.Id) AS MinInv, MAX(Invoice.Id) AS MaxInv
FROM Customer
INNER JOIN Invoice on Invoice.CustomerId = Customer.Id
WHERE Invoice.SomethingHasGoneWrong=1
GROUP BY Customer.Id
Create and join with a subselect 'FirstUser' that returns the first user for each account
SELECT User.Name, Account.*
FROM Account, User,
(select min(user.id) id,account_id from User group by user.account_id) as firstUser
WHERE Account.ID = User.Account_ID
and User.id = firstUser.id and Account.ID = firstUser.account_id