How do I increase a counter by 1 for each repeated value? - sql

As part of my course in university I have to make a database in Microsoft Access which is somewhat limiting me on what I'm trying to do. I have a table that has the information of whether a player in a team was present for a fixture or not using the values "P", "R", and "M" (Played, Reserves, Missed). I want to make a query that counts a value of 1 for each value of P or R and a separate one for M, so that when I make a report that prints off a membership card, it shows the amount of fixtures they've played in and the amount of fixtures that they have missed.
Sorry if this isn't clear, I'll try to explain myself further if you ask but I'm not very good with this stuff. Thank you.
Edit: I'll use screenshot links if that's okay, here is the Fixture Attendance entity that shows if a member of a team attended a game or not. I'm making a membership card based off this one. I want to be able to display the No. of fixtures played by the member and the No. of fixtures missed based off the values in the above entity and use that information in a form I'm going to create. That will be a subform inside my Membership Card form.
I'm presumably really bad at explaining this - I understand Access is rarely used in the real world so I'm not sure why I'm doing this in the first place and don't feel like I'm getting any real knowledge of working with databases.

You should use the COUNT function.
http://office.microsoft.com/en-us/access-help/count-data-by-using-a-query-HA010096311.aspx

I am guessing that you want something like this:
select playerid, sum(iif(fixture in ("P", "R"), 1, 0)) as NumPR,
sum(iif(figure = "M", 1, 0)as NumM
from table t
group by playerid;
The key idea here is putting the conditional part (iif()) inside the sum().

CASE WHEN can be used to translate the codes into 1's and 0's. Then use SUM with a GROUP BY to sum them.
SELECT player_id, SUM(Played), SUM(Reserve), SUM(Missed)
FROM
(SELECT player_id,
CASE WHEN present = 'P' THEN 1 ELSE 0 AS Played,
CASE WHEN present = 'R' THEN 1 ELSE 0 AS Reserve,
CASE WHEN present = 'M' THEN 1 ELSE 0 AS Missed
FROM fixtures)
GROUP BY player_id;

Related

SQL question with attempt on customer information

Schema
Question: List all paying customers with users who had 4 or 5 activities during the week of February 15, 2021; also include how many of the activities sent were paid, organic and/or app store. (i.e. include a column for each of the three source types).
My attempt so far:
SELECT source_type, COUNT(*)
FROM activities
WHERE activity_time BETWEEN '02-15-21' AND '02-19-21'
GROUP BY source_type
I would like to get a second opinion on it. I didn't include the accounts table because I don't believe that I need it for this query, but I could be wrong.
Have you tried to run this? It doesn't satisfy the brief on FOUR counts:
List all the ... customers (that match criteria)
There is no customer information included in the results at all, so this is an outright fail.
paying customers
This is the top level criteria, only customers that are not free should be included in the results.
Criteria: users who had 4 or 5 activities
There has been no attempt to evaluate this user criteria in the query, and the results do not provide enough information to deduce it.
there is further ambiguity in this requirement, does it mean that it should only include results if the account has individual users that have 4 or 5 acitvities, or is it simply that the account should have 4 or 5 activities overall.
If this is a test question (clearly this is contrived, if it is not please ask for help on how to design a better schema) then the use of the term User is usually very specific and would suggest that you need to group by or otherwise make specific use of this facet in your query.
Bonus: (i.e. include a column for each of the three source types).
This is the only element that was attempted, as the data is grouped by source_type but the information cannot be correlated back to any specific user or customer.
Next time please include example data and the expected outcome with your post. In preparing the data for this post you would have come across these issues yourself and may have been inspired to ask a different question, or through the process of writing the post up you may have resolved the issue yourself.
without further clarification, we can still start to evolve this query, a good place to start is to exclude the criteria and focus on the format of the output. the requirement mentions the following output requirements:
List Customers
Include a column for each of the source types.
Firstly, even though you don't think you need to, the request clearly states that Customer is an important facet in the output, and in your schema account holds the customer information, so although we do not need to, it makes the data readable by humans if we do include information from the account table.
This is a standard PIVOT style response then, we want a row for each customer, presenting a count that aggregates each of the values for source_type. Most RDBMS will support some variant of a PIVOT operator or function, however we can achieve the same thing with simple CASE expressions to conditionally put a value into projected columns in the result set that match the values we want to aggregate, then we can use GROUP BY to evaluate the aggregation, in this case a COUNT
The following syntax is for MS SQL, however you can achieve something similar easily enough in other RBDMS
OP please tag this question with your preferred database engine...
NOTE: there is NO filtering in this query... yet
SELECT accounts.company_id
, accounts.company_name
, paid = COUNT(st_paid)
, organic = COUNT(st_organic)
, app_store = COUNT(st_app_store)
FROM activities
INNER JOIN accounts ON activities.company_id = accounts.company_id
-- PIVOT the source_type
CROSS APPLY (SELECT st_paid = CASE source_type WHEN 'paid' THEN 1 END
,st_organic = CASE source_type WHEN 'organic' THEN 1 END
,st_app_store = CASE source_type WHEN 'app store' THEN 1 END
) as PVT
GROUP BY accounts.company_id, accounts.company_name
This results in the following shape of result:
company_id
company_name
paid
organic
app_store
apl01
apples
4
8
0
ora01
oranges
6
12
0
Criteria
When you are happy with the shpe of the results and that all the relevant information is available, it is time to apply the criteria to filter this data.
From the requirement, the following criteria can be identified:
paying customers
The spec doesn't mention paying specifically, but it does include a note that (free customers have current_mrr = 0)
Now aren't we glad we did join on the account table :)
users who had 4 or 5 activities
This is very specific about explicitly 4 or 5 activities, no more, no less.
For the sake of simplicity, lets assume that the user facet of this requirement is not important and that is is simply a reference to all users on an account, not just users who have individually logged 4 or 5 activities on their own - this would require more demo data than I care to manufacture right now to prove.
during the week of February 15, 2021.
This one was correctly identified in the original post, but we need to call it out just the same.
OP has used Monday to Friday of that week, there is no mention that weeks start on a Monday or that they end on Friday but we'll go along, it's only the syntax we need to explore today.
In the real world the actual values specified in the criteria should be parameterised, mainly because you don't want to manually re-construct the entire query every time, but also to sanitise input and prevent SQL injection attacks.
Even though it seems overkill for this post, using parameters even in simple queries helps to identify the variable elements, so I will use parameters for the 2nd criteria to demonstrate the concept.
DECLARE #from DateTime = '2021-02-15' -- Date in ISO format
DECLARE #to DateTime = (SELECT DateAdd(d, 5, #from)) -- will match Friday: 2021-02-19
/* NOTE: requirement only mentioned the start date, not the end
so your code should also only rely on the single fixed start date */
SELECT accounts.company_id, accounts.company_name
, paid = COUNT(st_paid), organic = COUNT(st_organic), app_store = COUNT(st_app_store)
FROM activities
INNER JOIN accounts ON activities.company_id = accounts.company_id
-- PIVOT the source_type
CROSS APPLY (SELECT st_paid = CASE source_type WHEN 'paid' THEN 1 END
,st_organic = CASE source_type WHEN 'organic' THEN 1 END
,st_app_store = CASE source_type WHEN 'app store' THEN 1 END
) as PVT
WHERE -- paid accounts = exclude 'free' accounts
accounts.current_mrr > 0
-- Date range filter
AND activity_time BETWEEN #from AND #to
GROUP BY accounts.company_id, accounts.company_name
-- The fun bit, we use HAVING to apply a filter AFTER the grouping is evaluated
-- Wording was explicitly 4 OR 5, not BETWEEN so we use IN for that
HAVING COUNT(source_type) IN (4,5)
I believe you are missing some information there.
without more information on the tables, I can only guess that you also have a customer table. i am going to assume there is a customer_id key that serves as key between both tables
i would take your query and do something like:
SELECT customer_id,
COUNT() AS Total,
MAX(CASE WHEN source_type = "app" THEN "numoperations" END) "app_totals"),
MAX(CASE WHEN source_type = "paid" THEN "numoperations" END) "paid_totals"),
MAX(CASE WHEN source_type = "organic" THEN "numoperations" END) "organic_totals"),
FROM (
SELECT source_type, COUNT() AS num_operations
FROM activities
WHERE activity_time BETWEEN '02-15-21' AND '02-19-21'
GROUP BY source_type
) tb1 GROUP BY customer_id
This is the most generic case i can think of, but does not scale very well. If you get new source types, you need to modify the query, and the structure of the output table also changes. Depending on the sql engine you are using (i.e. mysql vs microsoft sql) you could also use a pivot function.
The previous query is a little bit rough, but it will give you a general idea. You can add "ELSE" statements to the clause, to zero the fields when they have no values, and join with the customer table if you want only active customers, etc.

Is there an easier way than nested iifs to count mismatched conditions in SQL server

The picture below shows a table of accounts and the outcomes that I want to count. Ie every time the account number is 106844 and the outcome is "MESSAGE" or "ESCALATION EMAIL" that should count as 1 where any other outcome counts as 0. What I would normally do is a horrible mess of iifs like
sum( iif([account] = '106719' and [Outcome] in ('MESSAGE','ESCALATION_EMAIL'),1,iif([account] = '310827' and [outcome] <> 'ABORT' and 'CALL_OUTCOME_LB' in ("Call patched to Customer Care","Message Taken"),1,iif( ... , 0) as [Total Outcomes]
and so on but man it feel like there's got to be an easier way or one less prone to making a random mistake in the 7th nested iif and messing the whole thing up. Any ideas?
Don't use iif(). It is a function brought into SQL Server for back-compatibility to MS ACCESS. Why would you want to be backwards compatible to such a thing?
Use the ANSI standard CASE expression:
sum(case when account = '106719' and Outcome in ('MESSAGE', 'ESCALATION_EMAIL')
then 1
when account = '310827' and outcome <> 'ABORT' and
'CALL_OUTCOME_LB' in ("Call patched to Customer Care", "Message Taken")
then 1
. . .
else 0
end) as Total_Outcomes
I would also advise you to name your columns so they don't need to be escaped (why "Total Outcomes" became "Total_Outcomes"). That simplifies the code.
Why not use a lookup table that has the Account and Outcome and use that? Then, as requirements change, you could update the lookup table and not worry about your code.
Yeah... there is
The last parameter is for when the condition is false, everything else will fall in there.
SUM( IIF([ACCOUNT] = '106719' AND [OUTCOME] IN ('MESSAGE','ESCALATION_EMAIL'),1,0))

Trying to include ID column in Grouped SQL SELECT statement in order to drill down in web page

I know there has been a lot of discussion around this subject but I cannot find anything that points me in the direction of a definitive answer.
I have the below sql statement within a .net page in Webmatrix:
SELECT vehicle, vehicleDescription, count(vehicleDescription) AS 'Total'
FROM vehicles
WHERE (branchRequirement = 'Manchester')
AND (deliveryBranch = 'Manchester' OR deliveryBranch IS NULL)
AND (dateDeliveredToBranch > GETDATE() OR dateDeliveredToBranch IS NULL)
AND (vgc LIKE 'B_') GROUP BY vehicle,vehicleDescription
The output is obviously GROUPED data for the chosen conditions.
What I am trying to do is provide a link in my Webgrid on the .net page which allows the user to open a child page with details of the GROUPED vehicles.
Where I'm getting stuck is I cannot include the vehicleID in the GROUP BY because they are obviously all UNIQUE.
Has anybody come across this or something similar with any degree of success as I am pulling my hair out with it which I can ill afford to do!
Thanks
M
I have come across similar issues and the solution I came up with was to use the information you already have. When the user clicks on the link, you know the vehicle and the vehicleDescription that the user wants to see. You should not need the vehicleId because you are not going to have one unique result. If they click on a vehicle that has a count of 3, the child page should have details about all 3 results.
In order to find the 3 results the user would want to see, you can alter your existing query and use it for the child page. The altered query should take the vehicle and vehicleDesciption as parameters.
SELECT *
FROM vehicles
WHERE (branchRequirement = 'Manchester')
AND (deliveryBranch = 'Manchester' OR deliveryBranch IS NULL)
AND (dateDeliveredToBranch > GETDATE() OR dateDeliveredToBranch IS NULL)
AND (vgc LIKE 'B_')
AND vehicle = #vehicle
AND vehicleDesciption = #vehicleDescription
Pass the parameters in .Net and you should end up with the same data that you summed in your last query, since this query is essentially the same.

Writing a query to include certain values but exclude others when looking for a latest time period

I am trying to write a query that looks for a people that have a certain code with the latest period (year) but not if they have another code with that latest period(year). I'll be explicit just so my example makes sense.
I want people who have the code A1,A2,A3,A4,A5 but not AG,AP,AQ. There are people who have an A1 code for a period (like 2014) and an AG code for a the same period. I'd like to exclude them. Not everyone has a code so the field value could be NULL.
Is there a way to express this in a different way (i.e. less characters) than the way I did?
SELECT
people.firstName
FROM
people
WHERE EXISTS (
SELECT *
FROM codes
WHERE
codes.people_id = people.id
AND period = (SELECT MAX(period) FROM codes codes2 WHERE codes2.people_id = codes.people_id)
AND code LIKE 'A[1-5]'
)
AND NOT EXISTS (
SELECT *
FROM codes
WHERE
codes.people_id = people.id
AND period = (
SELECT MAX(period)
FROM codes codes2
WHERE codes2.people_id = codes.people_id
)
AND code LIKE 'A[GPQ]'
)
Schema is as follows:
People
id (PK)
firstName
Codes
people_id (FK) many to one relation with People table
code (e.g. "A1", "A2", "AG")
period (e.g. "2013", "2014")
There are so many ways you could do that, I'm not an SQL expert but I can't see your query being too bad, if you want to try and reduce the number of sub-queries you could consider using the GROUP BY clause along with a SUM Aggregate function in a HAVING clause.
I started updating your code as follows:
SELECT
people.firstName
FROM
people
LEFT JOIN codes AS a15 ON a15.people_id = people.id AND a15.code LIKE 'A[1-5]'
LEFT JOIN codes AS agpq ON agpq.people_id = people.id AND agpq.code LIKE 'A[GPQ]'
GROUP BY
people.firstName
HAVING
SUM(CASE WHEN a15.code IS NULL THEN 0 ELSE 1 END) > 0
AND SUM(CASE WHEN agpq.code IS NULL THEN 0 ELSE 1 END) = 0
This however doesn't take into account anything to do with period specific requirements described. You could add the period to the GROUP BY clause or add it to a WHERE or one of the JOIN constraints but I'm not quite sure from your description exactly what you're after (I don't believe this is through any fault of your own, I just can't personally align the code provided to the description).
I would also like to point out that the SUM functions above will not give an accurate count of the number of matching codes. This is because if both A[GPQ] and A[1_5] return at least one row, the number returned by each constraint will be multiplied by the number returned for the other, it can however be used to determine if there are "any" returned items as if the criteria is matched it will have a SUM(...) > 0
I'm sure a more experienced SQL Developer / DBA will be able to poke many holes in my proposed query but it might give them or someone else something to work from and hopefully gives you ideas for alternatives to using sub-queries.

SQL Group By

If I have a set of records
name amount Code
Dave 2 1234
Dave 3 1234
Daves 4 1234
I want this to group based on Code & Name, but the last Row has a typo in the name, so this wont group.
What would be the best way to group these as:
Dave/Daves 9 1234
As a general rule if the data is wrong you should fix the data.
However if you want to do the report anyway you could come up with another criteria to group on, for example LEFT(Name, 4) would perform a grouping on the first 4 characters of the name.
You may also want to consider the CASE statement as a method (CASE WHEN name = 'Daves' THEN 'Dave' ELSE name), but I really don't like this method, especially if you are proposing to use this for anything else then a one-off report.
If it's a workaround, try
SELECT cname, SUM(amount)
FROM (
SELECT CASE WHEN NAME = 'Daves' THEN 'Dave' ELSE name END AS cname, amount
FROM mytable
)
GROUP BY cname
This if course will handle only this exact case.
For MySQL:
select
group_concat(distinct name separator '/'),
sum(amount),
code
from
T
group by
code
For MSSQL 2005+ group_concat() can be implemented as .NET custom aggregate.
Fix the typo? Otherwise grouping on the name is going to create a new group.
Fixing your data should be your highest priority instead of trying to devise ways to "work around" it.
It should also be noted that if you have this single typo in your data, it is likely that you have (or will have at some point in the future) even more screwy data that will not cleanly fit into your code, which will force you to invent more and more "work arounds" to deal with it, when you should be focusing on the cleanliness of your data.
If the name field is suppose to be a key then the assumption has to be that Dave and Daves are two different items all together, and thus should be grouped differently. If however it is a typo, then as other have suggested, fix the data.
Grouping on a freeform entered text field if that is what this is, will always have issues. Data entry is never 100%.
To me it makes more sense to group on the code alone if that is the key field and leave name out of the grouping all together.