Grouping results within SQL - sql

I was thinking about doing this in crystal, but am thinking about doing it another way now. This is the way I was thinking about doing it before. Count by group in crystal reports
I am pulling by transactions and need a count of unique people, but in the report they want to show the count of individual people and then show their services below as shown in the image. What I'd like to do is somehow get a count of unique users that I could just throw in at the bottom. Crystal won't allow me to do a count by group and users will have duplicates in the format they want it displayed. I am hopping that I can group it in the code then just add it to the bottom of the report. I hope I'm getting across what I'm trying to accomplish. If I can somehow just add the total of unique users at the bottom of the report it will finish it for me. Thanks in advance.
select
distinct p.patient_id,
pa.fname as 'Patient',
p.clinic_id,
p.service_id,
p.program_id,
p.protocol_id,
p.discharge_reason,
p.date_discharged
from patient_assignment p
join patient pa
on p.patient_id = pa.patient_id
where p.program_id not in ('TEST', 'SA', 'INTAKE' ) and (p.date_discharged between '2013-01-01 00:00:00.000' and '2013-06-01 00:00:00.000')
and p.patient_id not in ('00000004', '00001667', '00020354')

This is too long for a comment.
SQL Queries have fixed columns. You can't "just throw" a different type of row at the bottom. Although there are SQL solutions (such as concatenating all the fields to make a single row), these are not very palatable.
Instead, here are some other ways.
You can run another query to get the count.
You can query the "result" set to get the count (perhaps after fetching all the rows).
You can do some minor coding work in the application.

Related

Quick one on Big Query SQL-Ecommerce Data

I am trying to replicate the Google Analyitcs data in Big Query but couldnt do that.
Basically I am using Custom Dimension 40 (user subscription status)
but I am getting wrong numbers in BQ.
Can someone help me on this?
I am using this query but couldn't find it out the exact one.
SELECT
(SELECT value FROM hits.customDimensions where index=40) AS UserStatus,
COUNT(hits.transaction.transactionId) AS Unique_Purchases
FROM
`xxxxxxxxxxxxx.ga_sessions_2020*` AS GA, --new rollup
UNNEST(GA.hits) AS hits
WHERE
(SELECT value FROM hits.customDimensions where index=40) IN ("xx001","xxx002")
GROUP BY 1
I am getting this from big query which is wrong.
I have check out the dates also but dont know why its wrong.
Your question is rather unclear. But because you want something to be unique and numbers are mysteriously not what you want, I would suggest using COUNT(DISTINCT):
COUNT(DISTINCT hits.transaction.transactionId) AS Unique_Purchases
As far as I understand, you imported Google Analytics data into Bigquery and you are trying to group the custom dimension with index 40 and values ("xx001","xxx002") in order to know how many hit transactions were performed in function of these dimension values.
Replicating your scenario and trying to execute the query you posted, I got the following error.
However, I created a query that could help with your use-case. At first, it selects the transactionId and dimension values with the transactionId different from null and with index value equal to 40, then the grouping is done by the dimension value, filtered with values equals to "xx001"&"xxx002".
WITH tx AS (
SELECT
HIT.transaction.transactionId,
CD.value
FROM
`xxxxxxxxxxxxx.ga_sessions_2020*` AS GA,
UNNEST(GA.hits) AS HIT,
UNNEST(HIT.customDimensions) AS CD
WHERE
HIT.transaction.transactionId IS NOT NULL
AND
CD.index = 40
)
SELECT tx.value AS UserStatus, count(tx.transactionId) AS Unique_Purchases
FROM tx
WHERE tx.value IN ("xx001","xx002")
GROUP BY tx.value
For further details about the format and schema of the data that is imported into BigQuery, I found this document.

MS Access 2013, How to add totals row within SQL

I'm in need of some assistance. I have search and not found what I'm looking for. I have an assigment for school that requires me to use SQL. I have a query that pulls some colunms from two tables:
SELECT Course.CourseNo, Course.CrHrs, Sections.Yr, Sections.Term, Sections.Location
FROM Course
INNER JOIN Sections ON Course.CourseNo = Sections.CourseNo
WHERE Sections.Term="spring";
I need to add a Totals row at the bottom to count the CourseNo and Sum the CrHrs. It has to be done through SQL query design as I need to paste the code. I know it can be done with the datasheet view but she will not accept that. Any advice?
To accomplish this, you can union your query together with an aggregation query. Its not clear from your question which columns you are trying to get "Totals" from, but here's an example of what I mean using your query and getting counts of each (kind of useless example - but you should be able to apply to what you are doing):
SELECT
[Course].[CourseNo]
, [Course].[CrHrs]
, [Sections].[Yr]
, [Sections].[Term]
, [Sections].[Location]
FROM
[Course]
INNER JOIN [Sections] ON [Course].[CourseNo] = [Sections].[CourseNo]
WHERE [Sections].[Term] = [spring]
UNION ALL
SELECT
"TOTALS"
, SUM([Course].[CrHrs])
, count([Sections].[Yr])
, Count([Sections].[Term])
, Count([Sections].[Location])
FROM
[Course]
INNER JOIN [Sections] ON [Course].[CourseNo] = [Sections].[CourseNo]
WHERE [Sections].[Term] = “spring”
You can prepare your "total" query separately, and then output both query results together with "UNION".
It might look like:
SELECT Course.CourseNo, Course.CrHrs, Sections.Yr, Sections.Term, Sections.Location
FROM Course
INNER JOIN Sections ON Course.CourseNo = Sections.CourseNo
WHERE Sections.Term="spring"
UNION
SELECT "Total", SUM(Course.CrHrs), SUM(Sections.Yr), SUM(Sections.Term), SUM(Sections.Location)
FROM Course
INNER JOIN Sections ON Course.CourseNo = Sections.CourseNo
WHERE Sections.Term="spring";
Whilst you can certainly union the aggregated totals query to the end of your original query, in my opinion this would be really bad practice and would be undesirable for any real-world application.
Consider that the resulting query could no longer be used for any meaningful analysis of the data: if displayed in a datagrid, the user would not be able to sort the data without the totals row being interspersed amongst the rest of the data; the user could no longer use the built-in Totals option to perform their own aggregate operation, and the insertion of a row only identifiable by the term totals could even conflict with other data within the set.
Instead, I would suggest displaying the totals within an entirely separate form control, using a separate query such as the following (based on your own example):
SELECT Count(Course.CourseNo) as Courses, Sum(Course.CrHrs) as Hours
FROM Course INNER JOIN Sections ON Course.CourseNo = Sections.CourseNo
WHERE Sections.Term = "spring";
However, since CrHrs are fields within your Course table and not within your Sections table, the above may yield multiples of the desired result, with the number of hours multiplied by the number of corresponding records in the Sections table.
If this is the case, the following may be more suitable:
SELECT Count(Course.CourseNo) as Courses, Sum(Course.CrHrs) as Hours
FROM
Course INNER JOIN
(SELECT DISTINCT s.CourseNo FROM Sections s WHERE s.Term = "spring") q
ON Course.CourseNo = q.CourseNo

SSRS 2008 R2 - How to aggregate group data based on inner group filters?

I have made a simplified example to illustrate the question i'm trying to ask here. In my example i have sales orders and each sales order has multiple lines, i group by Sales Order Number, then by Sales Order Line (row groups)
I have found Group Filters very useful/flexible in filtering report data in specific areas of a table, so in my example i filter the SOLine group to exclude the SO line if it equals 3.
Then, i want to have a group aggregate for the entire SO, to tell me a count of the SO lines within it. Unfortunately when doing COUNT() expression from a textbox within the Sales Order Number group scope it counts all the lines, including the SO Line 3, whereas i want it to take into consideration the line filtered out from its child group.
Below is a screenshot of my tablix and grouping:
On the SOLine group i have the following filter:
And below is the output i get when previewing the report:
I want that count to evaluate to 4, but i ideally want to keep using groups as i've found they are much more efficient than using SUM(IIF) which completely slowed down my actual report which has thousands of rows.
If this is not possible, please give all best alternatives i could use.
Many thanks.
Jacob

SQL count query not working

I have problem with this query:
SELECT Ordre.Objet, Count(Ordre.Objet) AS CompteDeObjet
FROM Ordre INNER JOIN Avis ON Ordre.[Ordre SAP] = Avis.[Ordre SAP]
GROUP BY Ordre.Objet, Avis.[Date Appel], Ordre.Objet
HAVING (((Avis.[Date Appel])>#8/1/2011# And (Avis.[Date Appel])<#10/20/2011#) AND ((Ordre.Objet) Is Not Null));
That I generated using Access 2003. it should count the number of items of each kind in the Object column but it only shows a count of one per different item...
Can't seem to figure out how to make this work.
[EDIT]
Considering the first two answers, I changed my code to the following, but I still get the same result:
SELECT Ordre.Objet, Count(Ordre.Objet) AS CompteDeObjet
FROM Ordre INNER JOIN Avis ON Ordre.[Ordre SAP] = Avis.[Ordre SAP]
WHERE (((Avis.[Date Appel])>#8/1/2011# And (Avis.[Date Appel])<#10/20/2011#) AND ((Ordre.Objet) Is Not Null))
GROUP BY Ordre.Objet;
[EDIT #2]
here is a sample of my data:
Ordre SAP Objet
11147212 Simplex
11147214 Simplex
11147215 Simplex
11147216 Simplex
11147225 Simplex
11147240 Auto Level
11147243
11147247 CANOPY
11147259 Capteur
And here is what the query from the last edit gives me:
Auto Level 1
CANOPY 1
Capteur 1
Simplex 1
All of my data is included in the date range specified in the query.
Sorry, i don't know how to show this in a proper table, I'm new here...
The query will give one row / count per item you GROUP BY.
You are grouping on:
GROUP BY Ordre.Objet, Avis.[Date Appel], Ordre.Objet
So, you will get one cound for each Objet / Date Appel combination
Try moving what is in the HAVING clause to the WHERE clause. HAVING filters on grouped items, it may be stopping the query from counting the items properly.
Ok, so I figured out what my problem was:
I have way more Ordres than I have Avis and not all avis have an ordre attached. My query somehow only counts the objects that have an Avis attached to it (because of the JOIN clause?)
so while testing, I just randomly placed lots of different values to random record in the order table. it just so happened that I put one of each on Ordres that had corresponding Avis...
Silly me :)
Thanks everyone though, help is always apreciated

Slow Query - Help with Optimization

Hey guys. This is a follow-on from this question:
After getting the right data and making some tweaks based on requests from business, I've now got this mini-beast on my hands. This query should return the total number of new jobseeker registrations and the number of new uploaded CV's:
SELECT COUNT(j.jobseeker_id) as new_registrations,
(
SELECT
COUNT(c.cv_id)
FROM
tb_cv as c, tb_jobseeker, tb_industry
WHERE
UNIX_TIMESTAMP(c.created_at) >= '1241125200'
AND
UNIX_TIMESTAMP(c.created_at) <= '1243717200'
AND
tb_jobseeker.industry_id = tb_industry.industry_id
)
AS uploaded_cvs
FROM
tb_jobseeker as j, tb_industry as i
WHERE
j.created_at BETWEEN '2009-05-01' AND '2009-05-31'
AND
i.industry_id = j.industry_id
GROUP BY i.description, MONTH(j.created_at)
Notes:
- The two values in the UNIX TIMESTAMP functions are passed in as parameters from the report module in our backend.
Every time I run it, MySQL chokes and lingers silently into the ether of the Interweb.
Help is appreciated.
Update: Hey guys. Thanks a lot for all the thoughtful and helpful comments. I'm only 2 weeks into my role here, so I'm still learning the schema. So, this query is somewhere between a thumbsuck and an educated guess. Will start to answer all your questions now.
tb_cv is not connected to the other tables in the sub-query. I guess this is the root cause for the slow query. It causes generation of a Cartesian product, yielding a lot more rows than you probably need.
Other than that I'd say you need indexes on tb_jobseeker.created_at, tb_cv.created_at and tb_industry.industry_id, and you might want to get rid of the UNIX_TIMESTAMP() calls in the sub-query since they prevent use of an index. Use BETWEEN and the actual field values instead.
Here is my attempt at understanding your query and writing a better version. I guess you want to get the count of new jobseeker registrations and new uploaded CVs per month per industry:
SELECT
i.industry_id,
i.description,
MONTH(j.created_at) AS month_created,
YEAR(j.created_at) AS year_created,
COUNT(DISTINCT j.jobseeker_id) AS new_registrations,
COUNT(cv.cv_id) AS uploaded_cvs
FROM
tb_cv AS cv
INNER JOIN tb_jobseeker AS j ON j.jobseeker_id = cv.jobseeker_id
INNER JOIN tb_industry AS i ON i.industry_id = j.industry_id
WHERE
j.created_at BETWEEN '2009-05-01' AND '2009-05-31'
AND cv.created_at BETWEEN '2009-05-01' AND '2009-05-31'
GROUP BY
i.industry_id,
i.description,
MONTH(j.created_at),
YEAR(j.created_at)
A few things I noticed while writing the query:
you GROUP BY values you don't output in the end. Why? (I've added the grouped field to the output list.)
you JOIN three tables in the sub-query while only ever using values from one of them. Why? I don't see what it would be good for, other than filtering out CV records that don't have a jobseeker or an industry attached — which I find hard to imagine. (I've removed the entire sub-query and used a simple COUNT instead.)
Your sub-query returns the same value every time. Did you maybe mean to correlate it in some way, to the industry maybe?.
The sub-query runs once for every record in a grouped query without being wrapped in an aggregate function.
First and foremost it may be worth moving the 'UNIX_TIMESTAMP' conversions to the other side of the equation (that is, perform a reverse function on the literal timestamp values at the other side of the >= and <=). That'll avoid the inner query having to perform the conversions for every record, rather than once for the query.
Also, why does the uploaded_cvs query not have any where clause linking it to the outer query? Am I missing something here?