SQL Frequency report with null values in columns - sql

I need help creating a frequency report for data that crosses multiple columns of data. Here's an example of my data:
[Sample Data Example]
And here's how I need for the data to read, except that I can't figure out how to get the frequency.
[Frequency Report]
There may be null values in a column in a particular row, but I still need that distinct row counted. I have all of the code working except how to get the frequency/count.
This is what I've tried, but all I get are 0s for the frequency.
select distinct TEST, PANEL, UNITS, LOINC, count(distinct TEST, PANEL, UNITS, LOINC) as FREQ
from DEV_HEALTHTERM.SOURCE.LAB
GROUP BY TEST, PANEL, UNITS, LOINC
ORDER BY FREQ DESC;
Can someone please help?
Thank you.
Robin

you should count columns that are not participating in the group by .in your case you can just use * :
select TEST, PANEL, UNITS, LOINC, count(*) as FREQ
from DEV_HEALTHTERM.SOURCE.LAB
GROUP BY TEST, PANEL, UNITS, LOINC
ORDER BY FREQ DESC;
also you don't need distinct , since you are groping by , It's redundant.

Related

Selecting first record name from duplicate values

I'm writing a query where I want to select the PartNo, Description, Model, and AvaQty from a view.
But in our system, there are slightly different Descriptions or Models for the same Part Number.
As an example, Part A has description like This is Part A and also there is another record for description like This is Part Aa
In my query, I want to remove duplicates and Sum the Ava Qty and show. But because the descriptions and model are different for the same part numbers I'm getting more duplicate values in the final report.
This is my current code.
SELECT DISTINCT PART_NO as PartNo,
ad.INVENTORY_PART_API.Get_Description(contract,part_no) as PartDescription,
ad.Inventory_Product_Family_API.Get_Description(ad.Inventory_Part_API.Get_Part_Product_Family(CONTRACT, PART_NO)) as PartModel,
SUM( QTY_ONHAND - QTY_RESERVED) as AvaQty
FROM ad.INVENTORY_PART_IN_STOCK_UIV
WHERE CONTRACT is not null and
upper(ad.Sales_Part_API.Get_Catalog_Group(CONTRACT, PART_NO)) = upper('SPAM')OR
upper(ad.Sales_Part_API.Get_Catalog_Group(CONTRACT, PART_NO)) = upper('OTOA')
GROUP BY PART_NO,
ad.INVENTORY_PART_API.Get_Description(contract,part_no),
ad.Inventory_Product_Family_API.Get_Description(ad.Inventory_Part_API.Get_Part_Product_Family(CONTRACT, PART_NO))
So I get 14623 counts of records, 46 records are duplicated because the description or model was different from each other. So is there any way to get this without duplicating it?
I tried without selecting Description and Model. Selected Only PartNo and Qty. Then records come without duplicate records. Need to know is there any way to select PartNo and then assign description and model from the duplicate values first record or something and sum of qty. Thanks
You want one result row per part number, so group by part number. There can be different descriptions per part number, so decide which to show. Below, I am showing the first in alphabet (MIN). You can also use MAX to show the latest or LISTAGG to show them all.
SELECT
part_no AS partno,
MIN(ad.inventory_part_api.get_description(contract,part_no)) AS partdescription,
MIN(ad.inventory_product_family_api.get_description(ad.inventory_part_api.get_part_product_family(contract, part_no))) AS partmodel,
SUM(qty_onhand - qty_reserved) AS avaqty
FROM ad.inventory_part_in_stock_uiv
WHERE contract IS NOT NULL
AND UPPER(ad.sales_part_api.get_catalog_group(contract, part_no)) IN ('SPAM', 'OTOA')
GROUP BY part_no
ORDER BY part_no;
As to your WHERE clause: You had WHERE (contract IS NOT NULL AND catgrp = 'SPAM') OR (catgrp = 'OTOA'), because AND has precedence over OR. In my query it is WHERE (contract IS NOT NULL) AND (catgrp = 'SPAM' OR catgrp = 'OTOA'). I suppose this is what you really want. Otherwise change it back.
I would solve this task by using analytic functions.
For example, SUM(qty_onhand - qty_reserved) OVER (PARTITION BY PART_NO). In this case you don't have to use GROUP BY.

SQL Sum Total with multiple assignments

select dc_id, whse_id, assg_id, START_DTIM,
UNIT_SHIP_CSE*prod_cub as TOTAL_CUBE
from exehoust.aseld
I attached a photo to show how the query currently populates. I want to sum the TOTAL_CUBE for each distinct ASSG_ID. I have tried case where sum and group by but keep failing. Basically want to do a SUM IF for each distinct ASSG_ID
You need to group by the assg_id, but ou need also the define what happens to all the other columns i choose MIN only to give you a hint, you need to choose the function yourself
select MIN(dc_id), MIN(whse_id), assg_id, MIN(START_DTIM),
SUM(UNIT_SHIP_CSE*prod_cub) as TOTAL_CUBE
from exehoust.aseld
GROUP BY assg_id
use select assg_id, sum() over(partition by assg_id order by assg_id) to sum by groupings

How to write a query to parse data into multiple columns from one column

I have an existing table (link #1) that I am trying to write a query for so that the query reformats the data as seen in the second link. Basically it is a table listing the completed email types for a group of users. The "Completed Type" is a single column with multiple values. I am trying to parse out the individual values (3 of them) from the "Completed Type" into their own column with a total count. I also would like to add a seperate column called "Completed" which is simply a sum of "Closed without response" and "Replied" for that particular user for that particular month.
I plan on then creating a pivot in Excel that will read off of the new query with the reformated data. For the life of me, I can't figure out how to write this in SQL. I tried creating individual queries to total the different "Completed" types and then tried to union them, but it is not working.
Existing table
Future Query Output
Any advice or guidance you can provide in writing a SQL query in Access that will produce image # 2 would be GREATLY appreciated! Thank you in advance!
You can use case when and sum, for example:
select month,
id,
sum(case when completed_type = "completed" then 1 else 0 end) as completed
from table
group by month, id
Use a crosstab query:
TRANSFORM
Sum([Case Count]) AS [SumOfCase Count]
SELECT
[Month],
ID,
[Adjusted Name],
Mgr,
Sup,
Region,
Region2,
Sum(Abs([Completed Type] Not Like "Closed*")) AS Completed
FROM
Cases
GROUP BY
[Month],
ID,
[Adjusted Name],
Mgr,
Sup,
Region,
Region2
ORDER BY
ID,
[Month] DESC
PIVOT
[Completed Type] In ("Replied","Sent","Closed without response");
Output:

Selecting percentage of group and population based on a field in a table

I have a table with user IDs and states. I need to assign 20% of users in each state to a control group by setting a flag in another table. I don't know how I would be able to ensure that the numbers are correct though. How would I go about even starting this?
As an example, take a look at this sqlfiddle:
http://sqlfiddle.com/#!4/8e49d/6/0
with counts as
(select stateid, count(userid) as num_users
from userstates
group by stateid)
select *
from (select x.stateid,
x.userid,
sum(1) over(partition by x.stateid order by x.userid) as runner,
y.num_users,
sum(1) over(partition by x.stateid order by x.userid) / y.num_users as pct
from userstates x
join counts y
on x.stateid = y.stateid)
where pct <= .2
There are a couple of assumptions I made:
-- I assumed that, if you could not pull exactly 20%, you would choose, for instance, 19%, rather than 21%. The query would need to be changed slightly if you want to pull 1 ID over 20% when exactly 20% is not possible (you can't pull a fraction of a username, so you have to choose one way or the other).
-- I assumed that you did not want a random 20%, and that 20% of the first user IDs, in order, would suffice. I would need to change the query slightly if you wanted the 20% from each group to be random.

BigQuery: GROUP BY clause for QUANTILES

Based on the bigquery query reference, currently Quantiles do not allow any kind of grouping by another column. I am mainly interested in getting medians grouped by a certain column. The only work around I see right now is to generate a quantile query per distinct group member where the group member is a condition in the where clause.
For example I use the below query for every distinct row in column-y if I want to get the desired result.
SELECT QUANTILE( <column-x>, 1001)
FROM <table>
WHERE
<column-y> == <each distinct row in column-y>
Does the big query team plan on having some functionality to allow grouping on quantiles in the future?
Is there a better way to get what I am trying to get here?
Thanks
With the recently announced percentile_cont() window function you can get medians.
Look at the example in the announcement blog post:
http://googlecloudplatform.blogspot.com/2013/06/google-bigquery-bigger-faster-smarter-analytics-functions.html
SELECT MAX(median) AS median, room FROM (
SELECT percentile_cont(0.5) OVER (PARTITION BY room ORDER BY data) AS median, room
FROM [io_sensor_data.moscone_io13]
WHERE sensortype='temperature'
)
GROUP BY room
While there are efficient algorithms to compute quantiles they are somewhat memory intensive - trying to do multiple quantile calculations in a single query gets expensive.
There are plans to improve QUANTILES, but I don't know what the timeline is.
Do you need median? Can you filter outliers and do an average of the remainder?
If your per-group size is fixed, you may be able to hack it using combination of order, nest and nth. For instance, if there are 9 distinct values of f2 per value of f1, for median:
select f1,nth(5,f2) within record from (
select f1,nest(f2) f2 from (
select f1, f2 from table
group by f1,f2
order by f2
) group by f1
);
Not sure if the sorted order in subquery is guaranteed to survive the second group, but it worked in a simple test I tried.