I have a data set which contains people, dates, food, and quantity of food. I want a query where I specify a person and a date and have two values returned: the quantity of food eaten on the date chosen and the average quantity of food eaten over the previous 7 days.
So if I pick Abe on 1/10/2013 I get "1" and "3.6" because he ate 1 piece of fruit on 1/10 and an average of 3.6 pieces of fruit each day between 1/3 and 1/9.
name,thedate,qty,food
name,thedate,qty,food
abe,1/2/2013,1,orange
abe,1/2/2013,3,pear
abe,1/3/2013,3,orange
abe,1/4/2013,2,orange
abe,1/4/2013,2,plum
abe,1/5/2013,1,orange
abe,1/7/2013,7,onion
abe,1/8/2013,2,orange
abe,1/9/2013,3,orange
abe,1/9/2013,2,pear
abe,1/10/2013,1,orange
jen,1/1/2013,2,orange
jen,1/4/2013,3,orange
jen,1/5/2013,2,orange
You need a correlated subquery to find this
Select
Parent.name
, Parent.thedate
, Parent.qty,
(SELECT avg(qty)
FROM yourTable
where name = parent.name
and thedate < parent.theDate
and theDate>=dateadd("d", datediff("d",0, parent.theDate)-7,0)
group by name) as previousSeven
from yourTable Parent
If this is actually on a per fruit-type basis you can join on that too with and fruit = parent.fruit you need to add fruit to the group by, too
Update
To find, not an average, but the sum of the number of fruit divided by the number of distinct days with data in the last 7 days you will need something more like this (it get's a lot more complicated because access doesn't support the Select count(distinct something) syntax)
Select
Name
, theDate
, qty
, sumOfPreviousSeven/distinctDaysWithDataLastSeven
from (
Select
Parent.name
, Parent.thedate
, Parent.qty
, (SELECT sum(qty)
FROM table4
where name = parent.name
and thedate < parent.theDate
and theDate>=dateadd("d", datediff("d",0, parent.theDate)-7,0)
group by name) as sumOfPreviousSeven
, (select top 1 count(distinctDates) from
(select dateadd("d", datediff("d",0, theDate),0) as distinctDates, name from table4
group by dateadd("d", datediff("d",0, theDate),0), name)
where name = parent.name
and distinctDates < parent.theDate
and distinctDates>=dateadd("d", datediff("d",0, parent.theDate)-7,0)
group by name) as distinctDaysWithDataLastSeven
from table4 Parent) as base
Related
I have one dataset, and am trying to list all of the combinations of said dataset. However, I am unable to figure out how to include the combinations that are null. For example, Longitudinal? can be no and cohort can be 11-20, however for Region 1, there were no patients of that age in that region. How can I show a 0 for the count?
Here is the code:
SELECT "s_safe_005prod"."ig_eligi_group1"."site_name" AS "Site Name",
"s_safe_005prod"."ig_eligi_group1"."il_eligi_ellong" AS "Longitudinal?",
"s_safe_005prod"."ig_eligi_group1"."il_eligi_elcohort" AS "Cohort",
count(*) AS "count"
FROM "s_safe_005prod"."ig_eligi_group1"
GROUP BY "s_safe_005prod"."ig_eligi_group1"."site_name",
"s_safe_005prod"."ig_eligi_group1"."il_eligi_ellong",
"s_safe_005prod"."ig_eligi_group1"."il_eligi_elcohort"
ORDER BY "s_safe_005prod"."ig_eligi_group1"."site_name",
"s_safe_005prod"."ig_eligi_group1"."il_eligi_ellong" ASC,
"s_safe_005prod"."ig_eligi_group1"."il_eligi_elcohort" ASC
Create a cross join across the unique values from each of the three grouping fields to create a set of all possible combinations. Then left join that to the counts you have originally and coalesce null values to zero.
WITH groups AS
(
SELECT a.site_name, b.longitudinal, c.cohort
FROM (SELECT DISTINCT site_name FROM s_safe_005prod.ig_eligi_group1) a,
(SELECT DISTINCT il_eligi_ellong AS longitudinal FROM s_safe_005prod.ig_eligi_group1) b,
(SELECT DISTINCT il_eligi_elcohort AS cohort FROM s_safe_005prod.ig_eligi_group1) c
),
dat AS
(
SELECT site_name,
il_eligi_ellong AS longitudinal,
il_eligi_elcohort AS cohort,
count(*) AS "count"
FROM s_safe_005prod.ig_eligi_group1
GROUP BY site_name,
il_eligi_ellong,
il_eligi_elcohort
)
SELECT groups.site_name,
groups.longitudinal,
groups.cohort,
COALESCE(dat.[count],0) AS "count"
FROM groups
LEFT JOIN dat ON groups.site_name = dat.site_name
AND groups.longitudinal = dat.longitudinal
AND groups.cohort = dat.cohort;
I have two tables which name shoes_type and shoes_list. The shoes_type table includes shoes_id, shoes_size, shoes_type, date, project_id. Meanwhile, on the shoes_list table, I have shoes_quantity, shoes_id, shoes_color, date, project_id.
I need to get the sum of shoes_quantity based on the shoes_type, shoes_size, date, and also project_id.
I get how to sum the shoes_quantity based on color by doing:
select shoes_color, sum(shoes_quantity)
from shoes_list group by shoes_color
Basically what I want to see is the total quantity of shoes based on the type, size, date and project_id. The type and size information are available on shoes_type table, while the quantity is coming from the shoes_list table. I expect to see something like:
shoes_type shoes_size total quantity date project_id
heels 5 3 19/10/02 1
sneakers 5 3 19/10/02 1
sneakers 6 1 19/10/05 1
heels 7 5 19/10/03 1
While for the desired result, I have tried:
select shoes_type, shoes_size, date, project_id, sum(shoes_quantity)
from shoes_type st
join shoes_list sl
on st.project_id = sl.project_id
and st.shoes_id = sl.shoes_id
and st.date = sl.date
group by shoes_type, shoes_size, date, project_id
Unfortunately, I got an error that says that the column reference "date" is ambiguous.
How should I fix this?
Thank you.
The date column exists in both tables, so you have to specify where to select it from. Replace date with shoes_type.date or shoes_list.date
Qualify all column references to remove the "ambiguous" column error:
select st.shoes_type, st.shoes_size, st.date, st.project_id, sum(slshoes_quantity)
from shoes_type st join
shoes_list sl
on st.project_id = sl.project_id and
st.shoes_id = sl.shoes_id and
st.date = sl.date
group by st.shoes_type, st.shoes_size, st.date, st.project_id;
If you want all columns from shoes_type, you might find that a correlated subquery is faster:
select st.*,
(select sum(slshoes_quantity)
from shoes_list sl
where st.project_id = sl.project_id and
st.shoes_id = sl.shoes_id and
st.date = sl.date
)
from shoes_type st;
In the below query, I am getting productname, count of each item in the product and count of each customer buying the product and then calculating the percentage of customers buying the product. I am hard coding the value of the total unique customers. I want to know how i can dynamically incorporate this in my query. Joining based on purchase date is the only solution that comes to my mind. is there any other effeciante way to achive this?
Query below
(SELECT ProdName, COUNT(ProdName) AS No_of_Prods,
EXACT_COUNT_DISTINCT(cust) as No_of_cust,
(EXACT_COUNT_DISTINCT(cust)/1500)*100 as Percentage_of_cust
FROM
[Prod-cust]
WHERE
(STRFTIME_UTC_USEC(Timestamp,"%Y%m%d")) = (STRFTIME_UTC_USEC(DATE_ADD(CURRENT_TIMESTAMP(), -1, "day"), "%Y%m%d"))
GROUP BY
1,
ORDER BY
2 DESC)
Query for total unique customer as below
(SELECT EXACT_COUNT_DISTINCT(cust),
FROM
[Prod-cust]
WHERE
(STRFTIME_UTC_USEC(Timestamp,"%Y%m%d")) = (STRFTIME_UTC_USEC(DATE_ADD(CURRENT_TIMESTAMP(), -1, "day"), "%Y%m%d"))
SELECT
one.ProdName AS ProdName,
one.No_of_Prods AS No_of_Prods,
one.No_of_cust AS No_of_cust,
(one.No_of_cust/all.No_of_cust)*100 AS Percentage_of_cust
FROM (
SELECT
ProdName,
COUNT(ProdName) AS No_of_Prods,
EXACT_COUNT_DISTINCT(cust) AS No_of_cust
FROM [Prod-cust]
WHERE
(STRFTIME_UTC_USEC(TIMESTAMP,"%Y%m%d")) = (STRFTIME_UTC_USEC(DATE_ADD(CURRENT_TIMESTAMP(), -1, "day"), "%Y%m%d"))
GROUP BY 1
) AS one
CROSS JOIN (
SELECT EXACT_COUNT_DISTINCT(cust) AS No_of_cust,
FROM [Prod-cust]
WHERE
(STRFTIME_UTC_USEC(TIMESTAMP,"%Y%m%d")) = (STRFTIME_UTC_USEC(DATE_ADD(CURRENT_TIMESTAMP(), -1, "day"), "%Y%m%d")
) AS all
SELECT number, RATIO_TO_REPORT(number) OVER() ratio
FROM (SELECT 10 number),(SELECT 40 number),(SELECT 70 number),(SELECT 20 number),(SELECT 30 number)
RATIO_TO_REPORT will add all the numbers, and give you the ratio number/(sum number).
Updated: If we need to count the number of DISTINCT customers, then a COUNT_DISTINCT() OVER() would work as a sub-query:
SELECT number, number/distincts ratio
FROM (
SELECT number, COUNT_DISTINCT(id) OVER() distincts
FROM (SELECT 10 number, 1 id),(SELECT 30 number, 2 id),
(SELECT 60 number, 3 id),(SELECT 20 number, 1 id),
(SELECT 40 number, 1 id)
)
situation:
we have monthly files that get loaded into our data warehouse however instead of being replaced with old loads, these are just compiled on top of each other. the files are loaded in over a period of days.
so when running a SQL script, we would get duplicate records so to counteract this we run a union over 10-20 'customers' and selecting Max(loadID) e.g
SELECT
Customer
column 2
column 3
FROM
MyTable
WHERE
LOADID = (SELECT MAX (LOADID) FROM MyTable WHERE Customer= 'ASDA')
UNION
SELECT
Customer
column 2
column 3
FROM
MyTable
WHERE
LOADID = (SELECT MAX (LOADID) FROM MyTable WHERE Customer= 'TESCO'
The above union would have to be done for multiple customers so i was thinking surely there has to be a more efficient way.
we cant use a MAX (LoadID) in the SELECT statement as a possible scenario could entail the following;
Monday: Asda,Tesco,Waitrose loaded into DW (with LoadID as 124)
Tuesday: Sainsburys loaded in DW (with LoadID as 125)
Wednesday: New Tesco loaded in DW (with LoadID as 126)
so i would want LoadID 124 Asda & Waitrose, 125 Sainsburys, & 126 Tesco
Use window functions:
SELECT t.*
FROM (SELECT t.*, MAX(LOADID) OVER (PARTITION BY Customer) as maxLOADID
FROM MyTable t
) t
WHERE LOADID = maxLOADID;
Would a subquery to a derived table meet your needs?
select yourfields
from yourtables join
(select customer, max(loadID) maxLoadId
from yourtables
group by customer) derivedTable on derivedTable.customer = realTable.customer
and loadId = maxLoadId
I am trying to select the min price of each condition category. I did some search and wrote the code below. However, it shows null for the selected fields. Any solution?
SELECT Sales.Sale_ID, Sales.Sale_Price, Sales.Condition
FROM Items
LEFT JOIN Sales ON ( Items.Item_ID = Sales.Item_ID
AND Sales.Expires_DateTime > NOW( )
AND Sales.Sale_Price = (
SELECT MIN( s2.Sale_Price )
FROM Sales s2
WHERE Sales.`Condition` = s2.`Condition` ) )
WHERE Items.ISBN =9780077225957
A little more complicated solution, but one that includes your Sale_ID is below.
SELECT TOP 1 Sale_Price, Sale_ID, Condition
FROM Sales
WHERE Sale_Price IN (SELECT MIN(Sale_Price)
FROM Sales
WHERE
Expires_DateTime > NOW()
AND
Item_ID IN
(SELECT Item_ID FROM Items WHERE ISBN = 9780077225957)
GROUP BY Condition )
The 'TOP 1' is there in case more than 1 sale had the same minimum price and you only wanted one returned.
(internal query taken directly from #Michael Ames answer)
If you don't need Sales.Sale_ID, this solution is simpler:
SELECT MIN(Sale_Price), Condition
FROM Sales
WHERE Expires_DateTime > NOW()
AND Item_ID IN
(SELECT Item_ID FROM Items WHERE ISBN = 9780077225957)
GROUP BY Condition
Good luck!