How to put 0 in a column when no records were made in a day - sql

Im trying to fetch the numbers of customers per date for the last 28 days. I also need to get 0 if there are no customers in a certain day. However, I only know how to fetch all the dates with the records. Not the dates with 0 customers.
Instead of:
date customers
0 2022-01-02 1
1 2022-01-05 4
2 2022-01-06 1
I want to get
date customers
0 2022-01-01 0
1 2022-01-02 1
1 2022-01-03 0
1 2022-01-04 0
1 2022-01-05 4
2 2022-01-06 1

The most simple way is to add a calendar table that involves all dates in a single colums and add a RIGHT OUTER JOIN ti this table.
Another way is to use a CTE (recursive or with some CROOS JOINs) to produce dynamically the calendar then join.
An old proverb says that you only find in a database what you put in it...
Of course the best approach from a performance point of view is the fixed calendar table

Related

How to join a fact table to a dimension table which has a duplicate key value while avoiding duplications in fact table?

How do I join a fact table to a dimension table with a duplicate key value, while at the same time avoiding duplication in the fact table that would result from the join?
Dimension table: enter image description here
fact table: enter image description here
product look-up table (another dimension table): enter image description here
I thought of using the activation date as the next unique value, but they share a month in common.
I thought of creating a snowflake schema which connects dimension table in question (marketing campaigns) to product dimension which in turn connects to the fact table with no issues.
edit:
I am designing a datawarehouse which should answer how effective marketing campaigns based on purchase data.
Purchase data which will be the core of my fact table looks like this:
product_id timestamp sales_price user_id
1 5/9/2015 120 124
2 6/9/2015 150 129
the product lookup table looks like this:
id product_name model production_cost
6 ring 2019 300
5 headband 2018 200
the marketing campaigns look up table looks like this:
startdate enddate type amount_spent currency product_id
1/1/2019 7/1/2019 print 100,000 USD 6
6/1/2019 1/1/2020 socialmedia 10,000,000 USD 6
6/1/2019 1/1/2020 socialmedia 10,000,000 USD 3
The issue is that the marketing table has duplicate product id value of 6. So, when I use it as my natural key to create a surrogate primary key for that dimension table and pull that surrogate key to the fact table as a foreign key it's going to cause duplications for anything with product_id of 6 (as it's not unique). How do I connect marketing campaigns data to fact table, whilst keeping the data integrity intact -- that is no duplications?
I thought about combining start/end date with product_id to create a composite primary key, but they share/overlap a month (6/1/2019 to 7/1/2019)
I also thought about connecting the purchases (fact table) to product lookup and then product to marketing campaigns (a snowflake schema) to avoid the duplication.
I suggest you take the time to read the details of dimensional database design.
If you mean dimensional design, there is no such thing as a lookup table there; there is either a Slowly Changing Dimension (SCD), or just a Dimension. Your product lookup table could be a product lookup dimension. Your Dimension table looks imperfect, too: It does contain the element of time , but not correctly. You need - usually in this order:
a completely arbitrary integer as a surrogate, primary, key - often populated by a sequence or defined as IDENTITY
a business identifier - that could be the SKU for a product, first part of a business unique identifier
the valid-from-date, second part of a business unique identifier
the valid-to-date, '9999-12-31' for the current row, or equal to the valid-from-date of its successor
Type 1 attributes, those that don't change over time
Type 2 attributes, those that change over time and need a new row every time they change
There can be more columns: the Boolean current-indicator, and an inserted timestamp and an updated timestamp.
The fact table is populated from the source transactions, after the dimension table. For each transaction row, you join with the SCD table with the business identifier (SKU in our case), that must be equal and the transaction's timestamp, that must be greater or equal to the valid-from-date and less than the valid-to-date. You pick the surrogate key of the row found in the SCD to populate the fact table's foreign key.
This is an exemplary, minimal, customer SCD table, without the insert/change timestamps and without the current-indicator:
c_key
c_id
c_from_dt
c_to_dt
c_fname
c_lname
c_loy_lvl
c_org_id
66459
1
2022-01-25
9999-12-31
Arthur
Dent
1
1
34168
2
2022-01-25
9999-12-31
Ford
Prefect
2
2
2284
3
2021-12-25
9999-12-31
Zaphod
Beeblebrox
3
3
84768
4
2021-12-25
9999-12-31
Tricia
McMillan
4
4
80080
5
2022-01-25
9999-12-31
Gag
Halfrunt
5
5
57458
6
2022-01-25
9999-12-31
Prostetnic Vogon
Jeltz
6
6
1076
7
2022-01-25
9999-12-31
Lionel
Prosser
1
0
9782
8
2021-12-25
9999-12-31
Benji
Mouse
2
1
42655
9
2021-12-25
9999-12-31
Frankie
Mouse
3
2
57348
10
2021-09-25
2021-10-25
Wonko
The Sane
1
3
22279
10
2021-10-25
2021-11-25
Wonko
The Sane
2
3
3675
10
2021-11-25
2021-12-25
Wonko
The Sane
3
3
95534
10
2021-12-25
2022-01-25
Wonko
The Sane
4
3
69529
10
2022-01-25
9999-12-31
Wonko
The Sane
5
3
34845
11
2022-01-25
9999-12-31
Eccentrica
Gallumbitis
6
4

SQL Query - Identifying entries between payment dates greater than 6 years

I have this table (in reality it has more fields but for simplicity, it will demonstrate what I'm after)
Payment_Type
Person ID
Payment_date
Payment_Amount
Normal
1
2015-01-01
£1.00
Normal
1
2017-01-01
£2.00
Reversal
1
2022-01-09
£3.00
Normal
2
2016-12-29
£3.00
Reversal
2
2022-01-02
£4.00
I need 2 specific things from this:
I need all entries where there is over 6 years difference between any given payment dates (when its been greater than or equal to 6 years from the date of the latest payment date). I don't need to count them, I just need it to return all the entries that meet this criteria.
I also need it to specify where a normal payment hasn't been made for 6 years or more from todays date but a reversal has however occurred within the last 6 years. (This might need to be a separate query but will take suggestions)
I'm using Data Lake (Hue).
Thank you.
I've tried to run a sub query with join and union but I'm not getting the desired results so will need to start from scratch. Any advice/insight on this is greatly appreciated.
Ideally, query one will show:
Payment_Type
Person ID
Payment_date
Payment_Amount
Normal
1
2015-01-01
£1.00
Normal
1
2017-01-01
£2.00
Normal
2
2016-12-29
£3.00
Query 2 results should show:
Payment_Type
Person ID
Payment_date
Payment_Amount
Normal
1
2017-01-01
£2.00
Reversal
1
2022-01-09
£3.00
Normal
2
2016-12-29
£3.00
Reversal
2
2022-01-02
£4.00

SQL / SQLite Query that counts the number of text occurrences based on date

Below I have an example of a query that I am trying to perform. The INPUT is the table that I have access to. The OUTPUT is how I would like to organize the data. I am trying to group the data by the date and have an output for the number ups and the number of downs, so the output will give a string, int, int.
This query is a little more advanced than I've done before, but I think I will need something similar to this:
SELECT date, COUNT(*), COUNT(*) FROM Input GROUP BY date
I am not sure how to format the COUNTs to look for only UP or DOWN on a particular date. Could someone help point me in the right direction
INPUT
date
time
status
2022-01-01
12:12:12
UP
2022-01-01
13:12:12
DOWN
2022-01-01
14:12:12
UP
2022-02-04
12:12:12
UP
2022-02-04
13:12:12
DOWN
2022-02-04
14:12:12
DOWN
2022-03-05
12:12:12
UP
2022-03-05
13:12:12
UP
2022-03-05
14:12:12
DOWN
OUTPUT
date
# of UP
# of DOWN
2022-01-01
2
1
2022-02-04
1
2
2022-03-05
2
1
You can try sum with a conditional case expression
select date,
Sum(case when status='Up' then 1 else 0 end) NumUps,
Sum(case when status='Down' then 1 else 0 end) NumDowns
from t
group by date;

Apply a discount to order if user already ordered something else

I have a table with users, a table with levels, a table for submitted orders and processed orders.
Here's what the submitted orders looks like:
OrderId UserId Level_Name Discounted_Price Order_Date Price
1 1 OLE Core 0 2020-11-01 00:00:00.000 19.99
2 1 Xandadu 1 2020-11-01 00:00:00.000 0
3 2 Xandadu 0 2020-12-05 00:00:00.000 5
4 1 Eldorado 1 2021-01-31 00:00:00.000 9
5 2 Eldorado 0 2021-02-20 00:00:00.000 10
6 2 Birmingham Blues NULL 2021-07-10 00:00:00.000 NULL
What I am trying to do:
UserId 2 has an order for Birmingham Blues, they have already ordered Eldorado and so qualify for a discount on their Birmingham Blues order. Is there a way to check the entire table for this similarity, and if it exists update the discounted price to a 1 and change the price to lets say 10 for the Birmingham Blues order.
EDIT: I have researched the use of cursors, which I'm sure will do the job but they seem complicated and was hoping a simpler solution would be possible. A lot of threads seem to also avoid using cursors. I also looked at this question: T-SQL: Deleting all duplicate rows but keeping one and was thinking I could potentially use the answer to that in some way.
Based on your description and further comments, the following should hopefully meet your requirements - updating the row for the specified User where the values are currently NULL and the user has a qualifying existing order:
update s set
s.Discounted_Price = 1,
Price = 10
from submitted_Orders s
where s.userId=2
and s.Level_Name = 'Birmingham Blues'
and s.discounted_Price is null
and s.Price is null
and exists (
select * from submitted_orders so
where so.userId = s.userId
and so.Level_name = 'Eldorado'
and so.Order_Date < s.OrderDate
);

SQL: Display joined data on a day to day basis anchored on a start date

Perhaps my title is misleading, but I am not sure how else to phrase this. I have two tables, tblL and tblDumpER. They are joined based on the field SubjectNumber. This is a one (tblL) to many (tblDumpER) relationship.
I need to write a query that will give me, for all my subjects, a value from tblDumpER associated with a date in tblL. This is to say:
SELECT tblL.SubjectNumber, tblDumpER.ER_Q1
FROM tblL
LEFT JOIN tblDumpER ON tblL.SubjectNumber=tblDumpER.SubjectNumber
WHERE tblL.RandDate=tblDumpER.ER_DATE And tblDumpER.ER_Q1 Is Not Null
This is straightforward enough. My problem is the value RandDate from tblL is different for every subject. However, it needs to be displayed as Day1 so I can have tblDumpER.ER_Q1 as Day1 for every subject. Then I need RandDate+1 As Day2, etc until I hit either null or Day84. The 'dumb' solution is to write 84 queries. This is obviously not practical. Any advice would be greatly appreciated!
I appreciate the responses so far but I don't think that I'm explaining this correctly so here is some example data:
SubjectNumber RandDate
1001 1/1/2013
1002 1/8/2013
1003 1/15/2013
SubjectNumber ER_DATE ER_Q1
1001 1/1/2013 5
1001 1/2/2013 6
1001 1/3/2013 2
1002 1/8/2013 1
1002 1/9/2013 10
1002 1/10/2013 8
1003 1/15/2013 7
1003 1/16/2013 4
1003 1/17/2013 3
Desired outcome:
(Where Day1=RandDate, Day2=RandDate+1, Day3=RandDate+2)
SubjectNumber Day1_ER_Q1 Day2_ER_Q1 Day3_ER_Q1
1001 5 6 2
1002 1 10 8
1003 7 4 3
This data is then going to be plotted on a graph with Day# on the X-axis and ER_Q1 on the Y-axis
I would do this in two steps:
Create a query that gets the MIN date for each SubjectNumber
Join this query to your existing query, so you can perform a DATEDIFF calculation on the MIN date and the date of the current record.
I'm not entirely sure of what it is that you need, but perhaps a calendar table would be of help. Just create a local table that contains all of the days of the year in it, then use that table to JOIN your dates up?