Is there a function in SQL that automatically generates more rows by month? - sql

I've got a large database that's got all our transactions and shipping costs from them, here's a simplified version:
Source Table
Date
ROUTE
Cost
01/20/21
USA to UK
$40
01/01/21
USA to UK
$40
01/10/21
USA to UK
$40
12/20/20
USA to UK
$30
11/20/20
USA to UK
$20
11/20/20
USA to UK
$20
And I want to see the average cost by month before so it would look like:
Route
Nov 2020
Dec 2020
Jan 2020
USA to UK
$20
$30
$40
How do I write a code that I can repeat for when say April comes around and I have to refresh this table and I don't need to create new columns for Feb, March, etc.?

Here is a possible way of doing it by using PIVOT in Snowflake: https://docs.snowflake.com/en/sql-reference/constructs/pivot.html
Let's say "monthname" is a column you extracted out of your "Date"-column, probably this helps:
select * from yourTable
pivot(sum(cost) for monthname in ('January', 'February', 'March', 'April'))
order by route;
The values of your monthname-column should match the one in the brackets.
As this is a more static solution, you still have to adjust the code every month. Here probably writing a stored procedure is helping: https://docs.snowflake.com/en/sql-reference/stored-procedures.html

Related

How to calculate sum of different table which separated by years and what is the best data model to support ? Power BI

Desire Result
Choose year from Slicer , return sum result of Scorecard .
Problems
No single measure can be putted in scorecard and to apply for calculating all tables while year is selected by slicer .
e.g.
sum(table20[Score])
sum(table21[Score])
sum(table22[Score])
Have to avoid append table as each tables of a single year contains over millions of rows.
All tables are imported from power bi dataflow without using Direct Query.
Data
table20
YEAR GEO Type Score
2020 Asia A 1
2020 Africa A 2.5
2020 CentralAfrica A 0
2020 Europe A 0
2020 MiddleEast A 0
2020 America A 1.5
table21
YEAR GEO Type Score
2021 Asia A 3
2021 Africa A 2
2021 CentralAfrica A 6
2021 Europe A 1
2021 MiddleEast A 2
2021 America A 8
table22
YEAR GEO Type Score
2022 Asia A 4
2022 Africa A 0
2022 CentralAfrica C 3
2022 Europe C 4
2022 MiddleEast A 1
2022 America A 5
Relationship
Calendar table link those tables .
Calander = CALENDAR(DATE(2020,1,1),DATE(2022,12,31))
What is the best way to achieve this without appends to be a larger dataset ?
Not sure aggregation based on "imported table" can help on the performance ?
Please advise if my concept is basically wrong like append table cannot be avoided .e.t.c.

How can I pull out the second highest product usage from a SQL Server table?

We have a product usage table for software. It has 4 fields, [product name], [usage month], [users] and [Country]. We must report the data by Country and Product Name for licensing purposes. Our rule is to report the second highest number of users per country for each product. The same products can be used in all countries. It based on monthly usage numbers, so second peak usage for fy 2020. Since all of the data is in one table I am having trouble figuring out the SQL to get the information I need from the table.
I am thinking I need to do multiple selects (inner select? ) and group the data in a way to pull out the product name, peak usage and country. But that is where I am getting confused as to the best approach.
Example Data looks like this:
[product name], [usage month], [users], [Country]
Product1 January 831 United States of America
Product1 December 802 United States of America
Product1 September 687 United States of America
Product1 August 407 United States of America
Product1 July 799 United States of America
Product1 June 824 United States of America
Product1 April 802 United States of America
Product1 May 796 United States of America
Product1 February 847 United States of America
Product1 March 840 United States of America
Product1 November 818 United States of America
Product1 October 841 United States of America
Product2 March 1006 United States of America
Product2 February 1076 United States of America
Product2 April 890 United States of America
Product2 May 831 United States of America
Product2 September 538 United States of America
Product2 October 1053 United States of America
Product2 July 673 United States of America
Product2 August 87 United States of America
Product2 November 994 United States of America
Product2 January 1042 United States of America
Product2 December 952 United States of America
Product2 June 873 United States of America
I had originally thought about breaking this out into multiple tables and then trying sql against each product table, but since this is something I will need to do monthly, I didn't want to redesign the ETL that loads the data because 1) I don't control that ETL and 2) I felt like that would be a move backwards for a repetitive task. We were also looking into Power BI to do this for us, but haven't foound the right approach, and I would honestly rather have this in SQL.
If I follow you correctly:
select *
from (
select t.*,
row_number() over(partition by product_name, country order by users desc) rn
from mytable t
) t
where rn = 2
This generates one row per product and country, that corresponds to the second highest number of users.
For one country it should be fairly simple. This is off the top of my head, but a bit of tweaking should do it. This comes from your table names, which is likely way off (right?).
SELECT top 2 users
FROM ProductCounts
WHERE County = #Country
ORDER BY users DESC
LIMIT 1;
I don't really get a sense of how your data is entered to get a good feel of a better way to store the data to get the information you desire for your report.
You can use this, it returns the second highest user count grouped by first country and second product. Take as note that when there is only 1 user count per country and product the it will not show up, there have to be at least two user counts per country and product.
SELECT
country, product, users
FROM
ProductCounts
WHERE
(SELECT COUNT(*) FROM ProductCounts AS p
WHERE
p.country = ProductCounts.country
AND
p.product = ProductCounts.product
AND
p.users >= ProductCounts.users ) = 2
GROUP BY
country, product

Normalize monthly payments

First, sorry for my bad English. I'm trying to normalize a table in a pension system where subscribers are paid monthly. I need to know who has been paid and who has not and how much they've been paid. I believe I'm using SQL Server. Here's an example:
id_subscriber id_receipt year month pay_value payment type_pay
12 1 2016 January 100 80 1
13 1 2016 January 100 100 1
14 1 2016 January 100 100 1
12 2 2016 February 100 100 2
13 2 2016 February 100 80 1
But I'm not happy repeating the year and the month for every single subscriber. It doesn't seem right. Is there a better way to store this data?
EDIT:
The case is as follows: this company has many subscribers who must pay monthly and payment can be in various ways. They produce a single receipt for many customers, and each customer that receipt may be paying one or more installments.
These are my other tables:
tbl_subscriber
id_suscriber(PK) first_name last_name address tel_1 tel_2
12 Juan Perez xxx xxx xxx
13 Pedro Lainez xxx xxx xxx
14 Maria Lopez xxx xxx xxx
tbl_receipt
id_receipt(PK) value elaboration_date deposit_date
1 1,000.00 2015-09-16 2015-09-20
2 890.00 2015-12-01 2015-12-18
tbl_type_paym
id type description
1 bank xxxx
2 ventanilla xxx
This basically seems fine. You could split dates out into a separate table and reference that, but that strikes me as a kind of silly way to do it. I would recommend storing the month as an integer instead of a varchar column though. Besides not storing the same string over and over you can more reasonably do comparisons.
You could also use date values, although that might not be worth the trouble when you don't want greater granularity than the month.

Teradata Default List

There is this one table which contains the amounts and states that I need. However, this table contains a year information but I want month. For example, in the table it shows information for Kentucky for 2011..and thats it. For California it shows about 5 different years. But I need it to repeat by month.
So if in 2011 Kentucky had 12 total, then I need a query that shows 12 for January, February, May....repeatedly
Right now I get this output with a dumb query I have:
Kentucky 12 January
California 800 January
This is done easily by grouping by State, Quantity and Month
I want to make sure that no matter what the Quantity is, each State has ALL months
Kentucky 12 January
Kentucky 12 February
Kentucky 12 May
California 800 January
California 800 February
California 800 May
Any idea on how to do this with Teradata SQL?
The overall query would look something like this:
SELECT
state_quantities.state,
state_quantities.quantity,
all_months.month_name
FROM state_quantities
CROSS JOIN (
...
) all_months
What goes between the brackets for all_months depends on what you mean by "all months".
If you mean all months that appear in state_quantities irrespective of state (so if you have Kentucky with January, California with February and Florida with May, you'd only get those three months) you could use something like this:
SELECT
month_name
FROM state_quantities
GROUP BY month_name
If you want all 12 months, you would join to a table containing all 12 months. In the absence of that, you could use sys_calendar.calendar (syntax below might be off):
SELECT
CAST(calendar_date AS DATE FORMAT 'MMM') AS month_name
FROM sys_calendar.calendar
GROUP BY month_name

How can I create a tabular report in SQL when the column names are in the database, not the query?

http://www.geocities.com/colinpriley/sql/sqlitepg09.htm has a nice technique for creating a tabular report where the column names for the table can be coded in the query but in my case, the columns should be values from the database. Say I have daily sales figures like:
Transaction Date Rep Product Amount
1 July 1 Bob A12 $10
2 July 2 Bob B24 $12
3 July 2 Ted A12 $25
...
and I want a weekly summary report that shows how much of each product each rep sold:
A12 B24
Bob $10 $12
Ted $25 $0
My column names come from the Product column. Say, any product that has a row in the specified date range should have a column in the report. But other products -- which weren't sold in that time frame -- should not have a column of all 0s. How can I do that? Bonus points if it works in SQLite.
TIA.
http://weblogs.asp.net/wallen/archive/2005/02/18/376150.aspx has a good way to extract columns