How to write SQL statement to select for data broken up for each month of the year? - sql

I am looking for a way to write an SQL statement that selects data for each month of the year, separately.
In the SQL statement below, I am trying to count the number of instances in the TOTAL_PRECIP_IN and TOTAL_SNOWFALL_IN columns when either column is greater than 0. In my data table, I have information for those two columns ("TOTAL_PRECIP_IN" and "TOTAL_SNOWFALL_IN") for each day of the year (365 total entries).
I want to break up my data by each calendar month, but am not sure of the best way to do this. In the statement below, I am using a UNION statement to break up the months of January and February. If I keep using UNION statements for the remaining months of the year, I can get the answer I am looking for. However, using 11 different UNION statements cannot be the optimal solution.
Can anyone give me a suggestion how I can edit my SQL statement to measure from the first day of the month, to the last day of the month for every month of the year?
select monthname(OBSERVATION_DATE) as "Month", sum(case when TOTAL_PRECIP_IN or TOTAL_SNOWFALL_IN > 0 then 1 else 0 end) AS "Days of Rain" from EMP_BASIC
where OBSERVATION_DATE between '2019-01-01' and '2019-01-31'
and CITY = 'Olympia'
group by "Month"
UNION
select monthname(OBSERVATION_DATE) as "Month", sum(case when TOTAL_PRECIP_IN or TOTAL_SNOWFALL_IN > 0 then 1 else 0 end) from EMP_BASIC
where OBSERVATION_DATE between '2019-02-01' and '2019-02-28'
and CITY = 'Olympia'
group by "Month"```

Your table structure is too unclear to tell you the exact query you will need. But a general easy idea is to build the sum of your value and then group by monthname and/or by month. Sice you wrote you only want sum values greater 0, you can just put this condition in the where clause. So your query will be something like this:
SELECT MONTHNAME(yourdate) AS month,
MONTH(yourdate) AS monthnr,
SUM(yourvalue) AS yoursum
FROM yourtable
WHERE yourvalue > 0
GROUP BY MONTHNAME(yourdate), MONTH(yourdate)
ORDER BY MONTH(yourdate);
I created an example here: db<>fiddle
You might need to modify this general construct for your concrete purpose (maybe take care of different years, of NULL values etc.). And note this is an example for a MYSQL DB because you wrote about MONTHNAME() which is in most cases used in MYSQL databases. If you are using another DB type, maybe you need to do some modifications. To make sure that answers match your DB type, tag it in your question, please.

Related

SQL Server query date and amount

I am trying to create an SQL query which is based on the following info.
I have an amount bought and sold for each day for articles. I am trying to have a query that shows:
Total "amount" per "article" per "month
"amount" should be split into "positive total" and "negative total", summing up all positive "amount" and all negative "amount" separately.
THe date has the format "yyyy-mm-dd 00:00:00.000"
I tried the following
SELECT article, date, SUM (amount) Total FROM shop group by FORMAT(date, 'yyyy_MM'), article
I get the following message
"date is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause"
If I take the date out of the query everything works fine and it calculates the totals.
You need a so-called injective function for your dates: way to convert all dates in a month to the same value, and you need to use it both in your SELECT and GROUP BY clauses. LAST_DAY() is a decent function to use. So is DATE_FORMAT(date, '%Y-%m'), by the way.
Try this:
SELECT article, LAST_DAY(date) month_ending, SUM (amount) Total
FROM shop
GROUP BY LAST_DAY(date), article
There's a general writeup of solutions to this problem here.
Original answer for MySQL
There are mainly two mistakes in your query:
FORMAT is a function that converts a number to a string. So, MySQL will convert your date to a number first (which should not even be possible and raise an error, but MySQL does convert it to some number nonetheless) and then make sense of the format 'yyyy_MM', maybe taking MM to mean Myanmar, I don't know. I assume you get a different value for each day, instead of one value per month. You want DATE_FORMAT(date, '%Y-%m') instead.
You try to group by month, but then you display the date. Which date? A month has up to 31 different dates. You must display the month you grouped by instead (i.e. again DATE_FORMAT(date, '%Y-%m')).
As to separating positive and negative amounts, you can use conditional aggrgation, i.e. CASE WHEN inside the aggregation function (SUM).
SELECT
DATE_FORMAT(date, '%Y-%m') AS month,
article,
SUM(CASE WHEN amount > 0 THEN amount ELSE 0 END) AS positive_total,
SUM(CASE WHEN amount < 0 THEN amount ELSE 0 END) AS negative_total
FROM shop
GROUP BY DATE_FORMAT(date, '%Y-%m'), article
ORDER BY DATE_FORMAT(date, '%Y-%m'), article;
Updated answer for SQL Server
In SQL Server FORMAT(date, 'yyyy_MM') is a function to get the year and month from a date. The query is hence:
SELECT
FORMAT(date, 'yyyy_MM') AS month,
article,
SUM(CASE WHEN amount > 0 THEN amount ELSE 0 END) AS positive_total,
SUM(CASE WHEN amount < 0 THEN amount ELSE 0 END) AS negative_total
FROM shop
GROUP BY FORMAT(date, 'yyyy_MM'), article
ORDER BY FORMAT(date, 'yyyy_MM'), article;

How to generate one row per week when specifying a month in the WHERE clause PSQL

I have a database (psql) which is supposed to work as a Library Management System. I have a table called borrowed which contains some information about each borrowing of books. Each book has a borrowdate column, a date value specifying when the book was borrowed from the Library. Each book also has a returndate column, which tells us when the book was returned. If it hasn't been returned, the value of returndate will be null.
I'm now trying to write a query that presents a table that shows a monthly report for the number of books borrowed/returned for each week (for example week 1-4). I've managed to write a query that shows a table of the number of books borrowed, returned, and not returned (missing) for a specified time interval. Time interval is provided in the WHERE clause.
How can I split this time interval up into 4 equally sized parts so that I'll get one row for each part? Essentially, I want to get one row per week in a month. See my query below!
SELECT
ROW_NUMBER() OVER(ORDER BY COUNT(borrowid)) AS week,
COUNT(borrowid) AS borrowedbooks,
SUM(CASE WHEN returndate IS NULL THEN 1 ELSE 0 END) AS returnedbooks,
SUM(CASE WHEN returndate IS NULL THEN 1 ELSE 1 END) -
SUM(CASE WHEN returndate IS NULL THEN 1 ELSE 0 END) AS missingbooks
FROM borrowed
WHERE borrowdate >= '20190901'
AND borrowdate <= '20190930'
;
Thank you so much for any help!
The key is aggregation, but by what? You can calculate the date as a week and group by that calculation:
SELECT TO_CHAR(borrowdate, 'IYYY-IW') AS week,
COUNT(borrowid) AS borrowedbooks,
COUNT(*) FILTER (WHERE returndate IS NOT NULL) AS returnedbooks,
COUNT(*) FILTER (WHERE returndate IS NULL) AS missingbooks
FROM borrowed
WHERE borrowdate >= '20190901' AND borrowdate <= '20190930'
GROUP BY week
ORDER BY week;
Note that this uses TO_CHAR() to generate a standard definition for "week".
Note that I changed the conditional aggregation logic. Postgres supports the standard FILTER clause, so I've used that.

SQL - Case when using two date columns

I am trying to write a case statement in SQL to allow me to separate out new employees from tenured employees based on when they received a contact from a customer.
I have two date columns, the day they started work and the day they received a contact.
This is what I'm aiming for:
CASE WHEN start_date is equal or within 70 days of Contact_day THEN 'New Hire'
WHEN start_date is after 70 days of Contact_day THEN 'Tenured'
END AS Agent_tenure
I'm not sure how to write this out in SQL. Could somebody help me please.
Believe you using sql-server
Select (Case when start_date <= DATEADD(DAY,70,Contact_day) then 'NewHire'
else 'Tenured' end) as Agent_tenure
From Tablename

SQL: Average value per day

I have a database called ‘tweets’. The database 'tweets' includes (amongst others) the rows 'tweet_id', 'created at' (dd/mm/yyyy hh/mm/ss), ‘classified’ and 'processed text'. Within the ‘processed text’ row there are certain strings such as {TICKER|IBM}', to which I will refer as ticker-strings.
My target is to get the average value of ‘classified’ per ticker-string per day. The row ‘classified’ includes the numerical values -1, 0 and 1.
At this moment, I have a working SQL query for the average value of ‘classified’ for one ticker-string per day. See the script below.
SELECT Date( `created_at` ) , AVG( `classified` ) AS Classified
FROM `tweets`
WHERE `processed_text` LIKE '%{TICKER|IBM}%'
GROUP BY Date( `created_at` )
There are however two problems with this script:
It does not include days on which there were zero ‘processed_text’s like {TICKER|IBM}. I would however like it to spit out the value zero in this case.
I have 100+ different ticker-strings and would thus like to have a script which can process multiple strings at the same time. I can also do them manually, one by one, but this would cost me a terrible lot of time.
When I had a similar question for counting the ‘tweet_id’s per ticker-string, somebody else suggested using the following:
SELECT d.date, coalesce(IBM, 0) as IBM, coalesce(GOOG, 0) as GOOG,
coalesce(BAC, 0) AS BAC
FROM dates d LEFT JOIN
(SELECT DATE(created_at) AS date,
COUNT(DISTINCT CASE WHEN processed_text LIKE '%{TICKER|IBM}%' then tweet_id
END) as IBM,
COUNT(DISTINCT CASE WHEN processed_text LIKE '%{TICKER|GOOG}%' then tweet_id
END) as GOOG,
COUNT(DISTINCT CASE WHEN processed_text LIKE '%{TICKER|BAC}%' then tweet_id
END) as BAC
FROM tweets
GROUP BY date
) t
ON d.date = t.date;
This script worked perfectly for counting the tweet_ids per ticker-string. As I however stated, I am not looking to find the average classified scores per ticker-string. My question is therefore: Could someone show me how to adjust this script in such a way that I can calculate the average classified scores per ticker-string per day?
SELECT d.date, t.ticker, COALESCE(COUNT(DISTINCT tweet_id), 0) AS tweets
FROM dates d
LEFT JOIN
(SELECT DATE(created_at) AS date,
SUBSTR(processed_text,
LOCATE('{TICKER|', processed_text) + 8,
LOCATE('}', processed_text, LOCATE('{TICKER|', processed_text))
- LOCATE('{TICKER|', processed_text) - 8)) t
ON d.date = t.date
GROUP BY d.date, t.ticker
This will put each ticker on its own row, not a column. If you want them moved to columns, you have to pivot the result. How you do this depends on the DBMS. Some have built-in features for creating pivot tables. Others (e.g. MySQL) do not and you have to write tricky code to do it; if you know all the possible values ahead of time, it's not too hard, but if they can change you have to write dynamic SQL in a stored procedure.
See MySQL pivot table for how to do it in MySQL.

View data by date after Format 'mmyy'

I'm trying to answer questions like, how many POs per month do we have? Or, how many lines are there in every PO by month, etc. The original PO dates are all formatted #1/1/2013#. So my first step was to Format each PO record date into 'mmyy' so I could group and COUNT them.
This worked well but, now I cannot view the data by date... For example, I cannot ask 'How many POs after December did we get?' I think this is because SQL does not recognize mm/yy as a comparable date.
Any ideas how I could restructure this?
There are 2 queries I wrote. This is the query to format the dates. This is also the query I was trying to add the date filter to (ex: >#3/14#)
SELECT qryALL_PO.POLN, Format([PO CREATE DATE],"mm/yy") AS [Date]
FROM qryALL_PO
GROUP BY qryALL_PO.POLN, Format([PO CREATE DATE],"mm/yy");
My group and counting query is:
SELECT qryALL_PO.POLN, Sum(qryALL_PO.[LINE QUANTITY]) AS SUM_QTY_PO
FROM qryALL_PO
GROUP BY qryALL_PO.POLN;
You can still count and group dates, as long as you have a way to determine the part of the date you are looking for.
In Access you can use year and month for example to get the year and month part of the date:
select year(mydate)
, month(mydate)
, count(*)
from tableX
group
by year(mydate)
, month(mydate)
You can format it 'YYYY-MM' , and then use '>' for 'after' clause