SQL query getting multiple where-claused aliases

SQL query getting multiple where-claused aliases - sql

Hoping you can help with this issue.
I have an energymanagement software running on a system. The data logged is the total value, logged in the column Value. This is done every hour. Along is some other data, here amongst a boolean called Active and an integer called Day.
What I'm going for, is one query that gets me the a list of sorted days, the total powerusage of the day, and the peak-powerusage of the day.
The peak-power usage is counted by using Max/Min of the value where Active is present. Somedays, however, the Active bit isn't set, and the result of this query alone would yield NULL.
This is my query:
SELECT
A.Day, A.Forbrug, B.Peak
FROM
(SELECT
Day, Max(Value) - Min(Value) AS Forbrug
FROM
EL_HT1_K
WHERE
MONTH = 8 AND YEAR = 2016
GROUP By Day) A,
(SELECT
Day, Max(Value) - Min(Value) AS Peak
FROM
EL_HT1_K
WHERE
Month = 8 AND Year = 2016 AND Active = 1
GROUP BY Day) B
WHERE
A.Day = B.Day
Which only returns the result where query B (Peak-usage) would yield results.
What I want, is that the rest of the results from inner query A, still is shown, even though query B yields 0/null for that day.
Is this possible, and how?
FYI. The reason I need this to be in one query, is that the scada system has some difficulties handling multiple queries.

I think you just want conditional aggregation. Based on your description, this seems to be the query you want:
SELECT Day, SUM(Value) as total,
MAX(CASE WHEN Active = 1 THEN Value END) as Peak,
FROM EL_HT1_K
WHERE Month = 8 AND Year = 2016
GROUP BY Day;

Related

SQL: Apply an aggregate result per day using window functions

Consider a time-series table that contains three fields time of type timestamptz, balance of type numeric, and is_spent_column of type text.
The following query generates a valid result for the last day of the given interval.
SELECT
MAX(DATE_TRUNC('DAY', (time))) as last_day,
SUM(balance) FILTER ( WHERE is_spent_column is NULL ) AS value_at_last_day
FROM tbl
2010-07-12 18681.800775017498741407984000
However, I am in need of an equivalent query based on window functions to report the total value of the column named balance for all the days up to and including the given date .
Here is what I've tried so far, but without any valid result:
SELECT
DATE_TRUNC('DAY', (time)) AS daily,
SUM(sum(balance) FILTER ( WHERE is_spent_column is NULL ) ) OVER ( ORDER BY DATE_TRUNC('DAY', (time)) ) AS total_value_per_day
FROM tbl
group by 1
order by 1 desc
2010-07-12 16050.496339044977568391974000
2010-07-11 13103.159119670350269890284000
2010-07-10 12594.525752964512456914454000
2010-07-09 12380.159588711091681327014000
2010-07-08 12178.119542536668113577014000
2010-07-07 11995.943973804127033140014000
EDIT:
Here is a sample dataset:
LINK REMOVED
The running total can be computed by applying the first query above on the entire dataset up to and including the desired day. For example, for day 2009-01-31, the result is 97.13522530000000000000, or for day 2009-01-15 when we filter time as time < '2009-01-16 00:00:00' it returns 24.446144000000000000.
What I need is an alternative query that computes the running total for each day in a single query.
EDIT 2:
Thank you all so very much for your participation and support.
The reason for differences in result sets of the queries was on the preceding ETL pipelines. Sorry for my ignorance!
Below I've provided a sample schema to test the queries.
https://www.db-fiddle.com/f/veUiRauLs23s3WUfXQu3WE/2
Now both queries given above and the query given in the answer below return the same result.

Consider calculating running total via window function after aggregating data to day level. And since you aggregate with a single condition, FILTER condition can be converted to basic WHERE:
SELECT daily,
SUM(total_balance) OVER (ORDER BY daily) AS total_value_per_day
FROM (
SELECT
DATE_TRUNC('DAY', (time)) AS daily,
SUM(balance) AS total_balance
FROM tbl
WHERE is_spent_column IS NULL
GROUP BY 1
) AS daily_agg
ORDER BY daily

SQL query with summed statistical data, grouped by date

I'm trying to wrap my head around a problem with making a query for a statistical overview of a system.
The table I want to pull data from is called 'Event', and holds the following columns (among others, only the necessary is posted):
date (as timestamp)
positionId (as number)
eventType (as string)
Another table that most likely is necessary is 'Location', with, among others, holds the following columns:
id (as number)
clinic (as boolean)
What I want is a sum of events in different conditions, grouped by days. The user can give an input over the range of days wanted, which means the output should only show a line per day inside the given limits. The columns should be the following:
date: a date, grouping the data by days
deliverySum: A sum of entries for the given day, where eventType is 'sightingDelivered', and the Location with id=posiitonId has clinic=true
pickupSum: Same as deliverySum, but eventType is 'sightingPickup'
rejectedSum: A sum over events for the day, where the positionId is 4000
acceptedSum: Same as rejectedSum, but positionId is 3000
So, one line should show the sums for the given day over the different criteria.
I'm fairly well read in SQL, but my experience is quite low, which lead to me asking here.
Any help would be appreciated

SQL Server has neither timestamps nor booleans, so I'll answer this for MySQL.
select date(date),
sum( e.eventtype = 'sightingDelivered' and l.clinic) as deliverySum,
sum( e.eventtype = 'sightingPickup' and l.clinic) as pickupSum,
sum( e.position_id = 4000 ) as rejectedSum,
sum( e.position_id = 3000 ) as acceptedSum
from event e left join
location l
on e.position_id = l.id
where date >= $date1 and date < $date2 + interval 1 day
group by date(date);

SQL statement to match dates that are the closest?

I have the following table, let's call it Names:
Name Id Date
Dirk 1 27-01-2015
Jan 2 31-01-2015
Thomas 3 21-02-2015
Next I have the another table called Consumption:
Id Date Consumption
1 26-01-2015 30
1 01-01-2015 20
2 01-01-2015 10
2 05-05-2015 20
Now the problem is, that I think that doing this using SQL is the fastest, since the table contains about 1.5 million rows.
So the problem is as follows, I would like to match each Id from the Names table with the Consumption table provided that the difference between the dates are the lowest, so we have: Dirk consumes on 27-01-2015 about 30. In case there are two dates that have the same "difference", I would like to calculate the average consumption on those two dates.
While I know how to join, I do not know how to code the difference part.
Thanks.
DBMS is Microsoft SQL Server 2012.
I believe that my question differs from the one mentioned in the comments, because it is much more complicated since it involves comparison of dates between two tables rather than having one date and comparing it with the rest of the dates in the table.

This is how you could it in SQL Server:
SELECT Id, Name, AVG(Consumption)
FROM (
SELECT n.Id, Name, Consumption,
RANK() OVER (PARTITION BY n.Id
ORDER BY ABS(DATEDIFF(d, n.[Date], c.[Date]))) AS rnk
FROM Names AS n
INNER JOIN Consumption AS c ON n.Id = c.Id ) t
WHERE t.rnk = 1
GROUP BY Id, Name
Using RANK with PARTITION BY n.Id and ORDER BY ABS(DATEDIFF(d, n.[Date], c.[Date])) you can locate all matching records per Id: all records with the smallest difference in days are going to have rnk = 1.
Then, using AVG in the outer query, you are calculating the average value of Consumption between all matching records.
SQL Fiddle Demo

SQL: Average value per day

I have a database called ‘tweets’. The database 'tweets' includes (amongst others) the rows 'tweet_id', 'created at' (dd/mm/yyyy hh/mm/ss), ‘classified’ and 'processed text'. Within the ‘processed text’ row there are certain strings such as {TICKER|IBM}', to which I will refer as ticker-strings.
My target is to get the average value of ‘classified’ per ticker-string per day. The row ‘classified’ includes the numerical values -1, 0 and 1.
At this moment, I have a working SQL query for the average value of ‘classified’ for one ticker-string per day. See the script below.
SELECT Date( `created_at` ) , AVG( `classified` ) AS Classified
FROM `tweets`
WHERE `processed_text` LIKE '%{TICKER|IBM}%'
GROUP BY Date( `created_at` )
There are however two problems with this script:
It does not include days on which there were zero ‘processed_text’s like {TICKER|IBM}. I would however like it to spit out the value zero in this case.
I have 100+ different ticker-strings and would thus like to have a script which can process multiple strings at the same time. I can also do them manually, one by one, but this would cost me a terrible lot of time.
When I had a similar question for counting the ‘tweet_id’s per ticker-string, somebody else suggested using the following:
SELECT d.date, coalesce(IBM, 0) as IBM, coalesce(GOOG, 0) as GOOG,
coalesce(BAC, 0) AS BAC
FROM dates d LEFT JOIN
(SELECT DATE(created_at) AS date,
COUNT(DISTINCT CASE WHEN processed_text LIKE '%{TICKER|IBM}%' then tweet_id
END) as IBM,
COUNT(DISTINCT CASE WHEN processed_text LIKE '%{TICKER|GOOG}%' then tweet_id
END) as GOOG,
COUNT(DISTINCT CASE WHEN processed_text LIKE '%{TICKER|BAC}%' then tweet_id
END) as BAC
FROM tweets
GROUP BY date
) t
ON d.date = t.date;
This script worked perfectly for counting the tweet_ids per ticker-string. As I however stated, I am not looking to find the average classified scores per ticker-string. My question is therefore: Could someone show me how to adjust this script in such a way that I can calculate the average classified scores per ticker-string per day?

SELECT d.date, t.ticker, COALESCE(COUNT(DISTINCT tweet_id), 0) AS tweets
FROM dates d
LEFT JOIN
(SELECT DATE(created_at) AS date,
SUBSTR(processed_text,
LOCATE('{TICKER|', processed_text) + 8,
LOCATE('}', processed_text, LOCATE('{TICKER|', processed_text))
- LOCATE('{TICKER|', processed_text) - 8)) t
ON d.date = t.date
GROUP BY d.date, t.ticker
This will put each ticker on its own row, not a column. If you want them moved to columns, you have to pivot the result. How you do this depends on the DBMS. Some have built-in features for creating pivot tables. Others (e.g. MySQL) do not and you have to write tricky code to do it; if you know all the possible values ahead of time, it's not too hard, but if they can change you have to write dynamic SQL in a stored procedure.
See MySQL pivot table for how to do it in MySQL.

Confused on count(*) and self joins

I want to return all application dates for the current month and for the current year. This must be simple, however I can not figure it out. I know I have 2 dates for the current month and 90 dates for the current year. Right, Left, Outer, Inner I have tried them all, just throwing code at the wall trying to see what will stick and none of it works. I either get 2 for both columns or 180 for both columns. Here is my latest select statement.
SELECT count(a.evdtApplication) AS monthApplicationEntered,
count (b.evdtApplication) AS yearApplicationEntered
FROM tblEventDates a
RIGHT OUTER JOIN tblEventDates b ON a.LOANid = b.loanid
WHERE datediff(mm,a.evdtApplication,getdate()) = 0
AND datediff(yy,a.evdtApplication, getdate()) = 0
AND datediff(yy,b.evdtApplication,getdate()) = 0

You don't need any joins at all.
You want to count the loanID column from tblEventDates, and you want to do it conditionally based on the date matching the current month or the current year.
SO:
SELECT SUM( CASE WHEN Month(a.evdtApplication) = MONTH(GEtDate() THEN 1 END) as monthTotal,
count(*)
FROM tblEventDates a
WHERE a.evdtApplication BETWEEN '2008-01-01' AND '2008-12-31'
What that does is select all the event dates this year, and add up the ones which match your conditions. If it doesn't match the current month it won't add 1. Actually, don't even need to do a condition for the year because you're just querying everything for that year.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas