Assume data with structure like this: Demo
WITH CAL AS(
SELECT 2022 YR, '01' PERIOD UNION ALL
SELECT 2022 YR, '02' PERIOD UNION ALL
SELECT 2022 YR, '03' PERIOD UNION ALL
SELECT 2022 YR, '04' PERIOD UNION ALL
SELECT 2022 YR, '05' PERIOD UNION ALL
SELECT 2022 YR, '06' PERIOD UNION ALL
SELECT 2022 YR, '07' PERIOD UNION ALL
SELECT 2022 YR, '08' PERIOD UNION ALL
SELECT 2022 YR, '09' PERIOD UNION ALL
SELECT 2022 YR, '10' PERIOD UNION ALL
SELECT 2022 YR, '11' PERIOD UNION ALL
SELECT 2022 YR, '12' PERIOD ),
Data AS (
SELECT 2022 YR, '01' PERIOD, 10 qty UNION ALL
SELECT 2022 YR, '02' PERIOD, 5 qty UNION ALL
SELECT 2022 YR, '04' PERIOD, 10 qty UNION ALL
SELECT 2022 YR, '05' PERIOD, 7 qty UNION ALL
SELECT 2022 YR, '09' PERIOD, 1 qty)
SELECT *
FROM CAL A
LEFT JOIN data B
on A.YR = B.YR
and A.Period = B.Period
WHERE A.Period <10 and A.YR = 2022
ORDER by A.period
Giving us:
+------+--------+------+--------+-----+
| YR | PERIOD | YR | PERIOD | qty |
+------+--------+------+--------+-----+
| 2022 | 01 | 2022 | 01 | 10 |
| 2022 | 02 | 2022 | 02 | 5 |
| 2022 | 03 | | | |
| 2022 | 04 | 2022 | 04 | 10 |
| 2022 | 05 | 2022 | 05 | 7 |
| 2022 | 06 | | | |
| 2022 | 07 | | | |
| 2022 | 08 | | | |
| 2022 | 09 | 2022 | 09 | 1 |
+------+--------+------+--------+-----+
With Expected result of:
+------+--------+------+--------+-----+
| YR | PERIOD | YR | PERIOD | qty |
+------+--------+------+--------+-----+
| 2022 | 01 | 2022 | 01 | 10 |
| 2022 | 02 | 2022 | 02 | 5 |
| 2022 | 03 | 2022 | 03 | 5 | -- SQL derives
| 2022 | 04 | 2022 | 04 | 10 |
| 2022 | 05 | 2022 | 05 | 7 |
| 2022 | 06 | 2022 | 06 | 7 | -- SQL derives
| 2022 | 07 | 2022 | 07 | 7 | -- SQL derives
| 2022 | 08 | 2022 | 08 | 7 | -- SQL derives
| 2022 | 09 | 2022 | 09 | 1 |
+------+--------+------+--------+-----+
QUESTION:
How would one go about filling in the gaps in period 03, 06, 07, 08 with a record quantity referencing the nearest earlier period/year. Note example is limited to a year, but gap could be on period 01 of 2022 and we would need to return 2021 period 12 quantity if populated or keep going back until quantity is found, or no such record exists.
LIMITS:
I am unable to use table value functions. (No lateral, no Cross Apply)
I'm unable to use analytics (no lead/lag)
correlated subqueries are iffy.
Why the limits? this must be done in a HANA graphical calculation view. Which supports neither of those concepts. I've not done enough to know how to do a correlated subquery at this time to know if it's possible.
I can create any number of inline views or materialized datasets needed.
STATISTICS:
this table has over a million rows and grows at a rate of productlocationperiodsyears. so if you have 100020126=1.4 mil+ in 6 years with just 20 locations and 1000 products...
each product inventory may be recorded at at the end of a month for a given location. (no activity for product/location, no record hence a gap. Silly mainframe save storage technique used in a RDBMS... I mean how do I know the system just didn't error on inserting the record for that material; or omit it for some reason... )
In the cases where it is not recorded, we need to fill in the gap. The example provided is broken down to the bear bones without location and material as I do not believe it is not salient to a solution.
ISSUE:
I'll need to convert the SQL to a "HANA Graphical calculation view"
Yes, I know I could create a SQL Script to do this. This is not allowed.
Yes, I know I could create a table function to do this. This is not allowed.
This must be accomplished though Graphical calculation view which supports basic SQL functions
BASIC Joins (INNER, OUTER, FULL OUTER, Cross), filters, aggregation, a basic rank at a significant performance impact if all records are evaluated. (few other things) but not window functions, not cross Join, lateral...
as to why it has to do with maintenance and staffing. The staffed area is a reporting area who uses tools to create views used in universes. The area wishes to keep all Scripts out of use to keep cost for employees lower as SQL knowledge wouldn’t be required for future staff positions, though it helps!
For those familiar this issue is sourced from MBEWH table in an ECC implementation
This can be done with graphical calculation views in SAP HANA.
It's not pretty and probably not very efficient, though.
Whether or not the persons that are supposedly able to maintain graphical calc. views but not SQL statement will be able to successfully maintain this is rather questionable.
First, the approach in SQL, so that the approach becomes clear:
create column table calendar
( yr integer
, period nvarchar (2)
, primary key (yr, period))
insert into calendar
( select year (generated_period_start) as yr
, ABAP_NUMC( month(generated_period_start), 2) as period
from series_generate_date ('INTERVAL 1 MONTH', '2022-01-01', '2023-01-01'));
create column table data
( yr integer
, period nvarchar (2)
, qty integer
, primary key (yr, period));
insert into data values (2022, '01', 10);
insert into data values (2022, '02', 5);
insert into data values (2022, '04', 10);
insert into data values (2022, '05', 7);
insert into data values (2022, '09', 1);
SELECT *
FROM CALendar A
LEFT JOIN data B
on A.YR = B.YR
and A.Period = B.Period
WHERE A.Period <'10' and A.YR =2022
ORDER BY A.period;
/*
YR PERIOD YR PERIOD QTY
2,022 01 2,022 01 10
2,022 02 2,022 02 5
2,022 03 ? ? ?
2,022 04 2,022 04 10
2,022 05 2,022 05 7
2,022 06 ? ? ?
2,022 07 ? ? ?
2,022 08 ? ? ?
2,022 09 2,022 09 1
*/
The NUMC() function creates ABAP NUMC strings (with leading zeroes) from integers. Other than this it's pretty much the tables from OP.
The general approach is to use the CALENDAR table as the main driving table that establishes for which dates/periods there will be output rows.
This is outer joined with the DATA table, leaving "missing" rows with NULL in the corresponding columns.
Next, the DATA table is joined again, this time with YEAR||PERIOD combinations that are strictly smaller then the YEAR||PERIOD from the CALENDAR table. This gives us rows for all the previous records in DATA.
Next, we need to pick which of the previous rows we want to look at.
This is done via the ROWNUM() function and a filter to the first record.
As graphical calculation views don't support ROWNUM() this can be exchanged with RANK() - this works as long as there are no two actual DATA records for the same YEAR||PERIOD combination.
Finally, in the projection we use COALESCE to switch between the actual information available in DATA and - if that is NULL - the previous period information.
/*
CAL_YR CAL_PER COALESCE(DAT_YR,PREV_YR) COALESCE(DAT_PER,PREV_PER) COALESCE(DAT_QTY,PREV_QTY)
2,022 01 2,022 01 10
2,022 02 2,022 02 5
2,022 03 2,022 02 5
2,022 04 2,022 04 10
2,022 05 2,022 05 7
2,022 06 2,022 05 7
2,022 07 2,022 05 7
2,022 08 2,022 05 7
2,022 09 2,022 09 1
*/
So far, so good.
The graphical calc. view for that looks like this:
As it's cumbersome to screenshoot every single node, I will include the just most important ones:
1. CAL_DAT_PREV
Since only equality joins are supported in graphical calc. views we have to emulate the "larger than" join. For that, I created to calculated/constant columns join_const with the same value (integer 1 in this case) and joined on those.
2. PREVS_ARE_OLDER
This is the second part of the emulated "larger than" join: this projection simply filters out the records where cal_yr_per is larger or equal than prev_yr_per. Equal values must be allowed here, since we don't want to loose records for which there is no smaller YEAR||PERIOD combination. Alternatively, one could insert an intial record into the DATA table, that is guranteed to be smaller than all other entries, e.g. YEAR= 0001 and PERIOD=00 or something similar. If you're familiar with SAP application tables, then you've seen this approach.
By the way - for convenience reasons, I created calculated columns that combine the YEAR and PERIOD for the different tables - cal_yr_per, dat_yr_per, and prev_yr_per.
3. RANK_1
Here the rank is created for PREV_YR_PR, picking the first one only, and starting a new group for every new value fo cal_yr_per.
This value is returned via Rank_Column.
4. REDUCE_PREV
The final piece of the puzzle: using a filter on Rank_Column = 1 we ensure to only get one "previous" row for every "calendar" row.
Also: by means of IF(ISNULL(...), ... , ...) we emulate COALESCE(...) in three calculated columns, aptly named FILL....
And that's the nuts and bolts of this solution.
"It's works on my computer!" is probably the best I can say about it.
SELECT "CAL_YR", "CAL_PERIOD"
, "DAT_YR", "DAT_PER", "DAT_QTY"
, "FILL_YR", "FILL_QTY", "FILL_PER"
FROM "_SYS_BIC"."scratch/QTY_FILLUP"
ORDER BY "CAL_YR" asc, "CAL_PERIOD" asc;
/*
CAL_YR CAL_PERIOD DAT_YR DAT_PER DAT_QTY FILL_YR FILL_QTY FILL_PER
2,022 01 2,022 01 10 2,022 10 01
2,022 02 2,022 02 5 2,022 5 02
2,022 03 ? ? ? 2,022 5 02
2,022 04 2,022 04 10 2,022 10 04
2,022 05 2,022 05 7 2,022 7 05
2,022 06 ? ? ? 2,022 7 05
2,022 07 ? ? ? 2,022 7 05
2,022 08 ? ? ? 2,022 7 05
2,022 09 2,022 09 1 2,022 1 09
2,022 10 ? ? ? 2,022 1 09
2,022 11 ? ? ? 2,022 1 09
2,022 12 ? ? ? 2,022 1 09
*/
In SQL Server I have 2 tables that looks like this:
TEST SCRIPT 'a collection of test scripts'
(PK)
ID Description Count
------------------------
A12 Proj/Num/Dev 12
B34 Gone/Tri/Tel 43
C56 Geff/Ben/Dan 03
SCRIPT HISTORY 'the history of the aforementioned scripts'
(FK) (PK)
ScriptID ID Machine Date Time Passes
----------------------------------
A12 01 DEV012 6/26/15 16:54 4
A12 02 DEV596 6/28/15 13:12 9
A12 03 COM199 3/12/14 14:22 10
B34 04 COM199 6/30/13 15:45 12
B34 05 DEV012 6/30/15 13:13 14
B34 06 DEV444 6/12/15 11:14 14
C56 07 COM321 6/29/14 02:19 12
C56 08 ANS042 6/24/14 20:10 18
C56 09 COM432 6/30/15 12:24 4
C56 10 DEV444 4/20/12 23:55 2
In a single query, how would I write a select statement that takes just one entry for each DISTINCT script in TEST SCRIPT and pairs it with the values in only the TOP 1 most recent run time in SCRIPT HISTORY?
For example, the solution to the example tables above would be:
OUTPUT
ScriptID ID Machine Date Time Passes
---------------------------------------------------
A12 02 DEV596 6/28/15 13:12 9
B34 05 DEV012 6/30/15 13:13 14
C56 09 COM432 6/30/15 12:24 4
The way you describe the problem is almost directly as cross apply:
select h.*
from testscript ts cross apply
(select top 1 h.*
from history h
where h.scriptid = ts.id
order by h.date desc, h.time desc
) h;
Please try something like this:
select *
from SCRIPT SCR
left join (select MAX(SCRIPT_HISTORY.Date) as Date, SCRIPT_HISTORY.ScriptID
from SCRIPT_HISTORY
group by SCRIPT_HISTORY.ScriptID
) SH on SCR.ID = SH.ScriptID
I'm having trouble with something that I thought would've been simple...
I have a simple model Statistic that stores a date (created_at), a user_fingerprint and a structure_id. From that, I'd like to create a graph to show #visitors per day.
So I did
#structure.statistics.order('DATE(created_at) ASC').group('DATE(created_at)').count
Which works and return what I expect:
=> {Sat, 18 May 2014=>50, Mon, 19 May 2014=>90}
Now I'd like the same, but I want to squeeze all rows with the same couple (created_at, user_fingerprint). For instance:
| created_at | user_fingerprint | structure_id |
|----------------------|------------------|--------------|
| Sat, 18 May 2014 2PM | '124512341' | 12 |
| Sat, 18 May 2014 4PM | '124512341' | 12 |
| Mon, 19 May 2014 6PM | '124512341' | 12 |
With this data, I would have:
=> {Sat, 18 May 2014=>1, Mon, 19 May 2014=>1}
# instead of
=> {Sat, 18 May 2014=>2, Mon, 19 May 2014=>1}
I would be able to do it in Ruby but I wondered if I could directly do it with SQL & Arel.
Solution regarding your answers
Here is what I did at the end:
#impressions = {}
# The following is to ensure I will have a key when there is no stat for a day.
(15.days.ago.to_date..Date.today).each { |date| #impressions[date] = 0 }
#structure.statistics.where( Statistic.arel_table[:created_at].gt(Date.today - 15.days) )
.order('DATE(created_at) ASC')
.group('DATE(created_at)')
.select('DATE(created_at) as created_at, COUNT(DISTINCT(user_fingerprint)) as user_count')
.each{ |stat| #impressions[stat.created_at] = stat.user_count }
I need to do a bit of Ruby though but that's good for me.
your query would look something like (Oracle dialect)
select trunc(created_at), user_fingerprint, count(distinct user_fingerprint)
from statistic
group by trunc(created_at), user_fingerprint
there is no SQL standard for getting date portion out of datetime data field.
oracle: trunc(dt_column)
sql server: cast(dt_column As Date)
mysql: DATE(dt_column)
#structure.statistics.order('DATE(created_at) ASC').group('DATE(created_at)').select('count(distinct(user_fingerprint)) as user_count').first.user_count
I just started to program in SQL and I have a bit of a problem (n.b., I am working of a tabl that come from a game). My table is something like this, where ID refers to a single person, H to a certain hour of playing and IF to a certain condition:
ID H IF
01 1 0
01 2 0
01 3 0
02 1 0
02 2 1
03 1 0
03 2 1
03 3 0
03 4 1
In this case player 01 played for three hours, player 02 for two hours and player 03 for four hours. In each of these hours they may or may have not performed an action. If they did, a 1 appears in the IF column.
Now, my doubt is: how can I query so that I have a table with only the ID of the people who never performed the action? I do not want to rule out only the row with IF = 1, I want to rule out all the row with that ID. In this case it should become:
01 1 0
01 2 0
01 3 0
Any help?
This should do it.
select *
from table
where Id not in (select Id from table where IF = 1)
SELECT ID FROM Table GROUP BY ID HAVING SUM(IF)=0
I've been trying to develop a cashflow statement in access 2007. This would be so easily done in excel using formulas such as
= SUM (B6:M6) / CountIF(B6:M6)>0
but I cant seem to wrap my head around this when it comes to access. And I need this for every company we enter data on. The cashflow statement is supposed to look like this (Since I can't yet post a pic):
----------------------------------------------------------------------------------------------------------------
Particulars | Jan | Feb | Mar | Apr | Jun | Jul | Aug | Sep | Oct | Nov | Dec | Average |
Sales---------------------->
Salary------>
Transportation----->
and about 10 other items in the row, all with entries for Jan till Dec, however, sometimes we take 6 months worth of data and sometimes for all 12 months. (Imagine a basic excel sheet with items on the first column and headers for the next 12-13 columns).
In access, I made tables for each Item with columns as the months, eg. tblRcpt--> |rcpt_ID|Jan|Feb|... and so on till dec for all the items. Then they will be arranged and presented in an entry form which would be designed to look similar to the above table while later I would query and link them together to presentthe complete cashflow statement.
Now comes the question, I need to Average together the columns (as you can see in the right most column), BUT I only want to average together those months that have been filled (Sometimes in accounting people enter '0' where there is no data), so I cant just sum the columns and divide by twelve. It has to be dynamic, all functions seem to center around counting and averaging ROWs, not COLUMNs.
Thanks for just bearing with me and reading this, any help would be much appreciated.
Try this
(Jan + Feb + ... + Dec) /
( case when Jan = 0 then 0 else 1 end
+ case when Feb = 0 then 0 else 1 end
+ case when Dec = 0 then 0 else 1 end )
as Avg
Your table structure should be:
Particulars | Month | Amount
Sales 1 500
Sales 2 1000
Salary 1 80000
...and so on. You can either not enter rows when you don't have a value for that month, or you can handle them in the SQL statement (as I have below):
SELECT Particulars, AVG(Amount) AverageAmount
FROM MyTable
WHERE NULLIF(Amount, 0) IS NOT NULL
GROUP BY Particulars;