How to select variable (with a year in the name) for calculation based on value of a date field using SAS SQL - sql

With SAS SQL (or just SAS) I need to use a variable for a calculation based on the year portion of a different date field. The variable's name contains the year that I'd need to match from the year portion of the other date variable. How can I select the right variable to use for my calculation?
For example, I need to select which one of these to use:
GRADE_2013
GRADE_2014
GRADE_2015
by looking at a date field of the format 15JAN2014 - so from that year of 2014 I want to grab the value from GRADE_2014 to use in another calculation.

You have a few options, one is an array with a year index and another is the VVALUEX function that looks up the value of a variable.
Data One;
set Have;
array grades(2013:2015) grade_2013-grade_2015;
*Array method;
variable_want1 = grades(year(date_field));
*VValueX method;
variable_want2 = vvalues('grades_'||put(year(date_field), 4.));
run;

Generally, this sort of problem becomes much easier if you can transpose your data into a more normalized format.
So instead of having three grade_YYYY variables with year suffixes on each, transpose each record into three records, with variables YEAR and GRADE.

Thanks so much for the great answers...I'm a novice so I'll have to go the non-array approach at least to start.

Related

How to Filter Data with Multiple Parameters

I'd like to tally a series of data based on the day and user name. The data is being fed from a query, and I am not looking to use a pivot table because I would like to archive the data past what the program I am pulling the data from stores. Below is a sample of the data I have collected.
I want to tally the Column D "FULL_PLLT_QTY", but only for the date in Column G "SHIFT_DT" and the Column I "Name".
EX. I want to tally Column D for 6/7/2107 for Smith, R.W.
Is there a way to do this for a large range of dates and names? Also, the names appear on multiple dates. Any help with this would be much appreciated!
=SUMIFS(D:D,G:G,"6/7/2017 0:00",I:I,"Smith, R.W.")
Matching the date condition will depend on the exact format of column G. The above assumes it is just a string. If ti is a date then you probably need to use =DATE(2017,7,6) instead of the literal string "6/7/2017 0:00".

Limiting data on monthly basis from start date to system date dynamically in Tibco spotfire

I've tried limiting data on monthly basis in spotfire and it's working fine.
Now I'm trying to do like getting the records from the current date to month start date.
For suppose if the current date is Sept 21, then i should get the records from Sept 21 to Sept-01(dynamically).
I have a property control to input the number of months.
The easiest way to do this is with Month and Year. For example, in your visualization:
Right Click > Properties > Data > Limit Data Using Expressions (Edit)
Then, use this expression:
Month([TheDate]) = Month(DateTimeNow()) and Year([TheDate]) = Year(DateTimeNow())
This will limit the data to only those rows with the current Year/Month combination in your data column. Just replace [TheDate] with whatever your date column name is.
In other places, you can wrap this in an IF statement if you'd like. It's redundant in this case, but sometimes helps with readability.
IF(Month([TheDate]) = Month(DateTimeNow()) and Year([TheDate]) = Year(DateTimeNow()),TRUE,FALSE)
#san - Adding to #scsimon answer. If you would like to precisely limit values between 1st of the current month to current date, you could add the below expression to 'Limit data using expression' section.
[Date]>=date(1&'-'&Month(DateTimeNow())&'-'&year(DateTimeNow())) and [Date]<=DateTimeNow()

Why is SQL Server returning a different order when using 'month' in 'where'?

I run a procedure call that calculates sums into table rows. First I taught the procedure is not working as expected, so I wasted half a day trying to fix what actually works fine.
Later I actually taken a look at the SELECT that gets the data on screen and was surprised by this:
YEAR(M.date) = 2016
--and MONTH(M.date) = 2
and
YEAR(M.date) = 2016
and MONTH(M.date) = 2
So the second example returns a different sorting than the first.
The thing is I do calculations on the whole year. Display data on year + month parameters.
Can someone explain why this is happening and how to avoid this?
In my procedure that calls the SELECT for on screen data I have it implemented like so:
and (#month = 0 or (month(M.date) = #month))
and year(M.date) = #year
So the month parameter is optional if the user wants to see the data for the whole year and year parameter is mandatory.
You are ordering by the date column. However, the date column is not unique -- multiple rows have the same date. The ORDER BY returns these in arbitrary order. In fact, you might get a different ordering for the same query running at different times.
To "fix" this, you need to include another column (or columns) that is unique for each row. In your case, that would appear to be the id column:
order by date, id
Another way to think about this is that in SQL the sorts are not stable. That is, they do not preserve the original ordering of the data. This is easy to remember, because there is no "original ordering" for a table or result set. Remember, tables represent unordered sets.

Conditional mean calculation in excel

I have a dataset organized as following :
The column A is the name
The column B is the date
The column C is the value registered for that person in that day
How can i calculate for the whole dataset a mean of the value of that person in the 30 past days without manually ordering for name and making the mean checking the date?
Try the AVERAGEIFS function with the EDATE function giving you a one month window.
=AVERAGEIFS(C:C, A:A, "Jack", B:B, ">"&EDATE(TODAY(), -1), B:B, "<="&TODAY())
    
You can use a nested array formula (Ctrl+Shift+Enter instead of Enter):
=AVERAGE(IF($A$2:$A$15=A2,IF($B$2:$B$15>=TODAY()-30,$C$2:$C$15,""),""))
Add columns for these two fomulas:
=COUNTIF(A1:A4,"Jack")
=SUMIF(A1:A4,"Jack",C10:C13)
That will give you the count, and it will give you the sum. With those two you can calculate the mean.
That's the basic idea, anyway.
Of course, you can add another count for the date ranges. It's all the same sort of thing.

How can I query just the month and day of a DATE column?

I have a date of birth DATE column in a customer table with ~13 million rows. I would like to query this table to find all customers who were born on a certain month and day of that month, but any year.
Can I do this by casting the date into a char and doing a subscript query on the cast, or should I create an aditional char column, update it to hold just the month and day, or create three new integer columns to hold month, day and year, respectively?
This will be a very frequently used query criteria...
EDIT:... and the table has ~13 million rows.
Can you please provide an example of your best solution?
If it will be frequently used, consider a 'functional index'. Searching on that term at the Informix 11.70 InfoCentre produces a number of relevant hits.
You can use:
WHERE MONTH(date_col) = 12 AND DAY(date_col) = 25;
You can also play games such as:
WHERE MONTH(date_col) * 100 + DAY(date_col) = 1225;
This might be more suitable for a functional index, but isn't as clear for everyday use. You could easily write a stored procedure too:
Note that in the absence of a functional index, invoking functions on a column in the criterion means that an index is unlikely to be used.
CREATE FUNCTION mmdd(date_val DATE DEFAULT TODAY) RETURNING SMALLINT AS mmdd;
RETURN MONTH(date_val) * 100 + DAY(date_val);
END FUNCTION;
And use it as:
WHERE mmdd(date_col) = 1225;
Depending on how frequently you do this and how fast it needs to run you might think about splitting the date column into day, month and year columns. This would make search faster but cause all sorts of other problems when you want to retrieve a whole date (and also problems in validating that it is a date) - not a great idea.
Assuming speed isn't a probem I would do something like:
select *
FROM Table
WHERE Month(*DateOfBirthColumn*) = *SomeMonth* AND DAY(*DateOfBirthColumn*) = *SomeDay*
I don't have informix in front of me at the moment but I think the syntax is right.