SAS intersect with date values - sql

I have two tables, one that was created using a connection to teradata and other one that was created importing an excel file.
I need to find the records that are on one table but not on the other given three of the fields. The first two are returning what i would expect but when i add the third field which is a date field then none of the records match. Wen i look into the tables i see that the two dates are identical but somehow SAS does not consider them to be identical.
My code looks like this:
PROC SQL;
CREATE TABLE fields_that_do_not_match AS
SELECT /*FIRST TWO FIELDS*/ date_a FROM table_a
EXCEPT
SELECT /*FIRST TWO FIELDS*/ date_b FROM table_b;
QUIT;
Is there something else i should be considering for comparing dates?
When i see the properties of the date fields both are on DATE9. format, are of numerical type and have 8 bytes in length. Both of the dates show 14FEB2022 when i query the table but i don't know if some of the tables have aditional information that is not being displayed due to the format.
Thanks in advance.

The reason why the dates did not match was because one of them had decimal values and i needed to round the values. I added INT(DATE_A) and it worked after that.

Related

Null values not being returned in Postgresql

I have two tables in Postgresql, which I need to perform the union taking the null values, to add other values in another column of the junction.
Table one:
I filtered by date, because this data is generated daily and I only need the current_date
Table two: All names.
In table two I have 9 names that are not found in table one.
When I try to perform the join, I only get the 9 names from table one as a result.
Trying with date from table one to current_date
But if I don't filter the date from table one, the null value is returned.
That is, the name that is in table two but not in table one.
What I need is to join the two tables and where there is no asset referring to the second table, fill it with 0 (zero).
In this part I understood that I must use COALESCE(vcm.ativo,0).
But first I need the names of the second table to appear as well.
The result should be like this:
If someone could help me, I'll be grateful.
As pointed out in a comment by the asker, the solution turned out to be
with todays_data as (
select vcm.cooperativa, vcm.ativo
from sga_bi.veiculos_coop_mensal as vcm
where data = current_date
)
select coop.nome, COALESCE(vcmm.ativo,0)
from sga.cooperativas as coop
left outer join todays_data as vcmm
on coop.nome = vcmm.cooperativa

I need to retrieve a column from already retrieved column from a table

I had a table which has a column like this which i retrieved from this query
select distinct HDD_WP_RPTNG_AS_OF_SID
from wcadbo.WCA_MDW_D_HLDNGS_DATE
order by HDD_WP_RPTNG_AS_OF_SID desc;
Table:
HDD_WP_RPTNG_AS_OF_SID
20210501
20210430
20210429
20210428
It contains dates in integer format.
I wrote a query to retrieve another column of these dates in date format and I named column as AS_OF_DATE - like this:
SELECT DISTINCT
HDD_WP_RPTNG_AS_OF_SID,
to_date(HDD_WP_RPTNG_AS_OF_SID,'YYYYMMDD') AS_OF_DATE
FROM
WCADBO.WCA_MDW_D_HLDNGS_DATE
ORDER BY
HDD_WP_RPTNG_AS_OF_SID DESC;
Result set:
HDD_WP_RPTNG_AS_OF_SID AS_OF_DATE
----------------------------------
20210501 01-MAY-21
20210430 30-APR-21
20210429 29-APR-21
20210428 28-APR-21
Now I need another column as Display_Date in char type which gives LastAvailableDate for latest date in previous column or gives Date in char type for all other dates like this
I wrote this query but not working:
SELECT
HDD_WP_RPTNG_AS_OF_SID,
AS_OF_DATE,
Display_date
FROM
(SELECT DISTINCT
HDD_WP_RPTNG_AS_OF_SID,
to_date(HDD_WP_RPTNG_AS_OF_SID,'YYYYMMDD') AS_OF_DATE
FROM
WCADBO.WCA_MDW_D_HLDNGS_DATE
ORDER BY
HDD_WP_RPTNG_AS_OF_SID DESC)
WHERE
Display_Date = (CASE
WHEN AS_OF_DATE = '01-MAY-21'
THEN 'Last_Available_date'
ELSE TO_CHAR(AS_OF_DATE, 'MON DD YYYY')
END);
Finally I need three columns, one is already in table but modified a bit. Other two are temporary ones(AS_OF_DATE and Display_Date) that i need to retrieve.
I'm a beginner in SQL and couldn't figure out how to retrieve column from another temporary column..
Kindly help, Thank you.
BTW I was doing it in Oracle SQL Developer
It looks like you want something like this
SELECT
subQ.HDD_WP_RPTNG_AS_OF_SID,
subQ.AS_OF_DATE,
(CASE
WHEN subQ.AS_OF_DATE = date '2021-05-01'
THEN 'Last_Available_date'
ELSE TO_CHAR(subQ.AS_OF_DATE, 'MON DD YYYY')
END) Display_date
FROM
(SELECT DISTINCT
tbl.HDD_WP_RPTNG_AS_OF_SID,
to_date(tbl.HDD_WP_RPTNG_AS_OF_SID,'YYYYMMDD') AS_OF_DATE
FROM
WCADBO.WCA_MDW_D_HLDNGS_DATE tbl) subQ
ORDER BY
subQ.HDD_WP_RPTNG_AS_OF_SID DESC
Comments
If you want to add computed columns, that is done in the projection (the select list)
Always compare dates to dates and strings to strings. So in your case statement, compare the date as_of_date against another date. In this case I'm using a date literal. You could also call to_date on a string parameter.
If you want the results of the query ordered by a particular column, you want that order by applied at the outermost layer of the query, not in an inline view.
You basically always want to use aliases when referring to any column in a query. It's less critical in situations where everything is coming from one table but as soon as you start referencing multiple tables in a query, it becomes annoying to look at a query and not sure where a column is coming from. Even in a query like this where there is an inline view and an outer query, it makes it easier to read the query if you're explicit about where the columns are coming from.
Do you really need the distinct? I kept it because it was a part of the original query but I get antsy whenever I see a distinct particularly from people learning SQL. Doing a distinct is a relatively expensive operation and it is very commonly used to cover up an underlying issue in the query (i.e. that you're getting multiple rows because some other column you aren't showing has multiple values) that ought to be addressed correctly (i.e. by adding an additional predicate to ensure that you're only getting each hdd_wp_rptg_as_of_sid once).
Storing dates as strings in tables (as is done apparently with hdd_wp_rptg_as_of_sid) is a really bad practice. If one person writes one row to the table where the string isn't in the right format, your query will suddenly stop working and start throwing errors, for example.

Combine two columns of table into one in SQL

I would like to combine two columns(both from different table) into one column.
As shown below, both are expiry date and I would like them to be combined. Either one column will be present. If one is present, the other will not be present. But at times, both will not be present at the same time. I have looked at concat in sql but it is used to combine.
Need some guidance on this.
If you are using SQL Server can update the blanks in Expiry column as NULLs then you can do this
ISNULL(Expiry,Expiration_date)
Check if the first exists then take it else take the second one: like below:
select if(Expiry!='',Expiry, Expiration_date) as expiry from table

Postgres copy to select * with a type cast

I have a group of two SQL tables in postgres. A staging table and the main table. Among the variety of reasons for the staging table, the data i am uploading has irregular and different formats for all of the date columns. During the upload process these values go into staging table as varchars to be manipulated into usable formats.
In the main table the column type for the date fields is of type 'date' in the staging table they are of type varchar.
The question is, does postgres support a copy expression similar to
insert into production_t select *,textdate::date from staging_t
I need to change the format of a single field during the copy process. I know i can individually type out all of the column names during the insert and typecast the date columns there, but this table has over 200 columns and is one of 10 tables with similar issues. I want to accomplish this insert+typecast in one line that i can apply to all tables rather than having to type 2000+ lines of sql queries.
You have to write every column of such a query, there is no shorthand.
May I say that a design with 200 columns is questionable.

Filter Rows - Pentaho

We are Getting inputs from two different tables and passing it to the Filter rows.
But we are getting the below error.
The DATE_ADDED Table has only one column DATE_ADDED and similarly the TODAYS_DATE Table has a single column TODAYS_DATE .
The condition given in the Filter is DATE_ADDED < TODAYS_DATE .
The transaformation is
Can someone tell, where I am doing the mistake
It won't work like this. You expect a join of two streams (like SQL JOIN of two tables) but actually you will have a union (like SQL UNION).
When two streams are intersected on a step they must have identical columns - names, order and types - and the result will be the union of both streams with the same structure as origins.
When you intersect streams with different structures - different column names in your case - you will have unpredictable column names and actually only one column - nothing to compare with.
To do what you need use the Merge Join step (do not forget to sort streams on the joining key)
Both the column names and types should be identical if you wanna merge the columns in single step, right click on both steps and click output fields to verify the datatypes.
if datatype issues arrives OR you want to rename the columns, you can place select step(for each table steps) after table steps and select the DATE Type(in your case)in the Meta-data tab, and rename the fields as well.
Hope this helps... :)