Date Column showing Duplicate Records even after using to_date(to_char(JOB_CLOSED_DATE,'dd-mon-yy')), - sql

I have oracle query which should remove the duplicate records from a date column which consists of time as well. Because of time-stamp there are duplicate records are showing when include other columns along with the date columns. please see attached image from power bi. Is there any way I could be get rid of duplicacy of records.
Select distinct
to_date(to_char(JOB_CLOSED_DATE,'dd-mon-yy'))
From DWH_FACT_DISCRETE_JOB_WIP

First, implicit casting from char to date is very dangerous in oracle - it can result in hard to find bugs.
Second, try to use trunc() function instead of to_char() to get date without time.
Select distinct trunc(JOB_CLOSED_DATE)
From DWH_FACT_DISCRETE_JOB_WIP

Related

How to add column to an existing table and calculate the value

Table info:
I want to add new column and calculated the different of the alarmTime column with this code:
ALTER TABLE [DIALinkDataCenter].[dbo].[DIAL_deviceHistoryAlarm]
ADD dif AS (DATEDIFF(HOUR, LAG((alarmTime)) OVER (ORDER BY (alarmTime)), (alarmTime)));
How to add the calculation on the table? Because always there's error like this:
Windowed functions can only appear in the SELECT or ORDER BY clauses.
You are using the syntax for a generated virtual column that shows a calculated value (ADD columnname AS expression).
This, however, only works on values found in the same row. You cannot have a generated column that looks at other rows.
If you consider now to create a normal column and fill it with calculated values, this is something you shouldn't do. Don't store values redundantly. You can always get the difference in an ad-hoc query. If you store this redundantly instead, you will have to consider this in every insert, update, and delete. And if at some time you find rows where the difference doesn't match the time values, which column holds the correct value then and which the incorrect one? alarmtime or dif? You won't be able to tell.
What you can do instead is create a view for convenience:
create view v_dial_devicehistoryalarm as
select
dha.*,
datediff(hour, lag(alarmtime) over (order by alarmtime), alarmtime) as dif
from dial_devicehistoryalarm dha;
Demo: https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=b7f9b5eef33e72955c7f135952ef55b5
Remember though, that your view will probably read and sort the whole table everytime you access it. If you query only a certain time range, it will be faster hence to calculate the differences in your query instead.

SAS intersect with date values

I have two tables, one that was created using a connection to teradata and other one that was created importing an excel file.
I need to find the records that are on one table but not on the other given three of the fields. The first two are returning what i would expect but when i add the third field which is a date field then none of the records match. Wen i look into the tables i see that the two dates are identical but somehow SAS does not consider them to be identical.
My code looks like this:
PROC SQL;
CREATE TABLE fields_that_do_not_match AS
SELECT /*FIRST TWO FIELDS*/ date_a FROM table_a
EXCEPT
SELECT /*FIRST TWO FIELDS*/ date_b FROM table_b;
QUIT;
Is there something else i should be considering for comparing dates?
When i see the properties of the date fields both are on DATE9. format, are of numerical type and have 8 bytes in length. Both of the dates show 14FEB2022 when i query the table but i don't know if some of the tables have aditional information that is not being displayed due to the format.
Thanks in advance.
The reason why the dates did not match was because one of them had decimal values and i needed to round the values. I added INT(DATE_A) and it worked after that.

Fetching repeated data from database without repetition

Guys I have a table in database like this.
open to see
I need to fetch data such that i have only variety and number of grafts each date summed up without repetition.
I get this instead: result
I think, this is the query you are looking for
select sum(variety), sum(number_grafts), date
from <table_name>
group by date

I need to retrieve a column from already retrieved column from a table

I had a table which has a column like this which i retrieved from this query
select distinct HDD_WP_RPTNG_AS_OF_SID
from wcadbo.WCA_MDW_D_HLDNGS_DATE
order by HDD_WP_RPTNG_AS_OF_SID desc;
Table:
HDD_WP_RPTNG_AS_OF_SID
20210501
20210430
20210429
20210428
It contains dates in integer format.
I wrote a query to retrieve another column of these dates in date format and I named column as AS_OF_DATE - like this:
SELECT DISTINCT
HDD_WP_RPTNG_AS_OF_SID,
to_date(HDD_WP_RPTNG_AS_OF_SID,'YYYYMMDD') AS_OF_DATE
FROM
WCADBO.WCA_MDW_D_HLDNGS_DATE
ORDER BY
HDD_WP_RPTNG_AS_OF_SID DESC;
Result set:
HDD_WP_RPTNG_AS_OF_SID AS_OF_DATE
----------------------------------
20210501 01-MAY-21
20210430 30-APR-21
20210429 29-APR-21
20210428 28-APR-21
Now I need another column as Display_Date in char type which gives LastAvailableDate for latest date in previous column or gives Date in char type for all other dates like this
I wrote this query but not working:
SELECT
HDD_WP_RPTNG_AS_OF_SID,
AS_OF_DATE,
Display_date
FROM
(SELECT DISTINCT
HDD_WP_RPTNG_AS_OF_SID,
to_date(HDD_WP_RPTNG_AS_OF_SID,'YYYYMMDD') AS_OF_DATE
FROM
WCADBO.WCA_MDW_D_HLDNGS_DATE
ORDER BY
HDD_WP_RPTNG_AS_OF_SID DESC)
WHERE
Display_Date = (CASE
WHEN AS_OF_DATE = '01-MAY-21'
THEN 'Last_Available_date'
ELSE TO_CHAR(AS_OF_DATE, 'MON DD YYYY')
END);
Finally I need three columns, one is already in table but modified a bit. Other two are temporary ones(AS_OF_DATE and Display_Date) that i need to retrieve.
I'm a beginner in SQL and couldn't figure out how to retrieve column from another temporary column..
Kindly help, Thank you.
BTW I was doing it in Oracle SQL Developer
It looks like you want something like this
SELECT
subQ.HDD_WP_RPTNG_AS_OF_SID,
subQ.AS_OF_DATE,
(CASE
WHEN subQ.AS_OF_DATE = date '2021-05-01'
THEN 'Last_Available_date'
ELSE TO_CHAR(subQ.AS_OF_DATE, 'MON DD YYYY')
END) Display_date
FROM
(SELECT DISTINCT
tbl.HDD_WP_RPTNG_AS_OF_SID,
to_date(tbl.HDD_WP_RPTNG_AS_OF_SID,'YYYYMMDD') AS_OF_DATE
FROM
WCADBO.WCA_MDW_D_HLDNGS_DATE tbl) subQ
ORDER BY
subQ.HDD_WP_RPTNG_AS_OF_SID DESC
Comments
If you want to add computed columns, that is done in the projection (the select list)
Always compare dates to dates and strings to strings. So in your case statement, compare the date as_of_date against another date. In this case I'm using a date literal. You could also call to_date on a string parameter.
If you want the results of the query ordered by a particular column, you want that order by applied at the outermost layer of the query, not in an inline view.
You basically always want to use aliases when referring to any column in a query. It's less critical in situations where everything is coming from one table but as soon as you start referencing multiple tables in a query, it becomes annoying to look at a query and not sure where a column is coming from. Even in a query like this where there is an inline view and an outer query, it makes it easier to read the query if you're explicit about where the columns are coming from.
Do you really need the distinct? I kept it because it was a part of the original query but I get antsy whenever I see a distinct particularly from people learning SQL. Doing a distinct is a relatively expensive operation and it is very commonly used to cover up an underlying issue in the query (i.e. that you're getting multiple rows because some other column you aren't showing has multiple values) that ought to be addressed correctly (i.e. by adding an additional predicate to ensure that you're only getting each hdd_wp_rptg_as_of_sid once).
Storing dates as strings in tables (as is done apparently with hdd_wp_rptg_as_of_sid) is a really bad practice. If one person writes one row to the table where the string isn't in the right format, your query will suddenly stop working and start throwing errors, for example.

SQL Server 2012: MAX on varchar columns not working as expected

I am using aggregates like MAX on varchar columns. When I execute a simple select on a varchar column, I get 4 characters value which is "övrö". In that column I found lots of values that are much bigger in length. My understanding is that it should return a value like "asdfsadfasdfasdfsadfsdafsdafsadfasdfsadfasdf".
Please correct me if I am misunderstanding/misusing MAX as I have to use this concept in a complicated SQL to avoid duplicate records.
If MAX on varchar gets top record based on alphabetical order, how come it still causing the duplicate values if I use it for specific columns (#2 and 3 as shown in the attached image. Red border area is showing duplicates, which should just be a single row.
EDIT - ISSUE RESOLVED
The problem was that I was adding the string columns Aggregates in Group By clause, when removed, duplicates are gone!!!
Thanks for the help.
Min and max get the maximal values. For varchar, it's not the length, but imagine them as sorted alphabetically. Max takes the last element while min takes the first.
If you want to take the maximum length, you need max(length( column ))