distict cant use in this - sql

I want to display my data without duplication in any of the columns,
I use distinct or group by tetep it doesn't work
my sql :
SELECT DISTINCT rl.NIK,date(rl.enroll),LEFT(TIME(rl.enroll),8) AS time
FROM RTattandenceLog rl, mstEmp e
WHERE DATE(rl.enroll)=CURDATE()-1 AND e.idDept=3 AND e.NIK=rl.NIK
this is resultc:
the lines I crossed should not be displayed

Basically, you want an aggregation query with JOIN. I'm not sure why you are separating out the date/time into two columns instead of just using:
SELECT rl.NIK, DATE(rl.enroll),
CAST(MIN(TIME(rl.enroll)) as CHAR)
FROM RTattandenceLog rl JOIN
mstEmp e
ON e.NIK = rl.NIK
WHERE rl.enroll < CURDATE() AND
rl.enroll >= CURDATE() - INTERVAL 1 DAY AND
e.idDept = 3
GROUP BY rl.NIK, DATE(rl.enroll);
Notes:
This uses proper, explicit, standard, readable JOIN syntax. Never use commas in the FROM clause.
The date comparisons do not use DATE(). That makes it this more compatible with indexes and helps the optimizer.
There is no implicit conversion of a time value into a string. Not sure why a time is not good enough (date seems to be), but this explicitly converts to a string. Implicit conversions are the cause of both semantic errors and performance problems.
I don't understand why you would want to split the date and time into separate columns. Perhaps this is sufficient:
SELECT rl.NIK, MIN(rl.enroll)
FROM RTattandenceLog rl JOIN
mstEmp e
ON e.NIK = rl.NIK
WHERE rl.enroll < CURDATE() AND
rl.enroll >= CURDATE() - INTERVAL 1 DAY AND
e.idDept = 3
GROUP BY rl.NIK, DATE(rl.enroll);

Don't you actually want to create groups here based on the NIK and the enroll date? And show the earliest time for every group?
Something like this:
SELECT rl.NIK, date(rl.enroll), LEFT(TIME(MIN(rl.enroll)), 8) AS time
FROM RTattandenceLog rl, mstEmp e
WHERE DATE(rl.enroll) = CURDATE() - 1 AND e.idDept = 3 AND e.NIK = rl.NIK
GROUP BY rl.NIK, date(rl.enroll)
I do not use MariaDB myself, so I cannot test it. But it's pretty standard SQL syntax, so I assume it should work.

Related

Optimization on large tables

I have the following query that joins two large tables. I am trying to join on patient_id and records that are not older than 30 days.
select * from
chairs c
join data id
on c.patient_id = id.patient_id
and to_date(c.from_date, 'YYYYMMDD') - to_date(id.from_date, 'YYYYMMDD') >= 0
and to_date (c.from_date, 'YYYYMMDD') - to_date(id.from_date, 'YYYYMMDD') < 30
Currently, this query takes 2 hours to run. What indexes can I create on these tables for this query to run faster.
I will take a shot in the dark, because as others said it depends on what the table structure, indices, and the output of the planner is.
The most obvious thing here is that as long as it is possible, you want to represent dates as some date datatype instead of strings. That is the first and most important change you should make here. No index can save you if you transform strings. Because very likely, the problem is not the patient_id, it's your date calculation.
Other than that, forcing hash joins on the patient_id and then doing the filtering could help if for some reason the planner decided to do nested loops for that condition. But that is for after you fixed your date representation AND you still have a problem AND you see that the planner does nested loops on that attribute.
Some observations if you are stuck with string fields for the dates:
YYYYMMDD date strings are ordered and can be used for <,> and =.
Building strings from the data in chairs to use to JOIN on data will make good use of an index like one on data for patient_id, from_date.
So my suggestion would be to write expressions that build the date strings you want to use in the JOIN. Or to put it another way: do not transform the child table data from a string to something else.
Example expression that takes 30 days off a string date and returns a string date:
select to_char(to_date('20200112', 'YYYYMMDD') - INTERVAL '30 DAYS','YYYYMMDD')
Untested:
select * from
chairs c
join data id
on c.patient_id = id.patient_id
and id.from_date between to_char(to_date(c.from_date, 'YYYYMMDD') - INTERVAL '30 DAYS','YYYYMMDD')
and c.from_date
For this query:
select *
from chairs c join data
id
on c.patient_id = id.patient_id and
to_date(c.from_date, 'YYYYMMDD') - to_date(id.from_date, 'YYYYMMDD') >= 0 and
to_date (c.from_date, 'YYYYMMDD') - to_date(id.from_date, 'YYYYMMDD') < 30;
You should start with indexes on (patient_id, from_date) -- you can put them in both tables.
The date comparisons are problematic. Storing the values as actual dates can help. But it is not a 100% solution because comparison operations are still needed.
Depending on what you are actually trying to accomplish there might be other ways of writing the query. I might encourage you to ask a new question, providing sample data, desired results, and a clear explanation of what you really want. For instance, this query is likely to return a lot of rows. And that just takes time as well.
Your query have a non SERGABLE predicate because it uses functions that are iteratively executed. You need to discard such functions and replace them by a direct access to the columns. As an exemple :
SELECT *
FROM chairs AS c
JOIN data AS id
ON c.patient_id = id.patient_id
AND c.from_date BETWEEN id.from_date AND id.from_date + INTERVAL '1 day'
Will run faster with those two indexes :
CREATE X_SQLpro_001 ON chairs (patient_id, from_date);
CREATE X_SQLpro_002 ON data (patient_id, from_date) ;
Also try to avoid
SELECT *
And list only the necessary columns

Date automatically where clause - SQL

I have on my DB the dates that I can filter like this:
select *
where
a.y=2021 and a.m=2 and a.d=7
However if I run this query tomorrow I'll have to go there and change manually.
Is there a way to do this automatically as in if I run the query tomorrow I'll get d=8 and the day after d=9 and so on?
I tried to use get date but I get the following error:
SQL Error [6]: Query failed (#20210207_153809_06316_2g4as): line 2:7: Function 'getdate' not registered
I also don't know if that is the right solution. Does anybody know how to fix that?
you can use NOW to get the current date, and use YEAR , MONTH , DAY to get parts of the date
SELECT *
FROM [TABLE]
WHERE a.y=YEAR(NOW()) and a.m=MONTH(NOW()) and a.d=DAY(NOW())
The best solution is to have a date column in your data. Then you can just use:
where datecol = current_date
Or whatever your particular database uses for the current date.
Absent that, you have to split the current date into parts. In Standard SQL, this looks like:
where y = extract(year from current_date) and
m = extract(month from current_date) and
d = extract(day from current_date)
That said, date functions notoriously vary among databases, so the exact syntax depends on your database.
For instance, a common way to write this in SQL Server would be:
where y = year(getdate()) and
m = month(getdate()) and
d = day(getdate())

Why are different result between use date_part and exactly date parameter query data in peroid date?

I'm try to count distinct value in some columns in a table.
i have a logic and i try to write in 2 way
But i get diffent results from this two query.
Can any one help to clarify me? I dont know what wrong is code or i think.
SQL
select count(distinct membership_id) from members_membership m
where date_part(year,m.membership_expires)>=2019
and date_part(month,m.membership_expires)>=7
and date_part(day,m.membership_expires)>=1
and date_part(year,m.membership_creationdate)<=2019
and date_part(month,m.membership_creationdate)<=7
and date_part(day,m.membership_creationdate)<=1
;
select count(distinct membership_id) from members_membership m
where m.membership_expires>='2019-07-01'
and m.membership_creationdate<='2019-07-01'
;
I actually think that this is the query you intend to run:
SELECT
COUNT(DISTINCT membership_id)
FROM members_membership m
WHERE
m.membership_expires >= '2019-07-01' AND
m.membership_creationdate < '2019-07-01';
It doesn't make sense for a membership to expire at the same moment it gets created, so if it expires on midnight of 1st-July 2019, then it should have been created strictly before that point in time.
That being said, the problem with the first query is that, e.g., the restriction on the month being on or before July would apply to every year, not just 2019. It is difficult to write a date inequality using the year, month, and day terms separately. For this reason, the second version you used is preferable. It is also sargable, meaning that an index on membership_expires or membership_creationdate can be used.
There is an issue with the first query:
select count(distinct membership_id) from members_membership m
where date_part(year,m.membership_expires)>=2019
and date_part(month,m.membership_expires)>=7
and date_part(day,m.membership_expires)>=1
and date_part(year,m.membership_creationdate)<=2019
and date_part(month,m.membership_creationdate)<=7
and date_part(day,m.membership_creationdate)<=1; -- do you think that any day is less than 1??
-- this condition will be satisfy by only 01-Jul-2019, But I think you need all the dates before 01-Jul-2019
and date_part(day,m.membership_creationdate)<=1 is culprit of the issue.
even membership_creationdate = 15-jan-1901 will not satisfy above condition.
You need to always use date functions on date columns to avoid such type of issue. (Your second query is perfectly fine)
Cheers!!
The reason could be due to a time component.
The proper comparison for the first query is:
select count(distinct membership_id)
from members_membership m
where m.membership_expires >= '2019-07-01' and
m.membership_creationdate < '2019-07-02'
--------------------------------^ not <= ---^ next day
This logic should work regardless of whether or not the "date" has a time component.

ORA-00907: missing right parenthesis using count with expression inside it

SELECT Hotel_Name, COUNT(H_CHECK.Hotel_checkIn >= 'JUL-1-2016' AND H_CHECK.Hotel_checkIn <= 'JUL-31-2016') FROM HOTEL, H_CHECK
GROUP BY Hotel_Name
ORA-00907: missing right parenthesis
I have tried putting Parenthesis in many ways, but I couldn't find the solution. I'm using Oracle Application Express 11G.
This is the query:
Display the hotel name that has more than 2 customers checked in on July 2016.
Once you fix your immediate syntax problem, you need proper JOIN syntax.
One way to fix the problem is simply to move the conditions to a WHERE clause, resulting in a query like this:
SELECT Hotel_Name, COUNT(hc.hotel_id)
FROM HOTEL h LEFT JOIN
H_CHECK hc
ON h.hotel_id = hc.hotel_id -- I don't know what the right join condition is
WHERE hc.Hotel_checkIn >= DATE '2016-07-01' AND
hc.Hotel_checkIn <= DATE '2016-07-31'
GROUP BY Hotel_Name;
You cannot count based on condition in Select statement of your sql query.
COUNT (
H_CHECK.Hotel_checkIn >= 'JUL-1-2016'
AND H_CHECK.Hotel_checkIn <= 'JUL-31-2016')
This is wrong. You can do it like>
SELECT Hotel_Name,
COUNT (1)
FROM HOTEL
join H_CHECK
ON H_CHECK.Hotel_checkIn >= 'JUL-1-2016'
AND H_CHECK.Hotel_checkIn <= 'JUL-31-2016'
GROUP BY Hotel_Name
having count(1) > 2;
You're missing the CASE ... END from your conditions inside your count. You're after something like:
SELECT Hotel_Name,
COUNT(case when H_CHECK.Hotel_checkIn >= 'JUL-1-2016'
AND H_CHECK.Hotel_checkIn <= 'JUL-31-2016'
then 1
end)
FROM HOTEL, H_CHECK
GROUP BY Hotel_Name;
However, there are a number of concerns I have regarding your query:
if your hotel_checkin column is of DATE datatype, then you should be comparing it to DATEs not strings. I.e. H_CHECK.Hotel_checkIn >= to_date('07-01-2016', 'mm-dd-yyyy') - this way, you avoid relying on implicit conversion of the string into a date, which relies on your NLS_DATE_FORMAT parameter setting. This could be changed and may cause your query to fail.
FROM HOTEL, H_CHECK Don't use the old-style method of joining; instead, use the ANSI style method: FROM hotel cross join h_check
Did you really mean the join to be a cross join or did you forget to add the join conditions?
You should alias the hotel_name column to aid maintainability.
You should also give your count column a name.
Since you're only counting rows with a checkin between 1st and 31st July 2016, you should move this condition into the where clause (as XING has shown in their answer) *unless* you need to also show hotels that don't have any checkins within that time period.
Your condition assumes that there are no time elements to the hotel_checkin column - ie. everything is set to midnight. If, however, you could have a date with a time, bear in mind that your count will ignore all rows with a checkin date of 31st July 2016 that are after midnight. In which case, your upper bound needs to change to: H_CHECK.Hotel_checkIn < to_date('08-01-2016', 'mm-dd-yyyy')

Invalid Operation On An ANSI DATETIME (Subtracting one timestamp from another in Teradata)

I would like to create a WHERE condition to return results where only 1 day has passed between two timestamps. I tried this:
SELECT * FROM RDMAVWSANDBOX.VwNIMEventFct
INNER JOIN VwNIMUserDim ON VwNIMUserDim.NIM_USER_ID = VwNIMEventFct.NIM_USER_ID
INNER JOIN rdmatblsandbox.TmpNIMSalesForceDB ON TmpNIMSalesForceDB.EMAIL = VwNIMUserDim.USER_EMAIL_ADDRESS
WHERE (CONTRACT_EFFECTIVE_DATE - EVENT_TIMESTAMP) =1
But the result was an error message "Invalid Operation On An ANSI DATETIME value".
I guess that, looking at the code now, Teradata has no way of knowing whether the "1" in "= 1" is a day, hour or year.
How would I select data where only 1 day has passed between CONTRACT_EFFECTIVE_DATE and EVENT_TIMESTAMP?
Same again for 2 days, and 3 days etc?
If both columns are DATEs you can use =1which means one day.
For Timestamps you need to tell what kind of interval you want:
WHERE (CONTRACT_EFFECTIVE_DATE - EVENT_TIMESTAMP) DAY = INTERVAL '1' DAY
But i'm not shure if this is what you really want, what's your definition of 1 day?
Edit:
Based on your comment the best way should be:
WHERE CAST(CONTRACT_EFFECTIVE_DATE AS DATE) - CAST(EVENT_TIMESTAMP AS DATE) = 1
This avoids dealing with INTERVAL arithmetic :-)
Not sure about Teradata, but I think most versions of SQL have built-in date math functions. In MSSQL for instance you could do this:
...
WHERE DATEDIFF(DAY, CONTRACT_EFFECTIVE_DATE, EVENT_TIMESTAMP) = 1
Or if you wanted to make sure 24 hours had passed you could do:
...
WHERE DATEDIFF(HOUR, CONTRACT_EFFECTIVE_DATE, EVENT_TIMESTAMP) = 1
Other SQL's have their own versions of this, and you may have to use 'D' or 'DD' instead of 'DAY' or something (and maybe 'HH' instead of 'HOUR' likewise).