applying knowledge of SQL for everyday workplace activities - sql

My question is how to properly write a SQL query for the below highlighted/bold question.
There is a table in HMO database which stores doctor's working
hours.Table has following fields
"FirstName","LastName","Date","HoursWorked". write a sql statement
which retrieves average working hours for period January-March for a
doctor with name Joe Doe.
so far i have
SELECT HoursWorked
FROM Table
WHERE DATE = (January - March) AND
SELECT AVG(HoursWorked) FROM Table WHERE FirstName="Joe",LastName="Doe"*

A few pointers as this sounds like a homework question (which we don't answer for you here, but we can try to give you some guidance).
You want to put all the things you want to return from your select first and you want to have all your search conditions at the end.
So the general format would be :
SELECT Column1,
Column2,
Column3,
FROM YourTable
WHERE Column4 = Restriction1
AND Column5 = Restriction2
The next thing you need to think about is how the dates are formatted in your database table. Hopefully they're kept in a column of type datetime or date (options will depend on the database engine you're using, eg, Microsoft SQL Server, Oracle or MySql). In reality some older databases people use can store dates in all sorts of formats which makes this much harder, but since I'm assuming it's a homework type question, lets assume it's a datetime format.
You specify restrictions by comparing columns to a value, so if you wanted all rows where the date was after midnight on the 2nd of March 2012, you would have the WHERE clause :
WHERE MyDateColumn >= '2012-03-02 00:00:00'
Note that to avoid confusion, we usually try to format dates as "Year-Month-Day Hour:Minute:Second". This is because in different countries, dates are often written in different formats and this is considered a Universal format which is understood (by computers at least) everywhere.
So you would want to combine a couple of these comparisons in your WHERE, one for dates AFTER a certain date in time AND one for dates before another point in time.
If you give this a go and see where you get to, update your question with your progress and someone will be able to help get it finished if you have problems.
If you don't have access to an actual database and need to experiment with syntax, try this site : http://sqlfiddle.com/

you already have the answer written
SELECT AVG(HoursWorked) FROM Table WHERE FirstName="Joe",LastName="Doe"*
you only need to fix the query
SELECT AVG(HoursWorked) as AVGWORKED FROM Table WHERE FirstName='Joe' AND LastName='Doe'
That query will give you the average hours worked for Joe Doe, however you only need to get between some time you add the next "AND", if you are using SQL server you can use the built in function DateFromParts(year,month,day) to create a new Date, or if you are using another Database Engine you can convert a string to a DateColumn Convert(Date,'MM/dd/yyyy')
Example
SELECT AVG(HoursWorked) as AVGWORKED FROM Table WHERE FirstName='Joe' AND LastName='Doe' AND DateColumn between DateFromParts(year,month,day) and Convert(Date,'MM/dd/yyyy')
In the example i showed both approaches (datefromparts for the initial date, and convert(date) for the ending date).

Related

Average age using months_between()

So I have a table with the birth dates and I need to average the people's age. How do I do that? I know I have to use months_between(). Thank you in advance!
Why do you think you need months_between? You don't (unless you have a very specific and unusual definition of "average age").
Over a long enough period (like 40+ years, say), a person's age in years can be calculated (within a narrow approximation window) as the age in days, divided by 365.25. The age in days is simply a difference between two dates, SYSDATE and DATE_OF_BIRTH or BORN. The first one is provided by the system and the second is in your table. Assuming, that is, that you want age as of today; otherwise change SYSDATE to whatever "as-of" (fixed) date you want to use.
So, something like
select [some columns here], AVG(SYSDATE - BORN)/365.25 as avg_age
from your_table
Not clear why you would select max(born) from dual; surely you didn't call your table dual? Nor did you change the standard dual table to add your own data to it?
When people ask you what datatype you use for born in your tables, what you see on the screen when you query for it is not sufficient; the screen will show a string (it's the only thing a screen shows) and doesn't necessarily reflect what's in the database. To get the proper answer, run DESCRIBE table_name; that will show all the columns in table_name and their datatype. Note that DESCRIBE table_name is a SQL*Plus command (understood by Toad and SQL Developer - whatever you use to communicate with the database), so it doesn't need a ; or a / at the end. Just type it at the prompt and hit ENTER.
Good luck!

SQL for Next/Prior Business Day from Calendar table (in MS Access)

I have a Calendar table pulled from our mainframe DBs and saved as a local Access table. The table has history back to the 1930s (and I know we use back to the 50s in at least one place), resulting in 31k records. This Calendar table has 3 fields of interest:
Bus_Dt - every day, not just business days. Primary Key
Bus_Day_Ind - indicates if the day was a valid business day for the stock market.
Prir_Bus_Dt - the prior business day. Contains some errors (about 50), all old.
I have written a query to retrieve the first business day on or after the current calendar day, but it runs supremely slowly. (5+ minutes) I have examined the showplan output and see it is being run via an x-join, which between 30k+ record tables gives a solution space (and date comparisons) in the order of nearly 10 million. However, the actual task is not hard, and could be preformed comfortably by excel in minimal time using a simple sort.
My question is thus, is there any way to fix the poor performance of the query, or is this an inherent failing of SQL? (DB2 run on the mainframe also is slow, though not crushingly so. Throwing cycles at the problem and all that.) Secondarily, if I were to trust prir_bus_dt, can I get there better? Or restrict the date range (aka, "cheat"), or any other tricks I didn't think of yet?
SQL:
SELECT TE2Clndr.BUS_DT AS Cal_Dt
, Min(TE2Clndr_1.BUS_DT) AS Next_Bus_Dt
FROM TE2Clndr
, TE2Clndr AS TE2Clndr_1
WHERE TE2Clndr_1.BUS_DAY_IND="Y" AND
TE2Clndr.BUS_DT<=[te2clndr_1].[bus_dt]
GROUP BY TE2Clndr.BUS_DT;
Showplan:
Inputs to Query
Table 'TE2Clndr'
Table 'TE2Clndr'
End inputs to Query
01) Restrict rows of table TE2Clndr
by scanning
testing expression "TE2Clndr_1.BUS_DAY_IND="Y""
store result in temporary table
02) Inner Join table 'TE2Clndr' to result of '01)'
using X-Prod join
then test expression "TE2Clndr.BUS_DT<=[te2clndr_1].[bus_dt]"
03) Group result of '02)'
Again, the question is, can this be made better (faster), or is this already as good as it gets?
I have a new query that is much faster for the same job, but it depends on the prir_bus_dt field (which has some errors). It also isn't great theory since prior business day is not necessarily available on everyone's calendar. So I don't consider this "the" answer, merely an answer.
New query:
SELECT TE2Clndr.BUS_DT as Cal_Dt
, Max(TE2Clndr_1.BUS_DT) AS Next_Bus_Dt
FROM TE2Clndr
INNER JOIN TE2Clndr AS TE2Clndr_1
ON TE2Clndr.PRIR_BUS_DT = TE2Clndr_1.PRIR_BUS_DT
GROUP BY TE2Clndr.BUS_DT;
What about this approach
select min(bus_dt)
from te2Clndr
where bus_dt >= date()
and bus_day_ind = 'Y'
This is my reference for date() representing the current date

Converting date-time format and querying based on a date condition in SQL

I have around 20,000 entries in a SQL table for which a date column is of the form
YYYY-MM-DD HH-SS. I would like to convert this format to a YYYY-MM-DD format so I can run a query on all of the entries that will count the number of entries based on
a) the month under which they fall
b) the day
I'm new to SQL and not sure if there is a way to loop through all of the entries and check based on the required criteria; and as such, would greatly appreciate any help.
I unfortunately, cannot send a screenshot of the table since the data is classified.
You don't need to change the data in the table. Most databases have year() and month() functions, so you could do:
select year(datecol), month(datecol), count(*)
from sqltable
group by year(datecol), month(datecol)
order by year(datecol), month(datecol);
If these specific functions are not available, then I'm sure your database supports something similar.

Default value for datetime

How can you search for dates (datetimes) that contain a default value i.e. ''. I guess it is not:
select * from table where dateofbirth=''
All the dates seem to have a default value of '1900-01-01'. However, there are people in my database who have a date of birth on or before this date (histroic people mainly). Therefore I cannot do:
select * from table where dateofbirth='1900-01-01'
I know that some versions of SQL Server have a default date of: 1899-12-31.
I guess it is better to use nulls for unknown dates. I cannot do that in this case.
I have read through lots of questions on here about finding dates using SQL but I have not found an answer to my specific question.
You can get the default DateTime value as;
SELECT CONVERT(DATETIME, 0)
And apply it to the filter as appropriate;
SELECT * FROM [Table] WHERE DateOfBirth = CONVERT(DATETIME, 0)
Or if you need to select earlier dates then;
SELECT * FROM [Table] WHERE DateOfBirth <= CONVERT(DATETIME, 0)
Fiddle example
The best you are going to get is what you listed:
select * from table where dateofbirth='1900-01-01'
As you know, the problem is that if someone was really born on 1/1/1900, you will also include them. But there's really no way for your query to know the difference.
To fix this, you would need to change what your system is using for the default value (e.g. NULL or change to datetime2 or date datatype and use 1/1/0001). Then update all your 1/1/1900 values to the new default value. Yes, this will erroneously update any existing people with 1/1/1900 birthdays, but at least it will prevent any future occurrences.
In SQL Server terms a default value for column X is only used when a new record is first created and a value is not provided for that column. After the initial creation of the record the value is just a value, same as any other. Within a single table there is no way to distinguish between records that that hold the default value in column X because it was supplied, or because it was defaulted.
This won't help you now, but an alternative to nulls that is sometimes used is to use a 'magic value'. In the case of dates of births, the maximum datetime value of 31st December 9999 could be used to indicate an unknown value (assuming your system isn't expected to be in use in 8,000 years time :) Some people (including me) don't really approve of the use of magic values because there's no way in the database of indicating their magic status.
Rhys

How can I query just the month and day of a DATE column?

I have a date of birth DATE column in a customer table with ~13 million rows. I would like to query this table to find all customers who were born on a certain month and day of that month, but any year.
Can I do this by casting the date into a char and doing a subscript query on the cast, or should I create an aditional char column, update it to hold just the month and day, or create three new integer columns to hold month, day and year, respectively?
This will be a very frequently used query criteria...
EDIT:... and the table has ~13 million rows.
Can you please provide an example of your best solution?
If it will be frequently used, consider a 'functional index'. Searching on that term at the Informix 11.70 InfoCentre produces a number of relevant hits.
You can use:
WHERE MONTH(date_col) = 12 AND DAY(date_col) = 25;
You can also play games such as:
WHERE MONTH(date_col) * 100 + DAY(date_col) = 1225;
This might be more suitable for a functional index, but isn't as clear for everyday use. You could easily write a stored procedure too:
Note that in the absence of a functional index, invoking functions on a column in the criterion means that an index is unlikely to be used.
CREATE FUNCTION mmdd(date_val DATE DEFAULT TODAY) RETURNING SMALLINT AS mmdd;
RETURN MONTH(date_val) * 100 + DAY(date_val);
END FUNCTION;
And use it as:
WHERE mmdd(date_col) = 1225;
Depending on how frequently you do this and how fast it needs to run you might think about splitting the date column into day, month and year columns. This would make search faster but cause all sorts of other problems when you want to retrieve a whole date (and also problems in validating that it is a date) - not a great idea.
Assuming speed isn't a probem I would do something like:
select *
FROM Table
WHERE Month(*DateOfBirthColumn*) = *SomeMonth* AND DAY(*DateOfBirthColumn*) = *SomeDay*
I don't have informix in front of me at the moment but I think the syntax is right.