I have a requirement to create a report that counts a total from 2 date fields into one. A simplified example of the table I'm querying is:
ID, FirstName, LastName, InitialApplicationDate, UpdatedApplicationDate
I need to query the two date fields in a way that creates similar output to the following:
Date | TotalApplications
I would need the date output to include both InitialApplicationDate and
UpdatedApplicationDate fields and the TotalApplications output to be a count of the total for both types of date fields. Originally I thought maybe a Union would work however that returns 2 separate records for each date. Any ideas how I might accomplish this?
The simplest way, I think, is to unpivot using apply and then aggregate:
select v.thedate, count(*)
from t cross apply
(values (InitialApplicationDate), (UpdatedApplicationDate)) v(thedate)
group by v.thedate;
You might want to add where thedate is not null if either column could be NULL.
Note that the above will count the same application twice, once for each date. That appears to be your intention.
Related
I have a table called VIEWS with Id, Day, Month, name of video, name of browser... but I'm interested only in Id, Day and Month.
The ID can be duplicate because the user (ID) can watch a video multiple days in multiple months.
This is the query for the minimum date and the maximum date.
SELECT ID, CONCAT(MIN(DAY), '/', MIN(MONTH)) AS MIN_DATE,
CONCAT(MAX(DAY), '/', MAX(MONTH)) AS MAX_DATE,
FROM Views
GROUP BY ID
I want to insert this select with two columns(MIN_DATE and MAX_DATE) to two new columns with insert into.
How can be the insert into query?
To do what you are trying to do (there are some issues with your solution, please read my comments below), first you need to add the new columns to the table.
ALTER TABLE Views ADD MIN_DATE VARCHAR(10)
ALTER TABLE Views ADD MAX_DATE VARCHAR(10)
Then you need to UPDATE your new columns (not INSERT, because you don't want new rows). Determine the min/max for each ID, then join the result back to the table to be able to update each row. You can't update directly from a GROUP BY as rows are grouped and lose their original row.
;WITH MinMax
(
SELECT
ID,
CONCAT(MIN(V.DAY), '/', MIN(V.MONTH)) AS MIN_DATE,
CONCAT(MAX(V.DAY), '/', MAX(V.MONTH)) AS MAX_DATE
FROM
Views AS V
GROUP BY
ID
)
UPDATE V SET
MIN_DATE = M.MIN_DATE,
MAX_DATE = M.MAX_DATE
FROM
MinMax AS M
INNER JOIN Views AS V ON M.ID = V.ID
The problems that I see with this design are:
Storing aggregated columns: you usually want to do this only for performance issues (which I believe is not the case here), as querying the aggregated (grouped) rows is faster due to being less rows to read. The problem is that you will have to update the grouped values each time one of the original rows is updated, which as extra processing time. Another option would be periodically updating the aggregated values, but you will have to accept that for a period of time the grouped values are not really representing the tracking table.
Keeping aggregated columns on the same table as the data they are aggregating: this is normalization problem. Updating or inserting a row will trigger updating all rows with the same ID as the min/max values might have changed. Also the min/max values will always be repeated on all rows that belong to the same ID, which is extra space that you are wasting. If you had to save aggregated data, you need to save it on a different table, which causes the problems I listed on the previous point.
Using text data type to store dates: you always want to work dates with a proper DATETIME data type. This will not only enable to use date functions like DATEADD or DATEDIFF, but also save space (varchars that store dates need more bytes that DATETIME). I don't see the year part on your query, it should be considered to compute a min/max (this might depend what you are storing on this table).
Computing the min/max incorrectly: If you have the following rows:
ID DAY MONTH
1 5 1
1 3 2
The current result of your query would be 3/1 as MIN_DATE and 5/2 as MAX_DATE, which I believe is not what you are trying to find. The lowest here should be the 5th of January and the highest the 3rd of February. This is a consequence of storing date parts as independent values and not the whole date as a DATETIME.
What you usually want to do for this scenario is to group directly on the query that needs the data grouped, so you will do the GROUP BY on the SELECT that needs the min/max. Having an index by ID would make the grouping very fast. Thus, you save the storage space you would use to keep the aggregated values and also the result is always the real grouped result at the time that you are querying.
Would be something like the following:
;WITH MinMax
(
SELECT
ID,
CONCAT(MIN(V.DAY), '/', MIN(V.MONTH)) AS MIN_DATE, -- Date problem (varchar + min/max computed seperately)
CONCAT(MAX(V.DAY), '/', MAX(V.MONTH)) AS MAX_DATE -- Date problem (varchar + min/max computed seperately)
FROM
Views AS V
GROUP BY
ID
)
SELECT
V.*,
M.MIN_DATE,
M.MAX_DATE
FROM
MinMax AS M
INNER JOIN Views AS V ON M.ID = V.ID
total novice here with SQL SUM function question. So, SUM function itself works as I expected it to:
select ID, sum(amount)
from table1
group by ID
There are several records for each ID and my goal is to summarize each ID on one row where the next column would give me the summarized amount of column AMOUNT.
This works fine, however I also need to filter out based on certain criteria in the summarized amount field. I.e. only look for results where the summarized amount is either bigger, smaller or between certain number.
This is the part I'm struggling with, as I can't seem to use column AMOUNT, as this messes up summarizing results.
Column name for summarized results is shown as "00002", however using this in the between or > / < clause does not work either. Tried this:
select ID, sum(amount)
from table1
where 00002 > 1000
group by ID
No error message, just blank result, however plenty of summarized results with values over 1000.
Unfortunately not sure on the engine the database runs on, however it should be some IBM-based product.
The WHERE clause will filter individual rows that don't match the condition before aggregating them.
If you want to do post aggregation filtering you need to use the HAVING Clause.
HAVING will apply the filter to the results after being grouped.
select ID, sum(amount)
from table1
group by ID
having sum(amount) > 1000
I have a table in which Employee Punches are saved. For each date for each employee there are columns in the table as Punch1, Punch2 till Punch10.
I want all this Punch Columns data in a Single Column. e.g. If in a row i have dates stored in Punch1, Punch2, Punch3, Punc4....so on. I want all this data in a single Column.
How to achieve this?
UNPIVOTcan be used to normalize your table:
If you table is called EmployeePunchesit would look like this:
SELECT UserID, Punch
FROM
(
SELECT UserID, Punch1, Punch2, Punch3, Punch4
FROM EmployeePunches
) AS ep
UNPIVOT
(
Punch FOR Punches IN (Punch1, Punch2, Punch3, Punch4)
) AS up
Using UNION ALLworks too, but there you will have 1select statement per Punch.
With UNPIVOTyou only need 1 Statement and just add the Punch columns you need.
"Horizontal" (string) concatenation:
If for each row, you want to derive a new column Punches1To10 that contains all the timestamps as e.g. a comma-separated list (such as 'xxxx-xx-xa, xxxx-xx-xb, xxxx-xx-xc, …'), then FOR XML will be what you're looking for.
See this article for a tutorial on this, and the SO question "Row concatenation with FOR XML, but with multiple columns?"
"Vertical" (table) concatenation:
Visually speaking, if you want to vertically stack the single columns Punch1, Punch2, etc. then you would concatenate the result of several select statements using UNION ALL. For just two columns, this would look like this:
SELECT Punch1 AS Punch FROM YourTable
UNION ALL
SELECT Punch2 AS Punch FROM YourTable
With three columns, it's going to be:
SELECT Punch1 AS Punch FROM Punches
UNION ALL
SELECT Punch2 AS Punch FROM Punches
UNION ALL
SELECT Punch3 AS Punch FROM Punches;
Either way, consider normalizing your table first!
This could quickly get out of hand the more PunchN columns you have.
Therefore, may I recommend that you first redesign this table into something a little more normalized.
For example, instead of having several columns named Punch1, Punch2, etc. (where each of them contains the same type of data), just have two columns: one containing the 1, 2, etc. from the PunchN column names, the other containing the timestamps:
PunchN Date
1 xxxx-xx-xa
1 xxxx-xx-xb
1 xxxx-xx-xc
2 xxxx-xx-xd
2 xxxx-xx-xe
…
Like this answer shows, the database system can do something like this for you through UNPIVOT.
Now, no matter how many Punch columns you have, your query e.g. for "vertical" concatenation would always the same:
SELECT Date FROM Punches;
(The "horizontal" concatenation would become simpler, too.)
I think you are looking for something like this
select Employee,Punch_date,Identifier
from YourTable
cross apply (values (Punch1,'Punch1'),
(Punch2,'Punch2'),
(Punch3,'Punch3'),
.......
(Punch4,'Punch4')) tc(Punch_date,Identifier)
The Identifier column help you to find the punch date is from which Punch number for each Employee
I have a table with some search results. The search results maybe repeated because each result may be found using a different metric. I want to then query this table select only the distinct results using the ID column. So to summarize I have a table with an ID column but the IDs may be repeated and I want to select only one of each ID with MS Access SQL, how should I go about doing this?
Ok I have some more info after trying a couple of the suggestions. The Mins, and Maxes won't work because the column they are operating on cannot be shown. I get an error like You tried to execute a query that does not include the specified expression... I now have all my data sorted, here is what it looks like
ID|Description|searchScore
97 test 1
97 test .95
120 ball .94
97 test .8
120 ball .7
so the problem is that since the rows were put into the table using different search criteria I have duplicated rows with different scores. What I want to do is select only one of each ID sorted by the searchScore descending. Any ideas?
SELECT DISTINCT ID
FROM Search_Table;
Based on the last update to your question, the following query seems appropriate.
SELECT ID, [Description], Max(searchScore)
FROM Search_Table
GROUP BY ID, [Description];
However that's nearly the same as Gordon's suggestion from yesterday, so I'm unsure whether this is what you want.
Here is a way where you can get one of the search criteria:
select id, min(search_criteria)
from t
group by id
This will always return the first one alphabetically. You can also easily get the last one using max().
You could also use:
select id, first(search_criteria)
from t
group by id
I am trying to create a query that will return results for the number of distinct users who have accessed something by date. Right now I have a query that will display 2 columns, the first being date and the second being user name. It will list all the distinct users who accessed the application on a certain date but they will each have their own distinct row. Here is the query that does that:
SELECT DISTINCT logdate, User AS ReportUser
FROM table
WHERE appname='abcd1234' AND logdate >=DATE-30
I have tried putting COUNT() around User but it says selected non-aggregate values must be part of the associated group.
Any idea how I can get this query to just show like row for the past 30 days and the count of distinct users?
This will be the right approach for that.
SELECT logdate, Count(User) AS ReportUser
FROM table
WHERE appname='abcd1234' AND logdate >=DATE-30
GROUP BY 1
Never use DINTINCT in Teradata. It always slows down your query performance. Use GROUP BYinstead.
CORRECTION In Teradata 13 the optimizer is able to determines which version is more efficient, based on statistics. This can be found in the Release Summary for Teradata 13 under "Group By and DISTINCT Performance Equivalence". - http://www.info.teradata.com/edownload.cfm?itemid=083440012 (PDF)
use "GROUP BY" after WHERE clause
SELECT logdate, COUNT (User) AS ReportUser
FROM table
WHERE appname='abcd1234' AND logdate >=DATE-30
GROUP BY logdate