How to get last value from a table category wise? - sql

I have a problem with retrieving the last value of every category from my table which should not be sorted. For example i want the daily inventory value of nov-1 last appearance in the table without sorting the column daily inventory i.e "471". Is there a way to achieve this?
similarly i need to get the value of the next week's last daily inventory value and i should be able to do this for multiple items in the table too.
p.s: nov-1 represents nov-1 st week

Question from comments of initial post: will I be able to achieve what I need if I introduce a column id? If so, how can I do it?
Here's a way to do it (no guarantee that it's the most efficient way to do it)...
;WITH SetID AS
(
SELECT ROW_NUMBER() OVER(PARTITION BY Week ORDER BY Week) AS rowid, * FROM <TableName>
),
MaxRow AS
(
SELECT LastRecord = MAX(rowid), Week
FROM SetID
GROUP BY Week
)
SELECT a.*
FROM SetID a
INNER JOIN MaxRow b
ON a.rowid = b.LastRecord
AND b.Week = a.Week
ORDER BY a.Week
I feel like there's more to the table though, and this is also untested on large amounts of data. I'd be afraid that a different RowID could be potentially assigned upon each run. (I haven't used ROW_NUMBER() enough to know if this would throw unexpected data.)
I suppose this example is to enforce the idea that, if you had a dedicated rowID on the table, it's possible. Also, I believe #Larnu's comment to you on your original post - introducing an ID column that retains current order, but reinserting all your data - is a concern too.
Here's a SQLFiddle example here.

Related

Query monitoring changes in the field

I need to program a query where I can see the changes that certain fields have undergone in a certain date period.
Example: From the CAM_CONCEN table bring those records where the ACCOUNT_NUMBER undergoes a modification in the CONCTACT field in a period of 6 months before the date.
I would be grateful if you can guide me.
You can use LAG() to peek at the previous row of a particular subset of rows (the same account in this case).
For example:
select *
from (
select c.*,
lag(contact) over(partition by account_number
order by change_date) as prev_contact
from cam_concen c
) x
where contact <> prev_contact

INSERT INTO two columns from a SELECT query

I have a table called VIEWS with Id, Day, Month, name of video, name of browser... but I'm interested only in Id, Day and Month.
The ID can be duplicate because the user (ID) can watch a video multiple days in multiple months.
This is the query for the minimum date and the maximum date.
SELECT ID, CONCAT(MIN(DAY), '/', MIN(MONTH)) AS MIN_DATE,
CONCAT(MAX(DAY), '/', MAX(MONTH)) AS MAX_DATE,
FROM Views
GROUP BY ID
I want to insert this select with two columns(MIN_DATE and MAX_DATE) to two new columns with insert into.
How can be the insert into query?
To do what you are trying to do (there are some issues with your solution, please read my comments below), first you need to add the new columns to the table.
ALTER TABLE Views ADD MIN_DATE VARCHAR(10)
ALTER TABLE Views ADD MAX_DATE VARCHAR(10)
Then you need to UPDATE your new columns (not INSERT, because you don't want new rows). Determine the min/max for each ID, then join the result back to the table to be able to update each row. You can't update directly from a GROUP BY as rows are grouped and lose their original row.
;WITH MinMax
(
SELECT
ID,
CONCAT(MIN(V.DAY), '/', MIN(V.MONTH)) AS MIN_DATE,
CONCAT(MAX(V.DAY), '/', MAX(V.MONTH)) AS MAX_DATE
FROM
Views AS V
GROUP BY
ID
)
UPDATE V SET
MIN_DATE = M.MIN_DATE,
MAX_DATE = M.MAX_DATE
FROM
MinMax AS M
INNER JOIN Views AS V ON M.ID = V.ID
The problems that I see with this design are:
Storing aggregated columns: you usually want to do this only for performance issues (which I believe is not the case here), as querying the aggregated (grouped) rows is faster due to being less rows to read. The problem is that you will have to update the grouped values each time one of the original rows is updated, which as extra processing time. Another option would be periodically updating the aggregated values, but you will have to accept that for a period of time the grouped values are not really representing the tracking table.
Keeping aggregated columns on the same table as the data they are aggregating: this is normalization problem. Updating or inserting a row will trigger updating all rows with the same ID as the min/max values might have changed. Also the min/max values will always be repeated on all rows that belong to the same ID, which is extra space that you are wasting. If you had to save aggregated data, you need to save it on a different table, which causes the problems I listed on the previous point.
Using text data type to store dates: you always want to work dates with a proper DATETIME data type. This will not only enable to use date functions like DATEADD or DATEDIFF, but also save space (varchars that store dates need more bytes that DATETIME). I don't see the year part on your query, it should be considered to compute a min/max (this might depend what you are storing on this table).
Computing the min/max incorrectly: If you have the following rows:
ID DAY MONTH
1 5 1
1 3 2
The current result of your query would be 3/1 as MIN_DATE and 5/2 as MAX_DATE, which I believe is not what you are trying to find. The lowest here should be the 5th of January and the highest the 3rd of February. This is a consequence of storing date parts as independent values and not the whole date as a DATETIME.
What you usually want to do for this scenario is to group directly on the query that needs the data grouped, so you will do the GROUP BY on the SELECT that needs the min/max. Having an index by ID would make the grouping very fast. Thus, you save the storage space you would use to keep the aggregated values and also the result is always the real grouped result at the time that you are querying.
Would be something like the following:
;WITH MinMax
(
SELECT
ID,
CONCAT(MIN(V.DAY), '/', MIN(V.MONTH)) AS MIN_DATE, -- Date problem (varchar + min/max computed seperately)
CONCAT(MAX(V.DAY), '/', MAX(V.MONTH)) AS MAX_DATE -- Date problem (varchar + min/max computed seperately)
FROM
Views AS V
GROUP BY
ID
)
SELECT
V.*,
M.MIN_DATE,
M.MAX_DATE
FROM
MinMax AS M
INNER JOIN Views AS V ON M.ID = V.ID

How to work past "At most one record can be returned by this subquery"

I'm having trouble understanding this error through all the researching I have done. I have the following query
SELECT M.[PO Concatenate], Sum(M.SumofAward) AS TotalAward, (SELECT TOP 1 M1.[Material Group] FROM
[MGETCpreMG] AS M1 WHERE M1.[PO Concatenate]=M.[PO Concatenate] ORDER BY M1.SumofAward DESC) AS TopGroup
FROM MGETCpreMG AS M
GROUP BY M.[PO Concatenate];
For a brief instance it reviews the results I want, but then the "At most one record can be returned by this subquery" error comes and wipes all the data to #Name?
For context, [MGETCpreMG] is a query off a main table [MG ETC] that was used to consolidate Award for differing Material Groups on a PO transaction ([PO Concatenate])
SELECT [MG ETC].[PO Concatenate], Sum([MG ETC].Award) AS SumOfAward, [MG ETC].[Material Group]
FROM [MG ETC]
GROUP BY [MG ETC].[PO Concatenate], [MG ETC].[Material Group]
ORDER BY [MG ETC].[PO Concatenate];
I'm thinking it lies in my inability to understand how to utilize a subquery.
In the case in which the query can return more then one value? Simply add an additonal sort by.
So, a common sub query might be to get the last invoice. So you might have:
select ID, CompanyName,
(SELECT TOP 1 InvoiceDate from tblInvoice
where tblInvoice.CustomerID = tblCompany.ID
Order by InvoiceDate DESC)
As LastInvoiceDate
From tblCustomers
Now the above might work for some time, but then it will blow up since you might have two invoices for the same day!
So, all you have to do is add that extra order by clause - say on the PK of the child table like this:
Order by InvoiceDate DESC,ID DESC)
So top 1 will respect the "additional" order columns you add, and thus only ever return one row - even if there are multiple values that match the top 1 column.
I suppose in the above we could perhaps forget the invoiceDate and always take the top most last autonumber ID, but for a lot of queries, you can't always be sure - it might be we want the last most expensive invoice amount. And again, if the max value (top) was the same for two large invoice amounts, then again two rows could be return. So, simply add the extra ORDER BY clause with an 2nd column that further orders the data. And thus top 1 will only pull the first value. Your example of a top group is such an example. Just tack on the extra order by "ID" or whatever the auto number ID column is.

SQL Server: I have multiple records per day and I want to return only the first of the day

I have some records track inquires by DATETIME. There is an glitch in the system and sometimes a record will enter multiple times on the same day. I have a query with a bunch of correlated subqueries attached to these but the numbers are off because when there were those glitches in the system then these leads show up multiple times. I need the first entry of the day, I tried fooling around with MIN but I couldn't quite get it to work.
I currently have this, I am not sure if I am on the right track though.
SELECT SL.UserID, MIN(SL.Added) OVER (PARTITION BY SL.UserID)
FROM SourceLog AS SL
Here's one approach using row_number():
select *
from (
select *,
row_number() over (partition by userid, cast(added as date) order by added) rn
from sourcelog
) t
where rn = 1
You could use group by along with min to accomplish this.
Depending on how your data is structured if you are assigning a unique sequential number to each record created you could just return the lowest number created per day. Otherwise you would need to return the ID of the record with the earliest DATETIME value per day.
--Assumes sequential IDs
select
min(Id)
from
[YourTable]
group by
--the conversion is used to stip the time value out of the date/time
convert(date, [YourDateTime]

Find row number in a sort based on row id, then find its neighbours

Say that I have some SELECT statement:
SELECT id, name FROM people
ORDER BY name ASC;
I have a few million rows in the people table and the ORDER BY clause can be much more complex than what I have shown here (possibly operating on a dozen columns).
I retrieve only a small subset of the rows (say rows 1..11) in order to display them in the UI. Now, I would like to solve following problems:
Find the number of a row with a given id.
Display the 5 items before and the 5 items after a row with a given id.
Problem 2 is easy to solve once I have solved problem 1, as I can then use something like this if I know that the item I was looking for has row number 1000 in the sorted result set (this is the Firebird SQL dialect):
SELECT id, name FROM people
ORDER BY name ASC
ROWS 995 TO 1005;
I also know that I can find the rank of a row by counting all of the rows which come before the one I am looking for, but this can lead to very long WHERE clauses with tons of OR and AND in the condition. And I have to do this repeatedly. With my test data, this takes hundreds of milliseconds, even when using properly indexed columns, which is way too slow.
Is there some means of achieving this by using some SQL:2003 features (such as row_number supported in Firebird 3.0)? I am by no way an SQL guru and I need some pointers here. Could I create a cached view where the result would include a rank/dense rank/row index?
Firebird appears to support window functions (called analytic functions in Oracle). So you can do the following:
To find the "row" number of a a row with a given id:
select id, row_number() over (partition by NULL order by name, id)
from t
where id = <id>
This assumes the id's are unique.
To solve the second problem:
select t.*
from (select id, row_number() over (partition by NULL order by name, id) as rownum
from t
) t join
(select id, row_number() over (partition by NULL order by name, id) as rownum
from t
where id = <id>
) tid
on t.rownum between tid.rownum - 5 and tid.rownum + 5
I might suggest something else, though, if you can modify the table structure. Most databases offer the ability to add an auto-increment column when a row is inserted. If your records are never deleted, this can server as your counter, simplifying your queries.