Filling one table using another table in SQL Server - sql

I have two SQL tables as follows:
As you may note, the first table has a monthly frequency (date column), while the second table has a quarterly frequency. Here is what I would like to do:
For each issueid from table 1, I would like to look at the date, determine what is the previous end of quarter, and go fetch data from table 2 corresponding to that issue for that end of quarter, and insert it in the first table in the last two columns.
For example: take issueid 123456 and date 1/31/2014. The previous end of quarter is 12/31/2013. I would like to go to table 1, copy q_exp and q_act that correspond to that issueid and 12/31/2013, and paste it into the first table.
Of course, I would like to fill the entire first table and minimize manual inserts.
Any help would be appreciated! Thanks!

Try the following query
UPDATE issues
SET q_exp=(SELECT TOP 1 q.q_exp
FROM quarterlyTable q
WHERE q.issueid=i.issueid
AND q.[date]<=i.[date]
ORDER BY q.[date] DESC)
,q_act= (SELECT TOP 1 q.q_act
FROM quarterlyTable q
WHERE q.issueid=i.issueid
AND q.[date]<=i.[date]
ORDER BY q.[date] DESC)
FROM issues i

Related

Update table based on LAG value and condition

I'm trying to update the table with a lagging value for specific field when that field is different from '1900-01-01 00:00:00'
select Ticket_ID, Business_Area, Priority, HF_Client_Name, closed_date, Closed_Date_ID, Next_Create_date from schema.table_name order by closed_date desc;
And this is the result I'm trying to get:
I truly appreciate any help I can get. My DB is MYSQL db.
I'm working on a solution with a LAG function, an will paste my code as soon as I have something worth showing.
Thank you all!
Rosa
I've managed to solve this by creating a temp table which contains all the tickets with 0. I joined my original table without the 0 tickets to this second temp table.
The most important piece was to create a temp table with maximal next created date based on that month:
select a.business_area, a.priority, a.hf_client_name, DATE_FORMAT(a.closed_date ,'%Y-%m') month_of_ticket, max(next_created_date) next_created_date
from tmp_next_lead_date a
group by a.business_area, a.priority, a.hf_client_name, DATE_FORMAT(a.closed_date ,'%Y-%m');
After that I just joined the two tables based on month of ticket and the rest of the key.
select a.ticket_id, a.business_area, a.priority, a.hf_client_name, a.closed_date,
/*DATE_FORMAT(a.closed_date ,'%Y-%m') month_of_ticket,*/ a.closed_date_id, b.next_created_date
from tmp_missing_combination a
join tmp_next_created_Date b
on a.business_area=b.business_area
and a.hf_client_name=b.hf_client_name
and **DATE_FORMAT(a.closed_date ,'%Y-%m')=b.month_of_ticket** ;

SQL Server Sum multiple rows into one - no temp table

I would like to see a most concise way to do what is outlined in this SO question: Sum values from multiple rows into one row
that is, combine multiple rows while summing a column.
But how to then delete the duplicates. In other words I have data like this:
Person Value
--------------
1 10
1 20
2 15
And I want to sum the values for any duplicates (on the Person col) into a single row and get rid of the other duplicates on the Person value. So my output would be:
Person Value
-------------
1 30
2 15
And I would like to do this without using a temp table. I think that I'll need to use OVER PARTITION BY but just not sure. Just trying to challenge myself in not doing it the temp table way. Working with SQL Server 2008 R2
Simply put, give me a concise stmt getting from my input to my output in the same table. So if my table name is People if I do a select * from People on it before the operation that I am asking in this question I get the first set above and then when I do a select * from People after the operation, I get the second set of data above.
Not sure why not using Temp table but here's one way to avoid it (tho imho this is an overkill):
UPDATE MyTable SET VALUE = (SELECT SUM(Value) FROM MyTable MT WHERE MT.Person = MyTable.Person);
WITH DUP_TABLE AS
(SELECT ROW_NUMBER()
OVER (PARTITION BY Person ORDER BY Person) As ROW_NO
FROM MyTable)
DELETE FROM DUP_TABLE WHERE ROW_NO > 1;
First query updates every duplicate person to the summary value. Second query removes duplicate persons.
Demo: http://sqlfiddle.com/#!3/db7aa/11
All you're asking for is a simple SUM() aggregate function and a GROUP BY
SELECT Person, SUM(Value)
FROM myTable
GROUP BY Person
The SUM() by itself would sum up the values in a column, but when you add a secondary column and GROUP BY it, SQL will show distinct values from the secondary column and perform the aggregate function by those distinct categories.

SQL - Insert using Column based on SELECT result

I currently have a table called tempHouses that looks like:
avgprice | dates | city
dates are stored as yyyy-mm-dd
However I need to move the records from that table into a table called houses that looks like:
city | year2002 | year2003 | year2004 | year2005 | year2006
The information in tempHouses contains average house prices from 1995 - 2014.
I know I can use SUBSTRING to get the year from the dates:
SUBSTRING(dates, 0, 4)
So basically for each city in tempHouses.city I need to get the the average house price from the above years into one record.
Any ideas on how I would go about doing this?
This is an SQL Server approach, and a PIVOT may be a better, but here's one way:
SELECT City,
AVG(year2002) AS year2002,
AVG(year2003) AS year2003,
AVG(year2004) AS year2004
FROM (
SELECT City,
CASE WHEN Dates BETWEEN '2002-01-01T00:00:00' AND '2002-12-31T23:59:59' THEN avgprice
ELSE 0
END AS year2002,
CASE WHEN Dates BETWEEN '2003-01-01T00:00:00' AND '2003-12-31T23:59:59' THEN avgprice
ELSE 0
END AS year2003
CASE WHEN Dates BETWEEN '2004-01-01T00:00:00' AND '2004-12-31T23:59:59' THEN avgprice
ELSE 0
END AS year2004
-- Repeat for each year
)
GROUP BY City
The inner query gets the data into the correct format for each record (City, year2002, year2003, year2004), whilst the outer query gets the average for each City.
There many be many ways to do this, and performance may be the deciding factor on which one to choose.
The best way would be to use a script to perform the query execution for you because you will need to run it multiple times and you extract the data based on year. Make sure that the only required columns are city & row id:
http://dev.mysql.com/doc/refman/5.0/en/insert-select.html
INSERT INTO <table> (city) VALUES SELECT DISTINCT `city` from <old_table>;
Then for each city extract the average values, insert them into a temporary table and then insert into the main table.
SELECT avg(price), substring(dates, 0, 4) dates from <old_table> GROUP BY dates;
Otherwise you're looking at a combination query using joins and potentially unions to extrapolate the data. Because you're flattening the table into a single row per city it's going to be a little tough to do. You should create indexes first on the date column if you don't want the database query to fail with memory limits or just take a very long time to execute.

finding consecutive date pairs in SQL

I have a question here that looks a little like some of the ones that I found in search, but with solutions for slightly different problems and, importantly, ones that don't work in SQL 2000.
I have a very large table with a lot of redundant data that I am trying to reduce down to just the useful entries. It's a history table, and the way it works, if two entries are essentially duplicates and consecutive when sorted by date, the latter can be deleted. The data from the earlier entry will be used when historical data is requested from a date between the effective date of that entry and the next non-duplicate entry.
The data looks something like this:
id user_id effective_date important_value useless_value
1 1 1/3/2007 3 0
2 1 1/4/2007 3 1
3 1 1/6/2007 NULL 1
4 1 2/1/2007 3 0
5 2 1/5/2007 12 1
6 3 1/1/1899 7 0
With this sample set, we would consider two consecutive rows duplicates if the user_id and the important_value are the same. From this sample set, we would only delete row with id=2, preserving the information from 1-3-2007, showing that the important_value changed on 1-6-2007, and then showing the relevant change again on 2-1-2007.
My current approach is awkward and time-consuming, and I know there must be a better way. I wrote a script that uses a cursor to iterate through the user_id values (since that breaks the huge table up into manageable pieces), and creates a temp table of just the rows for that user. Then to get consecutive entries, it takes the temp table, joins it to itself on the condition that there are no other entries in the temp table with a date between the two dates. In the pseudocode below, UDF_SameOrNull is a function that returns 1 if the two values passed in are the same or if they are both NULL.
WHILE (##fetch_status <> -1)
BEGIN
SELECT * FROM History INTO #history WHERE user_id = #UserId
--return entries to delete
SELECT h2.id
INTO #delete_history_ids
FROM #history h1
JOIN #history h2 ON
h1.effective_date < h2.effective_date
AND dbo.UDF_SameOrNull(h1.important_value, h2.important_value)=1
WHERE NOT EXISTS (SELECT 1 FROM #history hx WHERE hx.effective_date > h1.effective_date and hx.effective_date < h2.effective_date)
DELETE h1
FROM History h1
JOIN #delete_history_ids dh ON
h1.id = dh.id
FETCH NEXT FROM UserCursor INTO #UserId
END
It also loops over the same set of duplicates until there are none, since taking out rows creates new consecutive pairs that are potentially dupes. I left that out for simplicity.
Unfortunately, I must use SQL Server 2000 for this task and I am pretty sure that it does not support ROW_NUMBER() for a more elegant way to find consecutive entries.
Thanks for reading. I apologize for any unnecessary backstory or errors in the pseudocode.
OK, I think I figured this one out, excellent question!
First, I made the assumption that the effective_date column will not be duplicated for a user_id. I think it can be modified to work if that is not the case - so let me know if we need to account for that.
The process basically takes the table of values and self-joins on equal user_id and important_value and prior effective_date. Then, we do 1 more self-join on user_id that effectively checks to see if the 2 joined records above are sequential by verifying that there is no effective_date record that occurs between those 2 records.
It's just a select statement for now - it should select all records that are to be deleted. So if you verify that it is returning the correct data, simply change the select * to delete tcheck.
Let me know if you have questions.
select
*
from
History tcheck
inner join History tprev
on tprev.[user_id] = tcheck.[user_id]
and tprev.important_value = tcheck.important_value
and tprev.effective_date < tcheck.effective_date
left join History checkbtwn
on tcheck.[user_id] = checkbtwn.[user_id]
and checkbtwn.effective_date < tcheck.effective_date
and checkbtwn.effective_date > tprev.effective_date
where
checkbtwn.[user_id] is null
OK guys, I did some thinking last night and I think I found the answer. I hope this helps someone else who has to match consecutive pairs in data and for some reason is also stuck in SQL Server 2000.
I was inspired by the other results that say to use ROW_NUMBER(), and I used a very similar approach, but with an identity column.
--create table with identity column
CREATE TABLE #history (
id int,
user_id int,
effective_date datetime,
important_value int,
useless_value int,
idx int IDENTITY(1,1)
)
--insert rows ordered by effective_date and now indexed in order
INSERT INTO #history
SELECT * FROM History
WHERE user_id = #user_id
ORDER BY effective_date
--get pairs where consecutive values match
SELECT *
FROM #history h1
JOIN #history h2 ON
h1.idx+1 = h2.idx
WHERE h1.important_value = h2.important_value
With this approach, I still have to iterate over the results until it returns nothing, but I can't think of any way around that and this approach is miles ahead of my last one.

Getting the last record in SQL in WHERE condition

i have loanTable that contain two field loan_id and status
loan_id status
==============
1 0
2 9
1 6
5 3
4 5
1 4 <-- How do I select this??
4 6
In this Situation i need to show the last Status of loan_id 1 i.e is status 4. Can please help me in this query.
Since the 'last' row for ID 1 is neither the minimum nor the maximum, you are living in a state of mild confusion. Rows in a table have no order. So, you should be providing another column, possibly the date/time when each row is inserted, to provide the sequencing of the data. Another option could be a separate, automatically incremented column which records the sequence in which the rows are inserted. Then the query can be written.
If the extra column is called status_id, then you could write:
SELECT L1.*
FROM LoanTable AS L1
WHERE L1.Status_ID = (SELECT MAX(Status_ID)
FROM LoanTable AS L2
WHERE L2.Loan_ID = 1);
(The table aliases L1 and L2 could be omitted without confusing the DBMS or experienced SQL programmers.)
As it stands, there is no reliable way of knowing which is the last row, so your query is unanswerable.
Does your table happen to have a primary id or a timestamp? If not then what you want is not really possible.
If yes then:
SELECT TOP 1 status
FROM loanTable
WHERE loan_id = 1
ORDER BY primaryId DESC
-- or
-- ORDER BY yourTimestamp DESC
I assume that with "last status" you mean the record that was inserted most recently? AFAIK there is no way to make such a query unless you add timestamp into your table where you store the date and time when the record was added. RDBMS don't keep any internal order of the records.
But if last = last inserted, that's not possible for current schema, until a PK addition:
select top 1 status, loan_id
from loanTable
where loan_id = 1
order by id desc -- PK
Use a data reader. When it exits the while loop it will be on the last row. As the other posters stated unless you put a sort on the query, the row order could change. Even if there is a clustered index on the table it might not return the rows in that order (without a sort on the clustered index).
SqlDataReader rdr = SQLcmd.ExecuteReader();
while (rdr.Read())
{
}
string lastVal = rdr[0].ToString()
rdr.Close();
You could also use a ROW_NUMBER() but that requires a sort and you cannot use ROW_NUMBER() directly in the Where. But you can fool it by creating a derived table. The rdr solution above is faster.
In oracle database this is very simple.
select * from (select * from loanTable order by rownum desc) where rownum=1
Hi if this has not been solved yet.
To get the last record for any field from a table the easiest way would be to add an ID to each record say pID. Also say that in your table you would like to hhet the last record for each 'Name', run the simple query
SELECT Name, MAX(pID) as LastID
INTO [TableName]
FROM [YourTableName]
GROUP BY [Name]/[Any other field you would like your last records to appear by]
You should now have a table containing the Names in one column and the last available ID for that Name.
Now you can use a join to get the other details from your primary table, say this is some price or date then run the following:
SELECT a.*,b.Price/b.date/b.[Whatever other field you want]
FROM [TableName] a LEFT JOIN [YourTableName]
ON a.Name = b.Name and a.LastID = b.pID
This should then give you the last records for each Name, for the first record run the same queries as above just replace the Max by Min above.
This should be easy to follow and should run quicker as well
If you don't have any identifying columns you could use to get the insert order. You can always do it like this. But it's hacky, and not very pretty.
select
t.row1,
t.row2,
ROW_NUMBER() OVER (ORDER BY t.[count]) AS rownum from (
select
tab.row1,
tab.row2,
1 as [count]
from table tab) t
So basically you get the 'natural order' if you can call it that, and add some column with all the same data. This can be used to sort by the 'natural order', giving you an opportunity to place a row number column on the next query.
Personally, if the system you are using hasn't got a time stamp/identity column, and the current users are using the 'natural order', I would quickly add a column and use this query to create some sort of time stamp/incremental key. Rather than risking having some automation mechanism change the 'natural order', breaking the data needed.
I think this code may help you:
WITH cte_Loans
AS
(
SELECT LoanID
,[Status]
,ROW_NUMBER() OVER(ORDER BY (SELECT 1)) AS RN
FROM LoanTable
)
SELECT LoanID
,[Status]
FROM LoanTable L1
WHERE RN = ( SELECT max(RN)
FROM LoanTable L2
WHERE L2.LoanID = L1.LoanID)