Update part of the table for a specific category - sql

Let's imagine I have a similar table to this one:
ID | Country | time | location 1 | location 2 | count_clients
------------------------------------------------------------------
1 | PL |2019-01-01 | JAK | ADD3 | 23
2 | PL |2019-03-01 | GGF | ADD5 | 34
3 | PL |2019-01-01 | J3K | 55D3 | 67
4 | NL |2019-04-01 | FDK | AGH3 | 2
5 | NL |2019-01-01 | GGK | AFF3 | 234
It's an aggregated table. Source contains one row per client, in my table it's aggregated showing no. of clients per country, time, location 1 and location 2. It's updated by loading new rows only (new dates). First they are loaded to stage table then, after some modifications, to final table. The values loaded to stage table are already aggregated and stage table contains only new rows.
BUT I just learned that rows in source table can be deleted - it means the "count_clients" value can change or can be deleted. What's also important - I know which COUNTRY, location 1 and location 2 are affected but I don't know WHEN they were changed (was it before or after last load? I don't know).
Do you know any smart ways to handle it? I currently load new rows + rows that were affected by change to stage tables, then remove affected rows from final table and load the stage rows to final table.
The source table is huge. I'm looking for a solution that will allow me to update only part of the table affected by the updates. Please remember that in stage table I have only new rows that needs to be inserted + the rows that was changed. I wanted to use the MERGE statement but to do that I would need to use a part of the table as a target not the whole table. I tried to do it but it didn't work.
I tried to do something like:
MERGE INTO (select country, time, location1, location 2, count from myFinalTable join stage table on country=country and location=location) --target = only rows affected by change
USING myStageTable
ON country = country and location=location
WHEN MATCHED THEN
UPDATE
SET count = count
WHEN NOT MATCHED BY TARGET then INSERT --insert new uploads
WHEN NOT MATCHED BY SOURCE then DELETE
but it looks like I can't use the 'select' statement in target..?

Related

Get the row with latest start date from multiple tables using sub select

I have data from 3 tables as copied below . I am not using joins to get data. I dont know how to use joins for multiple tables scenario. My situation is to update the OLD(eff_start_ts) date rows to sydate in one of the tables when we find the rows returned for a particular user is more than 2. enter code here
subscription_id |Client_id
----------------------------
20685413 |37455837
reward_account_id|subscription_id |CURRENCY_BAL_AMT |CREATE_TS |
----------------------------------------------------------------------
439111697 | 20685413 | -40 |1-09-10 |
REWARD_ACCT_DETAIL_ID|REWARD_ACCOUNT_ID |EFF_START_TS |EFF_STOP_TS |
----------------------------------------------------------------------
230900968 | 439111697 | 14-06-11 | 15-01-19
47193932 | 439111697 | 19-02-14 | 19-12-21
243642632 | 439111697 | 18-03-23 | 99-12-31
247192972 | 439111697 | 17-11-01 | 17-11-01
The SQL should update the EFF_STOP_TS of last table except the second row - 47193932 bcz that has the latest EFF_START_TS.
Expected result is to update the EFF_STOP_TS column of 230900968, 243642632 and 247192972 to sysdate.
As per my understanding, You need to update it per REWARD_ACCOUNT_ID. So, You can try the below code -
UPDATE REWARD_ACCT_DETAIL RAD
SET EFF_STOP_TS = SYSDATE
WHERE EFF_START_TS NOT IN (SELECT MAX(EFF_START_TS)
FROM REWARD_ACCT_DETAIL RAD1
WHERE RAD.REWARD_ACCOUNT_ID = RAD1.REWARD_ACCOUNT_ID)

Trying to find non-duplicate entries in mostly identical tables(access)

I have 2 different databases. They track different things about inventory. in essence they share 3 common fields. Location, item number and quantity. I've extracted these into 2 tables, with only those fields. Every time I find an answer, it doesn't get all the test cases, just some of the fields.
Items can be in multiple locations, and as a turn each location can have multiple items. The primary key would be location and item number.
I need to flag when an entry doesn't match all three fields.
I've only been able to find queries that match an ID or so, or who's queries are beyond my comprehension. in the below, I'd need a query that would show that rows 1,2, and 5 had issues. I'd run it on each table and have to verify it with a physical inventory.
Please refrain from commenting on it being silly having information in 2 different databases, All I get in response it to deal with it =P
Table A
Location ItemNum | QTY
-------------------------
1a1a | as1001 | 5
1a1b | as1003 | 10
1a1b | as1004 | 2
1a1c | as1005 | 15
1a1d | as1005 | 15
Table B
Location ItemNum | QTY
-------------------------
1a1a | as1001 | 10
1a1d | as1003 | 10
1a1b | as1004 | 2
1a1c | as1005 | 15
1a1e | as1005 | 15
This article seemed to do what I wanted but I couldn't get it to work.
To find entries in Table A that don't have an exactly matching entry in Table B:
select A.*
from A
left join B on A.location = B.location and A.ItemNum = B.ItemNum and A.qty = B.qty
where B.location Is Null
Just swap all the A's and B's to get the list of entries in B with no matching entry in A.

Microsoft SQL report builder: Multiple row 'duplicates' to one row per ID

Microsoft SQL Server report Builder
I have a query which returns a variety of information from our Community Intelligence database which runs something like this:
SELECT
viewadhocorganisation.[org_id]
..../other fields
viewadhocorganisationmainactivities.[Main Activity]
FROM
viewadhocorganisation LEFT JOIN viewadhocorganisationmainactivities
ON viewadhocorganisation.[OrganisationID] = viewadhocorganisationmainactivities.[Organisation ID]
WHERE
viewadhocorganisation.[city name] = 'MyCity'
This returns each organisation with its respective fields on a new row with each of its main activities listed seperately in a new row (i.e duplicating)
hence SELECT DISTINCT does nothing in this instance as each row is not strictly a duplicate
ie return is :
ID | Org Name | other fields | Main Activity
1 | Org 1 | other fields | Activity 1
1 | Org 1 | other fields | Activity 2
1 | Org 1 | other fields | Activity 3
1 | Org 1 | other fields | Activity 4
2 | Org 2 | other fields | Activity 1
2 | Org 2 | other fields | Activity 5
2 | Org 2 | other fields | Activity 7
2 | Org 2 | other fields | Activity 8
Main Activity is a text string populated from a seperate Lookup Table (maintained by a central sysadmin) I have tried various SUM, AGGREGATE, (also tried various JOIN, LOOKUP SET but I seem to be running into a lot of errors though I may be using it incorrectly) but have yet to find a solution to get the desired output where all main activities are in one row seperated by a comma :
Output required:
Org_ID | Org Name | other fields | Main Activity
1 | Org 1 | other fields | Activity 1, Activity 2, Activity 3
2 | Org 2 | other fields | Activity 1, Activity 5, Activity 7
The intention is to get a dump of information to integrated into google maps API showing address of Org 1, org 2 etc showing their main activities which I have a procedure for already but am unable to collate the Main activity field into one row
Edit: I have no access to the back end and can only report from the views and tables created by our vendor
you can use temporary tables to achieve this.
make your desired string of activities in one temporary from viewadhocorganisationmainactivities then join that temp table with your main table i.e viewadhocorganisation.
You need to join the table with itself and use the stuff clause to get the strings in a comma separated format. See this fiddle

SQL: Creating a common table from multiple similar tables

I have multiple databases on a server, each with a large table where most rows are identical across all databases. I'd like to move this table to a shared database and then have an override table in each application database which has the differences between the shared table and the original table.
The aim is to make updating and distributing the data easier as well as keeping database sizes down.
Problem constraints
The table is a hierarchical data store with date based validity.
table DATA (
ID int primary key,
CODE nvarchar,
PARENT_ID int foreign key references DATA(ID),
END_DATE datetime,
...
)
Each unique CODE in DATA may have a number of rows, but at most a single row where END_DATE is null or greater than the current time (a single valid row per CODE). New references are only made to valid rows.
Updating the shared database should not require anything to be run in application databases. This means any override tables are final once they have been generated.
Existing references to DATA.ID must point to the same CODE, but other columns do not need to be the same. This means any current rows can be invalidated if necessary and multiple occurrences of the same CODE may be combined.
PARENT_ID references must have same parent CODE before and after the split. The actual PARENT_ID value may change if necessary.
The shared table is updated regularly from an external source and these updates need to be reflected in each database's DATA. CODEs that do not appear in the external source can be thought of as invalid, new references to these will not be added.
Existing functionality will continue to use DATA, so the new view (or alternative) must be transparent. It may, however, contain more rows than the original provided earlier constraints are met.
New functionality will use the shared table directly.
Select performance is a concern, insert/update/delete is not.
The solution needs to support SQL Server 2008 R2.
Possible solution
-- in a single shared DB
DATA_SHARED (table)
-- in each app DB
DATA_SHARED (synonym to DATA_SHARED in shared DB)
DATA_OVERRIDE (table)
DATA (view of DATA_SHARED and DATA_OVERRIDE)
Take an existing DATA table to become DATA_SHARED.
Exclude IDs with more than one possible CODE so only rows common across all databases remain. These missing rows will be added back once the data is updated the first time.
Unfortunately every DATA_OVERRIDE will need all rows that differ in any table, not only rows that differ between DATA_SHARED and the previous DATA. There are several IDs that differ only in a single database, this causes all other databases to inflate. Ideas?
This solution causes DATA_SHARED to have a discontinuous ID space. It's a mild annoyance rather than a major issue, but worth noting.
edit: I should be able to keep all of the rows in DATA_SHARED, just invalidate them, then I only need to store differing rows in DATA_OVERRIDE.
I can't think of any situations where PARENT_ID references become invalid, thoughts?
Before:
DB1.DATA
ID | CODE | PARENT_ID | END_DATE
1 | A | NULL | NULL
2 | A1 | 1 | 2020
3 | A2 | 1 | 2010
DB2.DATA
ID | CODE | PARENT_ID | END_DATE
1 | A | NULL | NULL
2 | X | NULL | NULL
3 | A2 | 1 | 2010
4 | X1 | 2 | NULL
5 | A1 | 1 | 2020
After initial processing (DATA_SHARED created from DB1.DATA):
SHARED.DATA_SHARED
ID | CODE | PARENT_ID | END_DATE
1 | A | NULL | NULL
3 | A2 | 1 | 2010
-- END_DATE is omitted from DATA_OVERRIDE as every row is implicitly invalid
DB1.DATA_OVERRIDE
ID | CODE | PARENT_ID
2 | A1 | 1
DB2.DATA_OVERRIDE
ID | CODE | PARENT_ID
2 | X |
4 | X1 | 2
5 | A1 | 1
After update from external data where A1 exists in source but X and X1 don't:
SHARED.DATA_SHARED
ID | CODE | PARENT_ID | END_DATE
1 | A | NULL | NULL
3 | A2 | 1 | 2010
6 | A1 | 1 | 2020
edit: The DATA view would be something like:
select D.ID, ...
from DATA D
left join DATA_OVERRIDE O on D.ID = O.ID
where O.ID is null
union all
select ID, ...
from DATA_OVERRIDE
order by ID
Given the small number of rows in DATA_OVERRIDE, performance is good enough.
Alternatives
I also considered an approach where instead of DATA_SHARED sharing IDs with the original DATA, there would be mapping tables to link DATA.IDs to DATA_SHARED.IDs. This would mean DATA_SHARED would have a much cleaner ID-space and there could be less data duplication, but the DATA view would require some fairly heavy joins. The additional complexity is also a significant negative.
Conclusion
Thank you for your time if you made it all the way to the end, this question ended up quite long as I was thinking it through as I wrote it. Any suggestions or comments would be appreciated.

Need Strategy To Generate Snapshot Of Table Contents After a Series of Random Inserts

There are 10 rooms with a set of inventory items. When an item is added/deleted from a room a new row gets inserted to a MS-SQL table. I need the latest update for each room.
Take this series of inserts:
id| room| descriptor1| descriptor2| descriptor3|
1 | A | blue | 2 | large |
2 | B | red | 1 | small |
3 | A | blue | 1 | large |
What the resulting table needs to show:
room| descriptor1| descriptor2| descriptor3|
A | blue |1 | large |
B | red |1 | small |
Ideally, I would write a trigger that would update a room status table. I could then just query the room status table (Select *) to obtain the result. However, this table does not belong to me, I only have read access to a constantly updated table. I need to poll periodically or when I need a report.
How do I do this in MS-SQL? I have some inkling of how I would do it to obtain the status of just one room something like:
SELECT descriptor1, descriptor2, descriptor3
FROM myTable mt1
WHERE id = (SELECT MAX(id)
from myTable mt2
WHERE room = 'A'
);
Since I have 10 rooms i would need to do this query 10 times. Can this be narrowed down to a single query? What happens when there are 100 rooms? Is there a better way?
Thanks!
Matt
You were very close:
SELECT descriptor1, descriptor2, descriptor3
FROM myTable mt1
WHERE id in (SELECT MAX(id)
From myTable
Group By room
);
Instead of creating a trigger to update a static table, you should look into creating a view.