Complicated SQL query vs datatable iteration and proccessing - sql

I have a three table structure in SQL Server 2012: people, connections and messages. The affected schema would be like this:
People: Id (pk bigint), name...
Connections: Id (pk bigint), IdPpl1 fk, IdPpl2 fk
Messages: Id (pk uniqueidentifier), Idconnection (fk), Messagetype (smallint)
On the Connections table, IdPpl1 and IdPpl2 are fk's to people Id. It could happen to appear in this table the same "two people" but swapping their column, E.G:
Id IdPpl1 IdPpl2
.. ...... ......
3 101 105
8 105 101
9 101 106
10 106 101
The above situation is correct. Actually, those are the maximum occurrences of these "two people" in the table.
The Messages table holds the information of which "connection" sent a message.
Id IdConnection Messagetype
.. ............ ...........
24 3 1
25 8 1
26 3 2
27 8 2
28 9 3
29 10 2
(Note: the messages are one-way, that's why there can be two rows in the connections table affecting the same two people: on the first row, one person is the sender and the other the receiver, on the second row they swap)
Given a People Id, I need a SQL query to show "least connectiontype messages mutually sent by mutually connected people" and an extra colum indicating if the messagetype matches or not. The result should be like this, for People Id 101:
Person_id Person_name IdConnection MatchingMsgType
......... ........... ............ ...............
105 John 3 1
106 Peter 9 0
The first row appears because of MsgIds 24 and 25. A potential row corresponding with messages 26 and 27 won't appear because a previous matching messagetype was found.
The second row appears because of MsgIds 28 and 29, marking the messagetype as non-matching.
Currently I get all the "messages related to a person" and iterate through the datatable sorting, filtering and operating in-memory.
Would you go with a full-SQL solution (I want to preserve full isolation between app tiers) or is more suitable the datatable iteration?
Thanks in advance!!

Obviously it depends on the length of the resulting set of your current db query (the one resulting in all rows related to a user). It is not clear if rows are ever removed from you tables. If not, your solution does not scale, since the number of matching rows will grow for ever. If instead you can assert the the number of resulting rows has some bound (for example: the maximum number of connections a user can open at the same time) then your solution might be good enough.

Related

The IDs change in the column

I am quite a novice in programming and I kind need your help regarding SQL and an issue I notice.
I have a table:
date, ID, secondary ID, expenses
jul2020 258 0004 1000
jul2020 xxx xxxx xxx
...... .... .... .....
aug2020 258 0008 2000
aug2020 xxx xxxx xxx
aug2020 500 0004 1000
Id and secondary should be unique and always matching. But I notice that they are not. It's either correct the ID or the secondary ID. I want to sum all the D column per unique ID.
Thanks for reading and if you have any ideas would be very helpful
UPDATE: everything is numeric even ID. It's like this. As you can see we have different dates (but for the same date multiple customers). I notice that customer 258 for secondary ID 0004 during the years the ID or the secondary ID changes. And I wan to assign the same ID as the first date and the same secondary ID as the first date ( or any day just to be consistent). I want to to do this cause I want to know how many expenses each customer has during the years. There are like 50m obs.

Is it possible to match the "next" unmatched record in a SQL query where there is no strictly unique common field between tables?

Using Access 2010 and its version of SQL, I am trying to find a way to relate two tables in a query where I do not have strict, unique values in each table, using concatenated fields that are mostly unique, then matching each unmatched next record (measured by a date field or the record id) in each table.
My business receives checks that we do not cash ourselves, but rather forward to a client for processing. I am trying to build a query that will match the checks that we forward to the client with a static report that we receive from the client indicating when checks were cashed. I have no control over what the client reports back to us.
When we receive a check, we record the name of the payor, the date that we received the check, the client's account number, the amount of the check, and some other details in a table called "Checks". We add a matching field which comes as close as we can get to a unique identifier to match against the client reports (more on that in a minute).
Checks:
ID Name Acct Amt Our_Date Match
__ ____ ____ ____ _____ ______
1 Dave 1001 10.51 2/14/14 1001*10.51
2 Joe 1002 12.14 2/28/14 1002*12.14
3 Sam 1003 50.00 3/01/14 1003*50.00
4 Sam 1003 50.00 4/01/14 1003*50.00
5 Sam 1003 50.00 5/01/14 1003*50.00
The client does not report back to us the date that WE received the check, the check number, or anything else useful for making unique matches. They report the name, account number, amount, and the date of deposit. The client's report comes weekly. We take that weekly report and append the records to make a second table out of it.
Return:
ID Name Acct Amt Their_Date Unique1
__ ____ ____ ____ _____ ______
355 Dave 1001 10.51 3/25/14 1001*10.51
378 Joe 1002 12.14 4/04/14 1002*12.14
433 Sam 1003 50.00 3/08/14 1003*50.00
599 Sam 1003 50.00 5/11/14 1003*50.00
Instead of giving us back the date we received the check, we get back the date that they processed it. There is no way to make a rule to compare the two dates, because the deposit dates vary wildly. So the closest thing I can get for a unique identifier is a concatenated field of the account number and the amount.
I am trying to match the records on these two tables so that I know when the checks we forward get deposited. If I do a simple join using the two concatenated fields, it works most of the time, but we run into a problem with payors like Sam, above, who is making regular monthly payments of the same amount. In a simple join, if one of Sam's payments appears in the Return table, it matches to all of the records in the Checks table.
To limit that behavior and match the first Sam entry on the Return table to the first Sam entry on the Checks table, I wrote the following query:
SELECT return.*, checks.*
FROM return, checks
WHERE (( ( checks.id ) = (SELECT TOP 1 id
FROM checks
WHERE match = return.unique1
ORDER BY [our_date]) ));
This works when there is only one of Sam's records in the Return table. The problem comes when the second entry for Sam hits the Return table (Return.ID 599) as the client's weekly reports are added to the table. When that happens, the query appropriately (for my purposes) only lists that two of Sam's checks have been processed, but uses the "Top 1 ID" record to supply the row's details from the Return table:
Checks_Return_query:
Checks.ID Name Acct Amt Our_Date Their_Date Return.ID
__ ____ ____ ____ _____ ______ ________
1 Dave 1001 10.51 2/14/14 3/25/14 355
2 Joe 1002 12.14 2/28/14 4/04/14 378
3 Sam 1003 50.00 3/01/14 3/08/14 433
4 Sam 1003 50.00 4/01/14 3/08/14 433
In other words, the query repeats the Return table info for record Return.ID 433 instead of matching Return.ID 599, which is I guess what I should expect from the TOP 1 operator.
So I am trying to figure out how I can get the query to take the two concatenated fields in Checks and Return, compare them to find matching sets, then select the next unmatched record in Checks (with "next" being measured either by the ID or Our_Date) with the next unmatched record in Return (again, with "next" being measured either by the ID or Their_Date).
I spent many hours in a dark room turning the query into various joins, and back again, looking at functions like WHERE NOT IN, WHERE NOT EXISTS, FIRST() NEXT() MIN() MAX(). I am afraid I am way over my head.
I am beginning to think that I may have a structural problem, and may need to write the "matched" records in this query to another table of completed transactions, so that I can differentiate between "matched" and "unmatched" records better. But that still wouldn't help me if two of Sam's transactions are on the same weekly report I get from my client.
Are there any suggestions as to query functions I should look into for further research, or confirmation that I am barking up the wrong tree?
Thanks in advance.
I'd say that you really need another table of completed transactions, it could be temporary table.
Regarding your fears "... if two of Sam's transactions are on the same weekly report ", you can use cursor in order to write records "one-by-one" instead of set based transaction.

Create a Parent–Children Web Part Page

I am trying to figure out how to do something that I would think is commonplace, but I cannot find how to do.
Given two Custom Lists, one with a field that is essentially a primary key, and the other with what is essentially a foreign key, I want to show all the rows from the first in one area of the display, and the related records for the selected row of the first, in a second part of the screen.
I am thinking this would be side–by–side web parts on a web-part page.
So:
ID pkID Data ID fkID Data
___________________ ______________________________
| 1 100 Row one. | | 8 100 Related one/one |
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ | 9 100 Related one/two |
2 113 Row two. | 10 100 Related one/three |
3 118 Row n. ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
11 113 Related two/one
12 113 Related two/two
13 118 Related n/one
(That is my attempt to show what is established between the two lists. Top row selected on the left, related records from the other row on the right.)
Surely this is common enough that there is a way to readily do this?
I suppose I might need to create a means of asserting that a row is 'selected.'
You will note that I am not useing the ID field that "belongs" to SharePoint.
You can create look up fields to establish that relationship, sharepoint 2010 even allows you to enforce the relationship like in a SQL database. so for instace you can declare what happens if you try to delete a parent if there is childs (Cascade, Prevent, etc).
Have a read here:
http://office.microsoft.com/en-au/sharepoint-server-help/create-list-relationships-by-using-unique-and-lookup-columns-HA101729901.aspx
About visually displaying them, you might have to create some webparts for it, as the only support OOB is the link to the child entity from the main entity on the parent list.

Getting all the rows even though they are not mapped in Dimension usage

I have a fact table that can store 2 types of transactions - TrxType1, TrxType2 having an attribute called Owner_Id mapped to Dim Owner. Problem is only one type of transaction TrxType1 has owner and the other does not have a relationship. Hence while querying the cube I am not getting the records for TrxType2.
Is there a way to manage it? I have already tried changing Null Processing to UnkownMember but still I am unable to see.
In my practice I always fill in dictionary tables with None value and map to this member all blank values.
But if you don't have any transactions with type TrxType2 how you can count them?
If you have next fact table:
Type_Id Owner_id ...
__________________________________________
1 13 (just for example)
1 8
0 11
0 4
Dictionary TrxType:
___________________________
id Code
0 None
1 TrxType1
2 TrxType2
your dimension can have the following hierarchy
Count of rows
All 4
-None 2
-TrxType1 2
-TrxType2 0
If you have different situation - please write an example.

custom sorting or ordering a table without resorting the whole shebang

For ten years we've been using the same custom sorting on our tables, I'm wondering if there is another solution which involves fewer updates, especially since today we'd like to have a replication/publication date and wouldn't like to have our replication replicate unnecessary entries.I had a look into nested sets, but it doesn't seem to do the job for us.
Base table:
id | a_sort
---+-------
1 10
2 20
3 30
After inserting:
insert into table (a_sort) values(15)
An entry at the second position.
id | a_sort
---+-------
1 10
2 20
3 30
4 15
Ordering the table with:
select * from table order by a_sort
and resorting all the a_sort entries, updating at least id=(2,3,4)
will of course produce the desired output:
id | a_sort
---+-------
1 10
4 20
2 30
3 40
The column names, the column count, datatypes, a possible join, possible triggers or the way the resorting is done is/are irrelevant to the problem.Also we've found some pretty neat ways to do this task fast.
only; how the heck can we reduce the updates in the db to 1 or 2 max.
Seems like an awfully common problem.
The captain obvious in me thougth once "use an a_sort float(53), insert using a fixed value of ordervaluefirstentry+abs(ordervaluefirstentry-ordervaluenextentry)/2".
But this would only allow around 1040 "in between" entries - so never resorting seems a bit problematic ;)
You really didn't describe what you're doing with this data, so forgive me if this is a crazy idea for your situation:
You could make a sort of 'linked list' where instead of a column of values, you have a column for the 'next highest valued' id. This would decrease the number of updates to a maximum of 2.
You can make it doubly linked and also have a column for next lowest, which would bring the maximum number of updates to 3.
See:
http://en.wikipedia.org/wiki/Linked_list