SQL Query relating to grouping entries - sql

When I normalised my database, I used a text value to group together entries giving them the same foreign key. However, I also had 2 other fields prior to normalisation which used reference numbers to group together entries, one into pairs of entries and one into groups of entries. Because these grouped entries did not necessarily have the same text value, some entries will have been left out of the grouping (shared foreign key id).
I need a query which selects all entries which share a PairRef or GroupRef where the entries in that group do not all have the same ForeignKeyID.
Example:
ID PairRef GroupRef ForeignKeyID TextValue (in linked table2)
1 25 25 123 'Text value 123'
2 25 25 255 'Text value 255'
3 1 50 201 'Text value 201'
4 1 50 201 'Text value 201'
5 2 50 202 'Text value 202'
6 2 50 202 'Text value 202'
7 3 50 203 'Text value 203'
8 3 50 203 'Text value 203'
I then need to be able to edit the data to group them together. The problem is that in order to do this, I would need the query to be from more than one table because I need to see the text associated with the foreign key. I have found that using phpMyAdmin, although I can create queries from more than one table using inner joins, the results of these queries cannot be edited in the way that queries from one table can.
I guess the alternative would be to do an update query on the query results. Could you give an example of a quick and easy way of doing an update query on query results, without losing the original query which needs to be used repeatedly.
In the above example, for the regrouping of ID 1 and ID 2 which share the same PairRef, I would need to physically look at TextValue 123 and 255 and depending which one was the more appropriate text label, I would decide on which entry to change. Let's say that 'Text value 123' was the value I wanted to retain for that grouping. I would update ID 2 to ForeignKeyID 123, which would obviously automatically change the TextValue for that entry to 'Text value 123'.
For the regrouping of IDs 3 to 8, which share the same GroupRef, if I decided after looking at the data to re-group them all as 'Text value 201', I would change IDs 5,6,7 and 8 to ForeignKeyID = 201, which would automatically change all the TextValues for those entries to 'Text Value 201'.
IDs 1 to 8 would then no longer appear on the query results because the grouping problem would have been resolved and they would no longer meet the query criteria.
I need to find the easiest way possible of doing this, as grouping entries together is one of the main purposes of the databases and there is a lot of this editing to do.
Thank you

For the first part (select all entries which share a PairRef or GroupRef where the entries in that group do not all have the same ForeignKeyID), the following query can be used. It groups by PairRef and selects PairRefs which have more than 1 distinct ForeignKeyID. Then, all entries which have these PairRefs are selected. Similarly, the data is grouped by GroupRef also. All GroupRefs which have more than 1 distinct ForeignKeyID are selected. Then, all entries which have these GroupRefs are selected.
SELECT
T1.*
FROM Table1 T1
INNER JOIN Table2 T2
ON T1.ForeignKeyID = T2.ForeignKeyID
WHERE PairRef IN
(
SELECT
PairRef
FROM table1
GROUP BY PairRef
HAVING Count(DISTINCT ForeignKeyID) > 1
)
OR GroupRef IN
(
SELECT
GroupRef
FROM table1
GROUP BY GroupRef
HAVING Count(DISTINCT ForeignKeyID) > 1
);
For the second part (edit the data to group them together), I do not understand why you would need to see the TextValue from table2 (if it is corresponding to ForeignKeyID in table1). Anyway, once you have seen the PairRefs / GroupRefs, which have different ForeignKeyID values, you can run an update statement for each PairRef / GroupRef, since it seems like a manual process.
UPDATE Table1
SET ForeignKeyID = <ForeignKeyID to be set>
WHERE PairRef = <PairRef to update>;
UPDATE Table1
SET ForeignKeyID = <ForeignKeyID to be set>
WHERE GroupRef = <PairRef to update>;
You may need to run the first query again to check the data, because the UPDATE query for GroupRef might result in different values for PairRef.
Here is a SQL Fiddle demo. Thank you, #JohnLBevan for the stub.

Related

Query to find duplicate values for two fields

Sorry for the Title, But didn't know how to explain.
I have a table that have 2 fields A and B.
I want find all rows in the table that have duplicate A (more than one record) but at the same time A will consider as a duplicate only if B is different in both rows.
Example:
FIELD A Field B
10 10
10 10 // This is not duplicate
10 10
10 5 // this is a duplicate
How to to this in a single query
Let's break this down into how you would go about constructing such a query. You don't make it clear whether you're looking for all values of A or all rows but let's assume all values of A initially.
The first step therefore is to create a list of all values of A. This can be done two ways, DISTINCT or GROUP BY. I'm going to use GROUP BY because of what else you want to do:
select a
from your_table
group by a
This returns a single column that is unique on A. Now, how can you change this to give you the unique values? The most obvious thing to use is the HAVING clause, which allows you to restrict on aggregated values. For instance the following will give you all values of A which only appear once in the table
select a
from your_table
group by a
having count(*) = 1
That is the count of all values of A inside the group is 1. You don't want this of course, you want to do this with the column B. You need there to exist more than one value of B in order for the situation you want to identify to be possible (if there's only one value of B then it's impossible). This gets us to
select a
from your_table
group by a
having count(b) > 1
This still isn't enough as you want two different values of B. The above just counts the number of records with the column B. Inside an aggregate function you use the DISTINCT keyword to determine unique values; bringing us to:
select a
from your_table
group by a
having count(distinct b) > 1
To transcribe this into English this means select all unique values of A from YOUR_TABLE that have more than one values of B in the group.
You can use this method, or something similar, to build up your own queries as you create them. Determine what you want to achieve and slowly build up to it.
select FIELD from your_table group by FIELD having count(b) > 1
take in consideration that this will return count of all duplicate
example
if you have values
1
1
2
1
it will return 3 for value 1 not 2

in sql how to return single row of data from more than one row in the same table

I have a single table of activities, some labelled 'Assessment' (type_id of 50) and some 'Counselling' (type_id of 9) with dates of the activities. I need to compare these dates to find how long people wait for counselling after assessment. The table contains rows for many people, and that is the primary key of 'id'. My problem is how to produce a result row with both the assessment details and the counselling details for the same person, so that I can compare the dates. I've tried joining the table to itself, and tried nested subqueries, I just can't fathom it. I'm using Access 2010 btw.
Please forgive my stupidity, but here's an example of joining the table to itself that doesn't work, producing nothing (not surprising):
Table looks like:
ID TYPE_ID ACTIVITY_DATE_TIME
----------------------------------
1 9 20130411
1 v 50 v 20130511
2 9 20130511
3 9 20130511
In the above the last two rows have only had assessment so I want to ignore them, and just work on the situation where there's both assessment and counselling 'type-id'
SELECT
civicrm_activity.id, civicrm_activity.type_id,
civicrm_activity.activity_date_time,
civicrm_activity_1.type_id,
civicrm_activity_1.activity_date_time
FROM
civicrm_activity INNER JOIN civicrm_activity AS civicrm_activity_1
ON civicrm_activity.id = civicrm_activity_1.id
WHERE
civicrm_activity.type_id=9
AND civicrm_activity_1.type_id=50;
I'm actually wondering whether this is in fact not possible to do with SQL? I hope it is possible? Thank you for your patience!
Sounds to me like you only want to get the ID numbers where you have a TYPE_ID entry of both 9 and 50.
SELECT DISTINCT id FROM civicrm_activity WHERE type_id = '9' AND id IN (SELECT id FROM civicrm_activity WHERE type_id = '50');
This will give you a list of id's that has entries with both type_id 9 and 50. With that list you can now go and get the specifics.
Use this SQL for the time of type_id 9
SELECT activity_date_time FROM civicrm_activity WHERE id = 'id_from_last_sql' AND type_id = '9'
Use this SQL for the time of type_id 50
SELECT activity_date_time FROM civicrm_activity WHERE id = 'id_from_last_sql' AND type_id = '50'
Your query looks OK to me, too. The one problem might be that you use only one table alias. I don't know, but perhaps Access treats the table name "specially" such that, in effect, the WHERE clause says
WHERE
civicrm_activity.type_id=9
AND civicrm_activity.type_id=50;
That would certainly explain zero rows returned!
To fix that, use an alias for each table. I suggest shorter ones,
SELECT A.id, A.type_id, A.activity_date_time,
B.type_id, B.activity_date_time
FROM civicrm_activity as A
JOIN civicrm_activity as B
ON A.id = B.id
WHERE A.type_id=9
AND B.type_id=50;

Update a table based on a results of a group by

Update a table based on a results of a group by
I've got a tricky update problem I'm trying to solve. There are two tables that contain the same three columns plus additional varied columns, looking like this:
Table1 {pers_id, loc_id, pos, ... }
Table2 {pers_id, loc_id, pos, ... }
None of the fields are unique. The first two fields collectively identify the records in a table (or tables) as belonging to the same entity. Table1 could have 15 records belonging to an entity, and table2 could have 4 records belonging to the same entity. The third column 'pos' is an index from 0 to whatever, and this is the column that I'm trying to update.
In Table1 and in Table2, the pos column begins at 0, and increments based on user selection, so that in the example (15 records in table1 and 4 records in table2), table1 contains 'pos' values of 0 - 14, and Table2 contains 'pos' values of 0-3.
I want to increment the pos field in Table1 with the results of the count of similar entities in Table2. This is the sql statement that correctly gives me the results from table2:
select table2.pers_id, table2.loc_id, count(*) as pos_increment from table2 group by table2.pers_id, table2.loc_id;
The end result of the update, in the example (15 records in table1 and 4 records in table2), would be all records in Table1 of the same entity being incremented by 4 (the result of the specific entity group by). 0 would be changed to 4, 15 to 19, etc.
Is this achievable in a single statement?
Since you only need to increment the pos field the solution is really simple:
update table1 t1
set t1.pos = t1.pos +
(select count(1)
from table2 t2
where t2.pers_id = t1.pers_id
and t2.loc_id = t1.loc_id)
Yes, this is possible, you can use MERGE for some of these upadtes and there are ways to relate values between the update and the subselect. I have done this in the past, but it's tricky and I don't have an existing example.
You can find several examples on this site, some for Oracle and some for other database that will awork with slight modifications.

Oracle / SQL - Count number of occurrences of values in a single column

Okay, I probably could have come up with a better title, but wasn't sure how to word it so let me explain.
Say I have a table with the column 'CODE'. Each record in my table will have either 'A', 'B', or 'C' as it's value in the 'CODE' column. What I would like is to get a count of how many 'A's, 'B's, and 'C's I have.
I know I could accomplish this with 3 different queries, but I'm wondering if there is a way to do it with just 1.
Use:
SELECT t.code,
COUNT(*) AS numInstances
FROM YOUR_TABLE t
GROUP BY t.code
The output will resemble:
code numInstances
--------------------
A 3
B 5
C 1
If a code exists that has not been used, it will not show up. You'd need to LEFT JOIN to the table containing the list of codes in order to see those that don't have any references.

SQL Precedence Query

I have a logging table which has three columns. One column is a unique identifier, One Column is called "Name" and the other is "Status".
Values in the Name column can repeat so that you might see Name "Joe" in multiple rows. Name "Joe" might have a row with a status "open", another row with a status "closed", another with "waiting" and maybe one for "hold". I would like to, using a defined precedence in this highest to lowest order:("Closed","Hold","Waiting" and "Open") pull the highest ranking row for each Name and ignore the others. Anyone know a simple way to do this?
BTW, not every Name will have all status representations, so "Joe" might only have a row for "waiting" and "hold", or maybe just "waiting".
I would create a second table named something like "Status_Precedence", with rows like:
Status | Order
---------------
Closed | 1
Hold | 2
Waiting | 3
Open | 4
In your query of the other table, do a join to this table (on Status_Precedence.Status) and then you can ORDER BY Status_Precedence.Order.
If you don't want to create another table, you can assign numeric precedence using a SELECT CASE
Select Name, Status, Case Status
When 'Closed' then 1
When 'Hold' then 2
When 'Waiting' then 3
When 'Open' Then 4
END
as StatusID
From Logging
Order By StatusId -- Order based on Case
A lookup table is also a good solution though.
I ended up using matt b's solution and using this final query to filter out the lower ranked (lower bing higher numbered).
SELECT * from [TABLE] tb
LEFT JOIN Status_Precedence sp ON tb.Status = sp.Status
WHERE sp.Rank = (SELECT MIN(sp2.rank)
FROM[Table] tb2
LEFT JOIN Status_Precedence sp2 ON tb2.Status = sp2.Status
WHERE tb.Status = tb2.Status)
order by tb.[name]