SQL query select from table and group on other column - sql

I'm phrasing the question title poorly as I'm not sure what to call what I'm trying to do but it really should be simple.
I've a link / join table with two ID columns. I want to run a check before saving new rows to the table.
The user can save attributes through a webpage but I need to check that the same combination doesn't exist before saving it. With one record it's easy as obviously you just check if that attributeId is already in the table, if it is don't allow them to save it again.
However, if the user chooses a combination of that attribute and another one then they should be allowed to save it.
Here's an image of what I mean:
So if a user now tried to save an attribute with ID of 1 it will stop them, but I need it to also stop them if they tried ID's of 1, 10 so long as both 1 and 10 had the same productAttributeId.
I'm confusing this in my explanation but I'm hoping the image will clarify what I need to do.
This should be simple so I presume I'm missing something.

If I understand the question properly, you want to prevent the combination of AttributeId and ProductAttributeId from being reused. If that's the case, simply make them a combined primary key, which is by nature UNIQUE.
If that's not feasible, create a stored procedure that runs a query against the join for instances of the AttributeId. If the query returns 0 instances, insert the row.
Here's some light code to present the idea (may need to be modified to work with your database):
SELECT COUNT(1) FROM MyJoinTable WHERE AttributeId = #RequestedID
IF ##ROWCOUNT = 0
BEGIN
INSERT INTO MyJoinTable ...
END

You can control your inserts via a stored procedure. My understanding is that
users can select a combination of Attributes, such as
just 1
1 and 10 together
1,4,5,10 (4 attributes)
These need to enter the table as a single "batch" against a (new?) productAttributeId
So if (1,10) was chosen, this needs to be blocked because 1-2 and 10-2 already exist.
What I suggest
The stored procedure should take the attributes as a single list, e.g. '1,2,3' (comma separated, no spaces, just integers)
You can then use a string splitting UDF or an inline XML trick (as shown below) to break it into rows of a derived table.
Test table
create table attrib (attributeid int, productattributeid int)
insert attrib select 1,1
insert attrib select 1,2
insert attrib select 10,2
Here I use a variable, but you can incorporate as a SP input param
declare #t nvarchar(max) set #t = '1,2,10'
select top(1)
t.productattributeid,
count(t.productattributeid) count_attrib,
count(*) over () count_input
from (select convert(xml,'<a>' + replace(#t,',','</a><a>') + '</a>') x) x
cross apply x.x.nodes('a') n(c)
cross apply (select n.c.value('.','int')) a(attributeid)
left join attrib t on t.attributeid = a.attributeid
group by t.productattributeid
order by countrows desc
Output
productattributeid count_attrib count_input
2 2 3
The 1st column gives you the productattributeid that has the most matches
The 2nd column gives you how many attributes were matched using the same productattributeid
The 3rd column is how many attributes exist in the input
If you compare the last 2 columns and the counts
match - you can use the productattributeid to attach to the product which has all these attributes
don't match - then you need to do an insert to create a new combination

Related

Matrix table index SQL Server 2008

I have a table with two columns built from another table of names, one identity and one a name like this:
ID---Name
1----Mike
2----Jeff
3----Robert
...down to however many
Could be 10 rows, could be 100. This will vary depending on input from other tables that are always changing but never be over 160 or so.
Now, pairings of names will have some meaning and thus a decimal data type score will be associated with said pairing (how at this point doesn’t matter, just need to build it for now...numbers just illustrative). I envision a matrix kind of like this:
ID------Name------Mike-------Jeff--------Robert-------- ...out to however many
1 -------Mike-------NULL------100.1------5.4-------- ...out to however many
2 -------Jeff---------100.1------NULL-----21.23--------- ...out to however many
3 ------Robert-------5.4--------21.23-----NULL---------...out to however many
…down to however many happen to be in the first table…
Maybe this isn’t quite the most optimal way to go (Yes, I know there are duplicates in the table but I plan to structure the queries such that the duplicates are ignored) but at this point am not aware of many viable options. After searching around, I thought maybe I wanted a pivot but that doesn’t seem to fit what I have here because I’m leaving the names in the column and associating them as column heads for a paired score. Then I thought maybe I wanted to store a variable as the value of each row and then add them as the columns. That was no help. My latest iteration was maybe creating a temp table as an exact copy with and identity column, then trying to select the specific name by the identity and looping through them but I can’t even seem to grab the first name and make it a column name in addition to a row value under the name column...see below
--create a table of names with an identity column
CREATE TABLE myTable2
(
ID INT IDENTITY(1,1),
Name VARCHAR(5),
);
--add names to the table from a different table
INSERT INTO myTable1 (Name)
SELECT Name
FROM myTable1
--create a temp table with the same values
SELECT ID, Name
INTO #new
FROM myTable2
GROUP BY ID, Name
--insert name from first row as a column head
INSERT INTO myTable2 (SELECT Number FROM #new WHERE ID =1)
So, in the last bit there, INSERT INTO”, I want to copy the names, in this instance “Mike” and make it ALSO a column head in the same table where it is a row (like in my second table). I get an error message that the syntax is not correct for the statement. Why isn’t this allowed? How can I get it to do what I want? It also has been suggested by someone that knows way more about this stuff than me, that maybe instead of building the table as a matrix, build it as below. It is possible here to get rid of the duplicates this way and I would except I have no idea where to even begin doing this…
Name1-----------Name2-----------Calculated Value
Mike--------------Mike-------------NULL
Jeff---------------Mike-------------100.1
Robert-------------Mike-------------5.4
Mike--------------Jeff-------------100.1
Jeff----------------Jeff-------------NULL
Robert------------Jeff-------------21.23
Mike--------------Robert-----------5.4
Jeff---------------Robert-----------21.23
Robert------------Robert-----------NULL
...etc
Any help suggestions or pointing of me in the right and most appropriate direction would be greatly appreciated!
EDIT: Here's how I solved my problem. Looks like the Cartesian product was the way to go. Thanks #Alex Kudryashev
--create a table of cross joined names
CREATE TABLE cartNames
(
Name1 VARCHAR(5),
Name2 VARCHAR(5),
);
--create two temporary tables from a source table of names
SELECT Name AS Name1
INTO #name1
FROM names
GROUP BY Name
SELECT Name AS Name2
INTO #Name2
FROM names
GROUP BY Name
--populate the Cartesian table
INSERT INTO cartNames
SELECT * FROM #name1 CROSS JOIN #name2
--get rid of the temp tables
DROP TABLE #Name1
DROP TABLE #Name2
--add columns and populate calculated scores
---
It looks like you want to create a Cartesian Product. There is very easy way to do so.
declare #tbl table(name varchar(10))
insert #tbl(name) values('MIke'),('Jeff'),('Robert')
select t1.name name1,t2.name name2, some_udf(t1.name,t2.name) calc_value
from #tbl t1 cross join #tbl t2

SQL Server Unique Composite Key of Two Field With Second Field Auto-Increment

I have the following problem, I want to have Composite Primary Key like:
PRIMARY KEY (`base`, `id`);
for which when I insert a base the id to be auto-incremented based on the previous id for the same base
Example:
base id
A 1
A 2
B 1
C 1
Is there a way when I say:
INSERT INTO table(base) VALUES ('A')
to insert a new record with id 3 because that is the next id for base 'A'?
The resulting table should be:
base id
A 1
A 2
B 1
C 1
A 3
Is it possible to do it on the DB exactly since if done programmatically it could cause racing conditions.
EDIT
The base currently represents a company, the id represents invoice number. There should be auto-incrementing invoice numbers for each company but there could be cases where two companies have invoices with the same number. Users logged with a company should be able to sort, filter and search by those invoice numbers.
Ever since someone posted a similar question, I've been pondering this. The first problem is that DBs don't provide "partitionable" sequences (that would restart/remember based on different keys). The second is that the SEQUENCE objects that are provided are geared around fast access, and can't be rolled back (ie, you will get gaps). This essentially this rules out using a built-in utility... meaning we have to roll our own.
The first thing we're going to need is a table to store our sequence numbers. This can be fairly simple:
CREATE TABLE Invoice_Sequence (base CHAR(1) PRIMARY KEY CLUSTERED,
invoiceNumber INTEGER);
In reality the base column should be a foreign-key reference to whatever table/id defines the business(es)/entities you're issuing invoices for. In this table, you want entries to be unique per issued-entity.
Next, you want a stored proc that will take a key (base) and spit out the next number in the sequence (invoiceNumber). The set of keys necessary will vary (ie, some invoice numbers must contain the year or full date of issue), but the base form for this situation is as follows:
CREATE PROCEDURE Next_Invoice_Number #baseKey CHAR(1),
#invoiceNumber INTEGER OUTPUT
AS MERGE INTO Invoice_Sequence Stored
USING (VALUES (#baseKey)) Incoming(base)
ON Incoming.base = Stored.base
WHEN MATCHED THEN UPDATE SET Stored.invoiceNumber = Stored.invoiceNumber + 1
WHEN NOT MATCHED BY TARGET THEN INSERT (base) VALUES(#baseKey)
OUTPUT INSERTED.invoiceNumber ;;
Note that:
You must run this in a serialized transaction
The transaction must be the same one that's inserting into the destination (invoice) table.
That's right, you'll still get blocking per-business when issuing invoice numbers. You can't avoid this if invoice numbers must be sequential, with no gaps - until the row is actually committed, it might be rolled back, meaning that the invoice number wouldn't have been issued.
Now, since you don't want to have to remember to call the procedure for the entry, wrap it up in a trigger:
CREATE TRIGGER Populate_Invoice_Number ON Invoice INSTEAD OF INSERT
AS
DECLARE #invoiceNumber INTEGER
BEGIN
EXEC Next_Invoice_Number Inserted.base, #invoiceNumber OUTPUT
INSERT INTO Invoice (base, invoiceNumber)
VALUES (Inserted.base, #invoiceNumber)
END
(obviously, you have more columns, including others that should be auto-populated - you'll need to fill them in)
...which you can then use by simply saying:
INSERT INTO Invoice (base) VALUES('A');
So what have we done? Mostly, all this work was about shrinking the number of rows locked by a transaction. Until this INSERT is committed, there are only two rows locked:
The row in Invoice_Sequence maintaining the sequence number
The row in Invoice for the new invoice.
All other rows for a particular base are free - they can be updated or queried at will (deleting information out of this kind of system tends to make accountants nervous). You probably need to decide what should happen when queries would normally include the pending invoice...
you can use the trigger for before insert and assign the next value by taking the max(id) with "base" filter which is "A" in this case.
That will give you the max(id) value as 2 and than increment it by max(id)+1. now push the new value to the "id" field. before insert.
I think this may help you
MSSQL Triggers: http://msdn.microsoft.com/en-in/library/ms189799.aspx
Test Table
CREATE TABLE MyTable
( base CHAR(1),
id INT
)
GO
Trigger Definition
CREATE TRIGGER dbo.tr_Populate_ID
ON dbo.MyTable
INSTEAD OF INSERT
AS
BEGIN
SET NOCOUNT ON;
INSERT INTO MyTable (base,id)
SELECT i.base, ISNULL(MAX(mt.id),0) +1 AS NextValue
FROM inserted i left join MyTable mt
on i.base = mt.base
GROUP BY i.base
END
Test
Execute the following statement multiple times and you will see the next values available in that group will be assigned to ID.
INSERT INTO MyTable VALUES
('A'),
('B'),
('C')
GO
SELECT * FROM MyTable
GO

Separating multiple values in one column in MS SQL

I have a field in an application that allows a user to select multiple values. When I query this field in the DB, if multiple values were selected the result gets displayed as one long word. There are no commas or space between the multiple selected values. Is there any way those values can be split by a comma?
Here’s my query:
SELECT HO.Value
FROM HAssessment ha
INNER JOIN HObservation HO
ON HO.AssessmentiD = ha.AssessmentID
AND HO.Patient_Oid = 2255231
WHERE Ho.FindingAbbr = 'A_R_CardHx'
------------------------------------------------
Result:
AnginaArrhythmiaCADCChest Pain
-------------------------
I would like to see:
Angina, Arrhythmia, CADC, Chest Pain
------------------------------------------
Help!
There's no easy solution to this.
The most expedient would be writing a string splitting function. From your sample data, it seems the values are all concatenated together without any separators. This means you'll have to come up with a list of all possible values (hopefully this is a query from some symptoms table...) and parse each one out from the string. This will be complex and painful.
A straightforward way to do this would be to test each valid symptom value to see whether it's contained within HObservation.Value, stuff all the found values together, and return the result. Note that this will perform very poorly.
Here's an example in TSQL. You'd be much better off doing this at the application layer, though, or better yet, normalizing your database (see below for more on that).
declare #symptoms table (symptom varchar(100))
insert into #symptoms (symptom)
values ('Angina'),('Arrhythmia'),('CADC'),('Chest Pain')
declare #value varchar(100)
set #value = 'AnginaArrhythmiaCADCChest Pain'
declare #result varchar(100)
select #result = stuff((
SELECT ', ' + s.symptom
FROM #symptoms s
WHERE patindex('%' + s.symptom + '%',#value) > 0
FOR XML PATH('')),1,1,'')
select #result
The real answer is to restructure your database. Put each distinct item found in HObservation.Value (this means Angina, Arrhythmia, etc. as separate rows) in to some other table if such a table doesn't exist already. I'll call this table Symptom. Then create a lookup table to link HObservation with Symptom. Then drop the HObservation.Value column entirely. Do the splitting work in the application level, and make multiple inserts in to the lookup table.
Example, based on sample data from your question:
HObservation
------------
ID Value
1 AnginaArrhythmiaCADC
Becomes:
HObservation
------------
ID
1
Symptom
-------
ID Value
1 Angina
2 Arrhythmia
3 CADC
HObservationSymptom
-------------------
ID HObservationID SymptomID
1 1 1
1 1 2
1 1 3
Note that if this is a production system (or you want to preserve the existing data for some other reason), you'll still have to write code to do the string splitting.

SQL: I need to take two fields I get as a result of a SELECT COUNT statement and populate a temp table with them

So I have a table which has a bunch of information and a bunch of records. But there will be one field in particular I care about, in this case #BegAttField# where only a subset of records have it populated. Many of them have the same value as one another as well.
What I need to do is get a count (minus 1) of all duplicates, then populate the first record in the bunch with that count value in a new field. I have another field I call BegProd that will match #BegAttField# for each "first" record.
I'm just stuck as to how to make this happen. I may have been on the right path, but who knows. The SELECT statement gets me two fields and as many records as their are unique #BegAttField#'s. But once I have them, I haven't been able to work with them.
Here's my whole set of code, trying to use a temporary table and SELECT INTO to try and populate it. (Note: the fields with # around the names are variables for this 3rd party app)
CREATE TABLE #temp (AttCount int, BegProd varchar(255))
SELECT COUNT(d.[#BegAttField#])-1 AS AttCount, d.[#BegAttField#] AS BegProd
INTO [#temp] FROM [Document] d
WHERE d.[#BegAttField#] IS NOT NULL GROUP BY [#BegAttField#]
UPDATE [Document] d SET d.[#NumAttach#] =
SELECT t.[AttCount] FROM [#temp] t INNER JOIN [Document] d1
WHERE t.[BegProd] = d1.[#BegAttField#]
DROP TABLE #temp
Unfortunately I'm running this script through a 3rd party database application that uses SQL as its back-end. So the errors I get are simply: "There is already an object named '#temp' in the database. Incorrect syntax near the keyword 'WHERE'. "
Comment out the CREATE TABLE statement. The SELECT INTO creates that #temp table.

Return multiple tables from a T-SQL function in SQL Server 2008

You can return a single table from a T-SQL Function in SQL Server 2008.
I am wondering if it is possible to return more than one table.
The scenario is that I have three queries that filter 3 different tables. Each table is filtered against 5 filter tables that I would like to return from a function; rather than copy and paste their creation in each query.
An simplified example of what this would look like with copy and paste:
FUNCTION GetValuesA(#SomeParameter int) RETURNS #ids TABLE (ID int) AS
WITH Filter1 As ( Select id FROM FilterTable1 WHERE Attribute=SomeParameter )
, Filter2 As ( Select id FROM FilterTable2 WHERE Attribute=SomeParameter )
INSERT INTO #IDs
SELECT ID FROM ValueTableA
WHERE ColA IN (SELECT id FROM Filter1)
AND ColB IN (SELECT id FROM Filter2)
RETURN
-----------------------------------------------------------------------------
FUNCTION GetValuesB(#SomeParameter int) RETURNS #ids TABLE (ID int) AS
WITH Filter1 As ( Select id FROM FilterTable1 WHERE Attribute=SomeParameter )
, Filter2 As ( Select id FROM FilterTable2 WHERE Attribute=SomeParameter )
INSERT INTO #IDs
SELECT ID FROM ValueTableB
WHERE ColA IN (SELECT id FROM Filter1)
AND ColB IN (SELECT id FROM Filter2)
AND ColC IN (SELECT id FROM Filter2)
RETURN
So, the only difference between the two queries is the Table being filtered, and HOW (the Where clause).
I would like to know if I could return Filter1 & Filter2 from a function. I am also open to suggestions on different ways to approach this problem.
No.
Conceptually, how would you expect to handle a function that returned a variable number of tables? You would JOIN on two tables at once? What if the returned fields don't line up?
Is there some reason you can't have a TVF for each filter?
As others say, NO. A function in TSQL must return exactly one result (although that result can come in the form of a table with numerous values).
There are a couple of ways you could achieve something similar though. A stored procedure can execute multiple select statements and deliver the results up to whatever called it, whether that be an application layer or something like SSMS. Many libraries require you to add additional commands to access more result sets though. For instance, in Pyodbc to access result sets after the first one you need to call cursor.nextset()
Also, inside a function you could UNION several result sets together although that would require each result set to have the same columns. One way to achieve that if they have a different column structure is to add in nulls for the missing columns for each select statement. If you needed to know which select statement returned the value, you could also add a column which indicated that. This should work with your simplified example since in each case it is just returning a single ID column, but it could get awkward very quickly if the column names or types are radically different.