Placing different rows in succession - sql

I've started working with access around 1 month ago and I'm actually making a tool for preventive medicine so they can use a digital version of their actual paper form.
While the program is nearly finished, the med who requested it now wants to export to excel (the easy part) all the data from a patient his treatment and all the medicines used during that treatment in a single line (the problem).
I've been beating my head over that for two days, trying and researching on google, but all i could find was how to put values from a column in a single cell, and that's not how it has to be displayed.
So far, my best attempt (which is far from a good one) has been something like that:
CREATE TABLE Patient
(`SIP` int, `name` varchar(10));
INSERT INTO Patient
(`SIP`, `name`)
VALUES
(70,'John');
-- A patient can have multiple treatments
CREATE TABLE Treatment
(`id` int, `SIPFK` int);
INSERT INTO Treatment
(`id`,`SIPFK`)
VALUES
(1,70);
-- A treatment can have multiple medicines used while it's open
CREATE TABLE Medicine
(`Id` int, `Name` varchar(8), `TreatFK` int);
INSERT INTO Medicine
(`Id`, `Name`, `TreatFK`)
VALUES
(7, 'Apples', 1),
(7, 'Tomatoes', 1),
(7, 'Potatoes', 1),
(8, 'Banana', 2),
(8, 'Peach', 2);
-- The query
select c.id, c.Name, p.id as id2, p.Name as name2, r.id as id3, r.Name as name3
from Medicine as c, Medicine as p, Medicine as r
where c.id = 7 and p.id=7 and r.id=7;
The output I was trying to get was:
7 | Apples | 7 | Tomatoes | 7 | Potatoes
The table medicines will have more columns than that and i need to show every row related to a treatment in a single row along with the treatment.
But the values keep repeating themselves on different rows and the output on the subsequent columns besides the first ones is not as expected. Also GROUP BY won't solve the problem and DISTINCT doesn't work.
The output of the query is as follows: sqlfiddle.com
If any one could give me a hint, I would be grateful.
EDIT: Since access is a derp and won't let me use any good SQL fix nor will recognize DISTINCT to make the data from the queries not repeat themselves, I will try and search for a way to organize the rows directly in the exported excel.
Thank you all for your help, I'll save it cause I'm sure it'll save me hours of hands in the head.

This is a bit problemation, because MS Access does not support recursive CTE's and I dont see a way of doing that without Ranking.
Hence, I have tried to reproduce the results by using subquery which ranks the Medicines
and store these into a temporary table.
create table newtable
select c.id
, c.Name
,(SELECT COUNT(T1.Name) FROM Medicine AS T1 WHERE t1.id=c.id and T1.Name >= c.Name) AS Rank
from Medicine as c;
Afterwards, it is easy because my query is mostly based on Ranks and IDs.
select distinct id
,(select Name from newtable t2 where t1.id=t2.id and rank=1) as firstMed
,(select Name from newtable t2 where t1.id=t2.id and rank=2) as secMed
,(select Name from newtable t2 where t1.id=t2.id and rank=3) as ThirdMed
from newtable t1;
According to me, the SELF JOIN concept and the notion of recursive CTE's are the most important points for that particular example and a good practice would be to do a resarch on these.
for reference: http://sqlfiddle.com/#!2/f80a9/2

Related

Create New SQL Table w/o duplicates

I'm learning how to create tables in SQL pulling data from existing tables from two different databases. I am trying to create a table combining two tables without duplicates. I've seen some say using UNION but I could not get that to work.
Say TABLE 1 has 2 COLUMNS (IdNumber, Material) and TABLE 2 has 3 COLUMNS (IdNumber, Size, Description)
How can I create a new table (named TABLE3) that combines those two but only shows the columns (PartDescription, Weight, Color) but without duplicates.
What I have done so far is as follows:
CREATE TABLE #Materialsearch (IdNumber varchar(30), Material varchar(30))
CREATE TABLE #Sizesearch (idnumber varchar(30), Size varchar(30), Description varchar(50))
INSERT INTO #Materialsearch (IdNumber, Material)
SELECT [IdNumber],[Material]
FROM [datalist].[dbo].[Table1]
WHERE Material LIKE 'Steel' AND IdNumber NOT LIKE 'Steel'
INSERT INTO #Sizesearch (idnumber, Size, Description)
SELECT [idNumber],[itemSize], [ShortDesc]
FROM [515dap].[dbo].[Table2]
WHERE itemSize LIKE '1' AND idnumber NOT LIKE 'Steel'
SELECT DISTINCT #Materialsearch.IdNumber, #Materialsearch.Material,
#Sizesearch.Size, #Sizesearch.Description
FROM #Materialsearch
INNER JOIN #Sizesearch
ON #Materialsearch.IdNumber = #Sizesearch.idnumber
ORDER BY #Materialsearch.IdNumber
DROP TABLE #Materialsearch
DROP TABLE #Sizesearch
This would show all items that are made from steel but do not have steel as their itemid's.
Thanks for your help
I'm not 100% sure what you're after - but you may find this useful.
You could use a FULL OUTER JOIN which takes takes all rows from both tables, matches the ones it can, then reports all rows.
I'd suggest (for your understanding) running
SELECT A.*, B.*
FROM #Materialsearch AS A
FULL OUTER JOIN #Sizesearch AS B ON A.[IdNumber] = B.[IdNumber]
Then to get the relevant data, you just need some tweaks on that e.g.,
SELECT
ISNULL(A.[IdNumber], B.[IdNumber]) AS [IdNumber],
A.Material,
B.Size,
B.Description
FROM #Materialsearch AS A
FULL OUTER JOIN #Sizesearch AS B ON A.[IdNumber] = B.[IdNumber]
Edit: Changed typoed INNER JOINs to FULL OUTER JOINs. Oops :( Thankyou very much #Thorsten for finding it!

Select Value Of CSV in MasterTable From Reference Table

Consider these two tables:
--Subscriber_File---
ID GenreId FileName
01 1,2 TestFile.pdf
--MasterGenre--
ID Genrename
1 TEst1
2 Test2
When I issue this query, I'd like the result to be formatted as follows
Select * From Subscriber_File
ID GenreId FileName GenreName
1 1,2 TestFile.pdf TEst1,Test2
How can this be done?
Your data is not normalized. Specifically, in the one row for Subscriber_File, you have two facts in one place: the fact that the one entry is realted to both MasterGenre 1 and MaterGenre 2. What if they were related with three MaterGenres? What if 10? The code requited to associate your facts quickly escalates into an unmanageable mess.
The standard solution—when using relational database systems—is to normalize you data, such that each “repeating fact” is represented by one row in a table. (Google "database normalization" and you'll find thousands of articles on the subject. Really.) Here, you might end up with:
Table Subscriber
SubscriberId
FileName
(01, TestFile.pdf)
Table Genre
GenreId
GenreName
(1, Test1)
(2, Test2)
Table SubscriberGenre
SubscriberId
GenreId
(01, 1)
(01, 2)
At which point querying the data becomse trivial:
SELECT sub.SubscribeId, gen.GenreId, sub.FileName, gen.GenreName
From Subscriber sub
Inner join SubscriberGenre subgen
On subgen.SubscriberId = sub.SubscriberId
Inner join Gener gen
On gen.GenreId = subgen.GenreId
This should produce the result set
(01, 1, TestFile.pdf, Test1)
(01, 2, TestFile.pdf, Test2)
Hmm, you’re still challenged with converting those two lines into the one with a “1,2” value. I’ll let someone else answer that; my main point is that without normalized table structures, you’ll have trouble getting anything done.

Is SQL GROUP BY a design flaw? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
Why does SQL require that I specify on which attributes to group? Why can't it just use all non-aggregates?
If an attribute is not aggregated and is not in the GROUP BY clause then nondeterministic choice would be the only option assuming tuples are unordered (mysql kind of does this) and that is a huge gotcha. As far as I know, Postgresql requires that all attributes not appearing in the GROUP BY must be aggregated, which reinforces that it is superfluous.
Am I missing something or is this a language design flaw that promotes loose implementations and makes queries harder to write?
If I am missing something, what is an example query where group attributes can not be inferred?   
You don't have to group by the exactly the same thing you're selecting, e.g. :
SQL:select priority,count(*) from rule_class
group by priority
PRIORITY COUNT(*)
70 1
50 4
30 1
90 2
10 4
SQL:select decode(priority,50,'Norm','Odd'),count(*) from rule_class
group by priority
DECO COUNT(*)
Odd 1
Norm 4
Odd 1
Odd 2
Odd 4
SQL:select decode(priority,50,'Norm','Odd'),count(*) from rule_class
group by decode(priority,50,'Norm','Odd')
DECO COUNT(*)
Norm 4
Odd 8
There is one more reason for why does SQL requires that I specify on which attributes to group.
Lets sat we have two simple tables: friend and car, where we store info about our friends and their cars.
And lets say we want to show all our friends's data (from table friend) and for everyone of our friends, how many cars they own now, have sold, have crashed and the total number. Oh, and we want the elders first, younger last.
We'd do something like:
SELECT f.id
, f.firstname
, f.lastname
, f.birthdate
, COUNT(NOT c.sold AND NOT c.crashed) AS owned
, COUNT(c.sold) AS sold
, COUNT(c.crashed) AS crashed
, COUNT(c.friendid) AS totalcars
FROM friend f
LEFT JOIN car c <--to catch (shame!) those friends who have never had a car
ON f.id = c.friendid
GROUP BY f.id
, f.firstname
, f.lastname
, f.birthdate
ORDER BY f.birthdate DESC
But do we really need all those fields in the GROUP BY? Isn't every friend uniquely determined by his id? In other words, aren't the firstname, lastname and birthdate functionally dependend on the f.id? Why not just do (as we can in MySQL):
SELECT f.id
, f.firstname
, f.lastname
, f.birthdate
, COUNT(NOT c.sold AND NOT c.crashed) AS owned
, COUNT(c.sold) AS sold
, COUNT(c.crashed) AS crashed
, COUNT(c.friendid) AS totalcars
FROM friend f
LEFT JOIN car c <--to catch (shame!) those friends who have never had a car
ON f.id = c.friendid
GROUP BY f.id
ORDER BY f.birthdate
And what if we had 20 fields in the SELECT (plus ORDER BY) parts? Isn't the second query shorter, clearer and probably faster (in the RDBMS that accept it)?
I say yes. So, do the SQL 1999 and 2003 specs say, if this article is correct: Debunking group by myths
I would say if you have a large number of items in the group by clause then perhaps the core info should be pulled out into a tabular sub-query which you inner join into.
There is a probably a performance hit, but it makes for neater code.
select id, count(a), b, c, d
from table
group by
id, b, c, d
becomes
select id, myCount, b, c, d
from table t
inner join (
select id, count(*) as myCount
from table
group by id
) as myCountTable on myCountTable.id = t.id
That said, I'm interested to hear counter-arguments for doing this as opposed to having a large group by clause.
I agree its verbose that the group by list shouldn't implicitly be the same as then non-aggregated select columns. In Sas there are data aggregation operations that are more succinct.
Also : it's hard to come up with an example where it would be useful to have a longer list of columns in the group list than the select list. The best I can come up with is ...
create table people
( Nam char(10)
,Adr char(10)
)
insert into people values ('Peter', 'Tibet')
insert into people values ('Peter', 'OZ')
insert into people values ('Peter', 'OZ')
insert into people values ('Joe', 'NY')
insert into people values ('Joe', 'Texas')
insert into people values ('Joe', 'France')
-- Give me people where there is a duplicate address record
select * from people where nam in
(
select nam
from People
group by nam, adr -- group list different from select list
having count(*) > 1
)
If you issue just regarding to easier way to write scripts.
Here is one tip:
In MS SQL MGMS write you query in text something like select * from my_table
after that select text right click and "Design Query in Editor.."
Sql studio will open new editor with filed up all fields after that again right click and select "Add Gruop BY"
Sql MGM studio will add code for you .
I fund this method extremely useful for insert statements. When I need to write script for insert a lot of fields in table, I just do select * from table_where_want_to_insert and after that change type of select statement to insert,
I Agree
I quite agree with the question. I asked the same one here.
I honestly think it's a language flaw.
I realise that there are arguments against that, but I have yet to use a GROUP BY clause containing anything other than all the non-aggregated fields from the SELECT clause in the real world.
This thread provides some useful explanations.
http://social.msdn.microsoft.com/Forums/en/transactsql/thread/52482614-bfc8-47db-b1b6-deec7363bd1a
I'd say it is more likely to be a language design choice that decisions be explicit, not implicit. For instance, what if I wish to group the data in a different order than that in which I output the columns? Or if I want to group by columns that aren't included in the columns selected? Or if I want to output grouped columns only and not use aggregate functions? Only by explicitly stating my preferences in the group by clause are my intentions clear.
You also have to remember that SQL is a very old language (1970). Look at how Linq flipped everything around in order to make Intellisense work - it looks obvious to us now, but SQL predates IDEs and so couldn't have taken into account such issues.
The "superflous" attributes influence the ordering of the result.
Consider:
create table gb (
a number,
b varchar(3),
c varchar(3)
);
insert into gb values ( 3, 'foo', 'foo');
insert into gb values ( 1, 'foo', 'foo');
insert into gb values ( 0, 'foo', 'foo');
insert into gb values ( 20, 'foo', 'bar');
insert into gb values ( 11, 'foo', 'bar');
insert into gb values ( 13, 'foo', 'bar');
insert into gb values ( 170, 'bar', 'foo');
insert into gb values ( 144, 'bar', 'foo');
insert into gb values ( 130, 'bar', 'foo');
insert into gb values (2002, 'bar', 'bar');
insert into gb values (1111, 'bar', 'bar');
insert into gb values (1331, 'bar', 'bar');
This statement
select sum(a), b, c
from gb
group by b, c;
results in
44 foo bar
444 bar foo
4 foo foo
4444 bar bar
while this one
select sum(a), b, c
from gb
group by c, b;
results in
444 bar foo
44 foo bar
4 foo foo
4444 bar bar

Insert results of subquery into table with a constant

The outline of the tables in question are as follows:
I have a table, lets call it join, that has two columns, both foreign keys to other tables. Let's call the two columns userid and buildingid so join looks like
+--------------+
| join |
|--------------|
|userid |
|buildingid |
+--------------+
I basically need to insert a bunch of rows into this table. Each user will be assigned to multiple buildings by having multiple entries in this table. So user 13 might be assigned to buildings 1, 2, and 3 by the following
13 1
13 2
13 3
I'm trying to figure out how to do this in a query if the building numbers are constant, that is, I'm assigning a group of people to the same buildings. Basically, (this is wrong) I want to do
insert into join (userid, buildingid) values ((select userid from users), 1)
Does that make sense? I've also tried using
select 1
The error I'm running into is that the subquery returns more than one result. I also attempted to create a join, basically with a static select query that was also unsuccessful.
Any thoughts?
Thanks,
Chris
Almost! When you want to insert to values of a query, don't try to put them in the values clause. insert can take a select as an argument for the values!
insert into join (userid, buildingid)
select userid, 1 from users
Also, in the spirit of learning more, you can create a table that doesn't exist by using the following syntax:
select userid, 1 as buildingid
into join
from users
That only works if the table doesn't exist, though, but it's a quick and dirty way to create table copies!

A Simple Sql Select Query

I know I am sounding dumb but I really need help on this.
I have a Table (let's say Meeting) which Contains a column Participants.
The Participants dataType is varchar(Max) and it stores Participant's Ids in comma separated form like 1,2.
Now my problem is I am passing a parameter called #ParticipantsID in my Stored Procedure and want to do something like this:
Select Participants from Meeting where Participants in (#ParticipantsID)
Unfortunately I am missing something crucial here.
Can some one point that out?
I've been there before... I changed the DB design to have one record contain a single reference to the other table. If you can't change your DB structures and you have to live with this, I found this solution on CodeProject.
New Function
IF EXISTS(SELECT * FROM sysobjects WHERE ID = OBJECT_ID(’UF_CSVToTable’))
DROP FUNCTION UF_CSVToTable
GO
CREATE FUNCTION UF_CSVToTable
(
#psCSString VARCHAR(8000)
)
RETURNS #otTemp TABLE(sID VARCHAR(20))
AS
BEGIN
DECLARE #sTemp VARCHAR(10)
WHILE LEN(#psCSString) > 0
BEGIN
SET #sTemp = LEFT(#psCSString, ISNULL(NULLIF(CHARINDEX(',', #psCSString) - 1, -1),
LEN(#psCSString)))
SET #psCSString = SUBSTRING(#psCSString,ISNULL(NULLIF(CHARINDEX(',', #psCSString), 0),
LEN(#psCSString)) + 1, LEN(#psCSString))
INSERT INTO #otTemp VALUES (#sTemp)
END
RETURN
END
Go
New Sproc
SELECT *
FROM
TblJobs
WHERE
iCategoryID IN (SELECT * FROM UF_CSVToTable(#sCategoryID))
You would not typically organise your SQL database in quite this way. What you are describing are two entities (Meeting & Participant) that have a one-to-many relationship. i.e. a meeting can have zero or more participants. To model this in SQL you would use three tables: a meeting table, a participant table and a MeetingParticipant table. The MeetingParticipant table holds the links between meetings & participants. So, you might have something like this (excuse any sql syntax errors)
create table Meeting
(
MeetingID int,
Name varchar(50),
Location varchar(100)
)
create table Participant
(
ParticipantID int,
FirstName varchar(50),
LastName varchar(50)
)
create table MeetingParticipant
(
MeetingID int,
ParticipantID int
)
To populate these tables you would first create some Participants:
insert into Participant(ParticipantID, FirstName, LastName) values(1, 'Tom', 'Jones')
insert into Participant(ParticipantID, FirstName, LastName) values(2, 'Dick', 'Smith')
insert into Participant(ParticipantID, FirstName, LastName) values(3, 'Harry', 'Windsor')
and create a Meeting or two
insert into Meeting(MeetingID, Name, Location) values(10, 'SQL Training', 'Room 1')
insert into Meeting(MeetingID, Name, Location) values(11, 'SQL Training', 'Room 2')
and now add some participants to the meetings
insert into MeetingParticipant(MeetingID, ParticipantID) values(10, 1)
insert into MeetingParticipant(MeetingID, ParticipantID) values(10, 2)
insert into MeetingParticipant(MeetingID, ParticipantID) values(11, 2)
insert into MeetingParticipant(MeetingID, ParticipantID) values(11, 3)
Now you can select all the meetings and the participants for each meeting with
select m.MeetingID, p.ParticipantID, m.Location, p.FirstName, p.LastName
from Meeting m
join MeetingParticipant mp on m.MeetingID=mp.MeetingID
join Participant p on mp.ParticipantID=p.ParticipantID
the above should produce
MeetingID ParticipantID Location FirstName LastName
10 1 Room 1 Tom Jones
10 2 Room 1 Dick Smith
11 2 Room 2 Dick Smith
11 3 Room 2 Harry Windsor
If you want to find out all the meetings that "Dick Smith" is in you would write something like this
select m.MeetingID, m.Location
from Meeting m join MeetingParticipant mp on m.MeetingID=mp.ParticipantID
where
mp.ParticipantID=2
and get
MeetingID Location
10 Room 1
11 Room 2
I have omitted important things like indexes, primary keys and missing attributes such as meeting dates, but it is clearer without all the goo.
Your table is not normalized. If you want to query for individual participants, they should be split into their own table, along the lines of:
Meeting
MeetingId primary key
Other stuff
Persons
PersonId primary key
Other stuff
Participants
MeetingId foreign key Meeting(MeetingId)
PersonId foreign key Persons(PersonId)
primary key MeetingId,PersonId
Otherwise, you have to resort to all sorts of trickery (what I call SQL gymnastics) to find out what you want. That trickery never scales well - your queries become slow very quickly as the table grows.
With a properly normalized database, the queries can remain fast well into the multi-millions of records (I work with DB2/z where we are used to truly huge tables).
There are valid reasons for sometimes reverting to second normal form (or even first) for performance but that should be a very hard thought out decision (and based on actual performance data). All databases should initially start of in 3NF.
SELECT * FROM Meeting WHERE Participants LIKE '%,12,%' OR Participants LIKE '12,%' OR Participants LIKE '%,12'
where 12 is the ID you are looking for....
Ugly, what a nasty model.
If I understand your question correctly, you are trying to pass in a comma separated list of participant ids and see if it is in your list. This link lists several ways to do such a thing"
[http://vyaskn.tripod.com/passing_arrays_to_stored_procedures.htm][1]
codezy.blogspot.com
If you store the participant ids in a comma-separated list (as text) in the database, you cannot easily query it (as a list) using SQL. You would have to resort to string-operations.
You should consider changing your schema to use another table to map meetings to participants:
create table meeting_participants (
meeting_id integer not null , -- foreign key
participant_id integer not null
);
That table would have multiple rows per meeting (one for each participant).
You can then query that table for individual participants, or number of participants, and such.
If participants is a separate data type you should be storing it as a child table of your meeting table. e.g.
MEETING
PARTICIPANT 1
PARTICIPANT 2
PARTICIPANT 3
Each participant would hold the meeting ID so you can do a query
SELECT * FROM participants WHERE meeting_id = 1
However, if you must store a comma separated list (for some external reason) then you can do a string search to find the appropriate record. This would be a very inefficient way to do a query though.
That is not the best way to store the information you have.
If it is all you have got then you need to be doing a contains (not an IN). The best answer is to have another table that links Participants to Meetings.
Try SELECT Meeting, Participants FROM Meeting CONTAINS(Participants, #ParticipantId)