Display 2 columns for each header - sql

In SQL Server 2008 I have a table People (Id, Gender, Name).
Gender is either Male or Female. There can be many people with the same name.
I would like to write a query that displays for each gender the top 2 names
by count and their count, like this:
Male Female
Adam 23 Rose 34
Max 20 Jenny 15
I think that PIVOT might be used but all the examples I have seen display only one column for each header.

Here is an example on SQL Fiddle -- http://sqlfiddle.com/#!3/b3477/1
This uses an couple of common table expressions to separate the genders.
create table People
(
Id int,
Gender varchar(50),
Name varchar(50)
)
;
insert into People values (1, 'Male', 'Bob');
insert into People values (2, 'Male', 'Bob');
insert into People values (3, 'Male', 'Bill');
insert into People values (4, 'Male', 'Chuck');
insert into People values (5, 'Female', 'Anne');
insert into People values (6, 'Female', 'Anne');
insert into People values (7, 'Female', 'Bobbi');
insert into People values (8, 'Female', 'Jane');
with cteMale as
(
select Name as 'MaleName', Count(*) as Num, ROW_NUMBER() over(order by count(*) desc, Name) RowNum
from People
where Gender = 'Male'
group by Name
)
,
cteFemale as
(
select top 2 Name as 'FemaleName', Count(*) as Num, ROW_NUMBER() over(order by count(*) desc, Name) RowNum
from People
where Gender = 'Female'
group by Name
)
select a.MaleName, a.Num as MaleNum, b.femaleName, b.Num as FemaleNum
from cteMale a
join cteFemale b on
a.RowNum = b.RowNum
where a.RowNum <= 2

Use a windowing function. Below is a complete solution using a temporary table #people.
-- use temp db
use tempdb;
go
-- drop test table
--drop table #people;
--go
-- create test table
create table #people (my_id int, my_gender char(1), my_name varchar(25));
go
-- clear test table
delete from #people;
-- three count
insert into #people values
(23, 'M', 'Adam'),
(34, 'F', 'Rose');
go 3
-- two count
insert into #people values
(20, 'M', 'Max'),
(15, 'F', 'Jenny');
go 2
-- one count
insert into #people values
(20, 'M', 'John'),
(15, 'F', 'Julie');
go
-- grab top two by gender
;
with cte_Get_Top_Two as
(
select ROW_NUMBER() OVER(PARTITION BY my_gender ORDER BY count() DESC) AS my_window,
my_gender, my_name, count() as total
from #people
group by my_gender, my_name
)
select * from cte_Get_Top_Two where my_window in (1, 2)
go
Here is the output.
PS: You can drop my_id from the table since it does not relate to your problem but does not change solution.

Related

Find value contained in the HierarchyId at any level

I need to find a particular value contained in the SQL Server HierarchyId column. The value can occur at any level. Here is a sample code to illustrate the issue:
CREATE TABLE mytable
(
Id INT NOT NULL PRIMARY KEY,
TeamName VARCHAR(20) NOT NULL,
MyHierarchyId HIERARCHYID NOT NULL
);
INSERT INTO mytable(Id, TeamName, MyHierarchyId)
VALUES (1, 'Corporate','/1/');
INSERT INTO mytable(Id, TeamName, MyHierarchyId)
VALUES (2, 'Group A','/1/2/');
INSERT INTO mytable(Id, TeamName, MyHierarchyId)
VALUES (3, 'Team X','/1/2/3/');
INSERT INTO mytable(Id, TeamName, MyHierarchyId)
VALUES (4, 'Group B','/1/4/');
INSERT INTO mytable(Id, TeamName, MyHierarchyId)
VALUES (5, 'Team Y','/1/4/5/');
INSERT INTO mytable(Id, TeamName, MyHierarchyId)
VALUES (6, 'Team Z','/1/4/6/');
Now I would like to find all the records, which are associated with the Id = 4. This means records 4, 5 and 6. I could use a brute force methods like this:
SELECT [M].[Id],
[M].[TeamName],
[M].[MyHierarchyId],
[M].[MyHierarchyId].ToString() AS Lineage
FROM [dbo].[mytable] AS [M]
WHERE [M].[MyHierarchyId].ToString() LIKE '%4%'
But I suspect this will be very inefficient. Once again, the problem is that the level of the node I am searching for is not known in advance.
Thank you for any recommendations.
You can use IsDescendantOf()
Select *
from mytable
Where MyHierarchyID.IsDescendantOf( (select MyHierarchyID from mytable where id=4) ) = 1
Results
Id TeamName MyHierarchyId
4 Group B 0x5C20
5 Team Y 0x5C3180
6 Team Z 0x5C3280

SQL Server ranking weirdness using FREETEXTTABLE across multiple columns

I have been struggling to get my head around how SQL Server full text search ranks my results.
Consider the following FREETEXTTABLE search:
DECLARE #SearchTerm varchar(55) = 'Peter Alex'
SELECT ftt.[RANK], v.*
FROM FREETEXTTABLE (vMembersFTS, (Surname, FirstName, MiddleName, MemberRef, Passport), #SearchTerm) ftt
INNER JOIN vMembersFTS v ON v.ID = ftt.[KEY]
ORDER BY ftt.[RANK] DESC;
This returns the following results and rankings:
RANK ID MemberRef Passport FirstName MiddleName Surname Salutation
----- ---- ---------- ----------- ----------- ------------ ---------- ------------
18 2 AB-002 Pete Peters
18 9 AB-006 George Alex Mr Alex
18 13 AB-009 Peter David Alex Mr Alex
14 3 AB-003 Peter Alex Jones
As you may be able to tell from the results posted above, the last row, although having, what I consider, a good match on both 'Peter' and 'Alex', appears with a rank of only 14 where the result in the first row has only a single match on 'Peter' (admittedly the surname is 'Peters').
This is a contrived example, but goes some way to illustrate my frustrations and lack of knowledge.
I have spent quite a bit of time researching, but I am feeling a bit out of my depth now. I'm sure that I'm doing something stupid such as searching across multiple columns.
I welcome your help and support. Thanks in advance.
Thanks,
Kaine
(BTW I am using SQL Server 2012)
Here is the SQL you can use to repeat the test yourself:
-- Create the Contacts table.
CREATE TABLE dbo.Contacts
(
ID int NOT NULL PRIMARY KEY,
FirstName varchar(55) NULL,
MiddleName varchar(55) NULL,
Surname varchar(55) NOT NULL,
Salutation varchar(55) NULL,
Passport varchar(55) NULL
);
GO
-- Create the Members table.
CREATE TABLE dbo.Members
(
ContactsID int NOT NULL PRIMARY KEY,
MemberRef varchar(55) NOT NULL
);
GO
-- Create the FTS view.
CREATE VIEW dbo.vMembersFTS WITH SCHEMABINDING AS
SELECT c.ID,
m.MemberRef,
ISNULL(c.Passport, '') AS Passport,
ISNULL(c.FirstName, '') AS FirstName,
ISNULL(c.MiddleName, '') AS MiddleName,
c.Surname,
ISNULL(c.Salutation, '') AS Salutation
FROM dbo.Contacts c
INNER JOIN dbo.Members AS m ON m.ContactsID = c.ID
GO
-- Create the view index for FTS.
CREATE UNIQUE CLUSTERED INDEX IX_vMembersFTS_ID ON dbo.vMembersFTS (ID);
GO
-- Create the FTS catalogue and stop-list.
CREATE FULLTEXT CATALOG ContactsFTSCatalog WITH ACCENT_SENSITIVITY = OFF;
CREATE FULLTEXT STOPLIST ContactsSL FROM SYSTEM STOPLIST;
GO
-- Create the member full-text index.
CREATE FULLTEXT INDEX ON dbo.vMembersFTS
(Surname, Firstname, MiddleName, Salutation, MemberRef, Passport)
KEY INDEX IX_vMembersFTS_ID
ON ContactsFTSCatalog
WITH STOPLIST = ContactsSL;
GO
-- Insert some data.
INSERT INTO Contacts VALUES (1, 'John', NULL, 'Smith', NULL, NULL);
INSERT INTO Contacts VALUES (2, 'Pete', NULL, 'Peters', NULL, NULL);
INSERT INTO Contacts VALUES (3, 'Peter', 'Alex', 'Jones', NULL, NULL);
INSERT INTO Contacts VALUES (4, 'Philip', NULL, 'Smith', NULL, NULL);
INSERT INTO Contacts VALUES (5, 'Harry', NULL, 'Dukes', NULL, NULL);
INSERT INTO Contacts VALUES (6, 'Joe', NULL, 'Jones', NULL, NULL);
INSERT INTO Contacts VALUES (7, 'Alex', NULL, 'Phillips', 'Mr Phillips', NULL);
INSERT INTO Contacts VALUES (8, 'Alexander', NULL, 'Paul', 'Alex', NULL);
INSERT INTO Contacts VALUES (9, 'George', NULL, 'Alex', 'Mr Alex', NULL);
INSERT INTO Contacts VALUES (10, 'James', NULL, 'Castle', NULL, NULL);
INSERT INTO Contacts VALUES (11, 'John', NULL, 'Alexander', NULL, NULL);
INSERT INTO Contacts VALUES (12, 'Robert', NULL, 'James', 'Mr James', NULL);
INSERT INTO Contacts VALUES (13, 'Peter', 'David', 'Alex', 'Mr Alex', NULL);
INSERT INTO Members VALUES (1, 'AB-001');
INSERT INTO Members VALUES (2, 'AB-002');
INSERT INTO Members VALUES (3, 'AB-003');
INSERT INTO Members VALUES (5, 'AB-004');
INSERT INTO Members VALUES (8, 'AB-005');
INSERT INTO Members VALUES (9, 'AB-006');
INSERT INTO Members VALUES (11, 'AB-007');
INSERT INTO Members VALUES (12, 'AB-008');
INSERT INTO Members VALUES (13, 'AB-009');
-- Run the FTS query.
DECLARE #SearchTerm varchar(55) = 'Peter Alex'
SELECT ftt.[RANK], v.*
FROM FREETEXTTABLE (vMembersFTS, (Surname, FirstName, MiddleName, MemberRef, Passport), #SearchTerm) ftt
INNER JOIN vMembersFTS v ON v.ID = ftt.[KEY]
ORDER BY ftt.[RANK] DESC;
The rank is assigning based on the order in your query:
DECLARE #SearchTerm varchar(55) = 'Peter Alex'
SELECT ftt.[RANK], v.*
FROM FREETEXTTABLE (vMembersFTS, (Surname, FirstName, MiddleName, MemberRef, Passport), #SearchTerm) ftt
INNER JOIN vMembersFTS v ON v.ID = ftt.[KEY]
ORDER BY ftt.[RANK] DESC;
So in your case, a match on SurName trumps FirstName, and both trump MiddleName.
Your top 3 results have a rank of 18 as all three match on Surname. The last record has a rank of 14 for matching on FirstName and MiddleName but not SurName.
You can find details on the rank calculations here: https://technet.microsoft.com/en-us/library/ms142524(v=sql.105).aspx
If you want to allocate equal weight to these you can, but you'd have to use CONTAINSTABLE and not FREETEXTTABLE.
Info can be found here: https://technet.microsoft.com/en-us/library/ms189760(v=sql.105).aspx

sql query to join two tables and a boolean flag to indicate whether it contains any words from third table

I have 3 tables with the following schema
create table main (
main_id int PRIMARY KEY,
secondary_id int NOT NULL
);
create table secondary (
secondary_id int NOT NULL,
tags varchar(100)
);
create table bad_words (
words varchar(100) NOT NULL
);
insert into main values (1, 1001);
insert into main values (2, 1002);
insert into main values (3, 1003);
insert into main values (4, 1004);
insert into secondary values (1001, 'good word');
insert into secondary values (1002, 'bad word');
insert into secondary values (1002, 'good word');
insert into secondary values (1002, 'other word');
insert into secondary values (1003, 'ugly');
insert into secondary values (1003, 'bad word');
insert into secondary values (1004, 'pleasant');
insert into secondary values (1004, 'nice');
insert into bad_words values ('bad word');
insert into bad_words values ('ugly');
insert into bad_words values ('worst');
expected output
----------------
1, 1000, good word, 0 (boolean flag indicating whether the tags contain any one of the words from the bad_words table)
2, 1001, bad word,good word,other word , 1
3, 1002, ugly,bad word, 1
4, 1003, pleasant,nice, 0
I am trying to use case to select 1 or 0 for the last column and use a join to join the main and secondary table, but getting confused and stuck. Can someone please help me with a query ? These tables are stored in redshift and i want query compatible with redshift.
you can use the above schema to try your query in sqlfiddle
EDIT: I have updated the schema and expected output now by removing the PRIMARY KEY in secondary table so that easier to join with the bad_words table.
You can use EXISTS and a regex comparison with \m and \M (markers for beginning and end of a word, respectively):
with
main(main_id, secondary_id) as (values (1, 1000), (2, 1001), (3, 1002), (4, 1003)),
secondary(secondary_id, tags) as (values (1000, 'very good words'), (1001, 'good and bad words'), (1002, 'ugly'),(1003, 'pleasant')),
bad_words(words) as (values ('bad'), ('ugly'), ('worst'))
select *, exists (select 1 from bad_words where s.tags ~* ('\m'||words||'\M'))::int as flag
from main m
join secondary s using (secondary_id)
select main_id, a.secondary_id, tags, case when c.words is not null then 1 else 0 end
from main a
join secondary b on b.secondary_id = a.secondary_id
left outer join bad_words c on c.words like b.tags
SELECT m.main_id, m.secondary_id, t.tags, t.is_bad_word
FROM srini.main m
JOIN (
SELECT st.secondary_id, st.tags, exists (select 1 from srini.bad_words b where st.tags like '%'+b.words+'%') is_bad_word
FROM
( SELECT secondary_id, LISTAGG(tags, ',') as tags
FROM srini.secondary
GROUP BY secondary_id ) st
) t on t.secondary_id = m.secondary_id;
This worked for me in redshift and produced the following output with the above mentioned schema.
1 1001 good word false
3 1003 ugly,bad word true
2 1002 good word,other word,bad word true
4 1004 pleasant,nice false

convert marks into percentage

how to convert marks obtained by a student into x%
i.e. there are two exams. calculate certain %marks from both exams (say x% and Y%) so that the total will be 100%
Based on the limited info that you have provided, I think you might be asking for the following:
create table student
(
id int,
s_name varchar(10)
)
insert into student values (1, 'Jim')
insert into student values (2, 'Bob')
insert into student values (3, 'Jane')
create table exams
(
id int,
e_name varchar(10)
)
insert into exams values (1, 'Test 1')
insert into exams values (2, 'Test 2')
insert into exams values (3, 'Test 3')
insert into exams values (4, 'Test 4')
create table exam_student
(
e_id int,
s_id int,
dt datetime,
score decimal(5,2)
)
insert into exam_student values(1, 1, '2012-08-01', 65.0)
insert into exam_student values(1, 2, '2012-08-01', 85.0)
insert into exam_student values(2, 1, '2012-08-02', 75.0)
insert into exam_student values(2, 2, '2012-08-02', 42.0)
select avg(es.score) as ScorePct, s_id, s.s_name
from exam_student es
inner join exams e
on es.e_id = e.id
inner join student s
on es.s_id = s.id
group by s_id, s_name
Results:
If you provide more details on exactly what you are looking for that would be helpful in answering your question.

Generating combinations in SQL Server

I have a table that contains groups ('G1', 'G2' etc) and a table that contains persons ('P1', 'P2', etc...) and a m:m relation ship between them, so one user can belong to several groups, and one group consists of several users.
I have a rule that is satisfied only if a certain number of members of each group is present (i.e. at least 2 members of G1 and at least 1 member of G2 must be present), and I have a list od users that are present. One person cannot fulfil more than one requirement, so if P1 and P2 are members of both G1 and G2, the rule still needs a third person which can be a member of either G1 or G2.
Any ideas how can this be done in SQL Server?
Creation scripts:
create table Groups (GroupID int, Name nvarchar(100))
insert into Groups values (1, 'First')
insert into Groups values (2, 'Second')
insert into Groups values (3, 'Third')
create table Persons (PersonID int, Name nvarchar(100))
insert into Persons values (1, 'One')
insert into Persons values (2, 'Two')
insert into Persons values (3, 'Three')
insert into Persons values (4, 'Four')
insert into Persons values (5, 'Five')
insert into Persons values (6, 'Six')
create table PersonGroups (PersonID int, GroupID int)
-- p1 and p2 are members of g1
insert into PersonGroups values (1, 1)
insert into PersonGroups values (2, 1)
-- p2, p3 and p4 are members of g2
insert into PersonGroups values (2, 2)
insert into PersonGroups values (3, 2)
insert into PersonGroups values (4, 2)
-- p2, p4, p5 and p6 are members of g3
insert into PersonGroups values (2, 3)
insert into PersonGroups values (4, 3)
insert into PersonGroups values (5, 3)
insert into PersonGroups values (6, 3)
So, If a rule needs one person from each group to be present (1,3,5), (1,2,3), (2,3,4) would be valid, and (3, 5, 6) would not be valid.
Create header table for rules
create table #ruleset (Id int, name varchar(100))
insert into #ruleset
select 1,'At least 1 person from each group'
Create child table for each rule having many entries for each group.
drop table #ruleset_grouprules
create table #ruleset_Grouprules(Id int identity(1,1), RuleId int,
GroupID int, MinUsers int, MaxUsers int)
insert into #ruleset_Grouprules (RuleId, groupId, MinUsers, MaxUsers)
select 1,1,1,null
union all
select 1,2,1,null
union all
select 1,3,1,null
You can use NULL in the MinUsers column to represent no minimum amount
You can use NULL in the MaxUsers column to represent no maximum amount
This query will show you whether the group rules have passed or not.
select r.id, r.Name, gr.GroupId,
case when x.GroupQty>=isnull(gr.MinUsers, x.GroupQty)
and x.GroupQty<=isnull(gr.MaxUsers, x.GroupQty)
then 1 else 0 end as GroupValid
from #ruleset r
join #ruleset_Grouprules gr on gr.RuleId=r.Id
join (
select g.groupID, count(*) GroupQty
from #Groups g
join #PersonGroups pg on pg.GroupID=g.GroupID
join #Persons p on p.PersonID=pg.PersonID
group by g.GroupID
)x on x.GroupID=gr.GroupID
You can then aggregate on this query to compare sum(GroupValid)=count(*) with a group by r.id to check if the entire Rule is valid. I left it like that so you can see the working data.