SQL aliasing with FROM AS - sql

SELECT A.barName AS BarName1, B.barName AS BarName2
FROM (
SELECT Sells.barName, COUNT(barName) AS count
FROM Sells
GROUP BY barName
) AS A, B
WHERE A.count = B.count
I'm trying to do a self join on this table that I created, but I'm not sure how to alias the table twice in this format (i.e. FROM AS). Unfortunately, this is a school assignment where I can't create any new tables. Anyone have experience with this syntax?
edit: For clarification I'm using PostgreSQL 8.4. The schema for the tables I'm dealing with are as follows:
Drinkers(name, addr, hobby, frequent)
Bars(name, addr, owner)
Beers(name, brewer, alcohol)
Drinks(drinkerName, drinkerAddr, beerName, rating)
Sells(barName, beerName, price, discount)
Favorites(drinkerName, drinkerAddr, barName, beerName, season)
Again, this is for a school assignment, so I'm given read-only access to the above tables.
What I'm trying to find is pairs of bars (Name1, Name2) that sell the same set of drinks. My thinking in doing the above was to try and find pairs of bars that sell the same number of drinks, then list the names and drinks side by side (BarName1, Drink1, BarName2, Drink2) to try and compare if they are indeed the same set.

You have not mentioned what RDBMS you use.
If Oracle or MS SQL, you can do something like this (I use my sample data table, but you can try it with your tables):
create table some_data (
parent_id int,
id int,
name varchar(10)
);
insert into some_data values(1, 2, 'val1');
insert into some_data values(2, 3, 'val2');
insert into some_data values(3, 4, 'val3');
with data as (
select * from some_data
)
select *
from data d1
left join data d2 on d1.parent_id = d2.id
In your case this query
SELECT Sells.barName, COUNT(barName) AS count
FROM Sells
GROUP BY barName
should be placed in WITH section and referenced from main query 2 times as A and B.

It is slightly unclear what you are trying to achive. Are you looking for a list bar names, with how many times they appear in the table? If so, there are a couple ways you could do this. Firstly:
SELECT SellsA.barName AS BarName1, SellsB.count AS Count
FROM
(
SELECT DISTINCT barName
FROM Sells
) SellsA
LEFT JOIN
(
SELECT Sells.barName, COUNT(barName) AS count
FROM Sells
GROUP BY barName
) AS SellsB
ON SellsA.barName = SellsB.barName
Secondly, if you are using MSSQL:
SELECT barNamr, MAX(rn) AS Count
FROM
(
SELECT barName,
ROW_NUMBER() OVER(ORDR BY barName PARTITION BY barName) as rn
FROM Sells
) CountSells
GROUP BY barName
Thirdly, you could avoid a self-join in MSSQL, by using OVER():
SELECT
barName
COUNT(*) OVER(ORDER BY barName PARTITION BY barName) AS Count
FROM Sells

Related

Generating Lines based on a value from a column in another table

I have the following table:
EventID=00002,DocumentID=0005,EventDesc=ItemsReceived
I have the quantity in another table
DocumentID=0005,Qty=20
I want to generate a result of 20 lines (depending on the quantity) with an auto generated column which will have a sequence of:
ITEM_TAG_001,
ITEM_TAG_002,
ITEM_TAG_003,
ITEM_TAG_004,
..
ITEM_TAG_020
Here's your sql query.
with cte as (
select 1 as ctr, t2.Qty, t1.EventID, t1.DocumentId, t1.EventDesc from tableA t1
inner join tableB t2 on t2.DocumentId = t1.DocumentId
union all
select ctr + 1, Qty, EventID, DocumentId, EventDesc from cte
where ctr <= Qty
)select *, concat('ITEM_TAG_', right('000'+ cast(ctr AS varchar(3)),3)) from cte
option (maxrecursion 0);
Output:
Best is to introduce a numbers table, very handsome in many places...
Something along:
Create some test data:
DECLARE #MockNumbers TABLE(Number BIGINT);
DECLARE #YourTable1 TABLE(DocumentID INT,ItemTag VARCHAR(100),SomeText VARCHAR(100));
DECLARE #YourTable2 TABLE(DocumentID INT, Qty INT);
INSERT INTO #MockNumbers SELECT TOP 100 ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) FROM master..spt_values;
INSERT INTO #YourTable1 VALUES(1,'FirstItem','qty 5'),(2,'SecondItem','qty 7');
INSERT INTO #YourTable2 VALUES(1,5), (2,7);
--The query
SELECT CONCAT(t1.ItemTag,'_',REPLACE(STR(A.Number,3),' ','0'))
FROM #YourTable1 t1
INNER JOIN #YourTable2 t2 ON t1.DocumentID=t2.DocumentID
CROSS APPLY(SELECT Number FROM #MockNumbers WHERE Number BETWEEN 1 AND t2.Qty) A;
The result
FirstItem_001
FirstItem_002
[...]
FirstItem_005
SecondItem_001
SecondItem_002
[...]
SecondItem_007
The idea in short:
We use an INNER JOIN to get the quantity joined to the item.
Now we use APPLY, which is a row-wise action, to bind as many rows to the set, as we need it.
The first item will return with 5 lines, the second with 7. And the trick with STR() and REPLACE() is one way to create a padded number. You might use FORMAT() (v2012+), but this is working rather slowly...
The table #MockNumbers is a declared table variable containing a list of numbers from 1 to 100. This answer provides an example how to create a pyhsical numbers and date table. Any database should have such a table...
If you don't want to create a numbers table, you can search for a tally table or tally on the fly. There are many answers showing approaches how to create a list of running numbers...a

Firebird select from table distinct one field

The question I asked yesterday was simplified but I realize that I have to report the whole story.
I have to extract the data of 4 from 4 different tables into a Firebird 2.5 database and the following query works:
SELECT
PRODUZIONE_T t.CODPRODUZIONE,
PRODUZIONE_T.NUMEROCOMMESSA as numeroco,
ANGCLIENTIFORNITORI.RAGIONESOCIALE1,
PRODUZIONE_T.DATACONSEGNA,
PRODUZIONE_T.REVISIONE,
ANGUTENTI.NOMINATIVO,
ORDINI.T_DATA,
FROM PRODUZIONE_T
LEFT OUTER JOIN ORDINI_T ON PRODUZIONE_T.CODORDINE=ORDINI_T.CODORDINE
INNER JOIN ANGCLIENTIFORNITORI ON ANGCLIENTIFORNITORI.CODCLIFOR=ORDINI_T.CODCLIFOR
LEFT OUTER JOIN ANGUTENTI ON ANGUTENTI.IDUTENTE = PRODUZIONE_T.RESPONSABILEUC
ORDER BY right(numeroco,2) DESC, left(numeroco,3) desc
rows 1 to 500;
However the query returns me double (or more) due to the REVISIONE column.
How do I select only the rows of a single NUMEROCOMMESSA with the maximum REVISIONE value?
This should work:
select COD, ORDER, S.DATE, REVISION
FROM TAB1
JOIN
(
select ORDER, MAX(REVISION) as REVISION
FROM TAB1
Group By ORDER
) m on m.ORDER = TAB1.ORDER and m.REVISION = TAB1.REVISION
Here you go - http://sqlfiddle.com/#!6/ce7cf/4
Sample Data (as u set it in your original question):
create table TAB1 (
cod integer primary key,
n_order varchar(10) not null,
s_date date not null,
revision integer not null );
alter table tab1 add constraint UQ1 unique (n_order,revision);
insert into TAB1 values ( 1, '001/18', '2018-02-01', 0 );
insert into TAB1 values ( 2, '002/18', '2018-01-31', 0 );
insert into TAB1 values ( 3, '002/18', '2018-01-30', 1 );
The query:
select *
from tab1 d
join ( select n_ORDER, MAX(REVISION) as REVISION
FROM TAB1
Group By n_ORDER ) m
on m.n_ORDER = d.n_ORDER and m.REVISION = d.REVISION
Suggestions:
Google and read the classic book: "Understanding SQL" by Martin Gruber
Read Firebird SQL reference: https://www.firebirdsql.org/file/documentation/reference_manuals/fblangref25-en/html/fblangref25.html
Here is yet one more solution using Windowed Functions introduced in Firebird 3 - http://sqlfiddle.com/#!6/ce7cf/13
I do not have Firebird 3 at hand, so can not actually check if there would not be some sudden incompatibility, do it at home :-D
SELECT * FROM
(
SELECT
TAB1.*,
ROW_NUMBER() OVER (
PARTITION BY n_order
ORDER BY revision DESC
) AS rank
FROM TAB1
) d
WHERE rank = 1
Read documentation
https://community.modeanalytics.com/sql/tutorial/sql-window-functions/
https://www.firebirdsql.org/file/documentation/release_notes/html/en/3_0/rnfb30-dml-windowfuncs.html
Which of the three (including Gordon's one) solution would be faster depends upon specific database - the real data, the existing indexes, the selectivity of indexes.
While window functions can make the join-less query, I am not sure it would be faster on real data, as it maybe can just ignore indexes on order+revision cortege and do the full-scan instead, before rank=1 condition applied. While the first solution would most probably use indexes to get maximums without actually reading every row in the table.
The Firebird-support mailing list suggested a way to break out of the loop, to only use a single query: The trick is using both windows functions and CTE (common table expression): http://sqlfiddle.com/#!18/ce7cf/2
WITH TMP AS (
SELECT
*,
MAX(revision) OVER (
PARTITION BY n_order
) as max_REV
FROM TAB1
)
SELECT * FROM TMP
WHERE revision = max_REV
If you want the max revision number in Firebird:
select t.*
from tab1 t
where t.revision = (select max(t2.revision) from tab1 t2 where t2.order = t.order);
For performance, you want an index on tab1(order, revision). With such an index, performance should be competitive with any other approach.

Get distinct individual column values (not distinct pairs) from two tables in single query

I have two tables like the following. One is for sport talents of some people and second for arts talents. One may not have a sport talent to list and same applies for art talent.
CREATE TABLE SPORT_TALENT(name varchar(10), TALENT varchar(10));
CREATE TABLE ART_TALENT(name varchar(10), TALENT varchar(10));
INSERT INTO SPORT_TALENT(name, TALENT) VALUES
('Steve', 'Footbal')
,('Steve', 'Golf')
,('Bob' , 'Golf')
,('Mary' , 'Tennnis');
INSERT INTO ART_TALENT(name, TALENT) VALUES
('Steve', 'Dancer')
, ('Steve', 'Singer')
, ('Bob' , 'Dancer')
, ('Bob' , 'Singer')
, ('John' , 'Dancer');
Now I want to list down sport talent and art talent of one person. I would like to avoid duplication. But I don't mind if there is a "null" in any output. I tried the following
select distinct sport_talent.talent as s_talent,art_talent.talent as a_talent
from sport_talent
JOIN art_talent on sport_talent.name=art_talent.name
where (sport_talent.name='Steve' or art_talent.name='Steve');
s_talent | a_talent
----------+----------
Footbal | Dancer
Golf | Singer
Footbal | Singer
Golf | Dancer
I would like to avoid redundancy and need something like the following (distinct values of sport talents + distinct values of art talents).
s_talent | a_talent
----------+----------
Footbal | Dancer
Golf | Singer
As mentioned in subject, I am not looking for distinct combinations. But at the same time, it's OK if there are some records with "null" value in one column. I am relatively new to SQL.
Try:
SELECT s_talent, a_talent
FROM (
SELECT distinct on (talent) talent as s_talent,
dense_rank() over (order by talent) as x
FROM SPORT_TALENT
WHERE name='Steve'
) x
FULL OUTER JOIN (
SELECT distinct on (talent) talent as a_talent,
dense_rank() over (order by talent) as x
FROM ART_TALENT
WHERE name='Steve'
) y
ON x.x = y.x
Demo: http://sqlfiddle.com/#!15/66e04/3
There are no duplicates in your query. Each of the four records in your query return is unique. This result may not be what you want, but seems like its problem is not the duplicate.
Postgres 9.4
... introduces unnest() with multiple arguments. Does exactly what you want, and should be fast, too. Per documentation:
The special table function UNNEST may be called with any number of
array parameters, and it returns a corresponding number of columns, as
if UNNEST (Section 9.18) had been called on each parameter separately
and combined using the ROWS FROM construct.
About ROWS FROM:
Compare result of two table functions using one column from each
SELECT *
FROM unnest(
ARRAY(SELECT DISTINCT talent FROM sport_talent WHERE name = 'Steve')
, ARRAY(SELECT DISTINCT talent FROM art_talent WHERE name = 'Steve')
) AS t(s_talent, a_talent);
Postgres 9.3 or older
SELECT s_talent, a_talent
FROM (
SELECT talent AS s_talent, row_number() OVER () AS rn
FROM sport_talent
WHERE name = 'Steve'
GROUP BY 1
) s
FULL JOIN (
SELECT talent AS a_talent, row_number() OVER () AS rn
FROM art_talent
WHERE name = 'Steve'
GROUP BY 1
) a USING (rn);
Similar previous answers with more explanation:
What type of JOIN to use
Sort columns independently, such that all nulls are last per column
This is similar to what #kordirko posted, but uses GROUP BY to get distinct talents, which is evaluated before window functions. So we only need a bare row_number() and not the more expensive dense_rank().
About the sequence of events in a SELECT query:
Best way to get result count before LIMIT was applied
SQL Fiddle.

Matching a set of child records between two similar table hierarchies

I have two similar table hierarchies:
Owner -> OwnerGroup -> Parent
and
Owner2 -> OwnerGroup2
I would like to determine if there is an exact match of Owners that exists in Owner2 based on a set of values. There are approximately a million rows in each Owner table. Some OwnerGroups contain up to 100 Owners.
So basically if there is an OwnerGroup than contains Owners "Smith", "John" and "Smith, "Jane", I want to know the id of the OwnerGroup2s that are exact matches.
The first attempt at this was to generate a join per Owner (which required dynamic sql being generated in the application:
select og.id
from owner_group2 og
-- dynamic bit starts here
join owner2 o1 on
(og.id = o1.og_id) AND
(o1.given_names = 'JOHN' and o1.surname='SMITH')
-- dynamic bit ends here
join owner2 o2 on
(og.id = o2.og_id) AND
(o2.given_names = 'JANE' and o2.surname='SMITH');
This works fine until for small numbers of owners, but when we have to deal with the 100 Owners in a group scenario as this query plan means there 100 nested loops and it takes almost a minute to run.
Another option I had was to use something around the intersect operator. E.g.
select * from (
select o.surname, o.given_names
from owner1 o1
join owner_group1 og1 on o1.og_id = og1.id
where
og1.parent_id = 1936233
)
intersect
select o.surname, o.given_names
from owner2 o2
join owner_group2 og2 on og2.id = o2.og_id;
I'm not sure how to suck out the owner2.id in this scenario either - and it was still running in the 4-5 second range.
I feel like I am missing something obvious - so please feel free to provide some better solutions!
You're on the right track with intersect, you just need to go a bit further. You need to join the results of it back to the owner_groups2 table to find the ids.
You can use the listagg function to convert the groups into comma-separated lists of the names (note - requires 11g). You can then take the intersection of these name lists to find the matches and join this back to the list in owner_groups2.
I've created a simplified example below, in it "Dave, Jill" is the group that is present in both tables.
create table grps (id integer, name varchar2(100));
create table grps2 (id integer, name varchar2(100));
insert into grps values (1, 'Dave');
insert into grps values(1, 'Jill');
insert into grps values (2, 'Barry');
insert into grps values(2, 'Jane');
insert into grps2 values(3, 'Dave');
insert into grps2 values(3, 'Jill');
insert into grps2 values(4, 'Barry');
with grp1 as (
SELECT id, listagg(name, ',') within group (order by name) n
FROM grps
group by id
), grp2 as (
SELECT id, listagg(name, ',') within group (order by name) n
FROM grps2
group by id
)
SELECT * FROM grp2
where n in (
-- find the duplicates
select n from grp1
intersect
select n from grp2
);
Note this will still require a full scan of owner_groups2; I can't think of a way you can avoid this. So your query is likely to remain slow.

Database design - Multiple 1 to many relationships

What would be the best way to model 1 table with multiple 1 to many relatiionships.
With the above schema if Report contains 1 row, Grant 2 rows and Donation 12. When I join the three together I end up with a Cartesian product and result set of 24. Report joins to Grant and creates 2 rows, then Donation joins on that to make 24 rows.
Is there a better way to model this to avoid the caresian product?
example code
DECLARE #Report
TABLE (
ReportID INT,
Name VARCHAR(50)
)
INSERT
INTO #Report
(
ReportID,
Name
)
SELECT 1,'Report1'
DECLARE #Grant
TABLE (
GrantID INT IDENTITY(1,1) PRIMARY KEY(GrantID),
GrantMaker VARCHAR(50),
Amount DECIMAL(10,2),
ReportID INT
)
INSERT
INTO #Grant
(
GrantMaker,
Amount,
ReportID
)
SELECT 'Grantmaker1',10,1
UNION ALL
SELECT 'Grantmaker2',999,1
DECLARE #Donation
TABLE (
DonationID INT IDENTITY(1,1) PRIMARY KEY(DonationID),
DonationMaker VARCHAR(50),
Amount DECIMAL(10,2),
ReportID INT
)
INSERT
INTO #Donation
(
DonationMaker,
Amount,
ReportID
)
SELECT 'Grantmaker1',10,1
UNION ALL
SELECT 'Grantmaker2',3434,1
UNION ALL
SELECT 'Grantmaker3',45645,1
UNION ALL
SELECT 'Grantmaker4',3,1
UNION ALL
SELECT 'Grantmaker5',34,1
UNION ALL
SELECT 'Grantmaker6',23,1
UNION ALL
SELECT 'Grantmaker7',67,1
UNION ALL
SELECT 'Grantmaker8',78,1
UNION ALL
SELECT 'Grantmaker9',98,1
UNION ALL
SELECT 'Grantmaker10',43,1
UNION ALL
SELECT 'Grantmaker11',107,1
UNION ALL
SELECT 'Grantmaker12',111,1
SELECT *
FROM #Report r
INNER JOIN
#Grant g
ON r.ReportID = g.ReportID
INNER JOIN
#Donation d
ON r.ReportID = d.ReportID
Update 1 2011-03-07 15:20
Cheers for the feedback so far, to add to this scenario there are also 15 other 1 to many relationships coming from the one report table. These tables can't for various business reasons be grouped together.
Is there any relationship at all between Grants and Donations? If there isn't, does it make sense to pull back a query that shows a pseudo relationship between them?
I'd do one query for grants:
SELECT r.*, g.*
FROM #Report r
JOIN #Grant g ON r.ReportID = g.ReportID
And another for donations:
SELECT r.*, d.*
FROM #Report r
JOIN #Donation d ON r.ReportID = d.ReportID
Then let your application show the appropriate data.
However, if Grants and Donations are similar, then just make a more generic table such as Contributions.
Contributions
-------------
ContributionID (PK)
Maker
Amount
Type
ReportID (FK)
Now your query is:
SELECT r.*, c.*
FROM #Report r
JOIN #Contribution c ON r.ReportID = c.ReportID
WHERE c.Type = 'Grant' -- or Donation, depending on the application
If you're going to join on ReportID, then no, you can't avoid a lot of rows. When you omit the table "Report", and just join "Donation" to "Grant" on ReportId, you still get 24 rows.
SELECT *
FROM Grant g
INNER JOIN
Donation d
ON g.ReportID = d.ReportID
But the essential point is that it doesn't make sense in the real world to match up donations and grants. They're completely independent things that essentially have nothing to do with each other.
In the database, the statement immediately above will join each row in Grants to every matching row in Donation. The resulting 24 rows really shouldn't surprise you.
When you need to present independent things to the user, you should use a report writer or web application (for example) that selects the independent things, well, independently. Select donations and put them into one section of a report or web page, then select grants and put them into another section of the report or web page, and so on.
If the table "Report" is supposed to help you record which sections go into a particular report, then you need a structure more like this:
create table reports (
reportid integer primary key,
report_name varchar(35) not null unique
);
create table report_sections (
reportid integer not null references reports (reportid),
section_name varchar(35), -- Might want to reference a table of section names
section_order integer not null,
primary key (reportid, section_name)
);
The donation and grant tables look almost identical. You could make them one table and add a column that is something like DonationType. Would reduce complexity by 1 table. Now if donations and grants are completely different and have different subtables associated with them then keeping them seperate and only joining on one at a time would be ideal.