SQL function to sort by most popular content - sql

I don't know if this is possible with SQL:
I have two tables, one of content, each with an integer ID, and a table of comments each with an "On" field denoting the content it is on. I'd like to receive the content in order of how many comments have it in their "On" field, and was hoping SQL could do it.

SELECT comment.on AS content_id, COUNT(comment_id) AS num_comments
FROM comments
GROUP BY content_id
ORDER BY num_comments DESC
If you need all the fields of the content, you can do a join:
SELECT contents.*, COUNT(comment_id) AS num_comments
FROM contents
LEFT JOIN comments on contents.content_id = comments.on
GROUP BY content_id
ORDER BY num_comments DESC

select c.id, count(cmt.*) as cnt
from Content c, Comment cmt
where c.id = cmt.id
order by cnt
group by c.id,

Let's assume your tables look like this (I wrote this in pseudo-SQL - syntax may differ depending on database you are using). From the description you provided, it is not clear how you are joining the tables. Nevertheless, I think it looks something like this (with the caveat that all primary keys, indexes, and so forth are missing):
CREATE TABLE [dbo].[Content] (
[ContentID] [int] NOT NULL,
[ContentText] [varchar](50) NOT NULL
)
CREATE TABLE [dbo].[ContentComments] (
[ContentCommentID] [int] NOT NULL,
[ContentCommentText] [varchar](50) NOT NULL,
[ContentID] [int] NOT NULL
)
ALTER TABLE [dbo].[ContentComments] WITH CHECK ADD CONSTRAINT
[FK_ContentComments_Content] FOREIGN KEY([ContentID])
REFERENCES [dbo].[Content] ([ContentID])
Here is how you would write your query to get the content sorted by the number of comments each piece of content has. The DESC sorts the content items from those with the most comments to those with the least comments.
SELECT Content.ContentID, COUNT(ContentComments.ContentCommentID) AS CommentCount
FROM Content
INNER JOIN ContentComments
ON Content.ContentID = ContentComments.ContentID
GROUP BY Content.ContentID
ORDER BY COUNT(ContentComments.ContentCommentID) DESC

Related

Group by on non id field

I have the following setup of tables:
CREATE TABLE public.tags (
tag_id int4 NOT NULL,
creation_timestamp timestamp NULL,
"name" varchar(255) NULL,
CONSTRAINT tags_pkey PRIMARY KEY (tag_id)
);
-- public.tag_targets definition
-- Drop table
-- DROP TABLE public.tag_targets;
CREATE TABLE public.tag_targets (
id int4 NOT NULL,
creation_timestamp timestamp NULL,
target_id int8 NULL,
target_name varchar(255) NULL,
last_update_timestamp timestamp NULL,
tag_id int4 NULL,
CONSTRAINT tag_targets_pkey PRIMARY KEY (id),
CONSTRAINT fkcesi55mqvysjv63c1xf2j15oh FOREIGN KEY (tag_id) REFERENCES tags(tag_id)
);
I am trying to run the following query:
SELECT *
FROM tag_targets tt, tags t
WHERE tt.tag_id = t.tag_id
AND (t."name" IN ('Keeper', 'Pk'))
GROUP by tt.target_id
However it wants the PK of both Tags and Tagtarget in the group by:
ERROR: column "tt.id" must appear in the GROUP BY clause or be used in an aggregate function
Is there anyway to group on the target_id column? Also feel free to give any feedback on table design as I went for a generic mapping table and independent tags table
The problem is that you are requesting SELECT * but in GROUP BY you specified only tt.target_id. Generally speaking All column names in SELECT list must appear in GROUP BY. Oversimplifying: your database doesn't know what to do with all values you requested in select, that weren't used in GROUP BY or any agregate.
Try running following query to see if you are getting something
SELECT tt.target_id, count(*)
FROM tag_targets tt, tags t
WHERE tt.tag_id = t.tag_id
AND (t."name" IN ('Keeper', 'Pk'))
GROUP by tt.target_id
Unrelated but your syntax of table1, table2 with the join in the "where" clause is the non-ANSI syntax. It's not wrong or anything, but the ANSI syntax of explicit joins is preferred for a litany of reasons I won't go into:
SELECT *
FROM
tag_targets tt
join tags t on
tt.tag_id = t.tag_id
where
t."name" IN ('Keeper', 'Pk')
On the surface, when you say group I am wondering if you mean "sort..." I am assuming you are new to SQL, so if that's an oversimplification, forgive me, but this would be perhaps what you wanted -- an "order by" instead of a group by.
SELECT *
FROM
tag_targets tt
join tags t on
tt.tag_id = t.tag_id
where
t."name" IN ('Keeper', 'Pk')
order by
tt.target_id
If, on the other hand, you only wanted a single record for each target_id (which is truly a "group by target_id"), then perhaps this is what you wanted... one record per target_id, but then you have to identify how to prioritize which order is selected. In this example, I say pick the one based on the most recent updated date:
SELECT distinct on (tt.target_id)
*
FROM
tag_targets tt
join tags t on
tt.tag_id = t.tag_id
where
t."name" IN ('Keeper', 'Pk')
order by
tt.target_id, tt.last_update_timestamp desc
Not confident on either of these suggestions, so if they miss the mark, post some sample data and expected results.

How to write query/create view to limit multiple records to show only max value

Consider the following three tables. A list of contacts, a list of status with a defined "rank" and a join table that links a contact to multiple status's.
CREATE TABLE public."Contacts"
(
name character varying COLLATE pg_catalog."default",
email character varying COLLATE pg_catalog."default",
contactid integer NOT NULL DEFAULT nextval('"Contacts_contactid_seq"'::regclass),
CONSTRAINT "Contacts_pkey" PRIMARY KEY (contactid)
)
CREATE TABLE public.statusoptions
(
option character varying COLLATE pg_catalog."default" NOT NULL,
"Rank" integer,
CONSTRAINT "ListOptions_pkey" PRIMARY KEY (option)
)
CREATE TABLE public."ContactStatus"
(
contactid integer NOT NULL,
option character varying COLLATE pg_catalog."default" NOT NULL,
CONSTRAINT "Options_pkey" PRIMARY KEY (contactid, option),
CONSTRAINT fk_1 FOREIGN KEY (contactid)
REFERENCES public."Contacts" (contactid) MATCH SIMPLE
ON UPDATE NO ACTION
ON DELETE NO ACTION,
CONSTRAINT fk_2 FOREIGN KEY (option)
REFERENCES public.statusoptions (option) MATCH SIMPLE
ON UPDATE NO ACTION
ON DELETE NO ACTION
)
The following query returns all rows.
select "Contacts".contactid, "Contacts".name, "ContactStatus".option, statusoptions."Rank" as
currentRank
from "Contacts","ContactStatus", statusoptions
where "Contacts".contactid = "ContactStatus".contactid
and statusoptions.option="ContactStatus".option
This returns a record set that looks like this:
Contactid name Status CurrentRank
1 "john" "apply" 1
1 "john" "Manager Review" 4
2 "bill" "apply" 1
2 "bill" "1st interview" 2
1 "john" "1st interview" 2
What I need is to create a query/view that would always JUST return the rows of the MAX current RANK. So the expected result I want from this view is:
Contactid name Status CurrentRank
1 "john" "Manager Review" 4
2 "bill" "1st interview" 2
At any time, I could change the "Rank" value in the statusoptions field, which would change the view accordingly.
Is this possible?
You can use distinct on:
select distinct on(c.contactid)
c.contactid,
c.name,
cs.option,
s."Rank" as currentRank
from
"Contacts" c
inner join "ContactStatus" cs on c.contactid = cs.contactid
inner join statusoptions s on s.option = cs.option
order by c.contactid, s."Rank" desc
Note:
always use explicit, standard joins (with the on clause) instead of old-school, implicit joins (with a comma in the where clause)
(short) table aliase make the query shorter and easier to read
consider avoiding quoting table and column names, unless when absolutly necessary; they make the identifiers case-senstive, while by default they are not
In Postgres, you can use distinct on
I think you want:
select distinct on (c.contactid) c.contactid, c.name, cs.option, so."Rank" as currentRank
from "Contacts" c join
"ContactStatus" cs
on c.contactid = cs.contactid join
statusoptions so
on so.option = cs.option
order by c.contactid, so.rank desc;
Notes:
Use proper, explicit, standard JOIN syntax.
Never use commas in the FROM clause.
Table aliases make a query easier to write and to read.
You should avoid quoting table names and column names. That just clutters up queries unnecessarily.
distinct on usually has better performance than alternatives such as row_number().
You can do max(rank) and group by the remaining fields
select c.contactid, c.name, cs.option, max(so.rank) currentRank
from Contacts c
join ContactStatus cs on c.contactid = cs.contactid
join StatusOptions so on so.option = cs.option
group by c.contactid, c.name, cs.option

How Can i subtract two columns in different tables in SQL

I want to enter in a query the subtract between two columns in different tables it keeps saying error ...
SELECT FlightDate,
Plane,
Destination,
Capacity
FROM
FlightSchedule,
Routes,
Aircrafts
WHERE
FlightSchedule.RID=Routes.RouteID
AND FlightSchedule.Plane=Aircrafts.AcID
AND (SELECT SUM(Capacity)
FROM Aircrafts)
-
(SELECT Class , count(*)
FROM Tickets);
I would split this up into 2 parts. First get the data you need, then perform the math. You are also using old SQL syntax. You should re-format using the new JOIN syntax. It's also easier to read.
1st declare a table to hold you flight schedule and aircraft info.
Declare #AIRCRAFTCAPACITY Table
(
[FlightDate] [datetime] NOT NULL,
[Plane] [varchar](50) NOT NULL,
[Destination] [varchar](50) NOT NULL,
[Capacity] [INT] NOT NULL,
[NoOfSeatsRemaining] [INT] NULL
);
Then insert the data you need.
INSERT #AIRCRAFTCAPACITY
(
[FlightDate],
[Plane],
[Destination],
[Capacity]
)
SELECT FS.FlightDate,
A.Plane,
FS.Destination,
A.Capacity
FROM
FlightSchedule FS
INNER
JOIN Routes R
ON
FS.RID = R.RouteID
INNER
JOIN Aircrafts A
ON
FS.Plane = A.AcID
Now perform the math to calculate the remaining capacity. I've made an assumption that you are doing this for a particular route. But I'm sure you can adjust your SQL accordingly.
UPDATE #AIRCRAFTCAPACITY
SET
[NoOfSeatsRemaining] = [Capacity] - T.TICKETS_SOLD
FROM
#AIRCRAFTCAPACITY A
INNER
JOIN
(
SELECT ROUTEID, COUNT(ROUTEID) AS TICKETS_SOLD
FROM
Tickets T1
WHERE
T1.ROUTEID = A.ROUTEID
GROUP
BY ROUTEID
) T
ON
A.ROUTEID = T.ROUTEID
Apolgies if the syntax is a little off as it's hard to construct SQL when you dont have the underlying tables.
But hopefully it will help.
The query is wrong on multiple levels.
First of all, you cannot put the details of SELECT capacity as a sub query under WHERE. The WHERE clause is for conditional parameters.
Try this:
WITH VACANCY as (SELECT SUM(Capacity) FROM Aircrafts) - (SELECT Count(*) FROM Tickets)
SELECT FlightDate,Plane, Destination, Vacancy
FROM FlightSchedule,Routes,Aircrafts
WHERE FlightSchedule.RID=Routes.RouteID AND FlightSchedule.Plane=Aircrafts.AcID;
You also need to a condition for the select count for tickets but I don't know the schema of tickets so...

Returns all values in 3 tables

I have three tables:
CREATE TABLE [dbo].[Data]
(
[PorID] [int] NOT NULL,
[HourS] [int] NOT NULL
)
CREATE TABLE [dbo].[TimeData]
(
[HId] [bigint] NOT NULL,
[HName] [varchar](50) NOT NULL,
[HHour] [int] NOT NULL
)
CREATE TABLE [dbo].PortInfo
(
[Id] [bigint] NOT NULL,
[PortName] [varchar](50) NOT NULL
)
Even if the port is not present in the Data table it should return rows for all port in PortInfo table. Similarly, it should always return 24 records for each port. The result should display all ports for each record even if doesn't exist within the Data table.
Updated Answer
Based on what you are telling me in the comments and me filling in some blanks, this is what I assume you are looking for. This will produce a record for every hour, for every port.
SELECT
td.HHour,
td.HName,
pi.Id,
pi.PortName,
d.PorID,
d.HourS
FROM
dbo.TimeData td
FULL OUTER JOIN
dbo.PortInfo pi
ON (1 = 1)
LEFT OUTER JOIN
dbo.Data d
ON (d.PorID = pi.Id)
AND (d.HourS = td.HHour)
Output:
Some feedback to make this process easier. Share your schema (relationships) and/or some sample data. Also, consider creating more logical/intuitive names for your columns so that relationships and content may be implied.
Original Answer
This sounds like what you are looking for. The query below will return all port information (from PortInfo) even if its Id is not in the Data table (PorId). This is done by using a LEFT JOIN onto the PortInfo table.
SELECT
po.Id,
po.PortName,
d.HourS
FROM
dbo.PortInfo po
LEFT JOIN
dbo.Data d
ON (d.PorID = po.Id)
Now, you don't mention the how or if the 3rd table TimeData should be used, but if you wanted that information in your result as well, you can simply LEFT JOIN that as well:
SELECT
po.Id,
po.PortName,
d.HourS,
td.HName,
td.HHour
FROM
dbo.PortInfo po
LEFT JOIN
dbo.Data d
ON (d.PorID = po.Id)
LEFT JOIN
dbo.TimeData td
ON (td.HId = d.HourS) -- I assume this is the link, you may need to update if not.

T-SQL - get count of joined entries

I wonder how better to write the following query to Microsoft SQL Server.
I have three tables: surveys, survey_presets and survey_scenes. They have the following columns:
CREATE TABLE [dbo].[surveys](
[id] [int] IDENTITY(1,1) NOT NULL,
[caption] [nvarchar](255) NOT NULL,
[creation_time] [datetime] NOT NULL,
)
CREATE TABLE [dbo].[survey_presets](
[id] [int] IDENTITY(1,1) NOT NULL,
[survey_id] [int] NOT NULL,
[preset_id] [int] NOT NULL,
)
CREATE TABLE [dbo].[survey_scenes](
[id] [int] IDENTITY(1,1) NOT NULL,
[survey_id] [int] NOT NULL,
[scene_id] [int] NOT NULL,
)
Both survey_presets and survey_scenes have foreign keys on surveys for survey_id column.
Now I want to select all surveys with the count of corresponding presets and scenes for each. Here is the "pseudo-query" of what I want:
SELECT
surveys.*,
COUNT(survey_presets, where survey_presets.survey_id = surveys.id),
COUNT(survey_scenes, where survey_scenes.survey_id = surveys.id)
FROM surveys
ORDER BY suverys.creation_time
I can do a mess with SELECT DISTINCT, JOIN, GROUP BY, etc., but I'm new to T-SQL and I doubt my query will be optimal in any sense.
I would do the counting in subqueries to avoid cartesian products. As you might have a few matching rows in presets and also a few in scenes resulting count might be multiplied. You might write simple join query and avoid the multiplication by counting distinct survey_presets.id and distinct survey_scenes.id though.
SELECT
surveys.*,
isnull(presets_count, 0) presets_count,
isnull(scenes_count, 0) scenes_count
FROM surveys
LEFT JOIN
(
SELECT survey_id,
count(*) presets_count
FROM survey_presets
GROUP BY survey_id
) presets
ON surveys.id = presets.survey_id
LEFT JOIN
(
SELECT survey_id,
count(*) scenes_count
FROM survey_scenes
GROUP BY survey_id
) scenes
ON surveys.id = scenes.survey_id
ORDER BY surveys.creation_time
How it works
You can introduce a special kind of subquery called derived table to FROM section of your query. Derived table is defined as normal query enclosed in parenthesis and followed by table alias. It cannot use any column from outer query, but can expose columns you use in ON section to join derived table to main body of the query.
In this case derived table simply count rows grouped by id; joins connect the counts to surveys.
SELECT surveys.ID, surveys.caption, surveys.creation_time,
count(survey_presets.survey_id) as survey_presets,
count(survey_scenes.survey_id) as survey_scenes
FROM surveys
LEFT OUTER JOIN survey_presets on survey_presets.survey_id = surveys.id
LEFT OUTER JOIN survey_scenes on survey_scenes.survey_id = surveys.id
GROUP BY surveys.ID, surveys.caption, surveys.creation_time
ORDER BY suverys.creation_time