How do I tally votes in MySQL? - sql

I've got a database table called votes with three columns 'timestamp', 'voter', and 'voted_for'.
Each entry in the table represents one vote. I want to tally all of the votes for each 'voted_for' with some conditions.
The conditions are as follows:
Each voter can vote only once, in the case of multiple votes by a single voter the most recent vote counts.
Only votes made before a specified time are counted.

try this:
SELECT voted_for, count(*)
FROM votes v
INNER JOIN (SELECT Voter, Max(timestamp) as lastTime from votes group by Voter) A
on A.Voter = v.voter and a.lasttime = v.timestamp
WHERE timestamp < {date and time of last vote allowed}
Group by voted_for

the following may prove helpful:
drop table if exists users;
create table users
(
user_id int unsigned not null auto_increment primary key,
username varbinary(32) not null,
unique key users_username_idx(username)
)engine=innodb;
insert into users (username) values
('f00'),('foo'),('bar'),('bAr'),('bish'),('bash'),('bosh');
drop table if exists picture;
create table picture
(
picture_id int unsigned not null auto_increment primary key,
user_id int unsigned not null, -- owner of the picture, the user who uploaded it
tot_votes int unsigned not null default 0, -- total number of votes
tot_rating int unsigned not null default 0, -- accumulative ratings
avg_rating decimal(5,2) not null default 0, -- tot_rating / tot_votes
key picture_user_idx(user_id)
)engine=innodb;
insert into picture (user_id) values
(1),(2),(3),(4),(5),(6),(7),(1),(1),(2),(3),(6),(7),(7),(5);
drop table if exists picture_vote;
create table picture_vote
(
picture_id int unsigned not null,
user_id int unsigned not null,-- voter
rating tinyint unsigned not null default 0, -- rating 0 to 5
primary key (picture_id, user_id)
)engine=innodb;
delimiter #
create trigger picture_vote_before_ins_trig before insert on picture_vote
for each row
proc_main:begin
declare total_rating int unsigned default 0;
declare total_votes int unsigned default 0;
if exists (select 1 from picture_vote where
picture_id = new.picture_id and user_id = new.user_id) then
leave proc_main;
end if;
select tot_rating + new.rating, tot_votes + 1 into total_rating, total_votes
from picture where picture_id = new.picture_id;
-- counts/stats
update picture set
tot_votes = total_votes,
tot_rating = total_rating,
avg_rating = total_rating / total_votes
where picture_id = new.picture_id;
end proc_main #
delimiter ;
insert into picture_vote (picture_id, user_id, rating) values
(1,1,5),(1,2,3),(1,3,3),(1,4,2),(1,5,1),
(2,1,1),(2,2,2),(2,3,3),(2,4,4),(2,5,5),(2,6,1),(2,7,2),
(3,1,5),(3,2,5),(3,3,5),(3,4,5),(3,5,5),(3,6,5),(3,7,5);
select * from users order by user_id;
select * from picture order by picture_id;
select * from picture_vote order by picture_id, user_id;

SELECT voted_for,COUNT(DISTINCT voter)
FROM votes
WHERE timestamp < '2010-11-18 21:05:00'
GROUP BY voted_for

Related

Column 'Users.Name' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause

I have the following tables:
create table User
(
Id int not null primary key clustered (Id),
Name nvarchar(255) not null
)
create table dbo.UserSkill
(
UserId int not null,
SkillId int not null,
primary key clustered (UserId, SkillId)
)
Given a set of Skills Ids I need to get the users that have all these Skills Ids:
select Users.*
from Users
inner join UserSkills on Users.Id = UserSkills.UserId
where UserSkills.SkillId in (149, 305)
group by Users.Id
having count(*) = 2
I get the following error:
Column 'Users.Name' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
What am I missing?
Side questions:
Is there a faster query to accomplish the same result?
How can I pass the SkillsIds, e.g. (149, 305) as a parameter? And set the #SkillsIds count in having count(*) = 2 instead of 2?
UPDATE
The following code is working and I get the User John.
declare #Users table
(
Id int not null primary key clustered (Id),
[Name] nvarchar(255) not null
);
declare #Skills table
(
SkillId int not null primary key clustered (SkillId)
);
declare #UserSkills table
(
UserId int not null,
SkillId int not null,
primary key clustered (UserId, SkillId)
);
insert into #Users
values (1, 'John'), (2, 'Mary');
insert into #Skills
values (148), (149), (304), (305);
insert into #UserSkills
values (1, 149), (1, 305), (2, 148), (2, 149);
select u.Id, u.Name
from #Users as u
inner join #UserSkills as us on u.Id = us.UserId
where us.SkillId in (149, 305)
group by u.Id, u.Name
having count(*) = 2
If user has 40 columns, is there a way to not enumerate all the columns in the Select and Group By since Id is the only column needed to group?
First, your tables are broken, unless Name has only a single character. You need a length:
create table User (
UserId int not null primary key clustered (Id),
Name nvarchar(255) not null
);
Always use a length when specifying char(), varchar(), and related types in SQL Server.
For your query, SQL Server, is not going to process select * with group by. List each column in both the select and group by:
select u.id, u.name
from Users u join
UserSkills us
on u.Id = us.UserId
where us.SkillId in (149, 305)
group by u.Id, u.name
having count(*) = 2;

SQL query to calculate average and insert into a table

I'm working on a song archive database and I'm stuck on some queries. I would like to -
Calculate the rating of each user by their average Comments value of score and inserting the rating into Users
Calculating how much Purchases each user has
Calculate the average score of a Song from the Comments table
Calculating how many credits each user has spent on their purchases
Below you can find my tables...
CREATE TABLE Users
(
username NVARCHAR( 30 ) NOT NULL PRIMARY KEY,
pass NVARCHAR( 16 ),
email NVARCHAR( 50 ),
city NVARCHAR( 10 ),
credits INT,
rating INT
)
CREATE TABLE Songs
(
song_id INT NOT NULL IDENTITY ( 1, 1 ) PRIMARY KEY,
song_name NVARCHAR( 30 ),
username NVARCHAR( 30 ),
genre INT,
price INT,
song_length INT,
listens INT
)
CREATE TABLE Genres
(
genre_id INT NOT NULL IDENTITY ( 1, 1 ) PRIMARY KEY,
genre_name NVARCHAR( 16 )
)
CREATE TABLE Purchases
(
purchase_id INT NOT NULL IDENTITY ( 1, 1 ) PRIMARY KEY,
song_id INT,
username NVARCHAR( 30 )
date_purchased DATETIME
)
CREATE TABLE Comments
(
comment_id INT NOT NULL IDENTITY ( 1, 1 ) PRIMARY KEY,
username NVARCHAR( 30 ),
song_id INT,
text NVARCHAR( 30 ),
score INT
)
I answered some of your questions. In addition to the respective queries I arranged them as common table expressions, which I think could be a convenient way to use them...
Calculating how much credits each user has spent on his purchases, might require to know your logic about how users invest their credits.
WITH CTE_PurchasesByUser AS
(
SELECT p.username as username, count(*) as NrOfPurchases
FROM Purchases p
GROUP BY p.username
),
CTE_AverageScoreBySong AS
(
SELECT c.song_id as song_id, (sum(c.score)/count(c.score)) as AverageScore
FROM Comments c
GROUP BY c.song_id
),
CTE_AverageScoreByUser AS
(
SELECT u.username as username, (sum(c.score)/count(c.score)) as AverageScore
FROM Users u
INNER JOIN Comments c ON u.username = c.username
GROUP BY u.username
)
SELECT u.*, ISNULL(bbu.NrOfPurchases,0), asu.AverageScore
FROM Users u
LEFT JOIN CTE_PurchasesByUser bbu ON u.username = bbu.username
LEFT JOIN CTE_AverageScoreByUser asu ON u.username = asu.username
This SQL ran with your tables, yet I didn't test it with data rows...

Optimize SQL query with 3 FOR loops

I have a fully working SQL query. However, it is very very slow. I am looking for a way to optimize it.
CREATE TABLE trajectory_geom (
id SERIAL PRIMARY KEY,
trajectory_id BIGINT,
user_id BIGINT,
geom GEOMETRY(Linestring, 4326)
);
INSERT INTO trajectory_geom (trajectory_id, user_id, geom)
SELECT
p.trajectory_id,
p.user_id,
ST_Transform(ST_MakeLine(p.geom), 4326)
FROM point p
GROUP BY p.trajectory_id
;
DO $$
DECLARE
urow record;
vrow record;
wrow record;
BEGIN
FOR wrow IN
SELECT DISTINCT(p.user_id) FROM point p
LOOP
raise notice 'User id: %', wrow.user_id;
FOR vrow IN
SELECT DISTINCT(p.trajectory_id) FROM point p WHERE p.user_id = wrow.user_id
LOOP
FOR urow IN
SELECT
analyzed_tr.*
FROM trajectory_start_end_geom analyzed_tr
WHERE
analyzed_tr.user_id = wrow.user_id
AND
ST_Intersects (
(
analyzed_tr.start_geom
)
,
(
SELECT g.geom
FROM trajectory_geom g
WHERE g.trajectory_id = vrow.trajectory_id
)
) = TRUE
LOOP
INSERT INTO trajectories_intercepting_with_starting_point (initial_trajectory_id, mathced_trajectory_id, user_id)
SELECT
vrow.trajectory_id,
urow.trajectory_id,
wrow.user_id
WHERE urow.trajectory_id <> vrow.trajectory_id
;
END LOOP;
END LOOP;
END LOOP;
END;
$$;
It has 3 loops...how can I avoid them?
Basically, I am looping all user IDs, for each user looping all trajectories and checking is trajectory interact with any other trajectory of this user.
Schema:
CREATE TABLE public.trajectory_start_end_geom
(
id integer NOT NULL DEFAULT nextval('trajectory_start_end_geom_id_seq'::regclass),
trajectory_id bigint,
user_id bigint,
start_geom geometry(Polygon,4326),
end_geom geometry(Polygon,4326),
CONSTRAINT trajectory_start_end_geom_pkey PRIMARY KEY (id)
)
WITH (
OIDS=FALSE
);
CREATE TABLE public.trajectory_geom
(
id integer NOT NULL DEFAULT nextval('trajectory_geom_id_seq'::regclass),
trajectory_id bigint,
user_id bigint,
geom geometry(LineString,4326),
CONSTRAINT trajectory_geom_pkey PRIMARY KEY (id)
)
WITH (
OIDS=FALSE
);
CREATE TABLE public.point
(
id integer NOT NULL DEFAULT nextval('point_id_seq'::regclass),
user_id bigint,
date date,
"time" time without time zone,
lat double precision,
lon double precision,
trajectory_id integer,
geom geometry(Geometry,4326),
CONSTRAINT point_pkey PRIMARY KEY (id)
)
WITH (
OIDS=FALSE
);
Try this SQL query. Hope this helps.
INSERT INTO trajectories_intercepting_with_starting_point
(initial_trajectory_id, mathced_trajectory_id, user_id)
SELECT
TG.trajectory_id AS first_trajectory_id,
TG2.trajectory_id AS last_trajectory_id,
TG.user_id
FROM Trajectory_geom AS TG
JOIN Trajectory_geom AS TG2 ON TG.user_id = TG2.user_id
AND TG.trajectory_id < TG2.trajectory_id
JOIN Trajectory_start_end_geom AS TSE ON TSE.trajectory_id = TG.trajectory_id
WHERE ST_Intersects(TSE.start_geom, TG2.geom) = TRUE
This should do the trick:
WITH vrow AS(
INSERT INTO trajectory_geom (trajectory_id, user_id, geom)
SELECT
p.trajectory_id,
p.user_id,
ST_Transform(ST_MakeLine(p.geom), 4326) AS geom
FROM point p
GROUP BY p.trajectory_id
RETURNING trajectory_id, user_id, geom
)
INSERT INTO trajectories_intercepting_with_starting_point (initial_trajectory_id, mathced_trajectory_id, user_id)
SELECT
vrow.trajectory_id,
urow.trajectory_id,
vrow.user_id
FROM trajectory_start_end_geom AS urow
JOIN vrow
ON urow.user_id = vrow.user_id
AND urow.trajectory_id <> vrow.trajectory_id
AND ST_Intersects(urow.start_geom, vrow.geom)
If you don't need insert into trajectory_geom eliminating it (and the CTE) will speed it up

Insert if Update fails

Using H2, I want to attempt to update a row. If it doesn't exist, I'd like to insert it. I'd like to do it all in one single SQL statement, if possible, to avoid concurrency issues.
So far I have the update:
UPDATE RATING
SET NUM_RATINGS = (SELECT NUM_RATINGS + 1 FROM RATING WHERE EVENT_UID = :eventUid)
, SUM_RATINGS = (SELECT SUM_RATINGS + :newRating FROM RATING WHERE EVENT_UID = :eventUid)
WHERE EVENT_UID = :eventUid AND EXISTS ( SELECT * FROM RATING WHERE EVENT_UID = :eventUid)
The table definition is:
CREATE TABLE RATING (
ID BIGINT NOT NULL,
EVENT_UID VARCHAR(255) NOT NULL,
SUM_RATINGS BIGINT NOT NULL,
NUM_RATINGS INT NOT NULL,
PRIMARY KEY (ID),
FOREIGN KEY (EVENT_UID) REFERENCES EVENT(UID)
)
Can anyone improve the Update statement?
How do I, in the same SQL statement, add an Insert like the following, if the row does not exist?
INSERT INTO RATING ( ID , EVENT_UID , NUM_RATINGS , SUM_RATINGS )
VALUES (2, 'BWEIY-A4', 1, 4)

Update: subquery returned more than 1 value

I am having a problem, and I can't figure out how to fix this query. I have a temp table, one of the columns should contain a calculated value of another column divided by a sum of groups of that column. I don't know how to write this so that I avoid the error.
Declare #Temp Table
(
ZipCode char(5) Not Null,
StateFacilityId varchar (50) Not Null,
Cnt int Not Null,
MarketShare float,
Row int Not Null,
Primary Key Clustered (ZipCode, StateFacilityId)
);
Insert Into #Temp (ZipCode, StateFacilityId, Cnt, Row)
Select d.ZipCode, d.StateFacilityId, Cnt = COUNT(*), Row = ROW_NUMBER()OVER(PARTITION BY ZipCode ORDER BY Count(*) DESC)
From [MarketShareIQData].[dbo].[tblServicesDetail] d
Group By d.ZipCode, d.StateFacilityId
;
Update #Temp
Set MarketShare =(h.Cnt/(
Select SUM(h.Cnt)
From #Temp h
Group By ZipCode
))
From #Temp h
A group by would return one row per group. I'm guessing you're looking for the single group with matching zipcode. You could do that like:
update h
set MarketShare = h.Cnt /
(
select sum(h2.Cnt)
from #Temp h2
where h2.ZipCode = h.ZipCode
)
from #Temp h