Simple SQL (hopefully!) Query Question - sql

I hope someone can help me with my faltering steps to formulate a SQL query for the following problem.
I have a simple table that records visitor names and dates. The relationship is many to many, in that for any given date there are many visitors, and for any given visitor there will be one or more dates (i.e. repeat visits). There is a complicating third column that records the name of the exhibit(s) the visitor interacted with. The data might look like this:
NAME ART DATE
Joe Picture 1 23-1-09
Joe Picture 2 23-1-09
Joe Picture 3 23-1-09
Janet Picture 2 23-1-09
Joe Picture 2 31-2-09
I want to know what the distribution of single and multiple visits are, in other words, how many people only visited once, how many people visited on 2 separate days, how many on 3 separate days, and so on.
Can anyone help please? Thank you in anticipation!
Frankie

If you only want to count the total number of distinct visits, including multiple visits on the same date, you could use:
SELECT [Name],
COUNT(*) AS Count_Dates
FROM MyTable
GROUP BY [Name]
However, if you don't want to count multiple visits on the same date, you could use the following:
SELECT [Name],
COUNT(*) AS Count_Dates
FROM
(
SELECT DISTINCT [Name],
[Date]
FROM MyTable
) a
GROUP BY [Name]
This will show you the distribution of total people who visited x times per day. However, this will not display numbers for counts where 0 people visited that many times - for example, if nobody visited 8 times, then there won't be a row for Count_Dates = 8. If you did want to display a full list from 0-10 visits, you could create a temp table of Count_Dates and insert values from 0-10, then use it as part of the main query.
SELECT Count_Dates,
COUNT(*) AS Count_Visitors
FROM (SELECT [Name], COUNT(DISTINCT [Date]) AS Count_Dates FROM MyTable GROUP BY [Name]) a
GROUP BY Count_Dates
ORDER BY Count_Dates

SELECT NAME, COUNT(ART) as num_exhibits, COUNT(DATE) as num_days
FROM table GROUP BY NAME;
This will give you a table of each name along with the total number of visits for that name and the total number of dates visited.
To get an average exhibit per date you can do:
SELECT
NAME,
COUNT(ART) as num_exhibits,
COUNT(DATE) as num_days,
(num_exhibits / num_days) as avg_exhibit_per_day
FROM table GROUP BY NAME;

Related

Delete duplicates using dense rank

I have a sales data table with cust_ids and their transaction dates.
I want to create a table that stores, for every customer, their cust_id, their last purchased date (on the basis of transaction dates) and the count of times they have purchased.
I wrote this code:
SELECT
cust_xref_id, txn_ts,
DENSE_RANK() OVER (PARTITION BY cust_xref_id ORDER BY CAST(txn_ts as timestamp) DESC) AS rank,
COUNT(txn_ts)
FROM
sales_data_table
But I understand that the above code would give an output like this (attached example picture)
How do I modify the code to get an output like :
I am a beginner in SQL queries and would really appreciate any help! :)
This would be an aggregation query which changes the table key from (customer_id, date) to (customer_id)
SELECT
cust_xref_id,
MAX(txn_ts) as last_purchase_date,
COUNT(txn_ts) as count_purchase_dates
FROM
sales_data_table
GROUP BY
cust_xref_id
You are looking for last purchase date and count of distinct transaction dates ( like if a person buys twice, it should be considered as one single time).
Although you mentioned you want count of dates but sample data shows you want count of distinct dates - customer 284214 transacted 9 times but distinct will give you 7.
So, here is the SQL you can use to get your result.
SELECT
cust_xref_id,
MAX(txn_ts) as last_purchase_date,
COUNT(distinct txn_ts) as count_purchase_dates -- Pls note distinct will count distinct dates
FROM sales_data_table
GROUP BY 1

Link data from sql from 2/3 columns

I don't know if the question title is so clear, but here is my question:
I had table UsersMovements which contains Users along with their movements
UsersMovements:
ID
UserID
MovementID
Comments
Time/Date
I need help looking for a query which would give me if users 1, 2 & 3 had been in a common MovementID, knowing that I don't know what is the MovementID
The real case is that, I want to see if those X users which I would select been in an area (in a limited interval, assuming I had date/Time in the table)
Thank you
if you want to select list of movements which have userid 1,2 and 3 you can use group by with having
select movementid
from usermovements
where userid in(1,2,3)
group by movementid
having count(distinct userid)=3

Make table ID appear as a column and select across all tables

I've been requested by my superiors to write a query that will search every table in a database (each representative of a road and their total counts of traffic) and take the total counts by hour of motorcycles. Here's what I have so far whilst testing on one table:
WITH
totalCount AS
(
SELECT DATEDIFF(dd,0,event_time) AS DaySerial,
DATEPART(dd,event_time) AS theDay,
DATEDIFF(mm,0,event_time) AS MonthSerial,
DATEPART(mm,event_time) AS MonthofYear,
DATEDIFF(hh,0,event_time) AS HourSerial,
DATEPART(hh,event_time) AS Hour,
COUNT(*) AS HourlyCount,
DATEDIFF(yy,0,event_time) AS YearSerial,
DATEPART(yy,event_time) AS theYear
FROM [RUD].dbo.[10011E]
WHERE length <='1.7'
GROUP BY DATEDIFF(hh,0,event_time),
DATEPART(hh,event_time),
DATEDIFF(dd,0,event_time),
DATEPART(dd,event_time),
DATEDIFF(mm,0,event_time),
DATEPART(mm,event_time),
DATEDIFF(yy,0,event_time),
DATEPART(yy,event_time)
)
SELECT
theYear,
MonthofYear,
theDay,
Hour,
AVG(HourlyCount) AS Avg_Count
FROM
totalCount
GROUP BY
theYear,
MonthofYear,
theDay,
Hour
ORDER BY
theYear,
MonthofYear,
theDay,
Hour
Now I'm sure some of this is redundant or not needed, that's ok for now (I'm new to SQL btw, which is why some of this will be redundant). Basically as it stands, I list the year, month, date, hour and hourly count of motorcycles for one road. Now my two questions:
How do I take this query and make it so that it searches across every single table in the RUD database? Do I just need to list them all and UNION them, or is there a quicker way?
I realise if I search through every table gathering only the above (year, month, day, hour, hourly count) I will end up with the right data but with no way to distinguish which road all the counts are coming from. Is there a way to select the table ID (in this example, 10011E is the ID, and is the assigned name for a specific road) and place it in a column next to the rows that were selected from it?
If anyone needs clarification on what I mean, please let me know! Thanks!
One option would be to use UNION ALL and add an additional column for which source. You'll have to write out each of your tables in this case, but it's perhaps your fastest option:
SELECT ID, 'YourTable' TableName
FROM YourTable
UNION ALL
SELECT ID, 'YourOtherTable'
FROM YourOtherTable
....
Alternatively, dynamic sql could produce you the same results -- you might not have to type out all your table names, but it comes with a performance hit.

SQL - Add total from columns of two seperate tables

Complete novice here, trying to find out the total number of students from both the part-time students and full-time students and display the total in a named column.
partTimeStudents**(bannerID, moduleCode, modStartDate, rvisitorID)
fullTimeStudents**(bannerID, courseCode, crsStartDate, rvisitorID)
Thank you in advance for any help given :)
select
(select count(*) from partTimeStudents)+
(select count(*) from fullTimeStudents) as Total
I agree what others have said about the database design, but here's one example of a query that would fulfill your requirement.
SELECT SUM(students_count)
FROM (
SELECT COUNT(*) AS students_count
FROM partTimeStudents
UNION ALL
SELECT COUNT(*) AS students_count
FROM fullTimeStudents
)
You should not take 2 tables just to distinguish it by full/part time student. You can simply take a flag in table like:
students(bannerID, Code, modStartDate, rvisitorID, timeFlag)
In timeFlag you can manage whether the student is full time or part time.
Now to get count of all students[full time + part time]:
select count(*) from students;
And to have counts for fullTime students or partime students:
select count(*) from students where tiemFlag=1; //--- assuming 1 is for fulltime
select count(*) from students where tiemFlag=0; //--- assuming 0 is for parttime

Compute Users average weight

I have two tables, Users and DoctorVisit
User
- UserID
- Name
DoctorsVisit
- UserID
- Weight
- Date
The doctorVisit table contains all the visits a particular user did to the doctor.
The user's weight is recorded per visit.
Query: Sum up all the Users weight, using the last doctor's visit's numbers. (then divide by number of users to get the average weight)
Note: some users may have not visited the doctor at all, while others may have visited many times.
I need the average weight of all users, but using the latest weight.
Update
I want the average weight across all users.
If I understand your question correctly, you should be able to get the average weight of all users based on their last visit from the following SQL statement. We use a subquery to get the last visit as a filter.
SELECT avg(uv.weight) FROM (SELECT weight FROM uservisit uv INNER JOIN
(SELECT userid, MAX(dateVisited) DateVisited FROM uservisit GROUP BY userid) us
ON us.UserID = uv.UserId and us.DateVisited = uv.DateVisited
I should point out that this does assume that there is a unique UserID that can be used to determine uniqueness. Also, if the DateVisited doesn't include a time but just a date, one patient who visits twice on the same day could skew the data.
This should get you the average weight per user if they have visited:
select user.name, temp.AvgWeight
from user left outer join (select userid, avg(weight)
from doctorsvisit
group by userid) temp
on user.userid = temp.userid
Write a query to select the most recent weight for each user (QueryA), and use that query as an inner select of a query to select the average (QueryB), e.g.,
SELECT AVG(weight) FROM (QueryA)
I think there's a mistake in your specs.
If you divide by all the users, your average will be too low. Each user that has no doctor visits will tend to drag the average towards zero. I don't believe that's what you want.
I'm too lazy to come up with an actual query, but it's going to be one of these things where you use a self join between the base table and a query with a group by that pulls out all the relevant Id, Visit Date pairs from the base table. The only thing you need the User table for is the Name.
We had a sample of the same problem in here a couple of weeks ago, I think. By the "same problem", I mean the problem where we want an attribute of the representative of a group, but where the attribute we want isn't included in the group by clause.
I think this will work, though I could be wrong:
Use an inner select to make sure you have the most recent visit, then use AVG. Your User table in this example is superfluous: since you have no weight data there and you don't care about user names, it doesn't do you any good to examine it.
SELECT AVG(dv.Weight)
FROM DoctorsVisit dv
WHERE dv.Date = (
SELECT MAX(Date)
FROM DoctorsVisit innerdv
WHERE innerdv.UserID = dv.UserID
)
If you're using SQL Server 2005 you don't need the sub query on the GROUP BY.
You can use the new ROW_NUMBER and PARTION BY functionality.
SELECT AVG(a.weight) FROM
(select
ROW_NUMBER() OVER(PARTITION BY dv.UserId ORDER BY Date desc) as ID,
dv.weight
from
DoctorsVisit dv) a
WHERE a.Id = 1
As someone else has mentioned though, this is the average weight across all the users who have VISITED the doctor. If you want the average weight across ALL of the users then anyone not visiting the doctor will give a misleading average.
Here's my stab at the solution:
select
avg(a.Weight) as AverageWeight
from
DoctorsVisit as a
innner join
(select
UserID,
max (Date) as LatestDate
from
DoctorsVisit
group by
UserID) as b
on a.UserID = b.UserID and a.Date = b.LatestDate;
Note that the User table isn't used at all.
This average omits entirely users who have no doctors visits at all, or whose weight is recorded as NULL in their latest doctors visit. This average is skewed if any users have more than one visit on the same date, and if the latest date is one of those date where the user got wighed more than once.