I have two table call User_History and List_Of_Event.
I'm trying to get all the result from User_History table like this but I'm not sure why it's not showing me anything If my Event_ID column in User_History table is blank.
I'm just wondering how can I still get all the result even if my Event_Id column in User_History table is all blank/empty.
Select Event_Name As 'Event', GIN, GID, UPN, OneDrive, SharePoint, Mailbox, Event_Date, Extra
From List_of_Events E
Inner Join User_History H
on E.Event_Id = H.Event_Id
List_Of_Event have Event_Id (int) and Event_Name (varchar)
User_History have Event_Id(int) and other Varchar columns
then you need a left join to list of events table :
Select Event_Name As 'Event'
, GIN, GID, UPN
, OneDrive, SharePoint
, Mailbox, Event_Date, Extra
From User_History H
left Join List_of_Events E
on E.Event_Id = H.Event_Id
Related
So I have two tables: Please see the ER diagram here
I want to use SELECT to create one table with "name" from the USER table, "id" as the foreign key for the two tables, and the count of friend_id as the number of friends each user has.
Here is my code:
SELECT name, id, (SELECT count(friend_id) as number
FROM friend
GROUP BY user_id)
FROM user
ORDER BY number DESC
I'm wondering what's the problem with these lines. Thank you!
You can use a subquery to calculate the count.
SELECT name, id, COALESCE(f.Count, 0) AS friend_count
FROM user u
LEFT JOIN (
SELECT user_id, COUNT(DISTINCT friend_id) AS Count
FROM friend
GROUP BY user_id
) f ON f.user_id = u.id
ORDER BY friend_count DESC
I used a LEFT JOIN so that if a user doesn't have a row in friend, it will still return a row with a friend count of 0 (thanks to COALESCE). I also added a DISTINCT so that if the friend has duplicates the friend is counted only one, might not be necessary especially if you have a UNIQUE INDEX setup on columns user_id, friend_id
Just add where to find only one id and remove group by because you have only one id for one or more friends as your diagram says.
SELECT name, id, (SELECT count(friend_id) as number
FROM friend
WHERE user_id = user.id)
FROM user
ORDER BY number DESC
I think this will be correct for you puprose
CREATE TABLE #user(
id VARCHAR(22),
[name] VARCHAR(255),
)
CREATE TABLE #friend(
user_id VARCHAR(22),
friend_id VARCHAR(22)
)
SELECT name, id, (SELECT COALESCE(COUNT(friend_id), 0)
FROM #friend f
WHERE f.user_id = u.id
GROUP BY user_id) as number
FROM #user u
ORDER BY number DESC
--Same query with join:
SELECT u.[name], u.id, COALESCE(COUNT(f.friend_id),0) number
FROM #user u
LEFT JOIN #friend f ON f.user_id = u.id
GROUP BY u.[name], u.id
ORDER BY number
In short: 3 table inner join duplicates records
I have data in BigQuery in 3 tables:
Pageviews with columns:
timestamp
user_id
title
path
Contacts with columns:
website_user_id
email
company_id
Companies with columns:
id
name
I want to display all recorded pageviews and, if user and/or company is known, display this data next to pageview.
First, I join contact and pageviews data (SQL is generated by Metabase business intelligence tool):
SELECT
`analytics.pageviews`.`timestamp` AS `timestamp`,
`analytics.pageviews`.`title` AS `title`,
`analytics.pageviews`.`path` AS `path`,
`Contacts`.`email` AS `email`
FROM `analytics.pageviews`
INNER JOIN `analytics.contacts` `Contacts` ON `analytics.pageviews`.`user_id` = `Contacts`.`website_user_id`
ORDER BY `timestamp` DESC
It works as expected and I can see pageviews attributed to known contacts.
Next, I'd like to show pageviews of contacts with known company and which company is this:
SELECT
`analytics.pageviews`.`timestamp` AS `timestamp`,
`analytics.pageviews`.`title` AS `title`,
`analytics.pageviews`.`path` AS `path`,
`Contacts`.`email` AS `email`,
`Companies`.`name` AS `name`
FROM `analytics.pageviews`
INNER JOIN `analytics.contacts` `Contacts` ON `analytics.pageviews`.`user_id` = `Contacts`.`website_user_id`
INNER JOIN `analytics.companies` `Companies` ON `Contacts`.`company_id` = `Companies`.`id`
ORDER BY `timestamp` DESC
With this query I would expect to see only pageviews where associated contact AND company are known (just another column for company name). The problem is, I get duplicate rows for every pageview (sometimes 5, sometimes 20 identical rows).
I want to avoid selecting DISTINCT timestamps because it can lead to excluding valid pageviews from different users but with identical timestamp.
How to approach this?
Your description sounds like you have duplciates in companies. This is easy to test for:
select c.id, count(*)
from `analytics.companies` c
group by c.id
having count(*) >= 2;
You can get the details using window functions:
select c.*
from (select c.*, count(*) over (partition by c.id) as cnt
from `analytics.companies` c
) c
where cnt >= 2
order by cnt desc, id;
Goal: Create a query that calculates the ratio of ids that have/don't have a particular attribute.
Table 1: events
Fields: event_id, event_name, user_id
Field event_id is unique key/index
Field event_name has 3 potential values, one of which is the one being inspected.
Field user_id is a foreign key from Table 2
Table 2: users
Fields: id (and a long list of other attributes that aren't pertinent)
To get the list of user_ids with the qualifying attribute, I created the following:
SELECT DISTINCT events.user_id AS viewing_ids
FROM events
WHERE event_name = 'view_user_profile'
As I would expect this provides the list of users that have the corresponding event_name associated with their user_id
The next part is where I'm getting mixed up. Yes, I could COUNT(DISTINCT the select to get the count ids that have the attribute 'view_user_profile' but that only provides half the answer. What I need to do is then Join that list with the full user_id list from the table users and then determine when the id exists or doesn't.
I'm thinking the initial SELECT needs to be
SELECT
(CASE WHEN viewers IS NULL THEN false
ELSE true END) AS has_viewed_profile
, COUNT(user_id) AS users
FROM
(SELECT DISTINCT events.user_id AS viewing_ids
FROM events
WHERE event_name = 'view_user_profile') viewers
LEFT JOIN
users
ON
??? = users.id
This is where I get lost, I don't have a column name for viewers...
I think this is what you want:
select count(e.user_id) / count(*) as view_ratio
from users u left join
(select distinct e.user_id
from events e
where e.event_name = 'view_user_profile'
) e
on e.user_id = u.id;
There is probably a much better way to create these views. I have limited SQL experience so this is the way I designed it, I am hoping some of you SQL gurus can point me in a more efficient direction.
I essentially have 3 tables (sometimes 4) in my view, here is the essential structure:
Table USER
USER_ID | EMAIL | PASSWORD | CREATED_DATE
(Indexes: USER_ID)
Table USER_META
ID | USER_ID | NAME | VALUE
(Indexes: ID,USER_ID,NAME)
Table USER_SCORES
ID | USER_ID | GAME_ID | SCORE | CREATED_DATE
(Indexes: ID,USER_ID)
All the tables use the first ID column as an auto-increment primary key.
The second table "USER_META" is where I keep all the contact info and other misc. Primarily it is first_name,last_name, street,city, etc. - Depending on the user this could be 4 items or 140, which is why I use this table instead of having 150 columns in my USER table.
For reports, searching and editing I need about 20 values from USER_META, so I have views that look like this:
View V_USR_META
select USER_ID,EMAIL,
(select VALUE from USER_META
where NAME = 'FIRST_NAME' and USER_ID = u.USER_ID) as first_name,
(select VALUE from USER_META
where NAME = 'LAST_NAME' and USER_ID = u.USER_ID) as last_name,
(select VALUE from USER_META
where NAME = 'CITY' and USER_ID = u.USER_ID) as city,
(select VALUE from USER_META
where NAME = 'STATE' and USER_ID = u.USER_ID) as state,
(select VALUE from USER_META
where NAME = 'ZIP' and USER_ID = u.USER_ID) as zip,
/* 10 more selects for different meta values here */
(select max(SCORE) from USER_SCORES
where USER_ID = u.USER_ID) as high_score,
(select top (1) CREATED_DATE from USER_SCORES
where USER_ID = u.USER_ID
order by id desc) as last_game
from USER u
This get's pretty slow, and there are actually many more sub queries, this is just to illustrate the query. I also have to query a few other tables to get misc. info about the user.
I use the view when searching for a user, searches use name or userid or email or score, etc. I also use it to populate the user information screen when I present all the data in one place.
So - Is there a better way to write the view?
An alternative to all of those correlated subqueries would be to use max with case:
select u.USER_ID,
u.EMAIL,
max(case when um.name = 'FIRST_NAME' then um.value end) first_name,
max(case when um.name = 'LAST_NAME' then um.value end) last_name
...
from USER u
left join USER_META um
on u.user_id = um.user_id
group by u.user_id, u.email
Then you could add the user_scores results:
select u.USER_ID,
u.EMAIL,
max(case when um.name = 'FIRST_NAME' then um.value end) first_name,
max(case when um.name = 'LAST_NAME' then um.value end) last_name
...,
max(us.score) maxscore,
max(us.created_date) maxcreateddate
from USER u
left join USER_META um
on u.user_id = um.user_id
left join USER_SCORES us
on u.user_id = us.user_id
group by u.user_id, u.email
WITH Meta AS (
SELECT USER_ID
,FIRST_NAME
,LAST_NAME
,CITY
,STATE
,ZIP
FROM USER_META
PIVOT (
MAX(VALUE) FOR NAME IN (FIRST_NAME, LAST_NAME, CITY, STATE, ZIP)
) AS p
)
,MaxScores AS (
SELECT USER_ID
,MAX(SCORE) AS Score
FROM USER_SCORES
GROUP BY USER_ID
)
,LastGames AS (
SELECT USER_ID
,MAX(CREATED_DATE) AS GameDate
FROM USER_SCORES
GROUP BY USER_ID
)
SELECT USER.USER_ID
,USER.EMAIL
,Meta.FIRST_NAME
,Meta.LAST_NAME
,Meta.CITY
,Meta.STATE
,Meta.ZIP
,MaxScores.Score
,LastGames.GameDate
FROM USER
INNER JOIN Meta
ON USER.USER_ID = Meta.USER_ID
LEFT JOIN MaxScores
ON USER.USER_ID = MaxScores.USER_ID
LEFT JOIN LastGames
ON USER.USER_ID = LastGames.USER_ID
I am using the following query to get the transactions from a table made to and from a user. I then want to retrieve the username for the sender_id and for the recipient_id. However I can only seem to get it for the recipient_id or the sender_id. Anyone have any ideas how I can get both.
SELECT us.name, ta.amount, ta.recipient_id, ta.sender_id, ta.timestamp_insert
FROM `transactions` AS ta
JOIN users AS us
ON ta.recipient_id=us.u_id
WHERE ta.sender_id =111111 OR ta.recipient_id = 111111
LIMIT 0 , 10
Transactions Table Columns:
transaction_id
tw_id
tw
sender_id
recipient_id
amount
timestamp_insert
timestamp_start
timestamp_complete
transaction_status
User Table Columns:
u_id,
name
You need to join twice, thus:
SELECT ta.amount, ta.recipient_id, ta.sender_id, ta.timestamp_insert, sender.name as Sender, recipient.name as Recipient
FROM `transactions` AS ta
JOIN users AS recipient
ON ta.recipient_id=recipient.u_id
JOIN users AS sender
ON ta.sender_id=sender.u_id
WHERE ta.sender_id =111111 OR ta.recipient_id = 111111
LIMIT 0 , 10